National Microbial Pathogen Data Resource

Corrected annotations in an environment for comparative analysis of genomes and biological subsystems, with an emphasis on pathogenic species of Campylobacter, Listeria, Staphylococcus, Streptococcus, and Vibrio.

 New    With luck, there will be new genomes in the next version. Goals for v22 include:

Version 22
Development of the next version

 

Instant Survey: What organism is your lab research primarily focused on?

BLAST or Scan GenesSubsystemsOrganisms

   Quick-start guide to using NMPDR

  • Begin by entering keywords in the box above to search all publicly available, essentially complete genomes for a gene or protein to work on. Keywords may be any alphanumeric label, including Enzyme Commission (EC) numbers or accession numbers assigned by other data resources such as GenBank and SwissProt. Multiple search terms will automatically be joined with "AND." Try, for example, "vibrio toxin."

    • Limit the search to your focus group of orgamisms by starting on an organism summary page, linked at left, or clicking the Genes link for more options.

    • BLAST your protein or gene sequence from the BLAST link.

  • Continue from the table of text or BLAST search results, which will allow you to examine your protein in the GBrowse or NMPDR environment. Click the NMPDR button to view your protein (highlighted green) in the context of its neighbors in close proximity on the genome and to compare this region with homologous regions in other genomes.

  • Context of your protein on the genome is presented in a graphic and in a table that provides the functional annotations of genes within about 8 kb up- and downstream of your focus gene.

  • Functional Clusters that include your protein (highlighted green) are represented as blue arrows in the context graphic. In the context table, proximal genes that are functionally clustered with the focus gene will have a score, fc-sc. The score is approximately equal to the number of different species (not strains) in which the two genes are co-localized. Clusters in other genomes and a table of homologous pairs are returned by clicking the CL button or the fc-sc, respectively.

  • Compare your protein with homologous regions in the maximum number of other genomes by clicking the Pins button in the context table. A subset of the pins display, initially limited to 5 genomes that share the most similar proteins in this region, is returned by clicking "Show Compare Regions." The size of the chromosomal region and number of genomes displayed may be reset by the user.

  • Align your protein with others you select from a table orthologs identified by a pre-calculated, reciprocal BLAST analysis by clicking the button, "Bidirectional Best Hits."

  • Sequence of the protein and DNA are provided in FASTA format. Flanking sequence of 500 nucleotides up- and downstream of the gene are also readily available.

  • Instructions are avaliable in the list of frequently asked questions, or the tutorials, Searching NMPDR and Navigating NMPDR.

Hide user guide

   How do I use NMPDR to …  find a degenerate peptide motif in selected organisms?

  1. Select the BLAST or Scan search option.
  2. Select the protScan tool from the drop down list.
  3. Type the motif of interest in the sequence box.
    • For example, use a collagen-binding motif implicated in acute rheumatic fever (ARF) following streptococcal infection, AXYLXXLN (icon J Biol Chem 282:18686).
  4. Select the genomes to search in the genome list.
    • For example, select all strains of Streptococcus pyogenes from the genomes list quickly by typing "pyogenes" in the text box and clicking "Select genomes containing."
  5. Now click the big, green button.
  6. Matching sequences are presented in a table with links to respective NMPDR protein pages and a new Context Viewer.
    • ProtScan finds this matching sequence, AEYLKGLN, in the M protein of two M3 strains.
More complex example      Hide how to

   How do I use NMPDR to …  find genes that may be characteristic of a phenotype?

  1. Use NMPDR's Signature Genes Tool to compare and contrast whole genomes with the goal of defining a phenotype.
    • For example, what genes are characteristic of the El Tor biotype responsible for the seventh cholera pandemic?
  2. As the reference genome, select an organism that exemplifies the phenotype.
    • For example, choose Vibrio cholerae strain N16961, which has the O1 serotype and El Tor biotype.
  3. In the inclusion set, select any number of genomes (zero or more) that share the phenotype to compare with the reference genome.
    • For example, include Vibrio cholerae strain MO10, which has the O139 serotype and El Tor biotype.
  4. In the exclusion set, select any number of related genomes that do NOT share the phenotype to contrast with the reference genome.
    • For example, choose Vibrio cholerae strain 0395, which has the O1 serotype and classical biotype.
  5. Leave the remainder of the settings at their default values, and click the Go button.
  6. The tool searches the database to find every protein in the reference genome that has a bidirectional, best BLASTP hit (BBH) with the genomes in the inclusion set, but not the exclusion set. Results with a score of 1.000 have a bidirectional best hit in every genome from Set 1 and no bidirectional best hit against any genome in Set 2. When sets 1 and/or 2 are large, proteins with less than perfect scores will be returned.
    • For example, 76 proteins in Vibrio cholerae strain N16961 have BBH with strain MO10, but not with strain 0395. These are a starting point for finding genes in the El Tor biotype that may be responsible for pandemic virulence.
More complex example      Hide how to

 

Add NMPDR and PubMed search engines to your browser!

Go to Mycroft to download quick keyword search plugins that work in Firefox (Mac and PC) as well as IE (PC)