Tuesday, December 27, 2011

NCBI bioinformatics tools

Amino Acid Explorer: This tool allows users to explore the characteristics of amino acids by comparing their structural and chemical properties, predicting protein sequence changes caused by mutations, viewing common substitutions, and browsing the functions of given residues in conserved domains.

Assembly Archive: Links the raw sequence information found in the Trace Archive with assembly information found in publicly available sequence repositories (GenBank/EMBL/DDBJ). The Assembly Viewer allows a user to see the multiple sequence alignments as well as the actual sequence chromatogram.


BLAST Microbial Genomes: Performs a BLAST search for similar sequences from selected complete eukaryotic and prokaryotic genomes.

BLAST RefSeqGene: Performs a BLAST search of the genomic sequences in the RefSeqGene/LRG set. The default display provides ready navigation to review alignments in the Graphics display.


Batch Entrez: Allows you to retrieve records from many Entrez databases by uploading a file of GI or accession numbers from the Nucleotide or Protein databases, or a file of unique identifiers from other Entrez databases. Search results can be saved in various formats directly to a local file on your computer.

BioAssay Services: Tools that summarize the biological test results in the PubChem database and provide alternative ways to view bioassay results and structure-activity relationships. Users also can download their analyses and data tables.

CDTree: A stand-alone application for classifying protein sequences and investigating their evolutionary relationships. CDTree can import, analyze and update existing Conserved Domain (CDD) records and hierarchies, and also allows users to create their own. CDTree is tightly integrated with Entrez CDD and Cn3D, and allows users to create and update protein domain alignments.

COBALT: COBALT is a protein multiple sequence alignment tool that finds a collection of pairwise constraints derived from conserved domain database, protein motif database, and sequence similarity, using RPS-BLAST, BLASTP, and PHI-BLAST.


Coffee Break: Part of the NCBI Bookshelf, Coffee Break combines reports on recent biomedical discoveries with use of NCBI tools. Each report incorporates interactive tutorials that show how NCBI bioinformatics tools are used as a part of the research process.

Concise Microbial Protein BLAST: A specialized BLAST service in which the queried database consists of all proteins from complete microbial (prokaryotic) genomes. NCBI has precalculated clusters of similar proteins at the genus-level and one representative is chosen from each cluster in order to reduce the dataset, thereby reducing search time and providing a broader taxonomic view.

Conserved Domain Architecture Retrieval Tool (CDART): Displays the functional domains that make up a given protein sequence. It lists proteins with similar domain architectures and can retrieve proteins that contain particular combinations of domains.


Digital Differential Display (DDD): A tool for comparing EST profiles in order to identify genes with significantly different expression levels.

E-Bench: This interactive tool allows users to build E-utility URLs, either from a form or by hand, and then view their raw output. The tool provides a simple environment for testing E-utility URLs before including them in applications.


Ebot: A tool that allows users to construct an E-utility analysis pipeline using an online form, and then generates a Perl script to execute the pipeline.

Electronic PCR (e-PCR): A computational procedure that is used to identify sequence tagged sites (STSs) within DNA sequences. e-PCR looks for potential STSs in DNA sequences by searching for subsequences that closely match the PCR primers and have the correct order, orientation, and spacing that could represent the PCR primers used to generate known STSs.

Frequency-weighted Link (FLink): FLink is a tool that enables you to link from a group of records in a source database to a ranked list of associated records in a destination database based on frequency-weighted statistics.

Gene Expression Omnibus (GEO) BLAST: Tool for aligning a query sequence (nucleotide or protein) to GenBank sequences included on microarray or SAGE platforms in the GEO database.

Gene Plot: A tool for pairwise comparison of two prokaryotic genomes that displays pairs of protein homologs that are symmetrical best hits between the two genomes.

Genetic Codes: Displays the genetic codes for organisms in the Taxonomy database in tables and on a taxonomic tree.

Genome BLAST: This tool compares nucleotide or protein sequences to genomic sequence databases and calculates the statistical significance of matches using the Basic Local Alignment Search Tool (BLAST) algorithm.


Genome Remapping Service: NCBI's Remap tool allows users to project annotation data and convert locations of features from one genomic assembly to another or to RefSeqGene sequences through a base by base analysis. Options are provided to adjust the stringency of remapping, and summary results are displayed on the web page. Full results can be downloaded for viewing in NCBI's Genome Workbench graphical viewer, and annotation data for the remapped features, as well as summary data, is also available for download.


LinkOut: A service that allows third parties to link directly from PubMed and other Entrez database records to relevant web-accessible resources beyond the Entrez system. Examples of LinkOut resources include full-text publications, biological databases, consumer health information and research tools.



NCBI Toolbox: A set of software and data exchange specifications used by NCBI to produce portable, modular software for molecular biology. The software in the Toolbox is primarily designed to read records in Abstract Syntax Notation 1 (ASN.1) format, an International Standards Organization (ISO) data representation format.

OSIRIS: A public domain quality assurance software package that facilitates the assessment of multiplex short tandem repeat (STR) DNA profiles based on laboratory-specific protocols. OSIRIS evaluates the raw electrophoresis data using an independently derived mathematically-based sizing algorithm. It offers two new peak quality measures - fit level and sizing residual. It can be customized to accommodate laboratory-specific signatures such as background noise settings, customized naming conventions and additional internal laboratory controls.

Open Mass Spectrometry Search Algorithm (OMSSA) Search: An efficient search engine for identifying MS/MS peptide spectra by searching libraries of known protein sequences. OMSSA scores significant hits with a probability score developed using classical hypothesis testing, the same statistical method used in BLAST.

Open Reading Frame Finder (ORF Finder): A graphical analysis tool that finds all open reading frames in a user's sequence or in a sequence already in the database. Sixteen different genetic codes can be used. The deduced amino acid sequence can be saved in various formats and searched against protein databases using BLAST.

PSSM Viewer: Allows users to display, sort, subset and download position-specific score matrices (PSSMs) either from CDD records or from Position Specific Iterated (PSI)-BLAST protein searches. The tool also can align a query protein to the PSSM and highlight positions of high conservation.

Phenotype-Genotype Integrator (PheGenI): Supports finding human phenotype/genotype relationships with queries by phenotype, chromosome location, gene, and SNP identifiers. Currently includes information from dbGaP, the NHGRI GWAS Catalog, and GTeX. Displays results on the genome, on sequence, or in tables for download.



PubChem Power User Gateway (PUG): PUG provides access to PubChem services via a programmatic interface. PUG allows users to download data, initiate chemical structure searches, standardize chemical structures and interact with the E-utilities. PUG can be accessed using either standard URLs or via SOAP.

PubChem Standardization Service: Standardization, in PubChem terminology, is the processing of chemical structures in the same way used to create PubChem Compound records from contributors' original structures. This service lets users see how PubChem would handle any structure they would like to submit.



PubMed Tutorials: A collection of web and flash tutorials on PubMed searching and linking, saving searches in MyNCBI, using MeSH and other PubMed services.

Related Structures: The Related Structures tool allows users to find 3D structures from the Molecular Modeling Database (MMDB) that are similar in sequence to a query protein. Although the query protein may not yet have a resolved structure, the 3D shape of a similar protein sequence can shed light on the putative shape and biological function of the query protein.

SNP Database Specialized Search Tools: A variety of tools are available for searching the SNP database, allowing search by genotype, method, population, submitter, markers and sequence similarity using BLAST. These are linked under ""Search"" on the left side bar of the dbSNP main page.


Sequence Viewer: Provides a configurable graphical display of a nucleotide or protein sequence and features that have been annotated on that sequence. In addition to use on NCBI sequence database pages, this viewer is available as an embeddable webpage component.  
 

TaxPlot: A tool for comparing genomes on the basis of the protein sequences they encode. To use TaxPlot, one selects a reference genome and two species for comparison. Pre-computed BLAST results are then used to plot a point for each predicted protein in the reference genome, based on the best alignment with proteins in each of the two genomes being compared.



Taxonomy Statistics: Displays the number of taxonomic nodes in the database for a given rank and date of inclusion.

Taxonomy Status Reports: Displays the current status of a set of taxonomic nodes or IDs.

Variation Reporter: A tool designed to search for and report human sequence variation data from dbSNP and dbVar. Individual variations or batch files can be submitted in HGVS, GVF or BED formats. Related information will be retrieved and reported in a downloadable table containing variation identifiers, nucleotide and cytogenetic band locations on various genomic assemblies, allele type and minor allele frequencies, predicted functional consequences (missense, nonsense, frameshift, splice site, etc.), reported clinical significance, and relevant citations.

VecScreen: A system for quickly identifying segments of a nucleic acid sequence that may be of vector origin. VecScreen searches a query sequence for segments that match any sequence in a specialized non-redundant vector database (UniVec).


Viral Genotyping Tool: This tool helps identify the genotype of a viral sequence. A window is slid along the query sequence and each window is compared by BLAST to each of the reference sequences for a particular virus.

No comments:

Post a Comment