web
You’re offline. This is a read only version of the page.
close
Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at opm.gov.
Print Article: KA-03570

Who generates genome assemblies that are present in the NCBI databases?

NCBI receives genome sequencing and assembly data from individual researchers as well as large sequencing centers. Often, consortia consisting of collaborative groups from across the world tackle sequencing and assembly of complex eukaryotic genomes (examples). NCBI does not conduct any type of nucleotide sequencing nor does it generate genome assemblies from data found in the NCBI databases. However, some NCBI staff do actively participate in The Genome Reference Consortium (GRC) that is responsible for maintaining and improving the human, mouse, zebrafish, and chicken reference assemblies.

To make their assemblies publicly available, researchers submit them to NCBI (via GenBank) or to another member of the International Nucleotide Sequence Database Collaboration (INSDC). (GRC also submits their assembly data to INSDC.) All levels* of assembly are acceptable: contig, scaffold, chromosome, or complete. Currently, most researchers who sequence entire genomes submit these as WGS submissions. NCBI also accepts raw (unassembled) sequencing reads* that submitters deposit separately into the Sequence Read Archive (SRA).

The submitters may or may not annotate (indicate the location of the genes and other features on the DNA) their assemblies prior to submission to GenBank. The submitted assemblies are subject to processing by NCBI.

*See the article on assembly processes and assembly levels.