Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the cellular genome size of RGD3.0, the organism with a 473 genome, is 531kbp according to the paper. this means that the genome encompasses ~531,000 nucleotides (base-pairs). for reference, the human genome encompasses approximately 3.3-billion base-pairs.

you can download the human genome from here http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.... (it is recommended to use ftp, see: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/). it is an ~800mb download.

if it takes 800 mb to store 3.3 billion bases, i would assume it takes (531k / 3.3b) * 800 mb = 0.141 mb to store this genome.



Thanks! By the way, when you say "you can download the human genome", presumably this is the genome of a specific individual whose DNA was sequenced? What person is that?


no, it is a consensus sequence determined by sequencing the genomes of several individuals. i do not know how many individuals were sequenced for hg38, but hg37 was derived from 13 individuals according to [1].

genomic regions with higher variability between individuals are annotated separately and are not represented in the main chromosomal sequence, but can be aligned to it as such.

[1] https://en.wikipedia.org/wiki/Reference_genome


You can download Craig Venter's genome here:

http://hgdownload.soe.ucsc.edu/goldenPath/venter1/bigZips/

However, the human "reference" genome (the updated main product of the Human Genome Project) is a patchwork of about 20 donors' genomes. Many(most?) of the donors are anonymous volunteers from Buffalo, NY, USA.

http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: