• tetris11@lemmy.ml
    link
    fedilink
    English
    arrow-up
    11
    ·
    10 days ago

    3 billion nucleotides, but each nucleotide can be one of 4 bases, meaning that it’s 6 billion bits of info, or (6e9 / 8) = 750 MB of data.

    But if all of the sequence was used for data, then the sperm wouldn’t be a sperm. If we keep the 20,000 coding genes making up ~ 2% of the genome, that still leaves us with (750 * 0.98) 735 MB.

    But an organism is more than its gene templates, it also has functional regions where things bind and block things and join other things, and we’re not entirely sure what percentage of the non-coding regions this is. I’m gonna go with 80%, and that leaves us with (750 * 0.18) = 135MB

    Since the sperm cell is haploid data, it has 23 chromosomes instead of the 46 (23 pairs), so it has half the data redundancy of normal DNA. We might also need to add our own error correcting codes which will reduce some of the space. I’m gonna pull a factor of 3.6 out of my ass, and thus (135 / 3.6) - voila - 37.5 MB

  • RejZoR@lemmy.ml
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 days ago

    Hey, I can fit that load on my Samsung SSD and still have space to install Doom The Dark Ages. Win win!

  • xia@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 days ago

    If that’s half of it, then a “full human” could be defined by an 80mb file?

    • TauZero@mander.xyz
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 days ago

      The numbers in the meme are off. One sperm is 750MB, or about 1 CD, so full human is 2 CDs. Or a couple 1.4MB floppies if you only store the diff from the reference genome.