Teaching National Security Policy with AI (steveblank.com)
24 points by enescakir 4h ago 11 comments
CompactLog – Solving CT Scalability with LSM-Trees (github.com)
24 points by Eikon 7h ago 9 comments
Scientific Papers: Innovation or Imitation? (johndcook.com)
59 points by tapanjk 13h ago 23 comments
How much information is in DNA?
89 crescit_eundo 65 5/8/2025, 5:42:33 PM dynomight.substack.com ↗
- DNA methylation (https://en.wikipedia.org/wiki/DNA_methylation)
- Interactions of alleles (what article refers to as the "two versions of each base pair")
- Duplications, deletions, inversions, and other structural variations (https://www.genome.gov/genetics-glossary/Structural-Variatio...)
- Physical proximity interactions in 3-dimensional space (https://cmbl.biomedcentral.com/articles/10.1186/s11658-023-0...)
- Combinatorial effect (massive) of different alleles in complex systems
Overall, it's not sensible to compare a linear sequence of bits, like a CD (sibling comment) or DVD (the article), to the linear sequence of the genome and conclude that their information content, based on length alone, is in any way comparable.
And in that context, hundreds of MBs is a heck of a lot of complexity.
Not that it means they can’t be right, but the author also doesn’t seem to have any particular expertise in genetics. Their ideas need to survive a lot more criticism by people who know what they’re talking about before you could start to see them as convincing.
The laws of physics are another component.
From there you would need to simulate nature to be able to decompress all the data, like how computer programs can use procedural generation.
Imagine a game like Minecraft. You can generate practically infinitely many screenshots of Minecraft worlds, but all that data can be derived from the game code and the jvm.
This sounds a bit suspect. Maximally compressed version would be very sensitive to mutations which wouldn't be great for adaptation via mutations. My understanding is that only a small fraction of mutations lead to unviable phenotypes.
Also AFAIK the current understanding is that majority of DNA is "junk", i.e don't seem to affect the phenotype. Which would be a partial explanation for the above.
The process of genetic expression is indeed something like procedual generation, but if maximal compression is about something like Kolmorogov complexity, the produced phenotype doesn't contain more information than the genetic information.
In the end, the author literally says: "nobody knows". Yes, you cannot compare a linear sequence of bits to a macromolecule that interacts structurally with its environment, and the author does not make that claim. The question he tries to answer is: how much data is needed to re-create a similar macromolecule that interacts in a similar way. His main point, in which you both agree: only the exons are surely not enough because the encoded proteins are just a (small?) part of how DNA interacts.
Show me how this isn't a more confusing than useful explanation, even for the bright ten-year-old or so at whose level it appears to be pitched, and I'll grant it may have some value.
We know now that environmental factors change how DNA is expressed as well through epigenetics.
I don't know how any of it works. Something to do with the shape the DNA when it is wound up and how it changes the output when RNA produces proteins.
This is how parents can do things like pass some of the athleticism they earn through training to their children. It is possible for athletic parents to pass genes in such a way that it produces children even more athletic then they were.
All of this means that DNA has the ability to encode information and produce proteins in different ways using the same sequences.
So I am guessing that a lot of the DNA that is considered "junk" may not actually be. They are just missing a piece of the puzzle in how it gets read in.
1. Maaaaybe you could make a case for DNA methylation, but that still requires some DNA signatures so ...
It doesn't matter much, unless you use it to sneak in what you think we should care about, or use it to make philosophical arguments whose circularity is carefully hidden.
Millions of chimeric cells on the same petri dish? That's 1PB on a single glass slide.
Depending on the sequencing tech paired with the rise of Spatial data, the read speed could be formidable.
Needlessly complex setup though. Let's just stick with metals for now.
https://en.wikipedia.org/wiki/G_banding
Which is a bummer because it is circular. There is also a point on the strand where two separate genes overlap. The end of one has the same code as the beginning of another.
So even DNA has it's own native compression scheme.
[1] https://en.m.wikipedia.org/wiki/Xeno_nucleic_acid
How much information can you __store__ in DNA without affecting the organism too much?
Pretty sure the substack and main site are the same. First paragraph is at least.
It's fascinating to realize that the "messiness" of DNA isn't a bug, but a feature—a side effect of evolution's raw material supply chain.
Mutations, repeats, transposons, and imperfect repairs all contribute to a noisy genomic landscape. But it's exactly this noise that enables biological diversity. No mutations, no variation. No variation, no selection. No selection, no evolution.
The genome is not a blueprint—it's a living, adapting scratchpad. Messiness is the canvas on which nature paints diversity.
Godless evolution suggests randomness produced all of it overtime. Yet, that's never worked in anything we've built. Even our GA's required laws, an environment, a computer, software, and fine-tuning. Pre-existing or by intelligent design (human inventors). Without these, it produced no results.
So, I'll correct you by saying empirical data suggests evolution didnt produce this. We're seeing God's design skills in adaptive, resilient, complex, self-replicating systems. His work is truly beautiful to behold. Humans still can't produce something similar from scratch. Actually, they can't even be sure how the existing design works.
Nope. Randomness _and_ a selection function. Natural selection, ie: surviving to create the next generation.
> Yet, that's never worked in anything we've built.
It works completely fine in things we've built. We don't have the processing power to simulate something on the scale of computational complexity happening a small tide pond though. But you can see 'evolution by natural selection' in a rule set as simple as Game of Life.
> Even our GA's required laws, an environment, a computer, software, and fine-tuning. Pre-existing or by intelligent design (human inventors). Without these, it produced no results.
The laws/environment/computer are the equivalent of having a universe with physical laws. If you want to claim that god created the universe and tuned the constants of the universe, well, maybe. Or maybe every possible universe exists and we're just not around in the ones that don't lead to conscious life, in the same way that Game of Life universe is too simple/constrained to evolve conscious life on the scales we can simulate.
https://www.youtube.com/watch?v=vHb07ynsPgo
It takes more than that for the chemical bonds to form, for the encoding to exist, for the bootstrapping environments to form, for the transitions to happen, and so on. Also, if a selection function exists, where did it come from and why does it work? Why does the math work? Why isn't math less useful or changing constantly?
"But you can see 'evolution by natural selection' in a rule set as simple as Game of Life."
That's false. You're repeating the same false premises as in the original claim I refuted. If godless and random could do it, then the questions below would all be No.
Does the game run in an environment made by intelligent designers? Does that environment need to be maintained?
Does it require rules made and maintained by intelligent designers?
Does it take an initial state in those rules to get to the specific outcomes you are looking for?
Does it produce simple, temporary patterns that are useless? Or complex machinery that's actually useful?
Or did all of the above happen randomly, keep happening, and produce increasingly complex and useful things?
"Or maybe every possible universe exists"
Science starts from observations to produce hypotheses. That is a faith-based belief popular in science fiction. It's also sort of a cop out because they're going to imagine something as infinite as God, but not mention God, to hope this would pop out randomly. If one does, they still have the "maintain it with stability over long periods" problem for that or those universes. They'll probably drag it deeper into infinity to say it will finally happen accidentally. Let's do science instead.
What we observe is a universe that is highly chaotic, almost every cubic inch is deadly, and the safest places are dead. We see nothing happening from it with Earth and humans being mind-boggling exceptions. Looking deeper at classical physics, we find reality itself also emerges in an orderly fashion from endless, quantum events that should be too random to support order. It also appears to work perfectly without failure for long periods of time.
We've also observed countless phenomenon that are truly random and chaotic, like July 4th fireworks, which never produce life or complex machines. Never self-replicating artifacts whose complexity increases over time. Never emergent intelligence from anything that didn't show evidence of design or have human input. We have billions of observations of chaotic events which themselves sometimes have a high magnitude of particles, chemicals, etc. Also, nothing lasts on its own due to physics with our intelligent designs requiring maintenance over time.
Our first hypothesis is that our reality should be total chaos. Our second hypothesis is something with unimaginable power is forcing a specific order to consistently come out of chaos. Second hypothesis is that the universe doesn't support life without being forced to. Third hypothesis is an intelligent being went uphill against the deadly universe to create us and our planet. Fourth hypothesis is that being is sustaining us despite a whole universe of threats to our lives. Fifth is that the creator is perfect. God is the Occam's Razor explanation of all of this.
There's also revelatory knowledge. God revealed Himself to us via His Word which came with prophecies, miracles, and testable predictions about lifestyles. Jesus, who died for humanity's sins, had a perfect life on top of the same, other attributes. Neither nobody nor nothing else had these traits to support their claimed revelations. So, outside empirical knowledge, revelatory knowledge reinforces the God theory into a highly-proven, saving belief. The life transformations that follow add anecdotal evidence to it.
Scientists tell us all ideas, whether a proposal or dissent, are evaluated strictly on evidential merit. Yet, evolution as origin of life had little evidence, many flaws, was forced on people anyway, and dissenting papers aren't allowed.
If it is dogmatic, and dissent isnt allowed, it is not science at all. Just a godless religion or political domination done with scientific wording in their papers. A consensus by people who force everyone to think one way isnt a scientific consensus. A theory whose rebuttals aren't even allowed in scientific journals isnt a scientific theory.
Until alternatives are allowed, and a real debate happens, I reject macro-evolution as either the truth or even a scientific consensus. I'll throw in some example counters, most being strong, which I wasn't taught in high school or college.
https://www.epm.org/resources/2010/Oct/3/ten-major-flaws-evo...
https://www.icr.org/article/four-scientific-reasons-that-ref...
> Mutation, chaos, and randomness may actually be the fertile ground where biological diversity emerges.
At the same time, I fully agree with your key point:
> "The adaptive, complex, self-replicating systems we see > don’t persist just because of pure randomness."
In my view, this doesn’t necessarily mean a “God” designed it in a human-like way. But it does point to a deeper structural order and cosmic regularity.
Maybe we can call it a kind of “design of laws,” rather than a personal designer.
After all, nature seems to operate within a set of elegant, consistent rules:
- F = ma (Newton's 2nd Law): A foundational rule in classical mechanics. - E = mc² (Einstein): Energy and mass are interchangeable. - V = IR (Ohm’s Law): Governs how voltage, current, and resistance relate. - a² + b² = c² (Pythagorean Theorem): Geometry’s timeless backbone. - Entropy always increases (2nd Law of Thermodynamics): Order tends toward disorder unless something resists it.
So maybe we can say:
- In religious terms, this is “God’s design.” - In philosophical terms, it’s the “underlying order of the universe.” - In scientific terms, it’s the “laws of nature, structural stability, and the boundary conditions of evolution.”
This doesn't fit what random, survival-oriented processes produce. It doesn't fit what random, chaotic systems produce. It looks more like an intelligent being designed and maintained the universe. That should amaze you.
They also hardwired us with a specific morality. Children are born looking for God, wanting to be loved, and with a sense of justice (fairness). That the creation has these morals implies the creator either has them or knows of them. If people have done evil, they should be quite afraid of what that implies.
Divine revelation came later with miracles as proof. God's Word told us we have to seek God, love others, do good, and do justice. That fits with our natural design. That specific God fits the profile of one who would design that elegant universe with only human life in it. That should reinforce the need to repent and follow Christ, or burn alive. In His Word, he also said He created us very personally before He began driving those laws you're talking about.
This rules out pretty much every nutty theory which evolutionary psychologists propose. Such as we evolved for altruism, we evolved to believe in religion, etc etc. Complete B.S. Exactly how much information would you need to specify a behavior like being predisposed to a belief in religion??? There's less than 80 minutes worth of music's worth of information in our genomes, and most of that is concerned with just keeping us alive.
You are not predisposed to be anything. Go create the kind of person you want to be.
Or awful lot of text information (state of art compressors can do up to 1:10 ratio for plain text, decoder itself is rather small, 750MB compressed could potentially contain like 7GB of text data).
Also, look at demoscene. 4k (4 kB is the size of executable) can do crazy things, and 64kB can fit a lot of nice 3D objects, music, text, complex effects etc. weight less than any screenshot of any moment of running demo. In 95kB you can have full game (google kkringer)
P.S. better example: full snake game in 56 BYTES https://github.com/donno2048/snake
For comparation the link above is 34 bytes, whole sentence is 83 bytes. It's possible to do a lot if we're talking about code
There are limits to how Kolmogorov complexity scales up. Many of these tricks are exploiting procedural techniques that can be expressed in minimal terms. Once you start feeding in actual information that is not feasible to express procedurally (i.e., is already compressed/high-entropy), you are forced to accumulate bits. An obvious example of this would be incorporating a texture that is multiple megabytes when compressed as a jpeg on disk.
> An obvious example of this would be incorporating a texture
Some random range of storage data is now the texture. It was used to process formatting logic but now also a texture
Yet at the same time the result of this random code is extremely compressed, to the point we compare it to procedural generative code.
Not sure what we can do with this but it certainly seems like we can once again get inspired by nature on this one.
To expand upon your compression idea, the index it is using exists outside the DNA encoding itself which means it could be holding an absolute ton of data.
Bonus: https://xkcd.com/3056/
I am fond of the analogy of DNA to procedural generation. The "operating system", as I see it, is physics. Everything else is primitives built on top of that.
Our brains can't begin to comprehend the untold multitudes of interactions occurring at a molecular scale over geologic time.
That’s a very misleading take, this is lossless audio and the majority of the bits are spent encoding noise. You can encode way more audio at perceptually but not technically lossless level in that space.
What an insanely bad take.
Not only did you not read and/or comprehend the article, the article itself undersells the information content of the genome (I'll post on this at the top level).
> You are not predisposed to be anything.
This does not logically follow your preceding statement, even if we were to accept the foregoing limited information content as factual
What you can learn will just swamp any predispositions you have.
1. You can dramatically increase the amount of information stored by compression. Uh, no. Information content is measured, as it were, "post-compression". There is a limit to how much information you can store in a gigabyte, and that limit is--a gigabyte.
2. Information content is not just stored in the DNA, but also in all of the ancillary proteins, etc. Well, the the information on how to create the proteins themselves are stored in DNA. Any additional information, then, has to be contributed by the environment. But that is exactly my point--environment matters way more than the information stored in the DNA.
3. "you are wrong", "you are a mathematical ignoramus" etc etc is ad hom. It is not a valid argument, contributes nothing to the conversation, and is not a good look. If you disagree let's see some math.
4. No matter how much information you think is stored in DNA, the amount of information stored in your brain is at least 5 orders of magnitude larger. The information you can learn swamps out any predispositions you might have. Go become the kind of person you want to be.
I think a more apt comparison would be that of a LLM of that size. qwen:0.5b is about 400MB, its abilities are laughable compared to the likes of ChatGPT, but it can write coherently about general topics. For instance.
It is not a statement about LLMs, more about what you can achieve with "just" 400MB for storage. The other similarity is that LLMs are also "messy", if you want to see the results of finely crafted work in a really small amount of space, look at what sizecoders can do with a few kB of code or less.