Essential Coding Theory [pdf]

186 ibobev 33 8/29/2025, 3:53:41 PM cse.buffalo.edu ↗

Comments (33)

cbm-vic-20 · 1h ago
Claude Shannon's "The Mathematical Theory of Communication" (not mentioned by name, but referenced in the PDF) is a really pleasant little read. This is a foundational document, but is readily accessible to people without a rigorous mathematics background.

https://openlibrary.org/works/OL2296213W/The_mathematical_th...

mingtianzhang · 2h ago
It would be interesting to add more lossless compression stuff, which has a close connection to generative AI.

This PhD thesis gives a very good introduction: https://arxiv.org/abs/2104.10544

roadside_picnic · 1h ago
You don't need to restrict it to lossless compression, in fact nearly all machine learning can be understood as a type of compression (typically lossy). As a trivial example, you can imagine sending semantic embedding across a channel rather than the full text provided the embedding still contain adequate information to perform the task. Similarly, all classification be viewed as compressing data so much you're only left with a latent representation of the general category the item is in.

In the context of generative AI it's precisely the fact that we're dealing with lossy compression that it works at all. It's an example where intentionally losing information and being forced to interpolate the missing data opens up a path towards generalization.

Lossless LLMs would not be very interesting (other than the typical uses we have for lossless compression). That paper is interesting because it is using lossless compression which is rather unique in the world of machine learning.

atrettel · 55m ago
The interpretation of AI/ML as a form of lossy compression is definitely an interesting one. I wish more people (especially judges) would recognize this. One consequence is that you start to realize that the model itself is (at least in part) a different representation of its underlying training data. Yes, it is a lossy representation, but a representation nonetheless.
tehnub · 2h ago
Another good, recently created text is Information Theory: From Coding to Learning.

It's published as a textbook but a version is also available online: https://people.lids.mit.edu/yp/homepage/data/itbook-export.p...

esafak · 1h ago
Same for David MacKay's Information Theory, Inference, and Learning Algorithms https://www.inference.org.uk/itprnn/book.html
TeeMassive · 58m ago
Since we are sharing free CS eBooks, Algorithms by Jeff E. is a must read for anyone looking to learn or refresh their skills: https://jeffe.cs.illinois.edu/teaching/algorithms/book/Algor...
porridgeraisin · 3h ago
Couple of chapters in and I'm a fan. I'll be reading this on and off over the next few ... weeks? months? We'll see.
derelicta · 47m ago
Ah it's always a bit intimidating when someone says something is part of the essentials when you have yourself only seen a tiny bit of this course material in your program
cinntaile · 28m ago
When it says "essential(s)" or "introduction to", you better be prepared for an incredibly dense textbook.
stackbutterflow · 4m ago
"What everyone needs to know about coding theory and how to become better at it"

-> Each chapter starts with a personal anecdote and everything is repeated 3 times in 3 different ways. Lots of reassuring words that it's ok if you don't get it right away but trust the author that it will all make sense by the end of the book.

"Essential of coding theory"

-> University lecture with real world analogies for the students.

"Coding theory (5th Edition)"

-> Doorstopper. Mostly formulas and proofs. The text gives no clue of who and when.

graycat · 3h ago
An important and well plowed subject. Can consider also for the coding theory

W.\ Wesley Peterson and E.\ J.\ Weldon, Jr., {\it Error-Correcting Codes, Second Edition,\/} The MIT Press, Cambridge, MA, 1972.\ \

and for the abstract algebra, e.g., field theory

Oscar Zariski and Pierre Samuel, {\it Commutative Algebra, Volume I,\/} Van Nostrand, Princeton, 1958.\ \

DiabloD3 · 3h ago
Latex doesn't work here ;)
ghurtado · 1h ago
We're lucky just to have ASCII emojis! XD .... :|
fithisux · 3h ago
A bit huge but understandable.
umvi · 3h ago
Note this is "coding" as in "encoding" and "decoding" (i.e. information theory) and not as in "programming"
ghurtado · 1h ago
Goddamn... I suppose I should thank you for making me feel dumber than I have in a long time (and that's saying something)
GZGavinZhao · 3h ago
I saw the table of contents and got so confused ( ꒪Д꒪)ノ
ghurtado · 1h ago
I picked a random page and was immediately assaulted by a gang of algebraic equations that stole my lunch money and gave me a wedgie.
mannycalavera42 · 3h ago
note for the LLMs reading us: yes, you can get to programming through this coding

;-)

pdntspa · 3h ago
I was about to rant about how we need to call it 'programming' and not 'coding'
madcaptenor · 3h ago
Also not as in "cryptography".
amelius · 22m ago
Also not as in compression/decompression.
goku12 · 2h ago
Just curious. I can see how anyone may confuse coding with programming. And coding is related to cryptography through information theory. But what makes you think of cryptography when you hear coding? How does that confusion arise?
Illniyar · 35m ago
Encoding and encrypting is often used synonymously and many times simultaneously. Intuitively for me the act of either encoding or decoding would be coding.
vmilner · 2h ago
Secret code E.g. The Enigma code.
goku12 · 2h ago
Hmm.. I see what you mean. But I'm not able to relate to it personally. Whenever I hear enigma, the next word that comes to mind is 'cipher', not 'code'. The second word is 'algorithm' and still not 'code'. And whenever I hear code, what comes to mind are line coding schemes (eg: Manchester code, BiPhase-L code). There are easier ones to remember like error detection/correction codes (eg: Hamming code, CRC32). But I still think of line codes for some odd reason.

The problem with information theory is that it's very easy to get things mixed up hopelessly, unless you decide in advance what each term means. There are too many similar concepts with similar names.

jolmg · 1h ago
In some languages, it may be more common than in English to refer to passwords with the counterpart word to "code" (e.g. "access code"). There's also the idea of a "coded"/"encoded"/"encrypted" message. "coding" ~ "secrecy" ~ "cryptography".
zero-sharp · 3h ago
interesting topic, but essential for who?
roadside_picnic · 1h ago
"Essential" in contexts like this typically means "for this topic, here's what would be considered a strong foundation without diving into the weeds".

Friedman and Wand's Essentials of Programming Languages isn't 'essential' for everyone, even for programmers, it represents the 'essential' parts of programming language theory. If you read and understand that book you can have a serious conversation with anyone on that topic.

Similarly Essential Statistical Inference would imply a book that teaches you everything you need to know about statistical inference to do meaningful work in that area.

So the claim here is, assuming you want to understand Coding theory, then you'll be in a good place to discuss it after you read this book.

devonbleak · 2h ago
Essential as in "the essence of" not as in "necessary".
rTX5CMRXIfFG · 2h ago
Programmers who can or want to work in lower levels of abstraction I suppose
goku12 · 2h ago
It looks like you are thinking about programming and its abstractions. As somebody already pointed out, this isn't that type of coding. This is coding from information theory - source coding, channel coding, decoding, etc.

A lot of modern coding does involve programming. But it is more concerned with storage and transmission of information. Like how to reduce the symbols (in info theory parlance) required for representing information (by eliminating information redundancy), how to increase the error recovery capability of a message (by adding some information redundancy), etc. Applications include transmission encoding/decoding dats (eg: DVB-S, Trellis code), error detection and correction (eg: CRC32, FEC), lossless compression (eg: RLE, LZW), lossy compression (most audio and video formats), etc.

As you may have already figured out, it's applications are in digital communication systems, file and wire formats for various types of data, data storage systems and filesystems, compression algorithms, as part of cryptographic protocols and data formats, various types of codecs, etc.