Ask HN: Why are PDFs so hard to edit?

5 superconduct123 5 6/17/2025, 2:26:04 AM
What is it about the underlying format that makes it so difficult to edit a PDF

Comments (5)

necovek · 8h ago
Because it was designed as a graphical output format, not an editable format.

Some of the "compression" tricks it allows one to use (eg. font subsetting, even remapping characters to use fewer bits to encode text) may make the data only keep the same appearance, and semantic encoding would be gone (for example, "A" may stand for "#").

It's actually quite similar in nature to TeX's DVI format (boxes and their positions), though obviously not a bitmap format but a vector one with all the deps embedded.

This means that, for instance, using non-default kerning and whitespace will lead to all text becoming box-per-character thrown around the page.

superconduct123 · 7h ago
I see, so its like a lower level format than say a word doc or markdown
k310 · 8h ago
There's a pretty decent explanation here:

https://mailmergic.com/blog/why-pdf-are-hard-to-edit/

The most compelling tidbit I found was this:

> The Technical Architecture of PDF: A Labyrinth of Objects

> Beneath the surface, PDF files are complex compositions made up of objects: text blocks, images, vectors, fonts, metadata, and instructions for rendering. These elements are often stored in fragmented sequences that are optimized for viewing rather than editing. The text is not always stored in logical reading order, and words may be divided into separate character objects placed precisely on the page based on coordinates.

Lots more there. No more spoilers.

PaulHoule · 8h ago
Maybe 10 years ago I was a student of file formats and I actually liked PDF as it had a clear theory of how you serialize a graph of objects. It's more like the old Microsoft Word format or the current DOCX and much better than the atrocious PSD format. PDF is a good format for one developed in the 1990s for what it was intended to do.
fuzzfactor · 8h ago
>Why are PDFs so hard to edit?

This is by design.

IIRC the original objective was to require a costly proprietary program from Adobe called "Acrobat" to create the file to begin with, and it was intended not to be edited. Rather it was supposed to be readable and printable with good consistency between PCs and Macs.

"Acrobat Reader" has always been free, to help popularize the format and make sure that anybody could open and read the file. But no editing for you the user. And the "publishers" who routinely generated the early PDFs using the full Acrobat suite wanted to distribute documents for people to trust that they had not been edited from the source. At least not as easily as a Word DOC file could be edited.