Amazon RTO policy is costing it top tech talent, according to internal document (businessinsider.com)

People would argue that if you store that much of binary data in a SQLite database, it is not really appropriate. But, application format usually has this requirement to bundle large binary data in one nice file, rather than many files that you need to copy together to make it work.

chmaynard · 1h ago

Dr. Hipp occasionally gets on a soapbox and extolls the virtue of sqlite databases for use as an application file format. He also preaches about the superiority of Fossil over Git. His arguments generally make sense. I tolerate his sermons because he is one of the truly great software developers of our time, and a personal hero of mine.

floating-io · 1h ago

An interesting skim, but it would have been more meaningful if it had tackled text documents or spreadsheets to show what additional functionality would be enabled with those beyond "versioning".

Maybe it's just me, but I see the presentation functionality as one of the less used aspects of the OpenOffice family.

sgc · 1h ago

It seems like it would be relatively straightforward to make an sqlite based file format and just have users add a plugin if for some reason they couldn't upgrade their older version of LibreOffice etc. I agree with the other commenter who mentioned that the benefits for text and spreadsheet files needs more explanation. But it seems like a good enough idea to have a LibreOffice working group perform a more in depth study. If significant memory reduction is real and that would translate to fewer crashes, it would be a huge boost even if it had no other benefits, IMHO.

sakesun · 1h ago

If I remember correctly Mendix project file format is simply a sqlite db. I thought the designer was lazy but it turns out it's a reasonable decision.

Recently, DuckDB team raise similar question on DataLake catalog format. Why not just use SQL database for that ? It's simpler and more efficient as well.

conorbergin · 53m ago

I've being trying out SQLite for a side project of mine, a virtual whiteboard, I haven't quite got my head around it, but it seems to be much less of a bother than interacting with file system APIs so far. The problem I haven't really solved is how sync and maybe collaboration is going to interact with it, so far I have:

1. Plaintext format (JSON or similar) or SQLite dump files versioned by git

2. Some sort of modern local first CRDT thing (Turso, libsql, Electric SQL)

3. Server/Client architecture that can also be run locally

Has anyone had any success in this department?

rogerbinns · 33m ago

SQLite has a builtin session extension that can be used to record and replay groups of changes, with all the necessary handling. I don't necessarily recommend session as your solution, but it is at least a good idea to see how it compares to others.

https://sqlite.org/sessionintro.html

That provides a C level API. If you know Python and want to do some prototyping and exploration then you may find my SQLite wrapper useful as it supports the session extension. This is the example giving a feel for what it is like to use:

https://rogerbinns.github.io/apsw/example-session.html

tombert · 12m ago

I remember I played with some software called "The Illumination Software Creator" [1], and I remember the saved project files were just SQLite databases.

I actually thought it was kind of cool, because I was able to play with it easily with some SQLite explorer tool (I forget which one) and I could easily look at how the save files actually worked.

I haven't really used SQLite for anything serious [2], but always found the idea of it kind of charming. Maybe I should dust it off and try it again.

[1] https://en.wikipedia.org/wiki/Illumination_Software_Creator by Bryan Lunduke before I realized how much of a pseudo-intellectual dimwit that he is.

[2] At least outside of the "included" database in a few web frameworks.

RainyDayTmrw · 57m ago

Juggling all the fragments inside the database, garbage collecting all the unused ones, and maintaining consistency are all quite challenging in this use case.

supportengineer · 1h ago

What if instead of API's for data sets, we simply placed a sqlite file onto a web server as a static asset, so you could just periodically do a GET and have a local copy.

yupyupyups · 1h ago

This works as long as the data is "small" and you have no ACL for it. Assuming you mean automatic downloads.

Devdocs does something similar, but there you request to download the payload manually, and the data is still browsable online without you having to download all of it. The data is also split in a convenient manner (by programming language/library). In other words, you can download individual parts. The UI also remains available offline, which is pretty cool.

https://devdocs.io/

abtinf · 1h ago

A few years ago someone posted a site that showed how to query portions of a SQLite file without having to pull the whole thing down.

dbarlett · 1h ago

https://news.ycombinator.com/item?id=27016630

supportengineer · 39m ago

>> I implemented a virtual file system that fetches chunks of the database with HTTP Range requests

That's wild!

abtinf · 1h ago

With an S3 object lambda, I suppose you could generate the sqlite file on the fly.

anon291 · 1h ago

You can do this today by using the WASM-compiled SQLite module with a custom Javascript VFS that implements the SQLite VFS api appropriately for your backend. I've used it extensively in the past to serve static data sets direct from S3 for low cost.

More industrious people have apparently wrapped this up on NPM: https://www.npmjs.com/package/sqlite-wasm-http

librasteve · 1h ago

wouldn’t an XML database be easier?

duskwuff · 1h ago

You can't* index into XML. You have to read through the whole document until you get to the part you want.

*: without adding an index of your own, at which point it isn't really XML anymore, it's some kind of homebrew XML-based archive format.

floating-io · 1h ago

Does an embeddable XML database engine exist at a similar level of reliability?

supportengineer · 1h ago

No.

renecito · 1h ago

LOL!

mac-attack · 1h ago

I'm a fan of both as a Linux user. Interesting thought experiment.

What Is the Fourier Transform? (quantamagazine.org)

Stripe Launches L1 Blockchain: Tempo (tempo.xyz)

LLM Visualization (bbycroft.net)

Classic 8×8-pixel B&W Mac patterns (pauladamsmith.com)

Is the decline of reading making politics dumber? (economist.com)

WiFi signals can measure heart rate (news.ucsc.edu)

What If OpenDocument Used SQLite? (sqlite.org)

Wikipedia survives while the rest of the internet breaks (theverge.com)

ICPC 2025 World Finals Results (worldfinals.icpc.global)

Memory is slow, Disk is fast – Part 2 (bitflux.ai)

What happens when 10k AI agents are left to self-govern in a virtual world? (aivilization.ai)

Atlassian is acquiring The Browser Company (cnbc.com)

Le Chat: Custom MCP Connectors, Memories (mistral.ai)

Melvyn Bragg steps down from presenting In Our Time (bbc.co.uk)

I ditched Spotify and set up my own music stack (leshicodes.github.io)

Age Simulation Suit (age-simulation-suit.com)

Unix Conspiracy (1991) (catb.org)

Action was the best 8-bit programming language (goto10retro.com)

Updating restrictions of sales to unsupported regions (anthropic.com)

3D QR Codes (erikdemaine.org)

Artie (YC S23) Is Hiring Engineers, AES, and Senior PMM (ycombinator.com)

A PM's Guide to AI Agent Architecture (productcurious.com)

Saquon Barkley is playing for equity (readtheprofile.com)

Rocketships and Slingshots (postround.substack.com)

Wal3: A Write-Ahead Log for Chroma, Built on Object Storage (trychroma.com)

Launch HN: Slashy (YC S25) – AI that connects to apps and does tasks

Almost anything you give sustained attention to will begin to loop on itself (henrikkarlsson.xyz)

A programmable display using microfluidics [video] (youtube.com)

How we built an interpreter for Swift (bitrig.app)

A high schooler writes about AI tools in the classroom (theatlantic.com)

AI not affecting job market much so far, New York Fed says (money.usnews.com)

Amazon RTO policy is costing it top tech talent, according to internal document (businessinsider.com)

We Investigated Tesla's Autopilot. It's Scarier Than You Think [video] (youtube.com)

16-inch softball (en.wikipedia.org)

30 minutes with a stranger (pudding.cool)

Polars Cloud and Distributed Polars now available (pola.rs)

Farewell to Meshnet (nordvpn.com)

Berg's Card Sorting Task (neurobs.com)

I should have loved electrical engineering (blog.tdhttt.com)

Inverting the Xorshift128 random number generator (littlemaninmyhead.wordpress.com)

UK government trial of M365 Copilot finds no clear productivity boost (theregister.com)

Mangrove Restoration Frustration (2021) (knowablemagazine.org)

How to build vector tiles from scratch (debuisne.com)

Étoilé – desktop built on GNUStep (etoileos.com)

The thousands of atomic bombs exploded on Earth (2015) (kottke.org)

Thunk: Build Rust program to support Windows XP, Vista and more (github.com)

The Paradigm (nonint.com)

OpenAI announces AI-powered hiring platform to take on LinkedIn (techcrunch.com)

Hollow Knight: Silksong causes server chaos on Xbox, Steam, and Nintendo (eurogamer.net)

Yes, America Has a Housing Emergency – Paul Krugman (paulkrugman.substack.com)

What If OpenDocument Used SQLite?

Comments (22)