What If OpenDocument Used SQLite?

59 whatisabcdefgh 22 9/4/2025, 9:36:50 PM sqlite.org ↗

Comments (22)

liuliu · 17m ago
One thing I would call out, if you use SQLite as an application format:

BLOB type is limited to 2GiB in size (int32). Depending on your use cases, that might seem high, or not.

People would argue that if you store that much of binary data in a SQLite database, it is not really appropriate. But, application format usually has this requirement to bundle large binary data in one nice file, rather than many files that you need to copy together to make it work.

chmaynard · 1h ago
Dr. Hipp occasionally gets on a soapbox and extolls the virtue of sqlite databases for use as an application file format. He also preaches about the superiority of Fossil over Git. His arguments generally make sense. I tolerate his sermons because he is one of the truly great software developers of our time, and a personal hero of mine.
floating-io · 1h ago
An interesting skim, but it would have been more meaningful if it had tackled text documents or spreadsheets to show what additional functionality would be enabled with those beyond "versioning".

Maybe it's just me, but I see the presentation functionality as one of the less used aspects of the OpenOffice family.

sgc · 1h ago
It seems like it would be relatively straightforward to make an sqlite based file format and just have users add a plugin if for some reason they couldn't upgrade their older version of LibreOffice etc. I agree with the other commenter who mentioned that the benefits for text and spreadsheet files needs more explanation. But it seems like a good enough idea to have a LibreOffice working group perform a more in depth study. If significant memory reduction is real and that would translate to fewer crashes, it would be a huge boost even if it had no other benefits, IMHO.
sakesun · 1h ago
If I remember correctly Mendix project file format is simply a sqlite db. I thought the designer was lazy but it turns out it's a reasonable decision.

Recently, DuckDB team raise similar question on DataLake catalog format. Why not just use SQL database for that ? It's simpler and more efficient as well.

conorbergin · 53m ago
I've being trying out SQLite for a side project of mine, a virtual whiteboard, I haven't quite got my head around it, but it seems to be much less of a bother than interacting with file system APIs so far. The problem I haven't really solved is how sync and maybe collaboration is going to interact with it, so far I have:

1. Plaintext format (JSON or similar) or SQLite dump files versioned by git

2. Some sort of modern local first CRDT thing (Turso, libsql, Electric SQL)

3. Server/Client architecture that can also be run locally

Has anyone had any success in this department?

rogerbinns · 33m ago
SQLite has a builtin session extension that can be used to record and replay groups of changes, with all the necessary handling. I don't necessarily recommend session as your solution, but it is at least a good idea to see how it compares to others.

https://sqlite.org/sessionintro.html

That provides a C level API. If you know Python and want to do some prototyping and exploration then you may find my SQLite wrapper useful as it supports the session extension. This is the example giving a feel for what it is like to use:

https://rogerbinns.github.io/apsw/example-session.html

tombert · 12m ago
I remember I played with some software called "The Illumination Software Creator" [1], and I remember the saved project files were just SQLite databases.

I actually thought it was kind of cool, because I was able to play with it easily with some SQLite explorer tool (I forget which one) and I could easily look at how the save files actually worked.

I haven't really used SQLite for anything serious [2], but always found the idea of it kind of charming. Maybe I should dust it off and try it again.

[1] https://en.wikipedia.org/wiki/Illumination_Software_Creator by Bryan Lunduke before I realized how much of a pseudo-intellectual dimwit that he is.

[2] At least outside of the "included" database in a few web frameworks.

RainyDayTmrw · 57m ago
Juggling all the fragments inside the database, garbage collecting all the unused ones, and maintaining consistency are all quite challenging in this use case.
supportengineer · 1h ago
What if instead of API's for data sets, we simply placed a sqlite file onto a web server as a static asset, so you could just periodically do a GET and have a local copy.
yupyupyups · 1h ago
This works as long as the data is "small" and you have no ACL for it. Assuming you mean automatic downloads.

Devdocs does something similar, but there you request to download the payload manually, and the data is still browsable online without you having to download all of it. The data is also split in a convenient manner (by programming language/library). In other words, you can download individual parts. The UI also remains available offline, which is pretty cool.

https://devdocs.io/

abtinf · 1h ago
A few years ago someone posted a site that showed how to query portions of a SQLite file without having to pull the whole thing down.
dbarlett · 1h ago
supportengineer · 39m ago
>> I implemented a virtual file system that fetches chunks of the database with HTTP Range requests

That's wild!

abtinf · 1h ago
With an S3 object lambda, I suppose you could generate the sqlite file on the fly.
anon291 · 1h ago
You can do this today by using the WASM-compiled SQLite module with a custom Javascript VFS that implements the SQLite VFS api appropriately for your backend. I've used it extensively in the past to serve static data sets direct from S3 for low cost.

More industrious people have apparently wrapped this up on NPM: https://www.npmjs.com/package/sqlite-wasm-http

librasteve · 1h ago
wouldn’t an XML database be easier?
duskwuff · 1h ago
You can't* index into XML. You have to read through the whole document until you get to the part you want.

*: without adding an index of your own, at which point it isn't really XML anymore, it's some kind of homebrew XML-based archive format.

floating-io · 1h ago
Does an embeddable XML database engine exist at a similar level of reliability?
supportengineer · 1h ago
No.
renecito · 1h ago
LOL!
mac-attack · 1h ago
I'm a fan of both as a Linux user. Interesting thought experiment.