Show HN: unsafehttp – tiny web server from scratch in C, running on an orange pi (unsafehttp.benren.au)

In 2014, the common heuristic was 256kB based on measurements in many systems, so the 128kB value is in line with that. At the time, optimal block sizing wasn't that sensitive to the I/O architecture so many people arrived at the same values.

In 2024, the optimal block size based on measurement largely reflects the quality and design of your I/O architecture. Vast improvements in storage hardware expose limitations of the software design to a much greater extent than a decade ago. As a general observation, the optimal I/O sizing in sophisticated implementations has been trending toward smaller sizes over the last decade, not larger.

The seeming optimality of large block sizes is often a symptom of an I/O scheduling design that can't keep up with the performance of current storage hardware.

marginalia_nu · 5h ago

I wonder if a more robust option is to peek in the sysfs queue info on Linux.

It has some nice information about hardware io operation limits, and also an optimal_io_size hint.

https://www.kernel.org/doc/html/v5.3/block/queue-sysfs.html

marginalia_nu · 8h ago

I urge you to read the papers and articles I linked at the end if any of this is your jam. They are incredible bangers all of them.

6r17 · 8h ago

Thanks for sharing this !

codeaether · 6h ago

Actually, to fully utilize NVME performance, one really need to try to avoid OS overhead by leveraging AsyncIO such as IO_Uring. In fact, 4KB page works quite well if you can issue enough outstanding requests. See a paper from the link below by the TUM folks.

https://dl.acm.org/doi/abs/10.14778/3598581.3598584

dataflow · 5h ago

SPDK is what folks who really care about this use, I think.

jandrewrogers · 1h ago

The only thing SPDK buys you is somewhat lower latency, which isn't that important for most applications because modern high-performance I/O schedulers usually are not that latency sensitive anyway.

The downside of SPDK is that it is unreasonably painful to use in most contexts. When it was introduced there were few options for doing high-performance storage I/O but a lot has changed since then. I know many people that have tested SPDK in storage engines, myself included, but none that decided the juice was worth the squeeze.

vlovich123 · 1h ago

SPDK requires taking over the device. OP is correct if you want to have a multi tenant application where the disk is also used for other things.

dataflow · 4m ago

[delayed]

marginalia_nu · 5h ago

As part of the problem domain in index lookups, issuing multiple requests at the same time is not possible, unless as part of some entirely guess-based readahead scheme thay may indeed drive up disk utilization but are unlikely to do much else. Large blocks are a solution with that constraint as a given.

That paper seems to mostly focus on throughput via concurrent independent queries, rather than single-query performance. It's arriving at a different solution because it's optimizing for a different variable.

throwaway81523 · 58m ago

In most search engines the top few tree layers are in ram cache, and can also have disk addresses for the next levels. So maybe that can let you start some concurrent requests.

Veserv · 3h ago

Large block reads are just a readahead scheme where you prefetch the next N small blocks. So you are just stating that contiguous readahead is close enough to arbitrary readahead especially if you tune your data structure appropriately to optimize for larger regions of locality.

marginalia_nu · 3h ago

Well I mean yes, you can use io_uring to read the 128KB blocks as 8 4KB blocks, but that's a very roundabout way of doing it that doesn't significantly improve your performance since with either method, the operation time is more or less the same. If a 128 KB read takes roughly the same time as a 4K read, 8 parallel 4K reads isn't going to be faster with io_uring.

Also, an index with larger block sizes is not equivalent to a structure with smaller block sizes with readahead. The index structure is not the same since having larger coherent blocks gives you better precision in your indexing structure for the same number of total forward pointers, as there's no need to index within each 128 KB block, the forward pointer resolution that would have gone to distinguishing between 4K blocks can instead help you rapidly find the next relevant 128 KB block.

ozgrakkurt · 5h ago

4KB is much slower than 512KB if you are using the whole data. Smaller should be better if there is read amplification

kvemkon · 7h ago

> 256 KB vs 512 B

> A counter argument might be that this drives massive read amplification,

For that, one need to know the true minimal block size SSD controller is able to physically read from flash. Asking for less than this wouldn't avoid the amplification.

mgerdts · 4h ago

> Modern enterprise NVMe SSDs are very fast…. This is a simple benchmark on a Samsung PM9A1 on a with a theoretical maximum transfer rate of 3.5 GB/s. … It should be noted that this is a sub-optimal setup that is less powerful than what the PM9A1 is capable of due to running on a downgraded PCIe link.

Samsung has client, datacenter, and enterprise lines. The PM9A1 is part of the OEM client segment and is about the same as a 980 Pro. Its top speeds (about 7GB/s read, 5GB/s write) are better than the comparable datacenter class drive, PM9A3. This top speeds comes with less consistent performance than you get with a PM9A3 or an enterprise drive like a PM1733 from the same era (early PCIe Gen 4 drives).

dataflow · 3h ago

Beginner(?) question: why is the model

  map<term_id, 
      list<pair<document_id, positions_idx>>
     > inverted_index;

and not

  map<term_id, 
      map<document_id, list<positions_idx>>
     > inverted_index;

(or using set<> in lieu of list<> as appropriate)?

marginalia_nu · 2h ago

This is to be seen as metaphorical to give a mental model for the actual data structures on disk so there's some tradeoff to finding the most accurate metaphor for what is happening.

I actually think you are right, list<pair<...>> is a bit of a weird choice that doesn't quite convey the data structures quite well. Map is better.

The most accurate thing would probably be something like map<term_id, map<document_id, pair<document_id, positions_idx>>>, but I corrected it to just a map<document_id, positions_idx> to avoid making things too confusing.

jeffbee · 7h ago

Fun post. One unmentioned parameter is the LBA format being used. Most devices come from the factory configured for 512B, so you can boot NetWare or some other dumb compatibility concern. But there isn't a workload from this century where this makes sense, so it pays to explore the performance impact of the LBA formats your device offers. Using a larger one can mean your device manages io backlogs more efficiently.

Show HN: OverType – A Markdown WYSIWYG editor that's just a textarea

Show HN: NextDNS Adds "Bypass Age Verification"

Show HN: Doxx – Terminal .docx viewer inspired by Glow (github.com)

Show HN: Fallinorg - Offline Mac app that organizes files by meaning (fallinorg.com)

Show HN: Rust macro utility for batching expensive async operations (github.com)

Show HN: Self-hosted Brainfuck compiler (for macOS) (github.com)

Show HN: Chatbang – Access ChatGPT from the terminal without an API key (github.com)

Show HN: Super simple offline app to track yearly goals (anyg.me)

Show HN: unsafehttp – tiny web server from scratch in C, running on an orange pi (unsafehttp.benren.au)

Show HN: DNC – A Decentralized Economic Protocol (No Blockchain) (github.com)

Show HN: Lue – Terminal eBook Reader with Text-to-Speech (github.com)

Show HN: A browser with organized and productive tabs, folders and more tweaks (polabrowser.com)

Show HN: Edka – Kubernetes clusters on your own Hetzner account (edka.io)

Show HN: Prime Number Grid Visualizer (enda.sh)

Show HN: Predict when you'll be financially independent (nesteggly.com)

Show HN: Website Emails Scraper, find emails on any site with API and CLI (apify.com)

Show HN: A catalog of single-file MCP servers build with C# (github.com)

Show HN: PgHook – Docker image that streams PostgreSQL row changes to webhooks (github.com)

Show HN: DraStic Emulator–Best Nintendo DS Emulator for Android (drastic-emulator.org)

Show HN: Code-snippets for developing eBPF Programs (github.com)

Show HN: I built a free alternative to Adobe Acrobat PDF viewer (github.com)

Show HN: OWhisper – Ollama for realtime speech-to-text (docs.hyprnote.com)

Show HN: XR2000: A science fiction programming challenge (clearsky.dev)

Show HN: JMAP MCP – Email for your agents (github.com)

Show HN: Building a web search engine from scratch with 3B neural embeddings (blog.wilsonl.in)

Show HN: Procrastinope, an open-source website blocker (github.com)

Show HN: Vaultrice – A real-time key-value store with a localStorage API (vaultrice.com)

Show HN: Zig-DbC – A design by contract library for Zig

Show HN: Yet another memory system for LLMs (github.com)

Show HN: Evaluating LLMs on creative writing via reader usage, not benchmarks (narrator.sh)

Show HN: Doom port to pure Go – Gore (github.com)

Show HN: X11.social – Call, talk, publish: voice-first AI for X with live demo (x11.social)

Show HN: MCP Server for Spotify – control playback, queue, playlists (github.com)

Show HN: Omnara – Run Claude Code from anywhere (github.com)

Show HN: Understanding the Spatial Web Browser Engine (m-creativelab.github.io)

Show HN: MCP Security Suite (github.com)

Show HN: Real-time privacy protection for smart glasses (github.com)

Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

Show HN: Modelence – Supabase for MongoDB (github.com)

Show HN: Open-source client side E2EE memo service (securememo.app)

Show HN: Play Pokémon to unlock your Wayland session (github.com)

Show HN: Embedr – Agentic IDE for Arduino, ESP32, and More (embedr.app)

Show HN: A condensed CS book called Computers, written by Claude Code (github.com)

Show HN: Branderize – Launch on brand experiences in minutes (brenderize.dev)

Show HN: Spin up 5 agents that don't trip over each other in 10 lines of code (github.com)

Show HN: Scoped, expiring API keys for AI agents (github.com)

Show HN: I built an app to block Shorts and Reels (scrollguard.app)

Show HN: Facebook Videos Downloader – Save FB Videos (chromewebstore.google.com)

Show HN: 100% Claude Coded Game: Floktoid (floktoid.franzai.com)

Show HN: Engineering.fyi – Search across tech engineering blogs in one place (engineering.fyi)

Faster Index I/O with NVMe SSDs

Comments (19)