Show HN: Vulners Lookup – CVE highlights on any page (hover for CVSS/EPSS/PoCs) (chromewebstore.google.com)

I've just made a small but important clarification to the article. While in many cases it's easier and even preferred to calculate all results, accumulate them somewhere, then sort; this article focuses on memory bound algorithms that support infinite streams and backpressure.

latchkey · 4h ago

Thanks, but I'd still use a queue over this solution.

Real-time Log Enrichment: perfect for my example [0], you're firing off endless tasks. RT logs have a timestamp.

Finding the First Match in a File List: Files tend to be static. I'd use a queue to first build an index and then a queue to process the index.

Time Series Data Processing: Break the data into chunks, you mention 600MB, which isn't that big at all given that Cloud Run memory maxes out at 32GB.

[0] https://news.ycombinator.com/item?id=45094387

tetraodonpuffer · 8h ago

Thanks for the write up! In my current application I have a few different scenarios that are a bit different from yours but still require processing aggregated data in order

1. Reading from various files where each file has lines with a unique identifier I can use to process in order: I open all the files and create a min heap reading the first line of each, then process by grabbing the lowest from the min-heap repeatedly, after reading a line from a file, I read another and put it in the min-heap again (the min heap cells contain the opened file descriptor for that file)

2. Aggregating across goroutines that service data generators with different latencies and throughputs. I have a goroutine each that interfaces with them and consider them “producers”. Using a global atomic integer I can quickly assign a unique increasing index to the messages coming in, these can be serviced with a min-heap same as above. There are some considerations about dropping too old messages, so an alternative approach for some cases is to index the min-heap on received time and process only up to time.Now()-some buffering time to allow more time for things to settle before dropping things (trading total latency for this).

3. Similar to the above I have another scenario where throughput ingestion is more important and repeated processing happens in-order but there is no requirement on all messages to have been processed every time, just that they are processed in order (this is the backing for a log viewer). In this case I just slab allocate and dump what I receive without ordering concerns but I also keep a btree with the indexes that I iterate over when it’s time to process. I originally had this buffering like (2) to guarantee mostly ordered insertions in the slabs themselves (which I simply iterated on) but if a stall happened in a goroutine then shifting over the items in the slab when the old items came in became very expensive and could spiral badly.

destel · 5h ago

Wow, that’s some seriously sophisticated stuff - it’s not that often you see a heap used in typical production code (outside of libraries)!

Your first example definitely gives me merge-sort vibes - a really clean way to keep things ordered across multiple sources. The second and third scenarios are a bit beyond what I’ve tackled so far, but super interesting to read about.

This also reminded me of a WIP PR I drafted for rill (probably too niche, so I’m not sure I’ll ever merge it). It implements a channel buffer that behaves like a heap - basically a fixed-size priority queue where re-prioritization only happens for items that pile up due to backpressure. Maybe some of that code could be useful for your future use cases: https://github.com/destel/rill/pull/50

tetraodonpuffer · 5h ago

Hah not sure about “production”, I am currently in between jobs and am taking advantage of that to work on a docker/k8s/file TUI log viewer.

I am using those techniques respectively for loading backups (I store each container log in a separate file inside a big zip file, which allows concurrent reading without unpacking) and for servicing the various log producing goroutines (which use the docker/k8s apis as well as fsnotify for files) since I allow creating “views” of containers that consequently need to aggregate in order. The TUI itself, using tview, runs in a separate goroutine at configurable fps reading from these buffers.

I have things mostly working, the latest significant refactoring was introducing the btree based reading after noticing the “fix the order” stalls were too bad, and I am planning to do a show hn when I’m finished. It has been a lot of fun going back to solo-dev greenfield stuff after many years of architecture focused work.

I definitely love golang but despite being careful and having access to great tools like rr and dlv in goland, it can get difficult sometimes to debug deadlocks sometimes especially when mixing channels and locks. I have found this library quite useful to chase down deadlocks in some scenarios https://github.com/sasha-s/go-deadlock

destel · 8h ago

Hi everyone, I’m the author of the article. Happy to answer any questions or discuss concurrency patterns in Go. Curious how others tackle such problems.

kunley · 2h ago

chan chan Foo seems like a cool trick, looking forward to use it in the code; thanks for the idea.

PS. I realize you present even better solution; still, first version seems like a thing nice enough to have in a toolbox

destel · 2h ago

Thanks. This replyTo pattern is very similar to promises in other languages.

Traubenfuchs · 7h ago

> Curious how others tackle such problems.

What do you think about the order preserving simplicity of Java?

  List<Input> inputs = ...;

  List<Output> results = inputs.parallelStream()
                             .map(this::processTask)
                             .collect(toList());

If you want more control or have more complex use cases, you can use an ExecutorService of your choice, handle the futures yourself or get creative with Javas new structured concurrency.

Groxx · 6h ago

Their planned semantics don't allow for that - there's no backpressure in that system, so it might race ahead and process up to e.g. item 100 while still working on item 1.

If everything fits in memory, that's completely fine. And then yeah, this is wildly overcomplicated, just use a waitgroup and a slice and write each result into its slice index and wait for everything to finish - that matches your Java example.

But when it doesn't fit in memory, that means you have unbounded buffer growth that might OOM.

destel · 7h ago

I haven’t used Java for about a decade, so I’m not very familiar with streams api.

Your snippet looks good and concise.

One thing I haven’t emphasized enough in the article is that all algorithms there are designed to work with potentially infinite streams

kamranjon · 7h ago

Often in go I’ll create some data structure like a map to hold the new value keyed by the original index (basically a for loop with goroutines inside that close over the index value) - then I just reorder them after waiting for all of them to complete.

Is this basically what Java is doing?

I think that maybe the techniques in this article are a little more complex, allowing you to optimize further (basically continue working as soon as possible instead of just waiting for everything to complete and reordering after the fact) but I’d be curious to know if I’ve missed something.

gleenn · 7h ago

It's a reasonable solution. The problem with this solution is mentioned in the article, you necessarily have the worst case memory usage because you have to store everything in the map first. If you don't have too much to store, it will work.

abtinf · 8h ago

Another scenario where order matters is in Temporal workflows. Temporal’s replay capability requires deterministic execution.

Groxx · 5h ago

That's a rather special case: they and Cadence control when calls into their code unblocks, and they use that to run your code as if it was a single-threaded event loop. That way, the stuff they do can be deterministic while simulating parallel execution (but only concurrency).

latchkey · 5h ago

For something like this, I would instinctively reach for an external queue mechanism instead of trying to work through the complexity of golangs concurrency.

Create a bunch of sequentially numbered jobs that then update their output into postgres database. Then have N number of workers process the jobs. Something like GCP's CloudTasks is perfect for this because the "workers" are just GCP Cloud Functions, so you can have a near infinite number of them (limited by concurrent DB connections).

This approach also buys you durability of the queue for free (ie: what happens when you need to stop your golang process mid queue?).

Then it is just a query:

  select * from finished_jobs order by job_num;

destel · 4h ago

candiddevmike · 6h ago

Personally, I've come to really hate channels in Go. They are a source of some seriously heinous deadlock bugs that are really hard to debug, and closing channels in the wrong spot can crash your entire app. I try using plain locks until it hurts before I reach for channels these days.

Groxx · 6h ago

Well over half of the code I've ever seen that uses three or more channels (i.e. two semantic ones plus a cancellation or shutdown) has had serious flaws in it.

Granted, that generally means they're doing something non-trivial with concurrency, and that correlates strongly with "has concurrency bugs". But I see issues FAR more frequently when they reach for channels rather than mutexes. It's bad enough that I just check absolutely every three-chan chunk of code proactively now.

I lay part of the blame on Go's "♥ safe and easy concurrency with channels! ♥" messaging. And another large chunk at the lack of generics (until recently), making abstracting these kinds of things extremely painful. Combined, you get "just do it by hand lol, it's easy / get good" programming, which is always a source of "fun".

__turbobrew__ · 4h ago

Agreed. I especially think it was common to overuse channels when golang was younger as that was “the go way”. I think people have started to realize that channels are complex and a sharp abstraction and they should not be used frivolously.

I can’t think of the last time I actually wrote code which directly created channels. Of course things like contexts, tickers, etc are implemented with channels and I think that is ideally how they should be used — in well defined and self contained library code.

kunley · 2h ago

Totally different perspective here. Never dissapointed with channels, can't stand async.

Show HN: Woomarks, transfer your Pocket links to this app or self-host it (woomarks.com)

Show HN: Simple modenized .NET NuGet server reached RC (github.com)

Show HN: Fine-tuned Llama 3.2 3B to match 70B models for local transcripts (bilawal.net)

Show HN: StripeMeter – Open-Source Usage Metering for Stripe Billing (github.com)

Show HN: Spotilyrics – See synchronized Spotify lyrics inside VS Code (github.com)

Show HN: Neuron – Cognitive Multi-Agent Architecture for Reasoning

Show HN: Public chat rooms with ephemeral chat and anonymous signup (phispr.space)

Show HN: Alpha- The fine structure constant emerged and code (zenodo.org)

Show HN: qdb.us is back, after extensive downtime (qdb.us)

Show HN: Blueprint: Fast, Nunjucks-like templating engine for Java 8 and beyond

Show HN: E-Paper Family 2 Day Calendar (github.com)

Show HN: An ncurses CUDA-based fluid simulation (github.com)

Show HN: Anonymous Age Verification (gist.github.com)

Show HN: First Half of "Swimming in Tech Debt" (book about tech debt) (helpthisbook.com)

Show HN: Sarpro – Get Sentinel‑1 GRD from NASA into → GeoTIFF/JPEG Fast (Rust) (github.com)

Show HN: The ASCII Side of the Moon (aleyan.com)

Show HN: Path-String Types for Go (pkg.go.dev)

Show HN: HTML Capture Compare – Chrome Extension to Debug Flaky Tests

Show HN: Hacker News em dash user leaderboard pre-ChatGPT (gally.net)

Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown (sosumi.ai)

Show HN: I made an Animal Crossing style letter editor (acmail.idreesinc.com)

Show HN: Find Hidden Gems on HN (pj4533.com)

Show HN: Meetup.com and eventribe alternative to small groups (github.com)

Show HN: Auto-Match – How We Built Receipt-to-Transaction Matching (Open Source) (midday.ai)

Show HN: Vulners Lookup – CVE highlights on any page (hover for CVSS/EPSS/PoCs) (chromewebstore.google.com)

Show HN: I integrated my from-scratch TCP/IP stack into the xv6-riscv OS (github.com)

Show HN: SwiftAI – open-source library to easily build LLM features on iOS/macOS (github.com)

Show HN: A zoomable, searchable archive of BYTE magazine (byte.tsundoku.io)

Show HN: Turn Markdown into React/Svelte/Vue UI at runtime, zero build step (markdown-ui.com)

Show HN: Oaki–job finder and resume maker (oaki.io)

Show HN: ThisTouristDoesNotExist (virtual tourists at real landmarks) (thistouristdoesnotexist.com)

Show HN: Base, an SQLite database editor for macOS (menial.co.uk)

Show HN: Octarine – a fast, lightweight, opinionated Markdown notes app (octarine.app)

Show HN: Async – Claude code and Linear and GitHub PRs in one opinionated tool (github.com)

Show HN: FilterQL – A tiny query language for filtering structured data (github.com)

Show HN: How to create and use Tesseract OCR in Rust programming language? (aiviewz.com)

Show HN: Pitaya – Orchestrate AI coding agents like Claude Code (github.com)

Show HN: Pol/ite – /pol/ but posts are all polite (pol-ite.web.app)

Show HN: I made a game called "Funeral of Freiren." (github.com)

Show HN: Yoink AI – macOS AI app that edits directly in any textfield of any app (useyoink.ai)

Show HN: Magic links – Get video and dev logs without installing anything

Show HN: A WebPDF Reader with AI assistance, for research and studying (pdf-hub.com)

Show HN: Sourcerer – MCP for semantic code search that reduces token waste (github.com)

Show HN: OpenAnimation – KMP app for exploring and editing Lottie animations (github.com)

Show HN: I Built a XSLT Blog Framework (vgr.land)

Show HN: PageIndex – Vectorless RAG (github.com)

Show HN: Realistic Character with Mood AI (dmwithme.com)

Show HN: Sometimes GitHub is boring, so I made a CLI tool to fix it (github.com)

Show HN: A01 – personal news agent to follow anything (apps.apple.com)

Show HN: Yet another daily word game – wotd (wotd.is)

Preserving Order in Concurrent Go Apps: Three Approaches Compared

Comments (21)