How randomness improves algorithms (2023)

32 kehiy 16 8/14/2025, 9:43:53 AM quantamagazine.org ↗

Comments (16)

hinkley · 28m ago

If you squint a bit, many of the optimizations for quicksort are essentially a very short Monte Carlo simulation.

Randomness seems to help more with intentionally belligerent users than anything else. There is no worst pattern because the same question twice yields a different result. For internal use that can help a little with batch processing, because the last time I did that seriously I found a big problem with clustering of outlier workloads. One team of users had 8-10x the cost of any other team. But since this work involved populating several caches, I sorted by user instead of team and that sorted it. Also meant that one request for team A was more likely to arrive after another request had populated the team data in cache instead of two of them concurrently asking the same questions. So it not only smoothed out server load it also reduced the overall read intensity a bit. But shuffling the data worked almost as well and took,hours instead of days.

jvanderbot · 22m ago

Adversary input is basically at the core of analysis of these types of algorithms, yes.

Here's another crossover analogy. In games, for any deterministic strategy there are games where you can't play a deterministic (pure) strategy and hope to get the best outcome, you need randomized (mixed) ones. Otherwise the Adversary could have anticipated and generated a different pathological response.

EdwardCoffin · 5h ago

It's unmentioned in the article, but Trevor Blackwell's PhD thesis, Applications of Randomness in System Performance Measurement [1] was advocating this in 1998:

This thesis presents and analyzes a simple principle for building systems: that there should be a random component in all arbitrary decisions. If no randomness is used, system performance can vary widely and unpredictably due to small changes in the system workload or configuration. This makes measurements hard to reproduce and less meaningful as predictors of performance that could be expected in similar situations.

[1] https://tlb.org/docs/thesis.pdf

hinkley · 18m ago

All else being equal, I like to have either a prime number of servers or a prime number of inflight requests per server. I’m always slightly afraid someone is going to send a batch of requests or tune a benchmark to be run a number of time that divides exactly evenly into the system parallelism and we won’t be testing what we think we are testing due to accidental locality of reference that doesn’t show up in the general population. Not unlike how you get uneven gear wear if you mesh two gears that have a large common denominator of tooth count, like a ratio of 3:1 or 2:3, so the same teeth keep meeting all the time.

But all else is seldom equal and Random 2 works as well or better.

sestep · 4h ago

Could the question mark in the HN version of the title be removed? It makes it read as a bit silly.

k_g_b_ · 4h ago

In my experience it's a common mistake of non-native English speakers, of native speakers of Slavic languages in particular. I see it often at work with titles starting with an interrogative word like "how".

pixelpoet · 4h ago

Guaranteed this is the case, I see it a lot too. They've done it twice before on previous submissions: https://news.ycombinator.com/item?id=44755116 and https://news.ycombinator.com/item?id=44785347

In case anyone is curious, the way to phrase it as a question would be, "How does randomness improve algorithms?"

jvanderbot · 1h ago

Weirdly, "Why randomness improves Algorithms." Is closer to the truth and also cannot be expressed correctly with a question mark.

egypturnash · 1h ago

It’s not there in the original title.

optimalsolver · 4h ago

Written by a shiba inu

furyofantares · 1h ago

I don't think 'random' is doing any of the work. These sound like they would work fine with a deterministic PRNG seeded at 0. They don't sound like they need to be looking at lava lamps or the like.

It's that there's a population of values (integers for factoring, nodes-to-delete for the graph) where we know a way to get a lot of information cheaply from most values, but we don't know which values, so we sample them.

Which isn't to say the PRNG isn't doing work - maybe it is, maybe any straightforward iteration through the sample space has problems, failure values being clumped together, or similar values providing overlapping information.

If so that suggests to me that you can do better sampling than PRNG, although maybe the benefit is small. When the article talks about 'derandomizing' an algorithm, is it referring to removing the concept of sampling from this space entirely, or is it talking about doing a better job sampling than 'random'?

jvanderbot · 1h ago

I don't follow the question.

A pseudo random sequence of choices is still sufficiently detached from the input. Random here means "I'm making a decision in a way that is independent from the input sufficiently so that structuring the input adversarially won't cause worst case performance." Coupled with "the cost of this algorithm is expressed assuming real random numbers".

That's the work Random is doing.

INB4 worst case: you can do worst case analysis on randomized analysis but it's either worst case across any choice or worst case in expectation, not worst case given a poor implementation of RNG, effectively randomization sometimes serves to shake you out of an increasingly niche and unlikely series of bad decisions that is the crux of an adversarial input.

To wit

> In the rare cases where the algorithm makes an unlucky choice and gets bogged down at the last step, they could just stop and run it again.

furyofantares · 50m ago

I think "How randomness improves algorithms" makes it sound kind of mystical and paradoxical.

Instead we could say "we know shortcuts that work for most values in a population, but we don't know how to figure out which values without expensive computation" and that's not very mystical or paradoxical.

And using random sampling is now the obvious way to deal with a situation. But it's not random doing the work, that's an implementation detail of how you do the sampling.

It very well can be that there isn't another obvious way to iterate through the sample space that isn't a disaster. Maybe bad values are clumped together so when you retry on failure you retry a lot. But if that's the case, there might also be a better way to sample than random - if bad values are clumped then there may be an intelligent way to sample such that you hit two bad values in a row less often than random.

My question is if that's what's being referred to as 'derandomizing' - taking one of these algorithms that uses random sampling and sample more intelligently. Or if they instead mean using what they learned from the (so-called) probabilistic algorithm to go back to a more traditional form.

jvanderbot · 39m ago

Well in fact we know (if P! = NP) that for any randomized ALG there's a good deterministic one. So its not going do the kind of game changing class breaking work you're looking for. You know, where random is the only way.

How randomization helps is by making it much easier to design algorithms. E.g. Verifying a solution is cheap, so proving your random choice is in some class of good choices and making an algorithm that turns that into a solution is still an interesting approach and opens up solutions to things we cant yet solve.

Derandomizing as presented apparently means proving for a few random but careful choices you're in the funnel towards the right solution, or (hopefully) can detect otherwise, and a few transformations or reasonably performant steps produce your solution.

lifeinthevoid · 21m ago

> Well in fact we know (if P! = NP) that for any randomized ALG there's a good deterministic one

Oh, that's cool, do you have a reference for that?

jvanderbot · 17m ago

TFA (edit: in the politest way)

Ask HN: Do you still bookmark websites?

VC-backed company just killed my EU trademark for a small OSS project

ChatGPT-5 System Prompt Leaked

F-Droid build servers can't build modern Android apps due to outdated CPUs

Ask HN: How do you tune your personality to get better at interviews?

Ask HN: Tesla switching from "Godot" to "Unreal": is this ~informative?

Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Ask HN: Is the rise of AI tools going to be the next 'dot com' bust?

Ask HN: What alternatives to GitHub are you using?

Ask HN: What toolchains are people using for desktop app development in 2025?

Ask HN: What are your favorite obscure but brilliant C/C++ libraries?

Individual Bestbuy email subscription pages are apparently indexed by Google

Design History in India

Ask HN: What's your most valuable query to an LLM?

Ask HN: How are you scaling AI agents reliably in production?

One weird trick to making Claude Code palatable to use

What would you name a new programming language?

Ask HN: How is Cognition able to raise at $10B?

Ask HN: Do you use personal AI Agents?

Ask HN: Is there an AI that can read code aloud and explain it?

Ask HN: What "developer holy war" have you flip-flopped on?

Ask HN: Are there software engineering areas that are safe from LLMs invasion?

Ask HN: Why Is My Happiness Tied to My Productivity?

Ask HN: Is Twilio necessary and required to build text and voice agents?

Ask HN: How do you connect with other founders in your city?

Ask HN: What's stopping Guix from building static binaries for various targets?

How randomness improves algorithms (2023)

Comments (16)