Traps to Developers

144 qouteall 56 8/16/2025, 10:34:35 AM qouteall.fun ↗

Comments (56)

mdaniel · 3h ago
> A method that returns Optional<T> may return null.

projects that do this drive me bananas

If I had the emotional energy, I'd open a JEP for a new @java.lang.NonNullReference and any type annotated with it would be a compiler error to assign null to it

  public interface Alpha {}
  @java.lang.NonNullReference
  public interface Beta {}

  Alpha a = null; // ok
  Beta b = null; // compiler error
javac will tolerate this

  Beta b;
  if (Random.randBoolean()) {
    b = getBeta();
  } else {
    b = newBeta();
  }
but I would need to squint at the language specification to see if dead code elimination is a nicety or a formality

  Beta b;
  if (true) {
    b = getBeta();
  } else {
    b = null; // I believe this will be elided and thus technically legal
  }
Spivak · 2h ago
I question the wisdom of even having Optional<T> in a language with nulls. It would raise some eyebrows if a function in Python returned an Optional type object rather than T | None. You have to do a check either way unless you're doing some cute monad-y stuff.
crooked-v · 1h ago
There's a lot of quality-of-life stuff enabled by it in Java, since the base language's equivalents to Optional.empty(), Optional.ofNullable(...).orElse(...), etc are painfully verbose by comparison.
singron · 2h ago
Maybe this is cute monady stuff, but there isn't an equivalent to Optional<Optional<T>> with only null/None. You usually don't directly write that, but you might incidentally instantiate that type when composing generic code, or a container/function won't allow nulls.
Someone · 5h ago
> Java, C# and JS use UTF-16-like encoding for in-memory string

That’s incorrect for Java, possibly also for C# and JS.

In any language where strings are opaque enough types [1], the in-memory representation is an implementation detail. Java has been such a language since release 9 (https://openjdk.org/jeps/254)

[1] The ‘enough’ is because some languages have fully opaque types, but specify efficiency of some operations and through it, effectively proscribe implementation details. Having a foreign function interface also often means implementation details cannot be changed because doing that would break backwards compatibility.

> JS use floating point for all numbers. The max accurate integer is 2⁵³−1

That is incorrect. Much larger integers can be represented exactly, for example 2¹⁰⁰.

What is true is that 2⁵³−1 is the largest integer n such that n-1, n, and n+1 can be represented exactly in an IEEE double. That, in turn, means n == n-1 and n == n+1 both will evaluate to false, as expected in ‘normal’ arithmetic.

debugnik · 4h ago
> possibly also for C# and JS

The representation for C# is very much fixed, as it allows, and very commonly uses, direct access into the string buffer as a ReadOnlySpan<char> or a raw char pointer, where char is the type of UTF-16 codepoints.

JS could maybe get away with it.

hinkley · 1h ago
When you have code that works a lot with strings the cost overhead of building an app on iso-latin-1 but encoding as utf-16 can be substantial.

I think Java moved away from this back around 8, or possibly 9.

seangrogg · 2h ago
Yeah, I think they didn't mean max "accurate" integer and rather meant max "safe" integer.
mikojan · 1h ago
> > Java, C# and JS use UTF-16-like encoding for in-memory string

>

> That’s incorrect for Java,

Maybe so, technically, but if you Base64 encode a string in a language that uses UTF-8 (or another UTF-16 with another endian) and decode it in Java, Java's UTF-16 representation will be the problem you will be dealing with.

scarface_74 · 4h ago
I started to say something about C# strings and then I remembered the clusterfuck when it came to Windows development and strings and depending on which API you call, a string is represented by one of a dozen different ways.

https://stackoverflow.com/questions/689211/interop-sending-s...

OptionOfT · 2h ago
> Some routers and firewall silently kill idle TCP connections without telling application. Some code (like HTTP client libraries, database clients) keep a pool of TCP connections for reuse, which can be silently invalidated. To solve it you can configure system TCP keepalive. For HTTP you can use Connection: keep-alive Keep-Alive: timeout=30, max=1000 header.

Once a TCP connection has been established there is no state on routers in between the 2 ends of the connection. The issue here is firewalls / NAT entries timing out. And indeed, no RSTs are sent.

We had the issue in K8s with the conntrack module set too low.

Now, you can try to put in an HTTP Keep-Alive, but that will not help you. The HTTP Keep-Alive is merely for connection re-use at the HTTP level, i.e. it doesn't close the connection: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...

An HTTP Keep-Alive does not generate any packages, it merely postpones the close.

A TCP Keep-Alive generates packages which resets the timers.

andunie · 8h ago
That's a nice compendium of tips and useful information.

I wonder if anyone can learn from this. I feel like I only understood what I already knew, or at least was very close to knowing. That's the same thing that happens with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topics, but often very bad at teaching the same topics to an audience that doesn't know anything.

skydhash · 8h ago
> with teaching manuals about any topic: they're organized in a way that makes sense and it's easy for people who already know the topic

I think that the reason for a manual existence. To have a written record so we don't have to trust our memory. This is what most unix manuals are. You already know what the software can do, you just need to remember the specificity on how to get something done.

> often very bad at teaching the same topics to an audience that doesn't know anything.

What you need then is a tutorial (beginner seeking to learn) or a guide (beginner/intermediate seeking to do). Manuals in this case only serve to have better questions (Now you know what you don't know).

ozim · 5m ago
Kind of what I noticed for myself.

When I was a kid I was trying to learn Linux and commands and it was disappointing.

Over the years of using it I don’t need to learn it but I do need to look stuff up.

jmull · 4h ago
This looks like not so much traps, but a list of things the author has learned.

Much of it would only apply in certain relatively narrow contexts, but the contexts aren't necessarily mentioned.

Some of it appears to be just wrong.

I guess I'm saying: I would not take this literally, but as something almost like a stream-of-consciousness.

nayuki · 2h ago
Largely a good listicle. Some feedback:

> Unicode unification. Different characters in different language use the same code point. Different languages' font variants render the same code point differently. 語

This isn't a trap. The given example character means the same thing in Chinese and Japanese, and the Japanese version was imported from China. People from both languages recognize both font variants as the same conceptual character.

The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.

Anyway, this is discussed at length in https://en.wikipedia.org/wiki/Han_unification

> There is a negative zero -0.0 which is different to normal zero. The negative zero equals zero when using floating point comparision. Normal zero is treated as "positive zero".

And there are two ways to distinguish negative zero from normal zero: By their integer bit patterns, or by the fact that 1.0/-0.0 == -Inf vs. 1.0/0.0 == +Inf.

> It's recommended to configure the server's time zone as UTC.

Big yes. I use UTC for servers, logs, photos, and anything that is worth archiving and timestamping properly. Local time is only for colloquial use.

> For integer (low + high) / 2 may overflow. A safer way is low + (high - low) / 2

Yes, but if low and high could be negative numbers, then you've just shifted the overflow to a different range. This matters for general binary search over an integer range, as opposed to unsigned binary search over an array.

> C/C++

I'm going to throw in one of my lists of pitfalls - just using integer types and arithmetic correctly in C/C++ is a massive developer trap. That's like the most basic thing in programming. https://www.nayuki.io/page/summary-of-c-cpp-integer-rules

> Rebase can rewrite history

"Can" is a weasel word; rebase does nothing but rewrite history.

cyberax · 19s ago
> The author is making it sound like the letter 'A' in English should have a different code point than an 'A' in French. Or that a lowercase 'a' with the top tail should be a different character than a lowercase 'a' without the top tail.

But we do have А and A. Even though they look the same. And unified Han characters are often quite distinct, it tripped me up as a learner of Chinese more than once. For example, a very common character '喝' (drink) looks quite a bit different: https://en.wiktionary.org/wiki/%E5%96%9D - they have a different number of strokes even.

Han unification is a mess.

skobes · 7h ago
The first "trap" on the page says "min-width: auto makes min width determined by content", but this is false outside of flex/grid.

From MDN: "For block boxes, inline boxes, inline blocks, and all table layout boxes auto resolves to 0."

https://developer.mozilla.org/en-US/docs/Web/CSS/min-width

jfengel · 6h ago
CSS cascade for text properties more or less makes sense.

I have been unable to comprehend CSS layout from any perspective: page designer, implementer, user, anything. It must have someone in mind but I have no idea who I that is.

chrisweekly · 6h ago
https://every-layout.dev has by far the best explanations and coherent usage of CSS I've encountered since I started doing webdev for a living in 1998.
lemonberry · 12m ago
Every Layout changed how I look at and do CSS. Great resource with a good philosophy behind it: CubeCSS. It really made CSS fun for me again.
skobes · 6h ago
Layout is more bazaar than cathedral. It has had many ideas mixed in by different contributors over decades.
diggan · 7h ago
I guess the first trap should really be: "You cannot read any CSS property in isolation, as just like what the name implies, defaults and what values end up doing cascades through all the rules your document ends up using"
FFFXXX · 6h ago
The part about C# volatile accesses using release-acquire ordering seems to be wrong if I read the C# docs correctly.

"There is no guarantee of a single total ordering of volatile writes as seen from all threads of execution"

https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...

charleslmunger · 6h ago
>A volatile write operation prevents earlier memory operations on the thread from being reordered to occur after the volatile write. A volatile read operation prevents later memory operations on the thread from being reordered to occur before the volatile read

Looks like release/acquire to me? A total ordering would be sequential consistency.

FFFXXX · 6h ago
I think you are quoting from https://learn.microsoft.com/en-us/dotnet/api/system.threadin...

"In C#, using the volatile modifier on a field guarantees that every access to that field is a volatile memory operation"

This makes it sound like you are right and the volatile keyword has the same behaviour as the Volatile class which explicitly says it has acquire-release ordering.

But that seems to contradict "The volatile keyword doesn't provide atomicity for operations other than assignment, doesn't prevent race conditions, and doesn't provide ordering guarantees for other memory operations." from the volatile keyword documentation?

charleslmunger · 2h ago
I too interpretat those docs as contradictory, and I wonder if, like how Java 5 strengthened volatile semantics, this happened at some point in C# too and the docs weren't updated? Either way the specification, which the docs say is definitive, says it's acquire/release.

https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...

"When a field_declaration includes a volatile modifier, the fields introduced by that declaration are volatile fields. [...] For volatile fields, such reordering optimizations are restricted:

    A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.

    A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence."
judofyr · 2h ago
Acquire-release ordering provides ordering guarantees for all memory operations. If an acquire observes a releases, the thread is also guaranteed to see all the previous writes done by the other thread - regardless of the atomicity of those writes. (There still can't be any other data races though.)

This volatile keyword appears to only consider that specific memory location whereas the Volatile class seem to implement acquire-release.

dataflow · 6h ago
Somewhat off topic, but what is a realistic example of where you need atomics with sequential consistency? Like, what useful data structure or pattern requires it? I feel like I've seen every other ordering except that one (and consume) in real world code.
judofyr · 2h ago
A mutex would be the most trivial example. I don't believe that is possible to implement, in the general case, with only acquire-release.

Sequential consistency mostly become relevant when you have more than two threads interacting with both reads and writes. However, if you only have single-consumer (i.e. only one thread reading) or single-producer (i.e. only one thread writing) then the acquire-release semantics ends up becoming sequential since the single-consumer/producer implicitly enforces a sequential ordering. I can potentially see some multi-producer multi-consumer queues lock-free queues needing sequential atomics.

I think it's rare to see atomics with sequential consistency in practice since you typically either choose (1) a mutex to simplify the code at the expense of locking or (2) acquire-release (or weaker) to minimize the synchronization.

dataflow · 2m ago
[delayed]
ngruhn · 8h ago
A recent trap for me:

Regex semantics is subtly different across languages. E.g. a{,3} matches between 0 and 3 "a" characters in Python. In JavaScript it matches the literal string "a{,3}".

skydhash · 8h ago
Regex is more a technique than an actual specification. It would be best to find the time to go and read an introductory book about Theory of Computation where they explain the underlying mechanism.
ryandv · 7h ago
> Theory of Computation

Computer science? Seriously? What a fucking waste of time. Better just take a bootcamp and get the LLM to write your regexes for you. Cut four years into eight weeks.

Time to get with the times, gramps. The singularity is near.

skydhash · 7h ago
It's half a chapter in most books I know. Or a subset of this 1h MIT videos [0], but the instructor also explains Finite Automata which is the basic mechanism that does all the stuff.

[0]: https://www.youtube.com/watch?v=9syvZr-9xwk

jraph · 6h ago
I'll assume sarcasm (from your comment history) but for people actually believing this first degree: good luck debugging an incorrect regex if you haven't practiced regexes. Especially if it was generated by an llm.
danhau · 7h ago
I always use regex101 to develop my regexes. It allows you to switch between different engines.
PhilipRoman · 7h ago
Honorable mention to [a-z], gotta be my favorite trap
dataflow · 6h ago
What's the trap for this one? I can't think of any engine that parses this to mean anything other than the letters a through z.
PhilipRoman · 4h ago
In some common implementations if $LANG is set to certain values, it will fail to match some ASCII letters. This is because not all latin character using languages put Z last in the alphabet.

Try this (you probably need to enable and generate the locale first)

    echo y | LANG=lv_LV.UTF-8 grep '[a-z]'
Locales in general should be considered a "trap", just look at Windows CSV separator handling, etc.
dataflow · 7m ago
That's wild. Thanks for explaining. I had no idea this depends on the locale. Looks like I have about a million scripts to fix...
1718627440 · 27m ago
Not in general, but using locales for something different than affecting presentation.
dpkirchner · 5h ago
It depends on its use, ultimately, but if your goal is to find a string of letters (a common use IMO), you'll want to use something like \p{L} to ensure you don't miss non-ASCII characters.

eta: fixed regex, I had typed \L, shared from my faulty memory.

accoil · 4h ago
[A-z] though is a fun one though as it includes a few extra symbols between upper and lowercase.
1718627440 · 25m ago
Does it? I thought Regex are defined on character classes not on numeric ASCII values. What would a Regex do on a different encoding then?
koromak · 3h ago
Does anyone truly understand all the little edge cases with CSS?

I've write tons and tons of CSS, have done for a decade. I don't sit and think about the exact interactions, I just know a couple things that might work if I'm getting something unexpected.

I don't really see it possible to commit that to memory, unless I literally start working on an interpreter myself.

yurishimo · 3h ago
I think there can be a different way to think about CSS that can help with that feeling of never understanding it all. Recently I’ve heard people influential in the CSS world describe it as a “suggestion” to the browser. The browser has its own styles, the user might have some custom stylesheet on top of the browser’s version, extensions, etc etc and at some point CSS is really more a long list of “suggestions” about how the site should look.

If you embrace that idea to the fullest, you can create some interesting designs/patterns that can be more resilient. The “downside” is that this way of writing css will likely made the pixel perfect head of the marketing department hate you unless they also write code.

I think it’s also okay to say that some ways of writing css just aren’t relevant anymore. A good parallel in mind is building construction and general carpentry. These days, a quick 2x4 stud wall or insulated concrete forms is fast, cheap, and standardized around the world. However, many craftspeople still exist that will create beautiful joinery for what is ultimately a simple thing, but we can appreciate that art standalone. With CSS, I don’t suspect we will ever need to go back to floats or crazy background images or whatever but it’s nice that those tools are still there for not only the sake of back compat, but also as a way to tinker and “craft” something bespoke for a special project or just because you like it. Education will eventually catch up and grid and flexbox will keep gaining popularity until we decide that it’s too complicated and come up with some new algorithm. That can all be true though and you can bring value as a developer without knowing every single aspect to the public API.

1718627440 · 29m ago
But you need to, you know, actually float something in a text. I think to do it with flexbox/grid you need JS that calculates heights and than manually splits the text into boxes with heights, so essentially you are doing rendering.

Also is there another way to position boxes side-by-side in an inline context without float?

upghost · 3h ago
> Unset variables. If DIR is unset, rm -rf $DIR/ becomes rm -rf /. Using set -u can make bash error when encountering unset variable.

sweet mercy :O

Someone call the Inquisition

AnimalMuppet · 2h ago
Instead, say

  rm -rf $DIR
That is, skip the trailing slash. Then if $DIR is not set, it becomes an invalid command, because no file names were supplied.
Terr_ · 1h ago
Better to make the requirement explicit, instead of relying on the argument-parsing details of rm or some other command:

    # Default message
    $ rm -rf "${DIR:?}"
    bash: DIR: parameter null or not set

    # Custom message
    $ rm -rf "${DIR:?It is not set OMG}"
    bash: DIR: It is not set OMG
QuadmasterXLII · 8h ago
CSS and C++ both have the “pick a subset and enforce that, or suffer” nature. On my to-do list: make a github action that requires manual override to merge any pull request with a css attribute not already present
dschuessler · 3h ago
I am unsure how this is supposed to work for CSS. To my knowledge, most CSS properties cannot be substituted for each other. If the subset to be enforced is "CSS properties already present", what is a developer supposed to do if their CSS property is not already present? Change the design?
QuadmasterXLII · 2h ago
Well, (like C++) new css attributes are constantly added. This means you constantly have to choose between the old way or the new way: either is fine, but “pick old or new at random on a per pull request basis” isn’t.
dschuessler · 2h ago
You seem to assume that old CSS properties can be substituted for new ones. But as I said, to my knowledge this isn’t possible in most cases. Can you give an example of two CSS properties where 'either is fine, but only one should be used'?

Or do you mean something else altogether by 'CSS attributes'?

QuadmasterXLII · 2h ago
The specific case that inspired this comment was a random mix of margin and gap
bradfitz · 4h ago
> Golang use UTF-8 for in-memory string.

Nope. It’s just bytes with no encoding.

https://go.dev/blog/strings

ivanjermakov · 3h ago
There is no such thing as "just bytes" when it comes to Unicode. UTF-8 is a way to represent Unicode codepoints in binary.

But I agree that author's statement is wrong. Go stings are equivalent to byte slices.