Activeloop (YC S18) Is Hiring Member of Technical Staff – Back End Engineering (careers.activeloop.ai)

> A lot of boolean data is representing a temporal event having happened. For example, websites often have you confirm your email. This may be stored as a boolean column, is_confirmed, in the database. It makes a lot of sense.

> But, you're throwing away data: when the confirmation happened. You can instead store when the user confirmed their email in a nullable column. You can still get the same information by checking whether the column is null. But you also get richer data for other purposes.

So the Boolean should be something else + NULL?

Now we have another problem ...

buckle8017 · 16m ago

It should be a timestamp of the last time the email was verified.

It's a surprisingly useful piece of data to have.

amelius · 6m ago

Even more useful is a log of all the changes in the database. This gives you what you want, and it would be automatic for any data you store.

So, keep the Boolean, and use a log.

OskarS · 7h ago

A piece of advice I read somewhere early in my career was "a boolean should almost never be an argument to a function". I didn't understand what the problem was at the time, but then years later I started at a company with a large Lua code-base (mostly written by one-two developers) and there were many lines of code that looked like this:

   serialize(someObject, true, false, nil, true)

What does those extra arguments do? Who knows, it's impossible without looking at the function definition.

Basically, what had happened was that the developer had written a function ("serialize()", in this example) and then later discovered that they wanted slightly different behaviour in some cases (maybe pretty printed or something). Since Lua allows you to change arity of a function without changing call-sites (missing arguments are just nil), they had just added a flag as an argument. And then another flag. And then another.

I now believe very strongly that you should virtually never have a boolean as an argument to a function. There are exceptions, but not many.

StopDisinfo910 · 20m ago

Named arguments are a solution to precisely this issue. With optional arguments with default value, you get to do precisely what was being done in your Lua code but with self documenting code.

I personally believe very strongly that people shouldn’t use programming languages lacking basic functionalities.

account42 · 6h ago

But this isn't really a boolean problem - even in your example there is another mistery argument: nil

And you can get the same problem with any argument type. What do the arguments in

  copy(obectA, objectB, "")

mean?

In general, you're going to need some kind of way to communicate the purpose - named parameters, IDE autocomplete, whatever - and once you have that then booleans are not worse than any other type.

OskarS · 6h ago

You're correct in principle, but I'm saying that "in practice", boolean arguments are usually feature flag that changes the behavior of the function in some way instead of being some pure value. And that can be really problematic, not least for testing where you now aren't testing a single function, you're testing a combinatorial explosions worth of functions with different feature flags.

Basically, if you have a function takes a boolean in your API, just have two functions instead with descriptive names.

hamburglar · 6h ago

> Basically, if you have a function takes a boolean in your API, just have two functions instead with descriptive names.

Yeah right like I’m going to expand this function that takes 10 booleans into 1024 functions. I’m sticking with it. /s

Viliam1234 · 6m ago

Hopefully you could refactor it automatically into 1024 functions and then find out that 1009 of them are never called in the project, so you can remove them.

OrderlyTiamat · 5h ago

If your function has a McCabe complexity higher than 1024, then boolean arguments are the least of your problems...

8-prime · 6h ago

True, but I think its worth noting that inferring what a parameter could be is much easier if its something other than a boolean.

You could of course store the boolean in a variable and have the variable name speak for its meaning but at that point might as well just use an enum and do it proper.

For things like strings you either have a variable name - ideally a well describing one - or a string literal which still contains much more information than simply a true or false.

arethuza · 7h ago

If you use keyword arguments then something like that doesn't look too bad:

serialize(someObject, prettyPrint:true)

NB I have no idea whether Lua has keyword arguments but if your language does then that would seem to address your particular issue?

OskarS · 6h ago

Lua doesn't directly support keyword arguments, but you can simulate it using tables:

    serialize(someObject, { prettyPrint = true })

And indeed that is a big improvement (and commonly done), but it doesn't solve all problems. Say you have X flags, then there's 2^X different configurations you have to check and test and so forth. In reality, all 2^X configurations will not be used, only a tiny fraction will be. In addition, some configurations will simply not be legal (i.e. if flag A is true, then flag B must be as well), and then you have a "make illegal states unrepresentable" situation.

If the tiny fraction is small enough, just write different functions for it ("serialize()" and "prettyPrint()"). If it's not feasible to do it, have a good long think about the API design and if you can refactor it nicely. If the number of combinations is enormous, something like the "builder pattern" is probably a good idea.

It's a hard problem to solve, because there's all sorts of programming principles in tension here ("don't repeat yourself", "make illegal states unrepresentable", "feature flags are bad") and in your way of solving a practical problem. It's interesting to study how popular libraries do this. libcurl is a good example, which has a GAZILLION options for how to do a request, and you do it "statefully" by setting options [1]. libcairo for drawing vector graphics is another interesting example, where you really do have a combinatorial explosion of different shapes, strokes, caps, paths and fills [2]. They also do it statefully.

[1]: https://curl.se/libcurl/c/curl_easy_setopt.html

[2]: https://cairographics.org/manual/cairo-cairo-t.html

lelanthran · 7h ago

It's a failing of many type systems of older languages (except Pascal).

The best way in many languages for flags is using unsigned integers that are botwise-ORed together.

In pseudocode:

    Object someObject;
    foo (someObject, Object.Flag1 | Object.Flag2 | Object.Flag3);

Whatever language you are using, it probably has some namespaced way to define flags as `(1 << 0)` and `(1 << 1)` etc.

arethuza · 7h ago

If you really need all of that I think I'd go with a separate object holding all of the options:

options = new SerializeOptions();

options.PrettyPrint = true;

options.Flag2 = "red"

options.Flag3 = 27;

serialize(someObject, options)

vanviegen · 13m ago

So 1 line of C/C++ becomes 5 lines of Java/C#? That sounds about right! :-) Though I'm sure we can get to 30 if we squeeze in an abstract factory or two!

dandersch · 5h ago

It's always crazy to see languages like C being able to beat high-level languages at some ergonomics (which is usually their #1 point of pride) just because C has bitfields and they often don't.

0x3444ac53 · 3h ago

I think the answer to this (specific to lua) is passing a table as an argument that gets unpacked.

nutjob2 · 6h ago

> I now believe very strongly that you should virtually never have a boolean as an argument to a function. There are exceptions, but not many.

Really? That sounds unjustified outside of some specific context. As a general rule I just can't see it.

I don't see whats fundamentally wrong with it. Whats the alternative? Multiple static functions with different names corresponding to the flags and code duplication, plus switch statements to select the right function?

Or maybe you're making some other point?

fifticon · 7h ago

The scope of TFA is data modelling, where it advises to use more descriptive data values, such as enums or happenedAtTimestamp.

However, personally I agree with the advice, in another context: Function return types, and if-statements.

Often, some critical major situation or direction is communicated with returned booleans. They will indicate something like 'did-optimizer-pass-succeed-or-run-to-completion-or-finish', stuff like that. And this will determine how the program proceeds next (retry, abort, continue, etc.)

A problem arises when multiple developers (maybe yourself, in 3 months) need to communicate about and understand this correctly.

Sometimes, that returned value will mean 'function-was-successful'. Sometimes it means 'true if there were problems/issues' (the way to this perspective, is when the function is 'checkForProblems'/verify/sanitycheck() ).

Another way to make confusion with this, is when multiple functions are available to plug in or proceed to call - and people assume they all agree on "true is OK, false is problems" or vice versa.

A third and maybe most important variant, is when 'the return value doesn't quite mean what you thought'. - 'I thought it meant "a map has been allocated".' - but it means 'a map exists' (but has not necesarily been allocated, if it was pre-existing).

All this can be attacked with two-value enums, NO_CONVERSION_FAILED=0, YES_CONVERSION_WAS_SUCCESFUL=1 . (and yes, I see the peril in putting 0 and 1 there, but any value will be dangerous..)

1718627440 · 5h ago

That's why you have coding style guides and documentation. Both choices are "correct", you just need to be consistent.

Fraterkes · 8h ago

I’m not a very experienced programmer, but the first example immediately strikes me as weird. The consideration for choosing types is often to communicate intend to others (and your future self). I think that’s also why code is often broken up into functions, even if the logic does not need to be modular / repeatable: the function signature kind of “summarizes” that bit of code.

Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion. The fact that you only save a binary true/false value tells the person looking at the code a ton about what the program currently is meant to do.

turboponyy · 7h ago

I actually completely agree with both the article and your point that your code should directly communicate your intent.

The angle I'd approach it from is this: recording whether an email is verified as a boolean is actually misguided - that is, the intent is wrong.

The actual things of interest are the email entity and the verification event. If you record both, 'is_verified' is trivial to derive.

However, consider if you now must implement the rule that "emails are verified only if a verification took place within the last 6 months." Recording verifications as events handles this trivially, whilst this doesn't work with booleans.

Some other examples - what is the rate of verifications per unit of time? How many verification emails do we have to send out?

Flipping a boolean when the first of these events occurs without storing the event itself works in special cases, but not in general. Storing a boolean is overly rigid, throws away the underlying information of interest, and overloads the model with unrelated fields (imagine storing say 7 or 8 different kinds of events linked to some model).

joshstrange · 6h ago

Normally you'd name the field `created_at`, `updated_at`, or similar which I think makes it very clear.

> Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion.

I don't follow at all, if your field is named as when a thing happened (`_at` suffix) then that seems very clear. Also, even if you never expose this via UI it can be a godsend for debugging "Oh, it was updated on XXXX-XX-XX, that's when we had Y bug or that's why Z service was having an issue".

bluGill · 8h ago

In the case of a database you often can't fix mistakes so overdesign just in case makes sense. Many have been burned.

hahn-kev · 6h ago

See always having a synthetic primary key

bsoles · 7h ago

This is such a weird advice and it seems to come from a particular experience of software development.

How about using Booleans for binary things? Is the LED on or off, is the button pressed or not, is the microcontroller pin low or high? Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice.

jilles · 7h ago

* led status: on, off, non-responsive * button status: idle, pressing, pressed

I'm with you by the way, but you can often think of a way to use enums instead (not saying you should).

nh23423fefe · 43m ago

well yes. every boolean is iso to 2, and every 2 can be embedded in 3. and every N can be embedded in N+1

aDyslecticCrow · 6h ago

The boolean type is the massive whaste, not the enum. A boolean in c is just a full int. So definitely not a whaste to use an enum which is also an int.

And usually you use operations to isolate the bit from a status byte or word, which is how it's also stored and accessed in registers anyway.

So its still no boolean type despite expressing boolean things.

Enums also help keep the state machine clear. {Init, on, off, error} capture a larger part of the program behavior in a clear format than 2-3 binary flags, despite describing the same function. Every new boolean flag is a two state composite state machine hiding edgecases.

leni536 · 7h ago

> Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice.

In C++ you can use enums in bit-fields, not sure what the case is in C.

kps · 7h ago

They're boolean (single bit of information) but not boolean (single bit interpreted as meaning true or false). The LED isn't true or false, the microcontroller pin isn't true or false.

bsoles · 7h ago

This is semantic pedantry. The association true/1/high and false/0/low is well-known and understood.

kps · 2h ago

Plenty of signals are asserted (true) by being brought low, or have 1=low (e.g. CAN).

marcellus23 · 6h ago

huh? The LED isn't true or false, but whether the LED is on is true or false.

simondw · 35m ago

And whether the LED is off is false or true.

padjo · 7h ago

I think it’s implicitly in the context of datastore design. In that context it feels like decent advice that would prevent a lot of mess.

taylodl · 8h ago

What I'm getting out of this is boolean shouldn't be a state that's durably stored, it's ephemeral, an artifact of runtime processing. You wouldn't likely durably store a boolean in an OLTP store, but your ETL into the OLAP store may capture a boolean to simplify logic for all the systems using the OLAP store to drive decision support. That is, it's an optimization. That feels right, but I've never really thought through this before. Interesting!

jbreckmckye · 8h ago

This makes intuitive sense because booleans are obviously reductive, as reductive as it gets (ideally stored in 1 bit), but for processing and analysis there's typically no reason to store data so sparingly

taylodl · 7h ago

For processing and analysis, you're centralizing the compute of complex analysis and storing the result so downstream decision support systems can use the result as a criterion in their analysis - and not have to distribute, and maintain, that logic throughout the set of applications. A contrived example: is_valued_customer. This is a simple boolean, but its computation can be involved and you wouldn't want to have to replicate and maintain this logic throughout all the applications. But at the time, it likely has no business being in the OLTP store.

jbreckmckye · 7h ago

You might persist that value as an optimisation, but if you make it your source of truth, and discard your inputs, you better make sure you never ever ever ever have a bug in deriveValuedCustomer() or else you have lost data permanently

taylodl · 7h ago

Good point - you wouldn't want to discard your inputs. You're going to need them should you ever redefine deriveValuedCustomer() - which is likely for a system that will be in production for 10-20 years or more.

jbreckmckye · 8h ago

To summarise: booleans should be derived, not stored

bayindirh · 7h ago

I'll expand on the first example, the datetime one.

Many user databases use soft-deletes where fields can change or be deleted, so user's actions can be logged, investigated or rolled back.

When user changes their e-mail (or adds another one), we add a row, and "verifiedAt" is now null. User verifies new email, so its time is recorded to the "verifiedAt" field.

Now, we have many e-mails for the same user with valid "verifiedAt" fields. Which one is the current one? We need another boolean for that (isCurrent). Selecting the last one doesn't make sense all the time, because we might have primary and backup mails, and the oldest one might be the primary one.

If we want to support multiple valid e-mails for a single account, we might need another boolean field "isPrimary". So it makes two additional booleans. isCurrent, isPrimary.

I can merge it into a nice bit field or a comma separated value list, but it defeats the purpose and wanders into code-golf territory.

Booleans are nice. Love them, and don't kick them around because they're small, and sometimes round.

dang · 1h ago

That boolean should probably be something else - https://news.ycombinator.com/item?id=44423995 - June 2025 (1 comment, but it's solid)

alphazard · 7h ago

The timestamps instead of boolean thing is something good engineers stumble upon pretty reliably. One gotcha is the database might be weird about indexing nulls. I'm not going to give an example because you should really read the docs for your specific database if this matters.

The ever growing set of boolean flags seems to be an attractor state for database schemas. Unless you take steps to avoid/prohibit it, people will reach for a single boolean flag for their project/task. Fortunately it's pretty easy to explain why it's bad with a counting argument. e.g. There are this many states with booleans, and this fraction are valid vs. this many with the enum and this fraction are valid. There is no verification, so a misunderstanding is more likely to produce an invalid state than a valid state.

pixelfarmer · 7h ago

There can be verification for such things.

ck45 · 8h ago

One argument that I’m missing in the article is that with an enumerated, states are mutually exclusive, while withseveral booleans, there could be some limbo state of several bool columns with value true, e.g. is_guest and is_admin, which is an invalid state.

cjs_ac · 8h ago

In that case, you set the enumeration up to use separate bit flags for each boolean, e.g., is_guest is the least significant bit, is_admin is the second least significant bit, etc. Of course, then you've still got a bunch of booleans that you need to test individually, but at least they're in the same column.

cratermoon · 7h ago

look up the typestate pattern.

mrheosuper · 7h ago

I dont like this pattern.

The author example, checking if "Datetime is null" to check if user is authorized or not, is not clear.

What if there are other field associated with login session like login Location ? Now you dont know exactly what field to check.

Or if you receive Null in Datetime field, is it because the user has not login, or because there is problem when retriving Datetime ?

This is just micro-optimization for no good reason

monkeyelite · 7h ago

> Now you dont know exactly what field to check.

Yes you do - you have a helper method that encapsulates the details.

In the DB you could also make a view or generated column.

> This is just micro-optimization for no good reason

It’s conceptually simpler to have a representation with fewer states, and bugs are hopefully impossible. For example what would it mean for the bool authorized to be false but the authorized date time to be non-null?

chikinpotpi · 7h ago

I generally prefer to let one value mean one thing.

Allowing the presence of a dateTime (UserVerificationDate for example) to have a meaning in addition to its raw value seems safe and clean. But over time in any system these double meanings pile up and lose their context.

Having two fields (i.e. UserHasVerified, UserVerificationDate) doesn't waste THAT much more space, and leaves no room for interpretation.

jerf · 7h ago

But it does leave room for "UserHasVerified = false, UserVerificationDate = 2025/08/25" and "UserHasVerified = true, UserVerificationDate = NULL".

The better databases can be given a key to force the two fields to match. Most programming languages can be written in such a way that there's no way to separate the two fields and represent the broken states I show above.

However the end result of doing that ends up isomorphic to simply having the UserVerificationDate also indicate verification. You just spent more effort to get there. You were probably better off with a comment indicating that "NULL" means not verified.

In a perfect world I would say it's obvious that NULL means not verified. In the real world I live in I encounter random NULLs that do not have a clear intentionality behind them in my databases all the time. Still, some comments about this (or other documentation) would do the trick, and the system should still tend to evolve towards this field being used correctly once it gets wired in to the first couple of uses.

cratermoon · 7h ago

> Having two fields (i.e. UserHasVerified, UserVerificationDate)

What happens when they get out of sync?

Duanemclemore · 5h ago

APL and its descendents don't have booleans, just 0 and 1 [0]. Which is awesome. It allows for bitmasks, sums / reductions, and even conditionals via Iverson Brackets. [1]

[0] https://aplwiki.com/wiki/Boolean

[1] https://en.m.wikipedia.org/wiki/Iverson_bracket

coin · 7h ago

> But, you're throwing away data

Often it’s intentional for privacy. Record no more data than what’s needed.

zwieback · 1h ago

Maybe for the DB domain author is talking about but the nice thing about a bool is that it's true or false. I don't have to dig around documentation or look through the code what the convention of converting enum, datetime, etc. to true/false is. 1970/1/1 (I was four years old then, just sayin), -6000 or something else?

Nullable helps a lot here but not all languages support that the same way.

eflim · 7h ago

I would add counters to this list. Start from zero (false), and then you know not just whether an event has occurred, but how many times.

the__alchemist · 7h ago

I read an article with the same premise here a few years ago.

A Boolean is a special, universal case of an enum (or whatever you prefer to call these choice types...) that is semantically valid for many uses.

I'm also an enum fanboy, and agree with the article's examples. It's conclusion of not using booleans because enums are more appropriate in some cases is wrong.

Some cases are good uses of booleans. If you find a Boolean isn't semantically clear, or you need a third variant, then move to an enum.

arethuza · 7h ago

I once, briefly, worked with a developer who believed that you should never use primitive types for fields or parameters...

fenesiistvan · 7h ago

I was hoping to read about bitfields or bit flags.

usernamed7 · 7h ago

replace "should" with "could".

I do think its wise to consider when a boolean could be inferred from some other mechanism, but i also use booleans a lot because they are the best solution for many problems. Sure, sometimes what is now a boolean may need to become something later like an enum, and that's fine too. But I would not suggest jumping to those out the gate.

Booleans are good toggles and representatives of 2 states like on/off, public/private. But sometimes an association, or datetime, or field presence can give you more data and said data is more useful to know than a separate attribute.

Prosper AI (YC S23) Is Hiring Founding Account Executives (NYC) (jobs.ashbyhq.com)

The Forecasting Company (YC S24) Is Hiring a Software Engineer (ycombinator.com)

Lago – Open-Source Usage Based Billing – Is Hiring in Sales, Eng, Ops (EU, US) (ycombinator.com)

Ember (YC F24) Is Hiring Full Stack Engineer (ycombinator.com)

LiteLLM (YC W23) is hiring a back end engineer (ycombinator.com)

SigNoz (YC W21, Open Source Datadog) Is Hiring Platform Engineers (Remote) (jobs.ashbyhq.com)

Motion (YC W20) Is Hiring Principal Software Engineers (jobs.ashbyhq.com)

Bild AI (YC W25) Is Hiring an Applied AI Engineer (workatastartup.com)

Text.ai (YC X25) Is Hiring Founding Full-Stack Engineer (ycombinator.com)

Cua (YC X25) is hiring design engineers in SF (ycombinator.com)

Activeloop (YC S18) Is Hiring Member of Technical Staff – Back End Engineering (careers.activeloop.ai)

Coris (YC S22) Is Hiring (ycombinator.com)

14.ai (YC W24) is hiring engineers in SF to build an AI-native Zendesk (14.ai)

Spice Data (YC S19) Is Hiring a Product Associate (New Grad) (ycombinator.com)

Ashby (YC W19) Is Hiring Design Engineers in AMER and EMEA (ashbyhq.com)

EasyPost (YC S13) Is Hiring (easypost.com)

Tesorio (YC S15) Is Hiring a Senior GenAI Engineer (100% Remote) (tesorio.com)

OneSignal (YC S11) Is Hiring Engineers (onesignal.com)

Axle (YC S22) is hiring product engineers (ycombinator.com)

Mbodi AI (YC X25) Is Hiring a Founding Research Engineer (Robotics) (ycombinator.com)

ReadMe (YC W15) Is Hiring a Developer Experience PM (readme.com)

Weave (YC W25) is hiring a founding AI engineer (ycombinator.com)

Depot (YC W23) Is Hiring a Community and Events Manager (Remote) (ycombinator.com)

CoLoop (YC S21) Is Hiring AI Engineers in London

Trellis (YC W24) Is Hiring: Automate Prior Auth in Healthcare (ycombinator.com)

Type (YC W23) is hiring a founding engineer to build an AI-native doc editor (ycombinator.com)

Foundry (YC F24) is hiring staff-level product engineers (ycombinator.com)

GoGoGrandparent (YC S16) Is Hiring Back End and Full-Stack Engineers

Kyber (YC W23) is hiring enterprise account executives (ycombinator.com)

Converge (YC S23) well-capitalized New York startup seeks product developers (runconverge.com)

Great Question (YC W21) Is Hiring a VP of Engineering (Remote) (ycombinator.com)

That boolean should probably be something else

Comments (63)