For something this small we can enumerate all the cases (this is a Scheme version of the Mathematica `Tuples` function)
> (list-tuples '(B G) 2)
((B B) (B G) (G B) (G G))
3 cases have at least one girl
of those 3 cases, 1/3 are both girls
sixo · 1h ago
It's not even worth mentioning this problem unless you talk about how the result depends on the data generating process. If you take it to be something like "you randomly sample from families with two children, discarding any without at least one girl", you get the 1/3 result, but there are various other ways to read a sampling process from the problem statement which lead to other results.
pontus · 55m ago
Just to pile on here, there's also ambiguity around how the observed girl is selected. Consider the following framing:
I go to a random house on a random street and knock on the door. A young girl opens the door. I ask how many siblings they have and they say one. What's the probability that they have a sister?
Now it's 50% even though cosmetically it seems like it'd be fair to say that the family has at least one daughter. The reason is that once I see a girl at the door, I'm slightly more confident in that it's a GG household since a GB or BG household would sometimes show a boy opening the door (assuming the two kids are equally likely to open the door).
P(GG | G at door) = P(G at door | GG) P(GG) / P(G at door)
P(G at door) = 1/2 (by symmetry)
So,
P(GG | G at door) = 1 * 1/4 * 2 = 1/2
MontyCarloHall · 47m ago
This is the crux of the "paradox," which is really just an interpretation problem. Most people assume that the question asks exactly your scenario, i.e. if a specific child is selected and it's a girl, what's the probability that the sibling is also a girl? In that case, the event space is just GB or GG, and p(GG)/(p(GB) + p(GG)) = 0.5. (BG is not in the event space because we are conditioning on a specific child being a girl.)
However, if the question is interpreted as "what's the probability of having two girls if we know there aren't two boys," then the event space is GB, BG, GG, and p(GG)/(p(GB) + p(BG) + p(GG)) = 1/3. Both GB and BG are in the event space because we are not conditioning on the sex of one specific child.
the_gipsy · 1h ago
Why can you not frame it as: "a random family has been sampled, the sample family has two childs, one of them is a girl"?
I.e. without "discarding", just giving some additional, but not complete, information on the random sample. Is adding information about the picked sample the same as discarding all contrarian samples? Why is this relevant?
AnotherGoodName · 55m ago
If there were two possible statements they asked
"a random family has been sampled, the sample family has two childs, one of them is a girl"?
and
"a random family has been sampled, the sample family has two childs, one of them is a boy"?
and they selected each statement based on randomly picking a child from a random family then the probability actually becomes 50% boy/girl for the next child since the boy/boy or girl/girl has twice the chance of generating the above statement for the respective gender compared to the mixed gender children family.
Ie. if they say one is a girl that statement had a 50% chance of being generated by a girl/girl family (since we pick the statement based on a random selection of one of the two childrens gender and there's 2 girls, doubling the chance of a statement that one's a girl coming from a girl/girl family), there's 25% chance the statement was generated from a girl/boy family and a 25% chance the statement was generated from a boy/girl family.
If you take 50% chance girl/girl, 25% chance boy/girl and 25% girl/boy you'll see there's a 50/50 chance of the next child being either gender.
All this due to changing how we sampled.
ndr · 1h ago
I took this to mean exactly that:
> Assume the family is selected at random because they have at least one girl.
And then again, if they sampled all families with 2 children the posterior would not change, would it?
Still assuming boy vs girls are completely iid and equally probable
two_handfuls · 1h ago
That's how I read it. What other ways were you thinking about?
bloak · 1h ago
Well, one way of getting families with two children, at least one of which is a girl, would be to go to a girls' school and ask the children to raise their hand if they have exactly one sibling.
aidenn0 · 1h ago
I would expect that would yield a 50% chance of the other being a girl, right?
jmount · 1h ago
Peter Winkler shares some great variations of this: "Boy Born on Tuesday" (p. xix) and "Men with Sisters" (p. xxii) in "Mathematical Puzzles".
"Mrs. Chance has two children of different ages. At least one of them is a boy born on Tuesday. What is the probability that both of them are boys?"
(note: it is a puzzle, not a biology or data demography problem. so there are 50/50 independence assumptions on gender and uniform day of week assumptions prior to adding the conditioning.)
layer8 · 51m ago
Here “on Tuesday” is ambiguous, in my opinion. I first thought it meant “on a Tuesday” and that it was just a diversion. But it is likely intended to mean “last Tuesday” or “this Tuesday” (which excludes the boy-then-girl case). Wording it more clearly would likely reduce the ratio of wrong answers.
Furthermore, “of different ages” is likely intended to exclude the case of twins. However, even with twins, one is generally nominally older than the other. (Not to mention that it’s possible for two non-twin siblings to be the same age in years, at certain points in time.) Why not just say “that aren’t twins”?
I loathe when logic puzzles are obscured by ambiguous language, turning them more into “gotcha” text interpretation riddles than logic puzzles.
jmount · 5m ago
Puzzles are definitely odd birds. I myself have gotten into a literal screaming match try to push my belief that they never should be used in interviews. The bulk of that was an interviewer said the interviewee was "clearly confused when they were asked a puzzle" yet refused to agree that may evidence the presentation of the puzzle may in fact be confusing (and not measuring anything).
I can't speak for Winkler, but both he and Jaynes implicitly separate the reading of the puzzle from the work. Winkler start his book with a few awful "reading trick ones", but in the explanations gives a few reading directions to try and avoid that going forward. I happen to know he meant "on a Tuesday." But a correct solution to a different read would be a correct solution even if it doesn't match the book text. I don't think he was trying to set a text trap, it is just hard to be clear, concise, and unambiguous at the same time. (Even "on a Tuesday" isn't completely clear if it means "all I am telling you was the day of week was Tuesday" versus "it was a very specific Tuesday, that I am not telling.")
stronglikedan · 42m ago
I agree with the first one, but age is measured in days at a minimum, so twins are always the same age. (I'm sure there are cases where they are born farther apart due to some issue with the pregnancy, but that is statistically insignificant here.)
layer8 · 34m ago
Even if you take days, one can be born at 23:58 and the other at 00:03 the next day. (And it could be New Year’s day — in some cultures that would even imply different ages in years.) Regardless of days, it’s not uncommon to talk about who is the older twin.
Of course colloquially twins are the same age, but we are talking about a mathematical puzzle about probabilities here, where precision is paramount.
jimmaswell · 48m ago
Why can't you just disregard the existing boy and reframe the question as the probability that the other child is a boy, and the space of all possible answers is BG and BB, equally probable (1/2)? Not really following explanations I find online.
joshuaissac · 14m ago
Because GB is also a possibility. You are not told that the existing boy is the elder child.
Rickasaurus · 1h ago
Great book, I highly recommend it too.
tocs3 · 2h ago
"This might seem abstract, but I've seen variations of this problem pop up in business and I've had difficult conversations with non-technical people as a result."
Does anyone have some real life examples? i cannot think of any off hand but would like to be able to cite a couple if someone says "So, what is this good for?".
No comments yet
tromp · 1h ago
Perhaps people would be more likely to give the correct answer if "at least one of them is a girl" is rephrased as the equivalent "the youngest is a girl or the oldest is a girl".
justonceokay · 1h ago
Well constructing a different question to remove the “trick” of the problem is one approach. Kind of the “no child left behind” approach to riddles.
jihadjihad · 50m ago
From the Wikipedia article linked in TFA:
> Following classical probability arguments, we consider a large urn containing two children.
I like how they modified a classic from probability texts, drawing items from an urn, and made sure it would be big enough in this example to accommodate two kids.
bitwize · 35m ago
This is better than an "assume a spherical cow" joke in the wild.
D13Fd · 43m ago
The "paradox" problem is in the setup. It's easy to mistake it as "a couple has one girl, what is the probability that their next child will be a girl," in which case the answer is 50%.
tantalor · 42m ago
If you allow misunderstanding the question, then any answer is allowed.
in_cahoots · 36m ago
But it's a valid point, the question is not well-posed. If you said, "I looked at both children and saw that at least one was a girl" more people would get the right answer. Many people will assume that the author looked at only one child, not both. And there's nothing in the wording to indicate either way.
As others are pointing out, this is just the Monty Hall problem. But the way the question is posed there is much clearer.
tantalor · 17m ago
I don't know how this could be made more clear:
"You're told that at least one of them is a girl"
> Many people will assume that the author looked at only one child
There is no mentioning of "looking"
Vermin2000 · 2h ago
The sisters paradox is madenning example of counter-intuitive probability. The resolution is straightforward, but it's really easy to get tied up in knots.
EMM_386 · 31m ago
I don't understand at all how this is maddening or counter-intuitive.
When you have a child, the odds are ~50% ... so the chance the next child is a boy or girl is almost equal. Is it because of the way it's framed that makes people think harder than they need to be?
This is like when I (very rarely) play something like "pick six".
I play 1, 2, 3, 4, 5, and 6. People think I'm crazy. They don't realize I have the same odds as any ticket they purchase.
bell-cot · 1h ago
> maddening example of counter-intuitive probability.
Not how I'd describe it. The setup is mundane enough for people to just assume that their intuition will work fine. The difference between the naive and correct answers is too small to spot in a small-n dataset. And ~0% of the population is actually familiar with analyzing such situations, for their "intuition" to be applicable.
It's a bit like Gell-Mann amnesia - people are too quick to apply an easy cognitive strategy, when (in theory) they know enough to rule that strategy out.
spadros · 37m ago
Yes, I found this one easy. Was surprised my data management intuition came back after all these years since school. There’s really only three options:
- boy - boy
- boy - girl
- girl - girl
So it must be 1/3 chance. If you’re looking at permutations in order, that’s a different question.
hdgvhicv · 1h ago
> Select only the families that have at least one girl.
That’s not what the first question said. The first question was select a family
(Bb,Bb,gb,gg)
Then that they happen to have a girl.
tantalor · 32m ago
I fail to see the distinction.
bitwize · 36m ago
Related to the Monty Hall "paradox". Spoiler: You'll get the car if you switch doors with 2/3 probability.
Is it? Would it change your intuition when I tell you "A couple has 100 kids, at least 99 are girls, what is the probability 100 are girls?"
I'm a bit at a loss I have to admit.
luxcem · 1h ago
The 'sample space' reduction method is indeed also used to solve the Monty Hall problem.
flappyeagle · 37m ago
It is exactly the same
pmg101 · 1h ago
GG / BG GB GG is 1 / 3.
What's the paradox?
AnotherGoodName · 46m ago
Because it's entirely dependent on sampling assumptions. Go to a random house where there's two children, one of which randomly opens the door. Each bb, gg, bg, gb is equal probability and a random child opens the door.
Now if you see a boy disregard that since you can't make the statement that one is a girl.
If you see a girl go ahead and make the statement "a family has two children. You're told that at least one of them is a girl.
What is the probability now?
You have twice the chance of making that statement if you encounter a gg family over a bg/gb family right since there's one of two girls possibly answering the door amongst those families.
So 50% chance of that statement being enabled from a gg family, 25% chance coming from a bg family, 25% chance of coming from a gb family. Which means 50% chance the other child's a girl and 50% chance the other childs a boy.
The probabilities here are entirely dependent on details of the sampling which is not made explicit here.
teekert · 1h ago
Paradoxes don't exist in reality (they do in hypothetical situations), so there is indeed no paradox as you correctly observe. Instead, most people answer this wrongly, for some reason. And for some reason we call situations where this happens "a paradox". Though I agree that we shouldn't.
Edit, ok, there are things like "This statement is false.", but we should perhaps stick to "self-referential problems" with those.
I think paradoxes just exist in our theories, languages, and formal systems when we make flawed assumptions or create inconsistent frameworks. But physical reality itself just is what it is - no contradictions, just phenomena we sometimes struggle to describe accurately.
If contradictions (paradoxes) can exist, then anything becomes possible through the principle of "explosion in logic". From a contradiction, any statement can be "proven" true. The whole foundation of rational thought would be undermined. Right?
luxcem · 1h ago
The Medical Test Paradox or what's that called do exist in the sense that when a test is positive for a rare disease we always run a second one.
teekert · 1h ago
To me that is not a paradox, just logical, with very low incidence you just find much more false positives than real positives. What is paradoxical about that?
It's why we don't screen for just any condition in the general population. I.e. we just do it for 65+ y/o's, 3 packs/day smokers because there we may actually find it worth the cost of the program.
There's no contradiction anywhere in this scenario, just people's incorrect intuitions meeting (mathematical) reality.
hammock · 1h ago
Confusing permutations and combinations
pmg101 · 49m ago
Oh yes. Strange.
lightvector · 1h ago
One of the challenges with puzzles like this it that it gives you "at least one of them is a girl" as a mathematical assertion where you're not supposed to further introspect the context of how/why you're being given that fact.
But that's unrealistic. In real life, the context for how and why there would be a speaker telling you such a thing in the first place can be relevant and affect the probability!
How is this possible? Suppose among all the math-riddle-loving parents of two children who would ask such a puzzle in the first place there are an equal number of parents of B-B, B-G, G-B, G-G, and that each is equally likely to ask you such a riddle when you meet them.
Suppose when asking such a riddle the B-B parents tell you "at least one of them is a boy" (they don't have any girls, so that's the only way they can ask this kind of riddle), the G-G parents tell you "at least one of them is a girl" (same thing but in reverse), while the B-G and G-B parents say one of "at least one of them is a boy" and "at least one of them is a girl" equally at random.
Then, conditioned on being told that "at least one of them is a girl", the probability of another girl is actually 1/2, not 1/3 like the paradox answer claims. To see this, imagine 40 examples of the above puzzle asking taking place. You get 10 B-B parents saying "at least one of them is a boy", 10 G-G parents saying "at least one of them is a girl", and among the 20 (B-G and G-B) parents since they choose randomly, you have 10 saying "at least one of them is a boy" and saying "at least one of them is a girl".
So out of the 20 times where "at least one of them is a girl" is said, there are 10 cases where it's a G-G family and 10 cases where it's a B-G or G-B family, therefore conditioned on being told "at least one of them is a girl", the probability of two girls is actually 1/2.
If there were some gender bias in how the B-G and G-B families might ask the question, or other differences that affect how likely different of these people would be posing the puzzle to you, then the probability could be yet different than either of 1/3 or 1/2.
So there's a difference in being present something as a flat mathematical assertion that you're supposed to take at face value and not supposed to question further (where the probability is 1/3, as the article claims). Versus being told something in real life, where you always need to take into account the context and situation of the speaker, and the probability could be different.
There are real life implications of this too - the big classic one being publication bias / newsworthiness bias. As most people intuitively know by now, it is also often wrong to take the statistical analysis or claims of a particular research study or paper entirely at face value, because there is a bias in the fact that "positive" and "exciting" results are more likely to be reported in the first place, and so statistical outliers that aren't actually replicable are disproportionately likely to be reported (see also https://xkcd.com/882/). And publication bias still occurs with respect to the reporting of results, amplification or not in the media etc, even when the the authors themselves are trustworthy and have done their analysis within the paper in a statistically proper way. So conditioned on you hearing about the result in the first place, it is often less likely to be true (and less likely to replicate in the future, etc) than you would think if you just took the statistical analysis in the paper at face value, even when that analysis was done correctly. The situation in the "sisters paradox" of computing a probability taking a statement entirely at logical face value is rare in real life.
fkyoureadthedoc · 1h ago
> a family has two children. You're told that at least one of them is a girl. What's the probability both are girls?
of those 3 cases, 1/3 are both girls
I go to a random house on a random street and knock on the door. A young girl opens the door. I ask how many siblings they have and they say one. What's the probability that they have a sister?
Now it's 50% even though cosmetically it seems like it'd be fair to say that the family has at least one daughter. The reason is that once I see a girl at the door, I'm slightly more confident in that it's a GG household since a GB or BG household would sometimes show a boy opening the door (assuming the two kids are equally likely to open the door).
P(GG | G at door) = P(G at door | GG) P(GG) / P(G at door)
P(G at door) = 1/2 (by symmetry)
So, P(GG | G at door) = 1 * 1/4 * 2 = 1/2
However, if the question is interpreted as "what's the probability of having two girls if we know there aren't two boys," then the event space is GB, BG, GG, and p(GG)/(p(GB) + p(BG) + p(GG)) = 1/3. Both GB and BG are in the event space because we are not conditioning on the sex of one specific child.
I.e. without "discarding", just giving some additional, but not complete, information on the random sample. Is adding information about the picked sample the same as discarding all contrarian samples? Why is this relevant?
"a random family has been sampled, the sample family has two childs, one of them is a girl"?
and
"a random family has been sampled, the sample family has two childs, one of them is a boy"?
and they selected each statement based on randomly picking a child from a random family then the probability actually becomes 50% boy/girl for the next child since the boy/boy or girl/girl has twice the chance of generating the above statement for the respective gender compared to the mixed gender children family.
Ie. if they say one is a girl that statement had a 50% chance of being generated by a girl/girl family (since we pick the statement based on a random selection of one of the two childrens gender and there's 2 girls, doubling the chance of a statement that one's a girl coming from a girl/girl family), there's 25% chance the statement was generated from a girl/boy family and a 25% chance the statement was generated from a boy/girl family.
If you take 50% chance girl/girl, 25% chance boy/girl and 25% girl/boy you'll see there's a 50/50 chance of the next child being either gender.
All this due to changing how we sampled.
> Assume the family is selected at random because they have at least one girl.
And then again, if they sampled all families with 2 children the posterior would not change, would it?
Still assuming boy vs girls are completely iid and equally probable
"Mrs. Chance has two children of different ages. At least one of them is a boy born on Tuesday. What is the probability that both of them are boys?"
(note: it is a puzzle, not a biology or data demography problem. so there are 50/50 independence assumptions on gender and uniform day of week assumptions prior to adding the conditioning.)
Furthermore, “of different ages” is likely intended to exclude the case of twins. However, even with twins, one is generally nominally older than the other. (Not to mention that it’s possible for two non-twin siblings to be the same age in years, at certain points in time.) Why not just say “that aren’t twins”?
I loathe when logic puzzles are obscured by ambiguous language, turning them more into “gotcha” text interpretation riddles than logic puzzles.
I can't speak for Winkler, but both he and Jaynes implicitly separate the reading of the puzzle from the work. Winkler start his book with a few awful "reading trick ones", but in the explanations gives a few reading directions to try and avoid that going forward. I happen to know he meant "on a Tuesday." But a correct solution to a different read would be a correct solution even if it doesn't match the book text. I don't think he was trying to set a text trap, it is just hard to be clear, concise, and unambiguous at the same time. (Even "on a Tuesday" isn't completely clear if it means "all I am telling you was the day of week was Tuesday" versus "it was a very specific Tuesday, that I am not telling.")
Of course colloquially twins are the same age, but we are talking about a mathematical puzzle about probabilities here, where precision is paramount.
Does anyone have some real life examples? i cannot think of any off hand but would like to be able to cite a couple if someone says "So, what is this good for?".
No comments yet
> Following classical probability arguments, we consider a large urn containing two children.
I like how they modified a classic from probability texts, drawing items from an urn, and made sure it would be big enough in this example to accommodate two kids.
As others are pointing out, this is just the Monty Hall problem. But the way the question is posed there is much clearer.
"You're told that at least one of them is a girl"
> Many people will assume that the author looked at only one child
There is no mentioning of "looking"
When you have a child, the odds are ~50% ... so the chance the next child is a boy or girl is almost equal. Is it because of the way it's framed that makes people think harder than they need to be?
This is like when I (very rarely) play something like "pick six".
I play 1, 2, 3, 4, 5, and 6. People think I'm crazy. They don't realize I have the same odds as any ticket they purchase.
Not how I'd describe it. The setup is mundane enough for people to just assume that their intuition will work fine. The difference between the naive and correct answers is too small to spot in a small-n dataset. And ~0% of the population is actually familiar with analyzing such situations, for their "intuition" to be applicable.
It's a bit like Gell-Mann amnesia - people are too quick to apply an easy cognitive strategy, when (in theory) they know enough to rule that strategy out.
- boy - boy
- boy - girl
- girl - girl
So it must be 1/3 chance. If you’re looking at permutations in order, that’s a different question.
That’s not what the first question said. The first question was select a family (Bb,Bb,gb,gg)
Then that they happen to have a girl.
https://en.m.wikipedia.org/wiki/Monty_Hall_problem
I'm a bit at a loss I have to admit.
What's the paradox?
Now if you see a boy disregard that since you can't make the statement that one is a girl.
If you see a girl go ahead and make the statement "a family has two children. You're told that at least one of them is a girl.
What is the probability now?
You have twice the chance of making that statement if you encounter a gg family over a bg/gb family right since there's one of two girls possibly answering the door amongst those families.
So 50% chance of that statement being enabled from a gg family, 25% chance coming from a bg family, 25% chance of coming from a gb family. Which means 50% chance the other child's a girl and 50% chance the other childs a boy.
The probabilities here are entirely dependent on details of the sampling which is not made explicit here.
Edit, ok, there are things like "This statement is false.", but we should perhaps stick to "self-referential problems" with those.
I think paradoxes just exist in our theories, languages, and formal systems when we make flawed assumptions or create inconsistent frameworks. But physical reality itself just is what it is - no contradictions, just phenomena we sometimes struggle to describe accurately.
If contradictions (paradoxes) can exist, then anything becomes possible through the principle of "explosion in logic". From a contradiction, any statement can be "proven" true. The whole foundation of rational thought would be undermined. Right?
It's why we don't screen for just any condition in the general population. I.e. we just do it for 65+ y/o's, 3 packs/day smokers because there we may actually find it worth the cost of the program.
There's no contradiction anywhere in this scenario, just people's incorrect intuitions meeting (mathematical) reality.
But that's unrealistic. In real life, the context for how and why there would be a speaker telling you such a thing in the first place can be relevant and affect the probability!
How is this possible? Suppose among all the math-riddle-loving parents of two children who would ask such a puzzle in the first place there are an equal number of parents of B-B, B-G, G-B, G-G, and that each is equally likely to ask you such a riddle when you meet them.
Suppose when asking such a riddle the B-B parents tell you "at least one of them is a boy" (they don't have any girls, so that's the only way they can ask this kind of riddle), the G-G parents tell you "at least one of them is a girl" (same thing but in reverse), while the B-G and G-B parents say one of "at least one of them is a boy" and "at least one of them is a girl" equally at random.
Then, conditioned on being told that "at least one of them is a girl", the probability of another girl is actually 1/2, not 1/3 like the paradox answer claims. To see this, imagine 40 examples of the above puzzle asking taking place. You get 10 B-B parents saying "at least one of them is a boy", 10 G-G parents saying "at least one of them is a girl", and among the 20 (B-G and G-B) parents since they choose randomly, you have 10 saying "at least one of them is a boy" and saying "at least one of them is a girl".
So out of the 20 times where "at least one of them is a girl" is said, there are 10 cases where it's a G-G family and 10 cases where it's a B-G or G-B family, therefore conditioned on being told "at least one of them is a girl", the probability of two girls is actually 1/2.
If there were some gender bias in how the B-G and G-B families might ask the question, or other differences that affect how likely different of these people would be posing the puzzle to you, then the probability could be yet different than either of 1/3 or 1/2.
So there's a difference in being present something as a flat mathematical assertion that you're supposed to take at face value and not supposed to question further (where the probability is 1/3, as the article claims). Versus being told something in real life, where you always need to take into account the context and situation of the speaker, and the probability could be different.
There are real life implications of this too - the big classic one being publication bias / newsworthiness bias. As most people intuitively know by now, it is also often wrong to take the statistical analysis or claims of a particular research study or paper entirely at face value, because there is a bias in the fact that "positive" and "exciting" results are more likely to be reported in the first place, and so statistical outliers that aren't actually replicable are disproportionately likely to be reported (see also https://xkcd.com/882/). And publication bias still occurs with respect to the reporting of results, amplification or not in the media etc, even when the the authors themselves are trustworthy and have done their analysis within the paper in a statistically proper way. So conditioned on you hearing about the result in the first place, it is often less likely to be true (and less likely to replicate in the future, etc) than you would think if you just took the statistical analysis in the paper at face value, even when that analysis was done correctly. The situation in the "sisters paradox" of computing a probability taking a statement entirely at logical face value is rare in real life.
> A simpler question
> Let's image you're asked a simpler question.
> A family has two children. What's the probability both are girls?