Let's Encrypt stopped its certificate expiration email notification service a while ago, and I hadn't found a replacement yet. As a result, I didn't receive an expiration notice this time and failed to renew my certificate in advance. The certificate expired today, making my website inaccessible. I logged into my VPS to renew it manually, but the process failed every time. I then checked my cloud provider's platform and saw a notification at the top, which made me realize the problem was with the certificate provider. A quick look at Hacker News confirmed it: Let's Encrypt was having an outage. I want to post this news on my website, but I can't, because my site is down due to the expired certificate.
mixdup · 9h ago
They have been communicating the ending of the email notices for quite a while and have been telling users that you should have some other monitoring in place to avoid just this situation
andrewmcwatters · 8h ago
Yes, but what’s weird is the recommended service they referred people to for new email notifications was not… sending me emails.
So, what gives?
kevincox · 8h ago
Yeah the recommended service is awful and not nearly as useful as the one they had is.
Which is disappointing because you should be able to recreate the service they had nearly exactly with certificate transparency logs.
CamperBob2 · 9h ago
Also, beware of the leopard.
jojobas · 8h ago
If you didn't see their sunset notification emails you wouldn't have seen your cert expiration email either.
Jedd · 7h ago
> Let's Encrypt stopped its certificate expiration email notification service a while ago, and I hadn't found a replacement yet.
This sounds like an easy problem to identify root cause for.
I think I received about 15 'we're disabling email notifications soon' emails over the past several months - one of which was interesting, but none were needed, as I'd originally set this up, per documentation, to auto-renew every 30 days.
Perhaps create a calendar reminder for the short term?
attentive · 1h ago
Monitoring the health of your site is your job.
You should have it on auto-renewal anyway.
You can grab a cert from ZeroSSL and probably some others.
You can also get 1year cert from aws for like $15 though I'd stick with auto renews.
Haven't they always, from day one, insisted that their primary goal was to encourage (force) automation of certificate maintenance, as a mechanism to make tls ubiquitous (mandatory everywhere)?
ffsm8 · 3h ago
Yes, we had lengthy discussions in itops (I had a admin role when LE was launched) about it.
The team lead couldn't get over the slogan "devops, automating downtimes since 2010" whenever someone wanted to add a new nonessential automation that does things on prod servers.
I mean he wasn't completely wrong, it was a non essential automation with high risk and very little reward (<1h saved every 2 yrs), which is why we never switched to LE for our main site, only internal tooling was allowed to use it
cpach · 3h ago
Perhaps you know this already but in the future, certs issued by a “real” CA will not be allowed to live for more than 47 days.
I was merely retelling an anecdote about how LE was always positioned to be exclusively about refreshing certs automatically, though. As I've moved out of (dev-)ops roles around 2016/2017 so I'm really not up to date with operations topics
Because you're not supposed to rely on emails. You should have an automated certificate renewal in place. I'm under the impression that Let's Encrypt wants to reduce certificate validity even further from the current 90 days.
compumike · 9h ago
Oof, you're right, that's rough that it's so soon after they discontinued their email service!
I wrote this blog post a few weeks ago: "Minimal, cron-ready scripts in Bash, Python, Ruby, Node.js (JavaScript), Go, and Powershell to check when your website's SSL certificate expires." https://heiioncall.com/blog/barebone-scripts-to-check-ssl-ce... which may be helpful if you want to roll your own.
(Disclosure: at Heii On-Call we also offer free SSL certificate expiration monitoring, among other things.)
mr_toad · 8h ago
Isn’t the recommended practice to update every ~60 days or so, regardless?
cpach · 2h ago
Either that, or use an ACME client that has support for ARI so that the CA can signal to the client when it’s time to renew.
> As a result, I didn't receive an expiration notice this time and failed to renew my certificate in advance.
Shouldn't that happen automatically a bit beforehand?
Kholin · 9h ago
Due to some legacy reasons, my service runs using a docker + nginx setup. However, certbot was initially used in its native nginx mode to generate the certificate, which prevented it from auto-renewing. I later switched it to standalone mode, but I'm not sure if I configured the auto-renewal correctly. In any case, the certificate happened to expire today, and it didn't renew automatically. On a side note, I was actually planning to see what an expired website certificate looked like first and then deal with the auto-renewal issue. After all, it's just a small hobby website, so it's not that big of a deal.
dylan604 · 8h ago
that sounds like a you're holding it wrong type of a situation to me. a major point of Let's Encrypt (besides the obvious free) is that it deliberately keeps the cert times short to avoid the "someone that no longer works here set this up two years ago" type of situation with certbot checking twice a day and updating when necessary. so to break what Let's Encrypt is doing with not using certbot definitely feels like you're holding it wrong
rfv6723 · 7h ago
I use self-hosted gatus to monitor my certs and other services' status.
It can send alerts to multiple alerting providers.
If it's a personal website you should consider HTTP+HTTPS. It offers the best of both worlds and your website would always be accessible even if some third party CA is not (or if there's some local issue, or if the HTTP client connecting has cert issues). MITM attacks on personal websites are extremely, extremely rare.
greyface- · 9h ago
Good time to note that Buypass offers free certificates over ACME. I have a few of my domains configured to use them instead of LetsEncrypt, just for redundancy and to ensure I have a working non-LE cert source in case LE suffers problems like this over a longer time period.
Example OpenBSD /etc/acme-client.conf:
authority buypass {
api url "https://api.buypass.com/acme/directory"
account key "/etc/acme/buypass-privkey.pem"
contact "mailto:youremail@example.com"
}
domain example.com {
domain key "/etc/ssl/private/example.com.key"
domain full chain certificate "/etc/ssl/example.com.pem"
sign with buypass
}
CGamesPlay · 8h ago
This is neat. Does cert-manager have facilities to automatically use a fallback ACME provider, so I could automate using this? I'd also accept a pool of ACME providers, but a priority ordering seems ideal. I don't see the functionality listed anywhere, maybe there's some security argument that this is a bad idea?
attentive · 1h ago
caddy will auto-issue/renew LE or ZeroSSL depending on availability
ninjin · 8h ago
Cheers! They look like decent chaps and also outside the US for some additional certificate diversity. Are there other trustworthy Acme issuers out there?
A pity that acme-client(1) does not allow for fallbacks, but I will add a mental note about it being an easy enough patch to contribute if I ever find the time.
grodriguez100 · 2h ago
ZeroSSL works very well for me. I found it because it is now the default for the acme.sh client.
pgporada · 9h ago
It's DNS, we're working on it. Sorry, thank you for bearing with us.
Titan2189 · 9h ago
It's not DNS
There's no way it's DNS
It was DNS
deadbabe · 9h ago
Five stages of DNS outage:
1. Denial: It’s not DNS.
2. Anger: What the fuck is it!
3. Bargaining: Maybe it’s a firewall, or Cloudflare!
4. Depression: We’ve checked everything…
5. Acceptance: It’s DNS.
trinsic2 · 6h ago
LOL. I just went though this the other day. my site was intermittently non-accessible. DNS was the last thing I thought it was until I ran a crawler on my site and spotted some 404 errors. Found that my non-www. url was pointed at the wrong IP and I forgot to update it when I transfered my domain to a new host.
woleium · 9h ago
That ttl is a killer, eh?
senectus1 · 8h ago
whoa whoa whoa.. slow down! you dont just leap to "It's DNS"... you have to try to blame everything else first before you get to DNS. it's like foreplay!
dylan604 · 8h ago
when all of the interns have jumped around the corner before the blame hammer was wielded, you have to move to the next item on the list
adamcharnock · 8h ago
It's always either DNS or MTU.
(Or, as I recently encountered, it can also be a McAfee corporate firewall trying to be helpful by showing a download progress bar in place of an HTTP SSE stream. I was sure that was being caused by MTU, but alas no.)
jasonthorsness · 9h ago
Mostly this should be a non-event due to renewal long before expiration? Although huge deal I suppose for services that require issuing new certifications constantly; Let's Encrypt would be major failure mode for them.
I encountered this while trying to issue a new certificate for a service. As a temporary fix, started using ZeroSSL which conveniently also supports the ACME protocol. While not a big problem, if you have something like `cert-manager` being used on Kubernetes, then it requires quite a bit of reconfiguration, and you may spend a couple hours trying to figure out why a certificate hasn't been issued yet.
That said, I'm unbelievably grateful for the great product (and work!) LetsEncrypt has provided for free. Hope they're able to get their infrastructure back up soon.
jasonthorsness · 9h ago
Let's Encrypt was a huge deal right from the beginning. They truly moved the web forward.
Truly was a radical advancement. Makes me wonder, a decade from now what will it be that we look back upon with a similar perspective?
pepa65 · 2h ago
From the announcement:
Subscribers will be able to optin to short-lived certificates via a certificate profile mechanism being added to our ACME API.
We hope to make short-lived certificates generally available by the end of 2025.
The earliest short-lived certificates we issue may not support IP addresses, but we intend to enable IP address support by the time short-lived certificates reach general availability.
sugarpimpdorsey · 9h ago
I'm sure those six-day lifetime certificates will work out real nice.
ocdtrekkie · 9h ago
I think I am going to become a fan of shorter certificate lifetimes because as soon as the chuckleheads in the CAB truly break the Internet on the level they are pushing for, the sooner we get to discard the entire PKI dumpster fire.
NooneAtAll3 · 9h ago
what's the alternative to PKI?
haiku2077 · 8h ago
Certainly something a hell of a lot simpler then x509 - and without assumptions from the 1990s hardcoded into it
cpach · 8m ago
Is it really X.509 that is the big “problem”? If so, I fail to see how.
We're seeing a lot of downstream effects of this at StatusGator. Of course any provider that relies on LetsEncrypt to issue certs (such as Heroku) is affected.
One notable exception is Cloudflare: They famously no longer rely solely on LetsEncrypt.
wnevets · 9h ago
This is the first time I remember something like this ever happening with LetsEncrypt
bethekidyouwant · 10h ago
I hope no one was migrating infra EOD West Coast
jaeh · 9h ago
can't be it, it's not friday.
keysdev · 9h ago
Shall we have some way of freely encrypting the web that is relying on one authority?
Especially something that needed to be renewed every 90 or is it 40 days now. How about issuing 100 years certificates as a default?
kyrra · 9h ago
Many of the cloud providers give free certs via acme.
The browsers and security people have been pushing towards shorter certs, not longer ones. Knowing how to rotate a cert every year, if not shorter, helps when your certificate or any of your parent certs are compromised and require an emergency rotation.
CGamesPlay · 8h ago
Does AWS provide something similar? I found ACM "exportable certificates", but that involves AWS managing your private key.
schoen · 8h ago
Last I knew, AWS would issue a free certificate to people using certain AWS services, but, as you say, only if Amazon is managing the private key. You can also use ACM APIs to import keys and certificates from other CAs.
RiverCrochet · 8h ago
Long expiration times = compromised certs that hang around longer than they should. It's bad.
Note that you can make your own self-signed CA certificate, create any server and client certificates you want signed with that CA cert, and deploy them whenever and wherever you want. Of course you want the root CA private key securely put somewhere and all that stuff.
The only reason it won't work at large without a bit of friction is because your CA cert isn't in the default trusted root store of major browsers (phone and PC). It's easy enough to add it - it does pop up warnings and such on Windows, Android, iOS and hopefully Mac OS X, but they're necessary here.
No, it's not going to let the whole world do TLS with you warning-free without doing some sort of work, but for small scales (the type that Let's Encrypt is often used for anyway) it's fine.
Marsymars · 9h ago
> Shall we have some way of freely encrypting the web that is relying on one authority?
Caddy uses ZeroSSL as a fallback if Let’s Encrypt fails!
gregsadetsky · 9h ago
But it's not on by default, right..? (i.e. is there a particular config needed for that?)
I'm using Caddy here and it's not falling back on ZeroSSL. Thanks for your help
This is largely not an issue thanks to ACME which they spearheaded. You can use multiple providers as backup options.
Also, you have days to weeks of slack time for renewals. The only real impact is trying to issue new certs if you are solely dependent on LE.
Dylan16807 · 9h ago
Revocation doesn't work well, so we're simplifying and relying on expiration for that. So no to the super long certs.
0xbadcafebee · 9h ago
The bigger question that's going unasked: what the hell is the point of an expiration date if it keeps getting shorter? At some point we will refresh the cert every second.
The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM, they can keep MITMing successfully until the cert the hacker gives to the clients expires (or was revoked by something like OCSP, assuming the client verifies OCSP). A very long expiration is very bad because it means the hacker could keep MITMing for years.
The way things like this work with modern security is ephemeral security tokens. Your program starts and it requests a security token, and it refreshes the token over X time (within 24 hrs). If a hacker gets the token, they can attack using it until 1) you notice and revoke the existing tokens AND sessions, or 2) the token expires (and we assume they for some reason don't have an advanced persistent threat in place).
Nobody puts any emphasis on the fact that 1) you have to NOTICE THE ATTACK AND REVOKE SHIT for any of these expirations to have any impact on security whatsoever, and 2) if they got the private key once, they can probably get it again after it expires, UNLESS YOU NOTICE AND PLUG THE HOLE. If you have nothing in place to notice a hacker has your private key, and if revocation isn't effective, the impact is exactly the same whether expiration is 1 second or 1 year.
How many people are running security scans on their whole stack every day? How many are patching security holes within a week? How many have advanced software designed to find rootkits and other exploits? Or any other measure to detect active attacks? My guess is maybe 0.0001% of you do. So you will never know when they gain access to your certs, so the fast expiration is mostly pointless.
We should be completely reinventing the whole protocol to be a token-based authorization service, because that's where it's headed. And we should be focusing more on mitigating active exploits rather than just hoping nobody ever exploits anything. But that would scare people, or require additional work. So instead we let like 3 companies slowly do whatever they want with the entire web in an uncoordinated way. And because we let them do whatever they want with the web, they keep introducing more failure modes and things get shittier. We are enabling the enshittification happening in front of our eyes.
schoen · 8h ago
The other benefit of expiration dates in a PKI is in case the subject information is no longer accurate.
In old-school X.509 PKI this might be "in case this person is no longer affiliated with the issuer" (for organizational PKI) or "in case this contact information for this person is otherwise no longer accurate".
In web PKI this might be "in case this person no longer controls this domain name" or "in case this person no longer controls this IP address".
The key-compromise issue you mention was more urgent for the web PKI before TLS routinely used ciphersuites providing forward secrecy. In that case, a private key compromise would allow the attacker to passively decrypt all TLS sessions during the lifetime of that private key. With more modern ciphersuites, a private key compromise allows the attacker to actively impersonate an endpoint for future sessions during the lifetime of that private key. This is comparatively much less catastrophic.
0xbadcafebee · 4h ago
TLS 1.0, 1.1 and 1.2 are still in use, despite 1.0 and 1.1 being deprecated, and only 1.3 requires forward secrecy. So any attacker that can MITM can just force a protocol that doesn't require forward secrecy.
In terms of "no longer controls this domain name", or "no longer controls this IP address", there are a raft of other issues related to this that expiration doesn't cover:
- Does the real domain owner still have a DNS record pointing to an IP address they no longer own? If yes, attacker that now has that IP can serve valid TLS.
- Does the attacker control either the registrar account, or the name server account, or can poison DNS, or an HTTP server, or an email server, or BGP? If yes, the attacker can make new certs.
There's so many holes in TLS it's swiss cheese. Expiration as security is like a cardboard box as a bulletproof vest. Yet that cardboard box is so bulky and cumbersome it makes normal life worse.
tialaramex · 8h ago
> The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM
Nope. So all that happened here is that you were wrong.
XorNot · 9h ago
You've always been able to do this. Whether its useful to your clients has always been the problem.
In a practical sense you likely wouldn't like the alternatives, because for most people's usage of the internet there's exactly one authority which matters: the local government, and it's legal system - i.e. most of my necessary use of TLS is for ecommerce. Which means the ultimate authority is "are you a trusted business entity in the local jurisdiction?"
Very few people would have any reason to ever expand the definition beyond this, and less would have the knowledge to do so safely even if we provided the interfaces - i.e. no one knows what safety numbers in Signal mean, if I can even get them to use Signal.
progmetaldev · 9h ago
Maybe I'm misinterpreting this, but local government's legal system is not the "one authority which matters." What local government is able to keep up to date on TLS certificates?
Your users that visit your website and get a TLS warning are the authority to worry about, if you're running a business that needs security. Depending on what you're selling, that one user could be a gigantic chunk of your business. Showing your local government that you have a process in place to renew your TLS certificates, and your provider was down is most likely going to be more than enough to indemnify you for any kind of maliciousness or ignorance (ignorantia juris non excusat). Obviously, different countries/locations have varying laws, but I highly doubt you'd be held liable for such a major outage for a company that is in such heavy use. Honestly, if you were held liable, or think you would be for this type of event, I'd think twice about operating from that location.
benlivengood · 8h ago
Hopefully the thundering herd when service is restored doesn't knock things offline again. I know LE designs for huge throughput (something like 3X total outstanding certificates in 24 hours, at one point) and the automated client recommendations for backoff are pretty good, but there will be a lot of manual applications/renewals I'm sure.
bravetraveler · 4h ago
Well, that does it: certificate lifetimes are even shorter now.
If only the same zest applied to probes
adamsiem · 9h ago
I thought I got rate-limited. Bad timing to spin up a new service.
phillipseamore · 9h ago
I want DANE!
rocqua · 9h ago
That ship has sailed. DNSsec is not liked even a little bit.
Given that control over DNS is how domain validated certs are handed out, it would make a lot of sense to cut out the middle man.
But DNS does not have a good reliable authenticated transport mechanism. I wonder if there was a way to build this that would have worked.
phillipseamore · 8h ago
My biggest problem is how centralized issuance is.
Half the year I live on an island that is reliant on submarine cables and has historically had weeks and months long outages and with a changing world I suspect that might become reality once again. Locally this wasn't much of an issue, the ccTLD continues to function, most services (but now about 35%) are locally hosted. Then HTTPS comes along. Zero certificates could be (re-)issued during an outage. A locally run CA isn't really an option (standalone simply isn't feasible and getting into root stores takes time and money), so you are left with teaching users to ignore certificate errors a few weeks into an extended outage.
I could see someone like LE working with TLD registrars to enable local issuance (with delegated/sub-CA certificates restricted to the TLD), that could also mitigate problems like today (decentralize issuance) and the registrars are already the primary source of truth for DV validation.
ocdtrekkie · 6h ago
Realistically there's no reason except Google retaining centralized control of the Internet for there to be a specific group of trusted CAs that meet Google's arcane specifications which can issue certificates the entire world trusts.
Your registrar should be able to validate your ownership of the domain, ergo your registrar should be your CA. Instead of a bunch of arbitrary and capricious rules to be trusted, a CA should not be "trusted" by the browser, but only able to sign certificates for domains registered to it.
phillipseamore · 6h ago
s/Google/Apple, Google, Microsoft, and Mozilla/
ocdtrekkie · 2h ago
Not in any realistic way, no. Because Chrome is by far the majority of the market, so what Google ships is what is available on the web. If Google unilaterally decides it is going to distrust a CA, it doesn't really matter who else does or not, the CA is dead.
Not that the other parties are that independent anyways: Microsoft's browser is a Google fork, and is wholly dependent on it. Mozilla's entire funding is Google. Apple is arguably the only somewhat independent party here, but that multibillion dollar annual search deal... let's say it incentivizes collaboration.
rob_c · 10h ago
Did the LLM delete this as well?
burnte · 9h ago
Either that or someone took Ambien last night, that seems to make people do crazy mistakes. ;)
88j88 · 9h ago
The response he received had a correction to the code that the user did not expect.
So, what gives?
Which is disappointing because you should be able to recreate the service they had nearly exactly with certificate transparency logs.
This sounds like an easy problem to identify root cause for.
I think I received about 15 'we're disabling email notifications soon' emails over the past several months - one of which was interesting, but none were needed, as I'd originally set this up, per documentation, to auto-renew every 30 days.
Perhaps create a calendar reminder for the short term?
You should have it on auto-renewal anyway.
You can grab a cert from ZeroSSL and probably some others.
You can also get 1year cert from aws for like $15 though I'd stick with auto renews.
The team lead couldn't get over the slogan "devops, automating downtimes since 2010" whenever someone wanted to add a new nonessential automation that does things on prod servers.
I mean he wasn't completely wrong, it was a non essential automation with high risk and very little reward (<1h saved every 2 yrs), which is why we never switched to LE for our main site, only internal tooling was allowed to use it
https://www.digicert.com/blog/tls-certificate-lifetimes-will...
I was merely retelling an anecdote about how LE was always positioned to be exclusively about refreshing certs automatically, though. As I've moved out of (dev-)ops roles around 2016/2017 so I'm really not up to date with operations topics
I wrote this blog post a few weeks ago: "Minimal, cron-ready scripts in Bash, Python, Ruby, Node.js (JavaScript), Go, and Powershell to check when your website's SSL certificate expires." https://heiioncall.com/blog/barebone-scripts-to-check-ssl-ce... which may be helpful if you want to roll your own.
(Disclosure: at Heii On-Call we also offer free SSL certificate expiration monitoring, among other things.)
https://letsencrypt.org/2024/04/25/guide-to-integrating-ari-...
Shouldn't that happen automatically a bit beforehand?
It can send alerts to multiple alerting providers.
https://github.com/TwiN/gatus
1. https://github.com/louislam/uptime-kuma
2. https://github.com/caronc/apprise
Example OpenBSD /etc/acme-client.conf:
A pity that acme-client(1) does not allow for fallbacks, but I will add a mental note about it being an easy enough patch to contribute if I ever find the time.
There's no way it's DNS
It was DNS
1. Denial: It’s not DNS.
2. Anger: What the fuck is it!
3. Bargaining: Maybe it’s a firewall, or Cloudflare!
4. Depression: We’ve checked everything…
5. Acceptance: It’s DNS.
(Or, as I recently encountered, it can also be a McAfee corporate firewall trying to be helpful by showing a download progress bar in place of an HTTP SSE stream. I was sure that was being caused by MTU, but alas no.)
As they move to shorter-lifetime certs (6 days now https://letsencrypt.org/2025/01/16/6-day-and-ip-certs/?utm_s...) this puts it in the realm of possibility that an incident could impact long-running services.
That said, I'm unbelievably grateful for the great product (and work!) LetsEncrypt has provided for free. Hope they're able to get their infrastructure back up soon.
Here is the HN announcement: https://news.ycombinator.com/item?id=8624160
Announcement "animated" https://hn.unlurker.com/replay?item=8624160
Subscribers will be able to optin to short-lived certificates via a certificate profile mechanism being added to our ACME API.
We hope to make short-lived certificates generally available by the end of 2025.
The earliest short-lived certificates we issue may not support IP addresses, but we intend to enable IP address support by the time short-lived certificates reach general availability.
https://zerossl.com/ (90 days)
https://www.buypass.com/ (180 days)
One notable exception is Cloudflare: They famously no longer rely solely on LetsEncrypt.
Especially something that needed to be renewed every 90 or is it 40 days now. How about issuing 100 years certificates as a default?
https://cloud.google.com/certificate-manager/docs/public-ca-... (EDIT: Google is their own CA, with https://pki.goog/ )
The browsers and security people have been pushing towards shorter certs, not longer ones. Knowing how to rotate a cert every year, if not shorter, helps when your certificate or any of your parent certs are compromised and require an emergency rotation.
Note that you can make your own self-signed CA certificate, create any server and client certificates you want signed with that CA cert, and deploy them whenever and wherever you want. Of course you want the root CA private key securely put somewhere and all that stuff.
The only reason it won't work at large without a bit of friction is because your CA cert isn't in the default trusted root store of major browsers (phone and PC). It's easy enough to add it - it does pop up warnings and such on Windows, Android, iOS and hopefully Mac OS X, but they're necessary here.
No, it's not going to let the whole world do TLS with you warning-free without doing some sort of work, but for small scales (the type that Let's Encrypt is often used for anyway) it's fine.
Caddy uses ZeroSSL as a fallback if Let’s Encrypt fails!
I'm using Caddy here and it's not falling back on ZeroSSL. Thanks for your help
EDIT: hmm, it should be automatic...! https://caddyserver.com/docs/automatic-https#issuer-fallback interesting, I'll double check my config
woah... it's probably related to this! https://github.com/caddyserver/caddy/issues/7084 TLDR: "Caddy doesn't fall back to ZeroSSL for domains added using API" (which is my case)
Also, you have days to weeks of slack time for renewals. The only real impact is trying to issue new certs if you are solely dependent on LE.
The whole point of the expiration is in case a hacker gets the private key to the cert and can then MITM, they can keep MITMing successfully until the cert the hacker gives to the clients expires (or was revoked by something like OCSP, assuming the client verifies OCSP). A very long expiration is very bad because it means the hacker could keep MITMing for years.
The way things like this work with modern security is ephemeral security tokens. Your program starts and it requests a security token, and it refreshes the token over X time (within 24 hrs). If a hacker gets the token, they can attack using it until 1) you notice and revoke the existing tokens AND sessions, or 2) the token expires (and we assume they for some reason don't have an advanced persistent threat in place).
Nobody puts any emphasis on the fact that 1) you have to NOTICE THE ATTACK AND REVOKE SHIT for any of these expirations to have any impact on security whatsoever, and 2) if they got the private key once, they can probably get it again after it expires, UNLESS YOU NOTICE AND PLUG THE HOLE. If you have nothing in place to notice a hacker has your private key, and if revocation isn't effective, the impact is exactly the same whether expiration is 1 second or 1 year.
How many people are running security scans on their whole stack every day? How many are patching security holes within a week? How many have advanced software designed to find rootkits and other exploits? Or any other measure to detect active attacks? My guess is maybe 0.0001% of you do. So you will never know when they gain access to your certs, so the fast expiration is mostly pointless.
We should be completely reinventing the whole protocol to be a token-based authorization service, because that's where it's headed. And we should be focusing more on mitigating active exploits rather than just hoping nobody ever exploits anything. But that would scare people, or require additional work. So instead we let like 3 companies slowly do whatever they want with the entire web in an uncoordinated way. And because we let them do whatever they want with the web, they keep introducing more failure modes and things get shittier. We are enabling the enshittification happening in front of our eyes.
In old-school X.509 PKI this might be "in case this person is no longer affiliated with the issuer" (for organizational PKI) or "in case this contact information for this person is otherwise no longer accurate".
In web PKI this might be "in case this person no longer controls this domain name" or "in case this person no longer controls this IP address".
The key-compromise issue you mention was more urgent for the web PKI before TLS routinely used ciphersuites providing forward secrecy. In that case, a private key compromise would allow the attacker to passively decrypt all TLS sessions during the lifetime of that private key. With more modern ciphersuites, a private key compromise allows the attacker to actively impersonate an endpoint for future sessions during the lifetime of that private key. This is comparatively much less catastrophic.
In terms of "no longer controls this domain name", or "no longer controls this IP address", there are a raft of other issues related to this that expiration doesn't cover:
- Does the real domain owner still have a DNS record pointing to an IP address they no longer own? If yes, attacker that now has that IP can serve valid TLS.
- Does the attacker control either the registrar account, or the name server account, or can poison DNS, or an HTTP server, or an email server, or BGP? If yes, the attacker can make new certs.
There's so many holes in TLS it's swiss cheese. Expiration as security is like a cardboard box as a bulletproof vest. Yet that cardboard box is so bulky and cumbersome it makes normal life worse.
Nope. So all that happened here is that you were wrong.
In a practical sense you likely wouldn't like the alternatives, because for most people's usage of the internet there's exactly one authority which matters: the local government, and it's legal system - i.e. most of my necessary use of TLS is for ecommerce. Which means the ultimate authority is "are you a trusted business entity in the local jurisdiction?"
Very few people would have any reason to ever expand the definition beyond this, and less would have the knowledge to do so safely even if we provided the interfaces - i.e. no one knows what safety numbers in Signal mean, if I can even get them to use Signal.
Your users that visit your website and get a TLS warning are the authority to worry about, if you're running a business that needs security. Depending on what you're selling, that one user could be a gigantic chunk of your business. Showing your local government that you have a process in place to renew your TLS certificates, and your provider was down is most likely going to be more than enough to indemnify you for any kind of maliciousness or ignorance (ignorantia juris non excusat). Obviously, different countries/locations have varying laws, but I highly doubt you'd be held liable for such a major outage for a company that is in such heavy use. Honestly, if you were held liable, or think you would be for this type of event, I'd think twice about operating from that location.
If only the same zest applied to probes
But DNS does not have a good reliable authenticated transport mechanism. I wonder if there was a way to build this that would have worked.
Half the year I live on an island that is reliant on submarine cables and has historically had weeks and months long outages and with a changing world I suspect that might become reality once again. Locally this wasn't much of an issue, the ccTLD continues to function, most services (but now about 35%) are locally hosted. Then HTTPS comes along. Zero certificates could be (re-)issued during an outage. A locally run CA isn't really an option (standalone simply isn't feasible and getting into root stores takes time and money), so you are left with teaching users to ignore certificate errors a few weeks into an extended outage.
I could see someone like LE working with TLD registrars to enable local issuance (with delegated/sub-CA certificates restricted to the TLD), that could also mitigate problems like today (decentralize issuance) and the registrars are already the primary source of truth for DV validation.
Your registrar should be able to validate your ownership of the domain, ergo your registrar should be your CA. Instead of a bunch of arbitrary and capricious rules to be trusted, a CA should not be "trusted" by the browser, but only able to sign certificates for domains registered to it.
Not that the other parties are that independent anyways: Microsoft's browser is a Google fork, and is wholly dependent on it. Mozilla's entire funding is Google. Apple is arguably the only somewhat independent party here, but that multibillion dollar annual search deal... let's say it incentivizes collaboration.