RIP Shunsaku Tamiya, the man who made plastic model kits a global obsession (japanesenostalgiccar.com)

More likely that their core database hit some scaling limit and fell over. Their status page talks constantly about them working with their "upstream database provider" (presumably AWS) to find a fix.

My guess. They use AWS hosted Postgresql and autovacuuming fell permanently behind without them noticing, and can't keep up with organic growth, and they can't scale vertically because they already maxed that out before. So they have to do crash migrations of data off their core DB which is why it's taking so long.

esafak · 11h ago

If so it is probably a good time to apply for an SRE position there unless they really do not get it!

acedTrex · 11h ago

An outage of this magnitude is almost ALWAYS the direct and immediate fault of senior leaderships priorities and focus. Pushing too hard in some areas, not listening to engineers on needed maintenance tasks etc.

AsmodiusVI · 9h ago

And engineers never are the cause of mistakes? There can't possibly be any data to back up that major outages are more often caused by leadership. I've been in SIEs simply because someone pushed a network outage to a switch network. Statements like these only go to show how much we have to learn, humble ourselves, and stop blaming others all the time.

acedTrex · 8h ago

PROLONGED outages are a failure point that more often than not, require organizational dysfunction to happen.

AlotOfReading · 9h ago

Leadership can include engineers responsible for technical priorities. If you're down for that long though, it's usually an organizational fuck-up because the priorities didn't include identifying and mitigating systemic failure modes. The proximate cause isn't all that important and the people who set organizational priorities are by-and-large not engineers.

bravesoul2 · 7h ago

Think of airplane safety. I think it is similar. A good culture can make sure $root-cause is more likely detected, tested, isolated, monitored, easy to roll back and so on.

nusl · 12h ago

My sympathy for those in the mud dealing with this. Never a fun place to be. Hope y'all figure it out and manage to de-stress :)

mattbillenstein · 11h ago

We're sorry https://www.youtube.com/watch?v=9u0EL_u4nvw

Edit, an outage of this length smells of bad systems architecture...

hinkley · 11h ago

Prediction: Someone confidently broke something, then confidently 'fixed' it, with the consequence of breaking more things instead. And now they have either been pulled off of the cleanup work or they wish they had been.

bravesoul2 · 12h ago

Wow >31h I am surprised they couldnt rebuild their entire systems in parallel on new infra in that time. Can be hard if data loss is invokved tho (a guess). Would love to see the post mortem so we all can learn.

stackskipton · 11h ago

I doubt it’s infra failure but software failure. Their bad design has caught up and they can’t toss more hardware for some reason. Most companies have this https://xkcd.com/2347/ in their stack and it’s fallen over.

wavemode · 12h ago

CEO's statement: https://www.reddit.com/r/webflow/comments/1mcmxco/from_webfl...

progbits · 12h ago

> 99.99%+ uptime is the standard we need to meet, and lately, we haven’t.

Four nines is not what I would be citing at this point. (That's less than an hour per year, so they burned that for next three decades)

Maybe aim for 99% first.

Otherwise a pretty honest and solid response, kudos for that!

zamadatix · 11h ago

One could have nearly 3 such incidents per year and still have hit 99%.

I always strive for 7 9s myself, just not necessarily consecutive digits.

manquer · 9h ago

It could be consecutive too and even start with a 9 and be all nines here you go : 9.9999999%

Spivak · 12h ago

I strive for one 9, thank you. No need to overcomplicate. We use Lambda on top of Glacier.

jeeyoungk · 12h ago

why go for 9's when you can go for 8s? you can aim for 88.8888888!

hinkley · 11h ago

There's an old rant I cannot find at the moment that argued that most systems that believe they are 5 9's are really more like 5 8's.

hnlmorg · 11h ago

Hit that and you also master time travel.

aspenmayer · 5h ago

I know you were going for a BTTF reference, but a Primer (2004) reference would be a better fit for a VC forum.

https://en.wikipedia.org/wiki/Primer_(film)

theideaofcoffee · 11h ago

Lots get starry-eyed and aim for five nines right out of the gate where they should have been targeting nine fives and learning from that. Walk before you run.

edoceo · 12h ago

Interesting the phrase "I'm sorry" was in there. Almost feels like someone in the Big Chair taking a bit of responsibility. Cheers to that.

thih9 · 11h ago

> Change controls are tighter, and we’re investing in long-term performance improvements, especially in the CMS.

This reads as if overall performance was an afterthought and this doesn’t seem practical; it should be a business metric, it is important to the users after all.

Then again, it’s easy to comment like this in hindsight. We’ll see what happens long term.

newZWhoDis · 11h ago

As a former webflow customer I can assure you performance was always an afterthought.

stackskipton · 11h ago

I mean, if customers don’t leave them over this, higher ups likely won’t care after dust settles.

bravesoul2 · 7h ago

Decent update. Guess people are really waiting for a fix tho!

willejs · 11h ago

Hugops to the people working on this for the last 31+ hours. Running incidents of this significance is hard, draining and requires a lot of effort, this going on for so long must be very difficult for all involved.

bravesoul2 · 6h ago

Hopefully they are rotating teams not people staying awake for a dangerous amount of time.

dangoodmanUT · 12h ago

Hugs for their SREs sweating bullets rn

sangeeth96 · 12h ago

Hugs to the ones dealing with this and the users of Webflow who invested in them for their clientele. Hoping they'll release a full postmortem once the sky clears up.

betaby · 11h ago

I'm more surprised that WordPress-like platforms are profitable businesses in 2025.

bravesoul2 · 6h ago

Because imagine your local biz can either pay a designer 1k a year or DIY and pay godaddy 200 bucks. Or 30 bucks for Wordpress and 20 hours of fiddling and asking their cousin for help.

Its not great by our standards but I bet many of us drink the house wine not something more sophisticated, right :)

bogzz · 11h ago

Why? Genuinely asking. Did you mean because there are free alternatives to self-host? I don't think that it would be so easy for someone in the market for a WYSIWYG blog builder to set everything up themselves.

betaby · 11h ago

Exactly. Because of the abundance of the one-click deploy WordPress offerings from value providers like OVH / Hetzner I would think margins are very low for WYSIWYG site builders.

esseph · 10h ago

Decent demand just awful margins.

And most non-tech (and many in tech) have never heard about OVH/Hetzner.

newZWhoDis · 11h ago

We moved away from webflow because it was slow (got the nickname web-slow internally).

Plus, despite marketing begging for the WYSIWYG interface they actually weren't creative enough to generate new content at a pace that required it.

We massively increased conversion rates by going full native and having 1 Engineer churn out parts kits/kitbash LPs from said kits.

Scale for reference: ~$10M/month

wewewedxfgdf · 11h ago

Companies get very good at handling disasters - after the disaster has happened.

dylan604 · 11h ago

The problem is they get good in that specific disaster. They can only plug a hole in the dike after the hole exists, then they look at the hole and make a plug the exact shape of that hole. The next hole starts the process over for it specifically. Each time. There's no generic plug that can be used each time. So sure, the get very good at making specific plugs. They never get to the point of making a better dike that doesn't spring so many leaks.

wewewedxfgdf · 11h ago

It is the job of the CTO to ensure the company has anticipated as many as possible such situations.

It's not a very interesting thing to do however.

dylan604 · 11h ago

okay. and? the CTO isn't the last word in anything. if they are overruled to keep releasing new features, acquiring new users/clients, sales forward dev cycles, then the whole thing has potential to collapse under the weight of itself.

It's actually the job of the CEO to keep all of the c-suite people doing jobs. Doesn't seem to stop the CEO salary explosions.

wewewedxfgdf · 11h ago

I think we are agreed.

Companies, after a disaster, focus lots of effort on that particular disaster, leaving all the other potential disasters unplanned for.

If you work at Webflow, you can anticipate LOTS of work in disaster recovery in the next 12 months. This has magically become a high priority for the CEO, who previously wanted features more than disaster recovery planning.

They will wait to focus massive resources on their security until after they get hacked.

esseph · 11h ago

You just described every company.

(And also why security is always a losing battle)

plutaniano · 11h ago

Will the company survive long enough to produce a postmortem?

chupchap · 11h ago

Bring back Failwhale

ChrisArchitect · 12h ago

Incident link: https://status.webflow.com/incidents/0xg8xq3l0h0q

nojs · 11h ago

Wow, that whole page does not inspire confidence. It’s 99% LLM slop.

What We’re Doing:

-We are making ongoing adjustments to our infrastructure to improve stability and ensure reliable scaling under elevated load

-Analyzing system patterns and optimizing backend processes where resource contention is highest

-Implementing protective measures to safeguard platform integrity

esseph · 11h ago

Expect every thing you read from here on our to be "AI Slop".

It's not going to get better in any way.

Marciplan · 10h ago

y’all relax they are vibe coding the fix right now

ActionHank · 11h ago

So now they’re Webno?

pton_xd · 12h ago

Claude, here is the bug, fix it. This is the new log output, fix the error. Fix the bug. Try a different approach. Reimplement the tests you modified. The bug is still happening, fix it. Fix the error.

We're out of credits, create a new account. We've been API rate limited? When did that start happening? When are we going to get access again?

Good luck engineers of the future!

lgl · 11h ago

Comment of the year 2025! Thanks for that :D

ed_mercer · 11h ago

You forgot to add “think hard!” :)

esafak · 11h ago

And a subtle threat: "... or else".

zvmaz · 11h ago

How do you know?

troyvit · 11h ago

More like "Good luck users of the future" that have to wade through failing infrastructure and tools that were vibe coded to begin with, rate limits notwithstanding.

xyst · 11h ago

I have no clue of "webflow" purpose based on it's marketing/buzzword filled landing page, but seems it's just a "no code" abstraction on top of HTML/CSS?

yet another SaaS that really does not need to be online 24/7. It could have been a simple app where you could "no code" on local machine and async state with webflow servers.

douglee650 · 57m ago

It's painful to use, but lets non-technical clients edit copy and create content in a safe environment. There's a runtime CMS types creator and a WYSIWYG html editor with facility for code blocks from global to inline scopes. Also comes with batteries included deploy. It's basically a one or two levels higher Squarespace/Wix

dylan604 · 11h ago

if you have a web based SaaS, everyone gets the updates. if you have a "simple app", then you are dependent on all of the users being up to date which you just cannot guarantee. also, what is a "simple app" that does not care about differences among various OSes found in the wild? how large of a team do you need for each of those OSes to support as wide of a user base as a web only app?

esseph · 11h ago

Cost of having a reliable product with some self-determination for the customer.

dylan604 · 10h ago

the customer can self-determine just fine using a web based SaaS no-code website builder. it's not like this is a different type of app. the thing is making a website that is more also hosted by the maker of the app. if you want to make a website to host on your own servers, then you are not the target audience of the web app.

you're like the person complaining that the hammer isn't very useful for driving in the screw. you need a different tool/app if you want to make a site you host yourself

M8.7 earthquake in Western Pacific, tsunami warning issued (earthquake.usgs.gov)

Study mode (openai.com)

RIP Shunsaku Tamiya, the man who made plastic model kits a global obsession (japanesenostalgiccar.com)

Launch HN: Hyprnote (YC S25) – An open-source AI meeting notetaker

URL-Driven State in HTMX (lorenstew.art)

iPhone 16 cameras vs. traditional digital cameras (candid9.com)

Sleep all comes down to the mitochondria (science.org)

Learning basic electronics by building fireflies (a64.in)

ACM Transitions to Full Open Access (acm.org)

Two Birds with One Tone: I/Q Signals and Fourier Transform (wirelesspi.com)

Show HN: The Aria Programming Language (github.com)

Show HN: Cant, rust nn lib for learning (github.com)

Analoguediehard (analoguediehard.com)

USB-C for Lightning iPhones (obsoless.com)

How the brain increases blood flow on demand (hms.harvard.edu)

A major AI training data set contains millions of examples of personal data (technologyreview.com)

FoundationDB: From idea to Apple acquisition [video] (youtube.com)

Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL (github.com)

Show HN: I built an AI that turns any book into a text adventure game (kathaaverse.com)

Irrelevant facts about cats added to math problems increase LLM errors by 300% (science.org)

A month using XMPP (using Snikket) for every call and chat (2023) (neilzone.co.uk)

My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air) (simonwillison.net)

Structuring large Clojure codebases with Biff (biffweb.com)

Elements of System Design (github.com)

Observable Notebooks 2.0 Technology Preview (observablehq.com)

Playing with more user-friendly methods for multi-factor authentication (tesseral.com)

Microsoft Flight Simulator 2024: WebAssembly SDK (docs.flightsimulator.com)

CodeCrafters (YC S22) is hiring first Marketing Person (ycombinator.com)

Supervised fine tuning on curated data is reinforcement learning (arxiv.org)

Seriously, Why Do Some AI Chatbot Subscriptions Cost More Than $200? (wired.com)

The Sail instruction-set semantics specification language (alasdair.github.io)

More honey bees dying, even as antibiotic use halves (news.uoguelph.ca)

Playing with Open Source LLMs (alicegg.tech)

Maru OS – Use your phone as your PC (maruos.com)

Show HN: A GitHub Action that quizzes you on a pull request (github.com)

Linux Performance Analysis (2015) (netflixtechblog.com)

Dropbox Passwords discontinuation (help.dropbox.com)

A grand tour through the essays of Lewis H. Lapham (laphamsquarterly.org)

Measuring Engineering (fffej.substack.com)

Project Zero – Policy and Disclosure: 2025 Edition (googleprojectzero.blogspot.com)

Revontuli Colorscheme Collection (codeberg.org)

The Making of Dario Amodei (bigtechnology.com)

Phenome-wide analysis of diseases in relation to sleep traits (spj.science.org)

The Convenience Trap: Why Seamless Banking Access Can Turn 2FA into 1FA (blog.opencore.ch)

Most Watched Software Engineering Talks Of 2025 (so far) (techtalksweekly.io)

Pkgbase Removes FreeBSD Base System Feature (lists.freebsd.org)

Microsoft Introduces 'Copilot Mode' in Edge (blogs.windows.com)

Enough AI copilots, we need AI HUDs (geoffreylitt.com)

Cucumber lets you write automated tests in plain language (cucumber.io)

The Saltgator: A Desktop SoftGel Injection Molding Machine (core77.com)

Webflow Down for >31 Hours

Comments (62)