Show HN: Dystopian chat where communication is edited and paraphrased by AI (dystochat.skyshelf.app)

One thing I'd love to see is dynamic CPU allocation or otherwise something similar to Jenkin's concept of a flyweight runner. Certain pipelines can often spend minutes to hours using zero CPU just polling for completion (e.g. CloudFormation, hosted E2E tests, etc.) In these cases I'd be charged for 2 vCPUs but use almost nothing.

Otherwise, the customers are stuck with the same sizing/packing/utilisation problems. And imagine being the CI vendor in this world: you know which pipeline steps use what resources on average (and at the p99), and with that information you could over-provision customer jobs so that you sell 20 vCPUs but schedule them on 10 vCPUs. 200% utilisation baby!

hinkley · 124d ago

I had a service that was used to do a bunch of compute at deployment time but even with the ramp up in deployment rates anticipated by the existence of the tool, we had machines that were saturated about 6 hours a month, 12 at the outside. The amount of hardware we has sitting around for this was initially equivalent to about 10% of our primary cluster, and I got it down to about 3%.

But at the end of that project I realized that all this work could have been done on a CI agent if only they had more compute on them. My little cluster was still almost the size of the build agent pool tended to be. If I could convince them to double or quadruple the instance size on the CI pipeline I could turn these machines off entirely, which would be a lower total cost at 2x and only a 30% increase at 4, especially since some builds would go faster resulting in less autoscaling.

So if one other team could also eliminate a similar service, it would be a huge win. I unfortunately did not get to finish that thought due to yet another round of layoffs.

matt-p · 124d ago

I'm sure they're doing this, they'd be mad not to - firecracker has cgroup support.

arccy · 124d ago

i think cloudflare workers does this

shadowgovt · 124d ago

Interesting writeup. I wonder somewhat what this looks like from the customer side; one downside I've observed with some serverless in the past is that it can introduce up-front latency delays as the system spins up support to handle your spike. I know the CI consensus seems to be that latency matters little in a process that's going to take a long time to run to completion anyway... But I'm also a developer of CI, and that latency is painful during a tight-loop development cycle.

(The good news is that if the spikes are regular, a sufficiently-advanced serverless can "prime the pump" and prep-and-launch instances into surplus compute before the spike since historical data suggests the spike is coming).

aayushshah15 · 124d ago

> one downside I've observed with some serverless in the past is that it can introduce up-front latency delays as the system spins up support to handle your spike

[cofounder of blacksmith here]

This is exactly one of the symptoms of running CI on traditional hyperscalers we're setting out to solve. The fundamental requirement for CI is that each job requires its own fresh VM (which is unlike traditional serverless workloads like lambdas). To provision an EC2 instance for a CI job:

- you're contending against general on-demand production workloads (which have a particular demand curve based on, say, the time of day). This can typically imply high variance in instance provisioning times.

- since AWS/GCP/Azure deploy capacity out as spot instances with a guaranteed pre-emption warning, you're also waiting for the pre-emption windows to expire before a VM can be handed to you!

shadowgovt · 124d ago

Excellent! I did some work in the past on prediction of behavior given past data, and I can tell you two things we learned:

- there are low-frequency and high frequency effects (so you can make predictions based on last week, for example, but those predictions fall flat if the company rushes launches at the EOQ or takes the last couple weeks in December off).

- you can try to capture those low-frequency effects, but in practice we consistently found that comprehension by end-users beat out a high-fidelity model, and users were just not going to learn an idea like "you can generate any wave by summing two other waves." The user feedback they got was that they consistently preferred the predictive model being a very dumb "The next four weeks look like the past four weeks" and an explicit slider to flag "Christmas is coming: we anticipate our load to be 10% of normal" (which can simultaneously tune the prediction for Christmas and drop Christmas as an outlier when making future predictions). When they set the slider wrong they'd get the wrong predictions, but they were wrong predictions that were "their fault" and they understood; they preferred wrong predictions they could understand to less-wrong predictions they had to think about Fourier analysis to understand.

LunaSea · 124d ago

> The fundamental requirement for CI is that each job requires its own fresh VM (which is unlike traditional serverless workloads like lambdas). To provision an EC2 instance for a CI job

Is this different to lambdas or ECS services due to the need to setup a VM / container and nested virtualisation / Docker-in-Docker is not supported?

TuringTest · 124d ago

Back in the ancient era of the mainframes, this "multitenancy" concept would have been called "time sharing".

It looks like everything old is new again.

notyourwork · 123d ago

If you stay around long enough the pendulum swings full circle.

hkt · 123d ago

Fascinating how the wording has changed from sharing to tenancy. Maybe reading a bit more into it, but isn't it funny how the modern word evokes the landlord/tenant relationship?

(I'm reading into this further than it needs to go for fun, primarily)

andrewstuart · 124d ago

It’s a common refrain on HN this thing is the same as something old. Dagnab those young folks!

nitwit005 · 124d ago

People trying to market things create a lot of new terms. They don't want to seem like they're selling the same old thing, but something new and innovative.

Occasionally, the new term is warranted, of course, but that's far less common than simply trying to appear different.

Imustaskforhelp · 123d ago

> Crud app

> Slaps cryptocurrency sticker in 2019-2021 era

Gets 1 million $ funding

> in 2025, slaps AI powered sticker

Gets 10 million $ funding.

But its still a crud app nonetheless.

I know it sounds really over the line example but I am sure that there are examples like this where the same thing gets some new terms and it gets a lot of funding that is, there is an incentive to put on new stickers.

The goal is not to appear different, the goal is probably profit, which they can get if they can get better funding I suppose, and they get better funding by slapping stickers.

morkalork · 124d ago

It pretty much is the same, the only change is the level of abstraction. Apparently the easiest thing for everyone is just giving the user access to the whole damn OS via a container, rather than have them deal with vendor specific mainframe minutea.

pphysch · 124d ago

Yeah, isn't this just HPCaaS, with an emphasis on CI workloads?

latchkey · 115d ago

Seems amazing given that most HPC is locked up in super computers.

hinkley · 124d ago

I was kind of disappointed the first time I saw an IBM mainframe and it kinda just looked like a rack of servers. To be fair, it was taking up a little bit of a server room that had clearly been designed for a larger predecessor and now almost had enough free space for a proper game of ping pong.

Hyperscaler rack designs definitely blur this line further. In some ways I think Oxide is trying to reinvent the mainframe, in a world where the suppliers got too much leverage and started getting uppity.

toast0 · 123d ago

My school district growing up had a decently sized Unisys mainframe. While I was working there, they upgraded to a new machine.

The old machine was about desk height and 15 feet wide. The new machine was a 4u box running a Unisys mainframe emulator on NT 4, on a quad processor pentium pro. Pretty sad. But they did keep the giant line printer, at least while I was working there.

esseph · 123d ago

Nah, oxide is standardizing pluggable, scalable, rack-unit API driven local cloud, with extremely tight integration that nobody else has largely except Apple.

There's different takes on it, that's just mine. I really appreciate and respect their work.

hinkley · 123d ago

They aren’t standardizing the way the PC was standardized. They’re making custom hardware with custom connectors. Their parts are interchangeable with themselves but the rack is the unit of delivery and operation. Their software is closer to standardizing in the normal sense of the word.

That’s a mainframe, sport. At their height they were modular and in at least IBM’s case they could run with damaged parts and were delivered with dark hardware that could replace damaged parts until a maintenance person could arrive, or be remotely enabled to increase throughput for a fee.

All of this cloud stuff, except the geographical redundancy parts, is recreating software that business had versions of forty years ago.

steveklabnik · 122d ago

It really depends on what you mean by “mainframe.” Architecturally, we are nothing like mainframes. But for what mainframe seems to mean for you, I can see your perspective.

(While the rack is the unit of delivery overall, we can ship individual sleds and they’re operator replaceable, if say, one of your sleds dies, incidentally.)

esseph · 121d ago

Some of us still touch mainframes, almost daily.

jeffreygoesto · 124d ago

Oh the memory. IBM3090 MVS with TSO...

Havoc · 124d ago

Surprised they’re doing fixed leases. I would have thought a fixed base with a layer of spot priced VMs for peaks would be more efficient on cost

matt-p · 124d ago

Outside of the big clouds just buying a 1 Year lease (say) on a dedicated server is so cheap that you'd not be saving much vs spot instances and with spot instances you need code to manage this and you're introducing risk of slowdowns. Probably not worth the trade off.

To illustrate a 128GB ram 20 core server with a 10Gbps NIC and some small SSD storage is probably going to cost you <$2000 USD for a years rental.

Havoc · 124d ago

They've got usage that plummets 80% 2 days a week and the other 5 have a broad predictable time based pattern where usage drops ~66% judging by graph.

If that works out to same prices as keeping compute at literally your peak requirement level round the clock then something is very wrong somewhere. Maybe that issue is not in-house at blacksmith - perhaps spot pricing is a joke...but something there doesn't check out.

Loads of companies do scaling with much less predictable patterns.

>risk of slowdowns

Yeah you do probably want the scaling to be super conservative...but -80% fluctuation is a comically large gap to not actively scale

>To illustrate

Better view I'd say is: That chart looks like ~4.5 peak. So you're paying for 730 hours of peak capacity and using all of it about 90 hrs.

Given that they wrote a blog about this topic they probably have a good reason for doing it this way. Just doesn't really make sense to me based on given info

matt-p · 124d ago

I think you're reading a graph for a single tenant not overall infrastructure.

m7i.4xlarge on AWS spot price right now is $0.39/hour whereas renting the server is about half that per hour.

Havoc · 123d ago

No, I'm not. I'm looking at the 5 day average across fleet graph right at the bottom. That shows very roughly 2/3 drop from peak to lowest, while the 80% is from the text as fleet wide.

>whereas renting the server is about half that per hour.

If you're at capacity only 90 out of 730 a month then paying 2x for spot to cover those peaks is a slam dunk

tsaifu · 124d ago

yeah, like the others have said, the tradeoff isn't really worth it for us as a business. spot instances also generally come with low qos guarantees (since they tend to be interruptible). tbf there are on-demand alternatives with better guarantees though

another thing to note is that we bootstrap the hosts, and tune them a decent amount, to support certain high-performance features which takes time and makes control + fixed-term ownership desirable

[disclaimer: i work at blacksmith]

whizzter · 124d ago

If their businessmodel is high performance runners and cheap cost they probably don't want to budge on speed, and once renting something fast on the cloud the costs run up quickly enough that they are probably just better off with a few more machines that pay themselves over time.

coolcase · 124d ago

Was thinking about this exact thing today. Where I work combining X services from their own scaling sets to pack them together into a kubernetes cluster (or similar tech) should "smooth out" the spikes relatively and reduce wastage and also need to scale. This is on cloud so no fixed hardware concern but even then it helps with reserve instances, discounts and keeping cost down generally. This was intuition but I might math the maths on it now inspired by this.

jillesvangurp · 123d ago

For the last two years, I've been running github actions like this:

- start a paused vm in google cloud

- run the build there (via a gcloud ssh command) and capture the output

- pause the vm after it is done

Takes about 4-5 minutes. Maybe a few dozen times per month. It's a nice fast machine with lots of CPU and memory. Would cost a small fortune to run 24x7. It would cost more than it costs to run our entire production environment. But a few hours of build time per month barely moves the needle.

Our build and tests max out those CPUs. We only pay for the minutes it is running. Without that it would takw 2-3 times as long. And it would sometimes fail because some of our async tests time out if they take too long.

It's not the most elegant thing I've ever done but it works and hasn't failed me in the two years I've been using this setup.

But it's also a bit artificial because bare metal is cheaper and it runs 24x7. The real underlying issue is the vastly inflated price of virtual machines cloud providers rent out vs. the cost of the hardware that powers them. The physical servers pay themselves back within weeks of coming online. Everything after is pure profit.

latchkey · 115d ago

We are building something like this for AMD MI300x GPU CI workloads. Except that every time someone shuts down a VM, we start a fresh one; hot spares cuts down your start up time.

We take on the small fortune expense and multi tenancy does the rest. The plan is to offer this as inexpensive as possible (far cheaper that deploying and running this type of compute yourself), so that it is a no brainer to take advantage of.

Imustaskforhelp · 123d ago

very interesting idea!

How much cheap is this as compared to github actions?

also why are you using gcloud? would certain other competitors like aws/(hetzner? if we are talking about vps) also suit the case.

I would love it if you could write a blog post about it.

jillesvangurp · 123d ago

We don't really pay for gh actions; we're staying below the freemium limit.

We use gcloud for convenience. Our production environment is there. So spinning up a vm is easy. Our builds also deploy there so we need gcloud credentials in our gh actions anyway. It only runs for a few hours per month in total. So the cost isn't very high. A few dollar at most.

No time for blog posts but feel free to adapt my gh action: https://gist.github.com/jillesvangurp/cccf5f9d61f4b457a994dc...

It basically runs a script on the vm. Should be fairly easy to adapt. There's a bit of bash in there that waits for the machine to come up before it does the ssh command that runs the build script.

solatic · 123d ago

Great write-up. OP's next steps are probably to offer off-peak capacity at a discount. If you pass along some of the savings to your customers, they'll happily take their once-daily or once-weekly jobs and make sure they get scheduled off-peak instead of getting scheduled at some time that was either random or arbitrary. Win-win.

mgaunard · 124d ago

In my experience scaling dynamically just makes things slower and it doesn't reduce costs significantly compared to having dedicated resources.

Resourcing dynamically is also difficult because you don't actually know upfront how many resources your CI needs.

notyourwork · 123d ago

It can when workloads are relatively predictable day-over-day but have low lows and high peaks. For example, My team has a service which has a daily traffic peak that is >5x our traffic min.

Scaling on traffic and resource demand gives us an increased average utilization rate for the hardware we pay for. Especially when peaks are short lived, an hour out of 24 for example.

mlhpdx · 124d ago

Everyone doing multi-tenant SaaS wants cost to be a sub-linear function of usage. This model of large unit capacity divided by small work units is an example of how to get there. The tough bit is that it’s stepwise at low volumes, and becomes linear at large scale, so it’s only magic during the growth phase — which is pretty solid for a growth phase company showing numbers for the next raise.

hinkley · 124d ago

Something for nothing or the Tragedy of the Commons. Many want a fair division of the cost but an unfair portion of the shared resource, subsidized by people who have not figured out how to minmax their slice of the pie. Doesn’t work when several clever people share the same resource pool.

mlhpdx · 123d ago

I’m not sure I get the connection? Designing systems for sub-linear cost scaling is just an engineering problem.

Aissen · 123d ago

> We have a fleet of hundreds of bare-metal gaming CPUs

Curious what you use and why. Larger datacenter CPUs have a steep entry price, but usually better economics. Also, don't trust the public pricing — it's totally broken in this industry, unfortunately.

0xbadcafebee · 124d ago

tl;dr for this particular case it's bin packing

other business cases have economics where multitenancy has (almost) nothing to do with "efficient computing", and more to do with other efficiencies, like human costs, organizational costs, and (like the other post linked in the article) functional efficiencies

__turbobrew__ · 123d ago

Am I reading this right, they are going to rack their own servers for this business?

If I were them I would be looking at renting from bargin bin hosting providers like hetzner or ovh to run this on. The great thing is that hetzner also has a large pool of racked servers that you can tap into.

You are basically going to re-implement hetzner at a smaller (and probably worse) scale by creating your own multitenant mini cloud for running these ci jobs.

Free advice: set up a giant kubernetes cluster on hetzner/ovh, use gvisor runtime for isolation, submit ci workloads as k8s jobs, give the k8s jobs different priority classes based on job urgency and/or some sort of credit system, jobs will naturally be executed/preempted based upon priority.

There you go, that is the product using nearly 100% existing and open source software.

Imustaskforhelp · 123d ago

Yeah, your goal is pretty nice if you want to open source blacksmith at their level, but I think most people would be pretty happy with just a hetzner vm using act https://github.com/nektos/act if they want github actions, or jenkins.

I think we can rent hetnzer vms on a per hour basis or maybe we can't , but I do know that there are services like (linode?) I guess, which use a per second model.

Combine that with I think automatic installation of act and you pay for per second use of your CI.

Plus points if we can use criu to scale from lower end machines to higher end machines depending upon the task while continuing the task from where it was left.

Show HN: AI-powered web service combining FastAPI, Pydantic-AI, and MCP servers (github.com)

Show HN: Daffodil – Open-Source Ecommerce Framework to connect to any platform (github.com)

Show HN: Blocks – Dream work apps and AI agents in minutes (blocks.diy)

Show HN: MCP Server Installation Instructions Generator (hyprmcp.com)

Show HN: Datadef.io – Canvas for data lineage and metadata management (datadef.io)

Show HN: Semlib – Semantic Data Processing (github.com)

Show HN: I reverse engineered macOS to allow custom Lock Screen wallpapers (cindori.com)

Show HN: A Daily Typing Challenge in the TUI (github.com)

Show HN: Omarchy on CachyOS (github.com)

Show HN: Allzonefiles.io – download 307M registered domain names (allzonefiles.io)

Show HN: Dagger.js – A buildless, runtime-only JavaScript micro-framework (daggerjs.org)

Show HN: A store that generates products from anything you type in search (anycrap.shop)

Show HN: Building an open-source agentic terminal (davehudson.io)

Show HN: Interactive news headline generator compatible with i3/sway (github.com)

Show HN: Demochain, a toy blockchain network that runs on the browser (github.com)

Show HN: Small Transfers – charge from 0.000001 USD per request for your SaaS (smalltransfers.com)

Show HN: Vicinae – A native, Raycast-compatible launcher for Linux (github.com)

Show HN: Ultraplot – A succint wrapper for matplotlib (github.com)

Show HN: PaperSync, making ArXiv papers collaborative (hackcmu25.vercel.app)

Show HN: I made a generative online drum machine with ClojureScript (dopeloop.ai)

Show HN: CLAVIER-36 – A programming environment for generative music (clavier36.com)

Show HN: Term.everything – Run any GUI app in the terminal (github.com)

Show HN: Building a Deep Research Agent Using MCP-Agent (thealliance.ai)

Show HN: GitHub repo with 180 tools for investing (github.com)

Show HN: Open-source business management tool for small business (github.com)

Show HN: EpicPSA – Create PSA's for any message (epicpsa.com)

Show HN: I made pgdbtemplate to cut PostgreSQL test time by 1.5x using templates (github.com)

Show HN: TailGuard – Bridge your WireGuard router into Tailscale via a container (github.com)

Show HN: Making a cross-platform game in Go using WebRTC Datachannels (pion.ly)

Show HN: Bottlefire – Build single-executable microVMs from Docker images (bottlefire.dev)

Show HN: C++ Compiler Support Page (cppstat.dev)

Show HN: Aris – a free AI-powered answer engine for kids (aris.chat)

Show HN: An MCP Gateway to block the lethal trifecta (github.com)

Show HN: Worried about your pet? Health assessments with instant answers (petcheckai.com)

Show HN: Haystack – Review pull requests like you wrote them yourself (haystackeditor.com)

Show HN: YC Startup Map – A Map Visualization of the YC Startup Directory (ycstartupmap.com)

Show HN: From selling AI to QA teams to building a smooth test-management app (tester.desplega.ai)

Show HN: HumbleOp – A debate platform where every post ends in a one-on-one duel

Show HN: TNX API – Natural Language Interaction with Databases, Now Open Source (github.com)

Show HN: DWS OS, a Plan 9 Inspired Web “OS” (dws.rip)

Show HN: Dystopian chat where communication is edited and paraphrased by AI (dystochat.skyshelf.app)

Show HN: wcwidth-o1 – Find Unicode text cell width in no time for JavaScript/TS (github.com)

Show HN: Paasword – a password vault that never stores your passwords (github.com)

Show HN: Navly – Curated Directory for the Latest AI Websites and Tools (navly.org)

Show HN: PipelinePlus – plug-and-play MediatR pipeline behaviors for .NET (github.com)

Show HN: Kodosumi – Open-source runtime for AI agents (kodosumi.io)

Show HN: I made a script that gives me fake calls to escape boring moments

Show HN: MemoryMe: An effort to beat Cognitive Decline (shraddhabuiltitwithai.com)

Show HN: Pbar.io – Distributed progress bars that work in terminals and browsers (pbar.io)

Show HN: Vue-Markdown-render – up to 100× faster streaming Markdown for Vue 3 (github.com)

How the economics of multitenancy work

Comments (46)