Show HN: Vacation Maximizer – Maximize PTO by taking days off around holidays (vacation-maximizer.com)

Small scale: a project is almost small enough to run the build and tests locally, but you still want to have a consistent environment and avoid "works on my machine" problems.

Large scale: a project is so large that you need to leverage remote, distributed computing to run everything with a reasonable feedback loop, ideally under 10 minutes.

The opposite ends of the spectrum warrant different solutions. For small scale, actually being able to run the whole CI stack locally is ideal. For large scale, it's not feasible.

> A CI system that’s a joy to use, that sounds like a fantasy. What would it even be like? What would make using a CI system joyful to you?

I spent the past few years building RWX[1] to make a CI system joyful to use for large scale projects.

- Local CLI to read the workflow definitions locally and then run remotely. That way can you test changes to workflow definitions without having to commit and push.

- Remote breakpoints to pause execution at any point and connect via ssh, which is necessary when running on remote infrastructure.

- Automatic content-based caching with sandboxed executions, so that you can skip all of the duplicative steps that large scale CI otherwise would. Sandboxing ensures that the cache never produces false positives.

- Graph-based task definitions, rather than the 1 job : 1 VM model. This results in automatic and maximum parallelization, with no redundancy in setup for each job.

- The graph based model also provides an improved retry experience, and more flexibility in resource allocation. For example, one task in the DAG can crank up the CPU and memory without having to run more resources for downstream tasks (steps in other platforms).

We've made dozens of other improvements to the UX for projects with large build and test workflows. Big engineering teams love the experience.

[1] https://rwx.com

kuehle · 5h ago

> I find the most frustrating part of using CI to be to wait for a CI run to finish on a server and then try to deduce from the run log what went wrong. I’ve alleviated this by writing an extension to rad to run CI locally: rad-ci.

locally running CI should be more common

__MatrixMan__ · 3h ago

Agreed. I've got a crazy idea that I think might help...

Most tests have a step where you collect some data, and another step where you make assertions about that data. Normally, that data only ever lived in a variable, so it is not kept around for later analysis. All you get when you're viewing a failed test is logs with either exception or a failed assertion. It's not enough to tell a full story, and I think this contributes to the frustration you're talking about.

I've been playing with the idea that all of the data generation should happen first (since it's the slow part), it then gets added to a commit (overwriting data from the previous CI run) and then all of the assertions should run afterwards (this is typically very fast).

So when CI fails, you can pull the updated branch and either:

- rerun the assertions without bothering to regenerate the data (faster, and useful if the fix is changing an assertion)

- diff the new data against data from the previous run (often instructive about the nature of the breakage)

- regenerate the data and diff it against whatever caused CI to fail (useful for knowing that your change will indeed make CI happy once more)

Most tools are uncomfortable with using git to transfer state from the failed CI run to your local machine so you can just rerun the relevant parts locally, so there's some hackery involved, but when it works out it feels like a superpower.

everforward · 1h ago

I've poked at this a few times, and I think it breaks down for CI that needs/wants to run integration tests against other services. Eg it wants to spin up a Postgres server to actually execute queries against.

Managing those lifetimes is annoying, especially when it needs to work on desktops too. On the server side, you can do things like spin up a VM that CI runs in, use Docker in the VM to make dependencies in containers, and then delete the whole VM.

That's a lot of tooling to do locally though, and even then it's local but has so many abstractions that it might as well be running in the cloud.

popsticl3 · 2h ago

I've been using brisk to run my CI from my local machine (so it runs in the cloud from my local terminal). The workflow is just a drop in replacement for local running. They've recently changed their backend and it seems to be working pretty smoothly. It works very well with AI agents too that are running in the terminal - they can run my tests for me if they make a change and it doesn't kill my machine.

andrewaylett · 4h ago

Hear, hear.

Although I'd slightly rephrase that to "if you don't change anything, you should end up running pretty much the same code locally as in CI".

GitHub Actions is really annoying for this, as it has no supported local mode. Act is amazing, but insufficient: the default runner images are huge, so you can't use the same environment, and it's not supported.

Pre-commit on the other hand is fantastic for this kind of issue, as you can run it locally and it'll fairly trivially run the same checks in CI as it does locally. You want it to be fast, though, and in practice I normally wind up having pre-commit run only cacheable tests locally and exclude any build and test hooks from CI because I'll run them as separate CI jobs.

I did release my own GHA action for pre-commit (https://github.com/marketplace/actions/cached-pre-commit), because the official one doesn't cache very heavily and the author prefers folk to use his competing service.

maratc · 4h ago

This can be achieved by running in CI what commonly runs on local.

E.g. if your build process is simply invoking `build.sh`, it should be trivial to run exactly that in any CI.

ambicapter · 4h ago

This is fine until your run into differences between your machine and the CI one (or you're writing code for a different architecture than the one you're using), but I agree, this is definitely the first step.

0x457 · 1h ago

Plot twist, my build.sh invokes nix build and all I have to do on CI is to install nix and setup caching.

maratc · 4h ago

I agree, but if there's an architecture gap then locally running CI is not gonna help you to bridge it either.

esafak · 4h ago

Be sure to run it in a container, so you have a semblance of parity.

maratc · 3h ago

Where possible. (If your build process builds containers and your tests get them up and make them talk, doing that in a container is a challenge.)

However, there are stateless VMs and stateless BMs too.

BeeOnRope · 2h ago

What is a BM?

maratc · 2h ago

Baremetal (meaning: "the whole server" usually.)

Norfair · 2h ago

nix-ci.com is built with this as one of the two central features. The other is that it figures out what to do by itself; you don't have to write any YAML.

gsaslis · 2h ago

It should!

And yet, that's technically not CI.

The whole point we started using automation servers as an integration point was to avoid the "it works on my machine" drama. (Have watched at least 5 seasons of it - they were all painful!).

+1 on running the test harness locally though (where feasible) before triggering the CI server.

esafak · 4h ago

dagger does this, at the expense of increased complexity.

apwell23 · 5h ago

prbly have really bad code and tests if everything passes locally but fails on CI reguarly.

goku12 · 5h ago

Not necessarily. For one, the local dev environment may be different or less pristine than what's encountered in the CI. I use bubblewrap (the sandboxing engine behind flatpak) sometimes to isolate the dev environment from the base system. Secondly, CI often does a lot more than what's possible on the local system. For example, it may run a lot more tests than what's practical on a local system. Or the upstream Repo may have code that you don't have in your local repo yet.

Besides all that, this is not at all what the author and your parent commenter is discussing. They are saying that the practice of triggering and running CI jobs entirely locally should be more common, rather than having to rely on a server. We do have CI runners that work locally. But the CI job management is still done largely from servers.

apwell23 · 4h ago

> For example, it may run a lot more tests than what's practical in a local system.

yes this is what i was taking about. If there are a lots of tests that are not practical to run locally then they are bad tests no matter how useful one might think they are. only good tests are the ones that run fast. It is also a sign that code itself is bad that you are forced to write tests that interact with outside world.

For example, you can extract logic into a presention layer and write unit test for that instead of mixing ui and business logic and writing browser tests for it. there are also well known patterns for this like 'model view presenter'.

I would rather put my effort into this than trying to figure out how to run tests that launch databases, browsers, call apis , start containers ect. Everywhere i've seen these kind of tests they've contributed to "it sucks to work on this code" feeling, bad vibes is the worst thing that can happen to code

8n4vidtmkvmk · 4h ago

It does suck when those large scale integration tests fail but sometimes that's the only real way to test something. E.g. I have to call a service owned by another team. It has a schema and documentation so I can mock out what I think it will return, but how will I truly know the API is going to do what it says or what I think it says without actually calling the API?

girvo · 3h ago

Tbf that’s what post-deployment verification tests sre ideal for, instead of as integration/e2e tests blocking your merges/deployments

apwell23 · 4h ago

> I truly know the API is going to do what it says or what I think it says without actually calling the API?

what if the API changes all of sudden in production? what about cases where api stays the same but content of response is all wrong? how do tests protect you from that?

edit: they are not hypothetical scenarios. wrong responses are way more common than schema breaking. tooling upsteam is often pretty good at catching schema breakages.

wrong responses often cause way more havoc than schema breakages because you get an alert for schema failures in app anyways.

chriswarbo · 2h ago

Tests can't catch everything; it's a question of cost/benefit, and stopping when the diminishing returns provided by further tests (or other QA work) isn't enough to justify the cost of further investment in them (including the opportunity cost of spending our time improving QA elsewhere).

For your example, the best place to invest would be in that API's own test suite (e.g. sending its devs examples of usage that we rely on); but of course we can't rely on others to make our lives easier. Contracts can help with that, to make the API developers responsible for following some particular change notification process.

Still, such situations are hypothetical; whereas the sorts of integration tests that the parent is describing are useful to avoid our deployments from immediately blowing up.

NewJazz · 4h ago

It is a tradeoff, e.g. running tests with a real database or other supporting service and taking longer vs. mocking things and having a test environment that is less like reality.

andrewaylett · 4h ago

https://testcontainers.com/ is not quite the solution to all your problems, but it makes working with real databases and supporting services pretty much as easy as mocking them would be.

I'd really recommend against mocking dependencies for most tests though. Don't mock what you don't own, do make sure you test each abstraction layer appropriately.

apwell23 · 4h ago

do you really need to test postgres api in your own code?

lbotos · 36m ago

Op, Radicle had a very glitchy style home page before it went more 8-bit. Do you have an archive of that anywhere? I’d like to use it as reference for a style period in design!

tough · 4h ago

I use act to run github CI locally fwiw https://github.com/nektos/act

Wayback 0.1 Released as First Preview Release for X11 Compatibility Layer (phoronix.com)

Triassic diapsid shows early diversification of skin appendages in reptiles (nature.com)

Why 24/7 trading is a bad idea (economist.com)

My project made me $18,000 in 7 months. Here's what I did differently this time

U.S. Drafts Plan to End Program That Saved Millions from Aids (nytimes.com)

Qianfan satellite network – China's Starlink rival – is facing serious delays (scmp.com)

Sonos Appoints Tom Conrad as Chief Executive Officer (sonos.com)

Why Are We All Cowards? (linch.substack.com)

Show HN: Vacation Maximizer – Maximize PTO by taking days off around holidays (vacation-maximizer.com)

Missing link to 'blobs' deep within Earth (theconversation.com)

For the Privileged Few, Airport Food Hits a New Height of Luxury (nytimes.com)

AI coding agents in CI/CD pipelines create new attack vectors (stepsecurity.io)

Demis Hassabis Interview [video] (youtube.com)

AI overviews cause massive drop in search clicks (arstechnica.com)

Los Angeles no longer ranks as worst US city for traffic (bbc.com)

The science behind the heat dome – 'a mosh pit' of molecules (grist.org)

82% of all new tech jobs this year have gone to foreigners (twitter.com)

R/AI: I'm officially in the "I won't be necessary in 20 years" camp (old.reddit.com)

The Sad State of Gaza (youtube.com)

Zen vs. Tribalism (amandaknox.substack.com)

DeLLMa: Decision Making Under Uncertainty with Large Language Models (dellma.github.io)

WTF? Is GPT Channeling the Bible? (scribd.com)

The Decade of Deep Learning (2019) (bmk.sh)

MetaMask extension bug causes 100s of GBs of extraneous data to be written (tomshardware.com)

AMD CEO Sees Chips from TSMC's US Plant Costing 5%-20% More (bloomberg.com)

Developing Our Position on AI (recurse.com)

How do you manage your taxes? (taxhero.vc)

I co-authored books with GPT. It became more than a mirror

AI industry's size obsession is killing ROI, engineer argues (theregister.com)

Nintendo Switch 2 physical game price differences (sethmlarson.dev)

ICE block founder's wife fired by DOJ in retaliation for the app (newsweek.com)

LLMs Transmit Traits via Hidden Signals to Other Models (twitter.com)

AWS merges malicious PR into Amazon Q (lastweekinaws.com)

Evidence of death row inmate Robert Roberson's innocence stronger (expressnews.com)

AccuWeather to discontinue free access to Core Weather API (developer.accuweather.com)

Ordering Without Barriers: Engineering Our Websites for Accessibility (thesidedish.flipdish.com)

Solid gold superheated to 14 times its melting temperature (nature.com)

Within 5 Years, All Engineers Will Be Systems Integrators (quicktea.ai)

iOS and Android are heading in opposite design directions (birchtree.me)

Helm vs. Kustomize: Differences Explained and How to Combine Them (substack.com)

Show HN: UGMM-NN – A FF neural network using univariate Gaussian mixture neurons (github.com)

Justice Dept. Told Trump in May That His Name Is Among Many in the Epstein Files (wsj.com)

Ask HN: Best Service for Blog Newsletter?

Medical Hypotheses: Could body piercing be a cause of rheumatoid arthritis? (sciencedirect.com)

The world is run by old men in a hurry (archive.is)

Julian LeFay, Father of the Elder Scrolls and demoscene pioneer has passed away (twitter.com)

AppleCare One launches as a single plan to cover multiple Apple devices (appleinsider.com)

For those who run Fedora as a server (versus CentOS/Alma/Rocky), why? (old.reddit.com)

Explore Kinabalu Park and More UNESCO World Heritage Sites (blog.google)

Major Rule About Cooking Meat Turns Out to Be Wrong (seriouseats.com)

Using Radicle CI

Comments (29)