Activeloop(YC S18)Is Hiring Senior Backend and AI Search Engineer(Mountain View) (careers.activeloop.ai)

1 points by davidbuniat 14d ago 0 comments

Morph (YC S23) Is Hiring a ML Engineer

1 points by bhaktatejas922 14d ago 0 comments

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco (ycombinator.com)

1 points by tk90 14d ago 0 comments

Demodesk (YC W19) Is Hiring Rails Engineers (demodesk.com)

1 points by alxppp 14d ago 0 comments

Piramidal (YC W24) Is Hiring a Senior Full Stack Engineer (ycombinator.com)

1 points by dsacellarius 15d ago 0 comments

AccessOwl (YC S22) is hiring an AI TypeScript Engineer to connect 100s of SaaS (ycombinator.com)

1 points by mathiasn 17d ago 0 comments

StackAI (YC W23) Is Looking for SWR and Tailwind Wizards (ycombinator.com)

1 points by baceituno 18d ago 0 comments

Bzip2 crate switches from C to 100% Rust

230 Bogdanp 82 6/17/2025, 8:06:54 PM trifectatech.org ↗

Comments (82)

dralley · 9h ago

How realistic is it for the Trifecta Tech implementation to start displacing the "official" implementation used by linux distros, which hasn't seen an upstream release since 2019?

Fedora recently swapped the original Adler zlib implementation with zlib-ng, so that sort of thing isn't impossible. You just need to provide a C ABI compatible with the original one.

wmf · 8h ago

Ubuntu is using Rust sudo so it's definitely possible.

masfuerte · 8h ago

They do provide a compatible C ABI. Someone "just" needs to do the work to make it happen.

tiffanyh · 6h ago

I think that is the goal of uutils.

https://uutils.github.io/

cocoa19 · 4h ago

I hope some are improved too.

The performance boost in tools like ripgrep and tokei is insane compared to the tools they replace (grep and cloc respectively).

deknos · 19m ago

i wait until they come to the hard stuff like awk, sed and grep.

rlpb · 8h ago

> You just need to provide a C ABI compatible with the original one.

How does this interact with dynamic linking? Doesn't the current Rust toolchain mandate static linking?

alxhill · 5h ago

The commenters below are confusing two things - Rust binaries can be dynamically linked, but because Rust doesn’t have a stable ABI you can’t do this across compiler versions the way you would with C. So in practice, everything is statically linked.

eru · 5h ago

Static linking also produces smaller binaries and lets you do link-time-optimisation.

emidln · 47m ago

Static linking doesn't produce smaller binaries. You are literally adding the symbols from a library into your executable rather than simply mentioning them and letting the dynamic linker figure out how to map those symbols at runtime.

The sum size of a dynamic binary plus the dynamic libraries may be larger than one static linked binary, but whether that holds for more static binaries (2, 3, or 100s) depends on the surface area your application uses of those libraries. It's relatively common to see certain large libraries only dynamically linked, with the build going to great lengths to build certain libraries as shared objects with the executables linking them using a location-relative RPATH (using the $ORIGIN feature) to avoid the extra binary size bloat over large sets of binaries.

IshKebab · 9m ago

Static linking does produce smaller binaries when you bundle dependencies. You're conflating two things - static vs dynamic linking, and bundled vs shared dependencies.

They are often conflated because you can't have shared dependencies with static linking, and bundling dynamically linked libraries is uncommon in FOSS Linux software. It's very common on Windows or with commercial software on Linux though.

connicpu · 5h ago

Specifically, the rust dependencies are statically linked. It's extremely easy to dynamically link anything that has a C ABI from rust.

quotemstr · 2h ago

C++ binaries should be doing the same. Externally, speak C ABI. Internally, statically link Rust stdlib or C++ stdlib.

bluGill · 6h ago

Rust cannot dynamic link to rust. It can dynamic link to C and be dynamicly linked by C - if you combine the two you can cheat but it is still C that you are dealing with not rust even if rust is on both sides.

filmor · 1h ago

Rust can absolutely link to Rust libraries dynamically. There is no stable ABI, so it has to be the same compiler version, but it will still be dynamically linked.

mjevans · 5h ago

It might help to think of it as two IPC 'servers' written in rust that happen to have the C ABI interfaces as their communication protocol.

sedatk · 8h ago

No. https://doc.rust-lang.org/reference/linkage.html#r-link.dyli...

arcticbull · 8h ago

Rust lets you generate dynamic C-linkage libraries.

Use crate-type=["cdylib"]

nicoburns · 8h ago

Dynamic linking works fine if you target the C ABI.

conradev · 6h ago

Rust importing Rust must be statically linked, yes. You can statically link Rust into a dynamic library that other libraries link to, though!

timeon · 7h ago

You can use dynamic linking in Rust with C ABI. Which means going through `unsafe` keyword - also known as 'trust me bro'. Static linking directly to Rust source means it is checked by compiler so there is no need for unsafe.

rwaksmunski · 7h ago

I use this crate to process 100s of TB of Common Crawl data, I appreciate the speedups.

viraptor · 7h ago

What's the reason for using bz2 here? Wouldn't it be faster to do a one off conversion to zstd? It beats bzip2 in every metric at higher compression levels as far as I know.

rwaksmunski · 6h ago

Common Crawl delivers the data as bz2. Indeed I store intermediate data in zstd with ZFS.

declan_roberts · 7h ago

That assumes you're processing the data more than once.

No comments yet

anon-3988 · 5h ago

Is this data available as torrents?

malux85 · 7h ago

Yeah came here to say a 14% speed up in compression is pretty good!

aidenn0 · 2h ago

bzip2 (particularly parallel implementations thereof) are already relatively competitive for compression. The decompression time is where it lags behind because lz77 based algorithms can be incredibly fast at decompression.

koakuma-chan · 6h ago

It's blazingly fast

broken_broken_ · 46m ago

About not having perf on macOS: you can get quite far with dtrace for profiling. That’s what the original flame graph script in Perl mentions using and what the flame graph Rust reimplementation also uses. It does not have some metrics like cache misses or micro instructions retired but still it can be very useful.

firesteelrain · 8h ago

Anyone know if this will by default resolve the 11 outstanding CVEs?

Ironically there is one CVE reported in the bzip2 crate

[1] https://app.opencve.io/cve/?product=bzip2&vendor=bzip2_proje...

tialaramex · 7h ago

There's certainly a contrast between the "Oops a huge file causes a runtime failure" reported for that crate and a bunch of "Oops we have bounds misses" in C. I wonder how hard anybody worked on trying to exploit the bounds misses to get code execution. It may or may not be impossible to achieve that escalation.

Philpax · 8h ago

> The bzip2 crate before 0.4.4

They're releasing 0.6.0 today :>

a-dub · 6h ago

i'd be curious if they're using the same llvm codegen (with the same optimization) backend for the c and rust versions. if so, where the speedups are coming from?

(ie, is it some kind of rust auto-simd thing, did they use the opportunity to hand optimize other parts or is it making use of newer optimized libraries, or... other)

eru · 5h ago

Just speculating: Rust can hand over more hints to the code generator. Eg you don't have to worry about aliasing as much as with C pointers. See https://en.wikipedia.org/wiki/Aliasing_(computing)#Conflicts...

MBCook · 3h ago

This makes a lot of sense to me, though I don’t know the official answer so I’m just sort of guessing along too.

Linked from the article is another on how they used c2rust to do the initial translation.

https://trifectatech.org/blog/translating-bzip2-with-c2rust/

For our purposes, it points out places where the code isn’t very optimal because the C code has no guarantees on the ranges of variables, etc.

It also points out a lot of people just use ‘int’ even when the number will never be very big.

But with the proper type the Rust compiler can decide to do something else if it will perform better.

So I suspect your idea that it allows unlocking better optimizations though more knowledge is probably the right answer.

Too · 48m ago

Ergonomics of using the right data structures and algorithms can also play a big role. In C, everything beyond a basic array is too much hassle.

adgjlsfhk1 · 1h ago

C is honestly a pretty bad language for writing modern high performance code. Between C99 and C21, there was a ~20 year gap where the language just didn't add features needed to idiomatically target lots of the new instructions added (without inline asm). Just getting good abstract machine instructions for clz/popcnt/clmul/pdep etc helps a lot for writing this kind of code.

zzo38computer · 1h ago

Popcount, clz, and ctz are provided as nonstandard functions in GCC (and clang might also support them in GNU mode, but I don't know for sure). PDEP and PEXT do not seem to be, but I think they should be (and PEXT is something that INTERCAL already had, anyways) (although PDEP and PEXP can be used with -mbmi2 on x86, but are not available for general use). The MOR and MXOR of MMIX are also something that I would want to be available as built-in functions.

WhereIsTheTruth · 2h ago

any rewrite, in X, Y, Z language gives you the opportunity to speed things up, there is nothing inherent to rust

xvilka · 3h ago

I hope they or Prossimo will also look and reimplement in the similar fashion the core Internet protocols - BGP, OSPF and RIP, other routing implementations, DNS servers, and so on.

dataking · 45m ago

https://www.memorysafety.org/initiative/ this page mentions TLS and DNS which goes some way towards your suggestion.

solarized · 7h ago

Do they use any llm to transpile the C to Rust ?

Twirrim · 6h ago

If you're going to use tools to transpile, don't use something that hallucinates. You want it to be precise.

https://github.com/immunant/c2rust reportedly works pretty well. Blog post from a few years ago of them transpiling quake3 to rust: https://immunant.com/blog/2020/01/quake3/. The rust produced ain't pretty, but you can then start cleaning it up and making it more "rusty"

dataking · 5h ago

They indeed used c2rust for the initial transpile according to https://trifectatech.org/blog/translating-bzip2-with-c2rust/

nightfly · 7h ago

Task that requires precision and potentially hard to audit? Exactly where I'd use an LLM /s

CGamesPlay · 6h ago

Without commenting on whether an LLM is the right approach, I don't think this task is particularly hard to audit. There is almost assuredly a huge test suite for bzip2 archives; fuzzing file formats is very easy; and you can restrict / audit the use of unsafe by the translator.

MBCook · 3h ago

You’re right, there is a large existing test suite. It’s mentioned in an article linked from this one.

https://trifectatech.org/blog/translating-bzip2-with-c2rust/

I suspect attempting to debug it would be a nightmare though. Given the LLM could hallucinate anything anywhere you’d likely waste a ton of time.

I suspect it would be faster to just try and write a new implementation based on the spec and debug that against the test suite. You’d likely be closer.

In fact, since they used c2rust, they had a perfectly working version from the start. From there they just had to clean up the Rust code and make sure it didn’t break anything. Clearly the best of the three options.

dale_huevo · 8h ago

A lot of this "rewrite X in Rust" stuff feels like burning your own house down so you can rebuild and paint it a different color.

Counting CPU cycles as if it's an accomplishment seems irrelevant in a world where 50% of modern CPU resources are allocated toward UI eye candy.

cornstalks · 7h ago

> Counting CPU cycles as if it's an accomplishment seems irrelevant in a world where 50% of modern CPU resources are allocated toward UI eye candy.

That's the kind of attitude that leads to 50% of modern CPU resources being allocated toward UI eye candy.

0cf8612b2e1e · 8h ago

Every cycle saved is longer battery life. Someone paid the one time cost of porting it, and now we can enjoy better performance forever.

dale_huevo · 8h ago

They kicked off the article saying that no one uses bzip2 anymore. A million cycles saved for something no one uses (according to them) is still 0% battery life saved.

If modern CPUs are so power efficient and have so many spare cycles to allocate to e.g. eye candy no one asked for, then no one is counting and the comparison is irrelevant.

yuriks · 8h ago

It sounds like the main motivation for the conversion was to simplify builds and reduce the chance of security issues. Old parts of protocols that no one pays much attention to anymore does seem to be a common place where those pop up. The performance gain looks more like just a nice side effect of the rewrite, I imagine they were at most targeting performance parity.

spartanatreyu · 7h ago

Exactly, even if we can't remove "that one dependency" (https://xkcd.com/2347/), we can reinforce everything that uses it.

jimktrains2 · 8h ago

Isn't bzip used quite a bit, especially for tar files?

Philpax · 8h ago

The Wikipedia data dumps [0] are multistream bz2. This makes them relatively easy to partially ingest, and I'm happy to be able to remove the C dependency from the Rust code I have that deals with said dumps.

[0]: https://meta.wikimedia.org/wiki/Data_dump_torrents#English_W...

jeffbee · 8h ago

If so, only by misguided users. Why would anyone choose bz2 in 2025?

0x457 · 7h ago

To unpack an archive made from the time when bz2 was used?

ben-schaaf · 7h ago

Of course no one uses systems, tools and files created before 2025!

jeffbee · 7h ago

bzip2 hasn't been the best at anything in at least 20 years.

appreciatorBus · 6h ago

The same could be said of many things that, nonetheless, are still used by many, and will continue to be used by many for decades to come. A thing does not need to be best to justify someone wanting to make it a bit better.

MBCook · 3h ago

I use plain old zip files almost every day.

“Best” is measured along a lot more axis than just performance. And you don’t always get to choose what format you use. It may be dictated to you by some 3rd party you can’t influence.

Twirrim · 6h ago

So? If I need to consume a resource compressed using bz2, I'm not just going to sit around and wait for them to use zstd. I'm going to break out bz2. If I can use a modern rewrite that's faster, I'll take every advantage I can get.

tcfhgj · 7h ago

> Counting CPU cycles as if it's an accomplishment seems irrelevant in a world where 50% of modern CPU resources are allocated toward UI eye candy.

Attitude which leads to electron apps replacing native ones, and I hate it. I am not buying better cpus and more ram just to have it wasted like this

stevefan1999 · 4h ago

You know it is just Wirth's law in action: "Software gets slower faster than hardware gets faster." [^1]

In fact Jevons Paradox: When technological progress increases the efficiency with which a resource is used, but the rate of consumption of that resource rises due to increasing demand - essentially, efficiency improvements can lead to increased consumption rather than the intended conservation. [^2][^3]

[^1]: https://www.comp.nus.edu.sg/~damithch/quotes/quote27.htm

[^2]: https://www.greenchoices.org/news/blog-posts/the-jevons-para...

[^3]: https://quickonomics.com/terms/jevons-paradox/

Rucadi · 8h ago

I personally find a lot more relevant the part about "Enabling cross-compilation ", which in my opinion is important and a win.

The same about exported symbols and being able to compile to wasm easily.

Terr_ · 8h ago

It seems to me like binary file format parsing (and construction) is probably a good place for using languages that aren't as prone to buffer-overflows and the like. Especially if it's for a common format and the code might be used in all sorts of security-contexts.

wahern · 3h ago

Buffer overflows are more a library problem, not a language problem, though for newer ecosystems like Rust the distinction is kind of lost on people. But point being, if you rewrote bzip2 using an equivalent to std::Vec, you'd end up in the same place. Unfortunately, the norm among C developers, especially in the past, was to open code most buffer manipulation, so you wind up with 1000 manually written overflow checks, some of which are wrong or outright missing, as opposed to a single check in a shared implementation. Indeed, even that Rust code had an off-by-one (in "safe" code), it just wasn't considered a security issue because it would result in data corruption, not an overflow.

What Rust-the-language does offer is temporal safety (i.e. the borrow checker), and there's no easy way to get that in C.

viraptor · 7h ago

Those cycles translate directly to $ saved in a few places. Mostly in places far away from having any UI at all.

Scuds · 3h ago

you're just an end user, you don't have to maintain the suite.

In OSS every hour of volunteer time is precious Manna from heaven, flavored with unicorn tears. So any way to remove Toil and introduce automation is gold.

Rust's strict compiler and an appropriate test suite guarantees a level of correctness far beyond C. There's less onus on the reviewer to ensure everything still works as expected when reviewing a pull request.

It's a win-win situation.

hoseja · 1h ago

It's like "adapting" Akallabêth so you can tell your own empowering story for modern audiences.

anonnon · 8h ago

> Counting CPU cycles

And that's assuming they aren't lying about the counting: https://desuarchive.org/g/thread/104831348/#q104831479

DaSHacka · 7h ago

Rust devs continuing to use misleading benchmarks? I, for one, am absolutely shocked. Flabbergasted, even.

bitwize · 5h ago

It's a lot like X11 vs. Wayland. The current graphics developers, who trend younger, don't want to maintain the boomer-written C code in the X server. Too risky and time-consuming. So one of the goals of Wayland is to completely abolish X so it can be replaced with something more long-term maintainable. Turns out, current systems-level developers don't want to maintain boomer-written GNU code or any C code at all, really, for similar reasons. C is inherently problematic because even seasoned developers have trouble avoiding its footguns. So an unstated, but important, goal of Rust is to abolish all critical C code and replace it with Rust code. Ubuntu is on board with this.

jxjnskkzxxhx · 7h ago

> lot of this "rewrite X in Rust" stuff feels like

Indeed. You know the react-angular-vue nevermind is churn? It appears that the trend of people pushing stuff because it benefit their careers is coming to the low level world.

I for one still find it mistifying that Linus torvals let this people into the kernel. Linus, who famous banned c++ from the kernel not because of c++ in itself, but to ban c++ programmer culture.

Foundry (YC F24) Hiring Early Engineer to Build Web Agent Infrastructure (ycombinator.com)

Blaze (YC S24) Is Hiring (ycombinator.com)

Infracost (YC W21) is hiring software engineers (GMT+2 to GMT-6) (infracost.io)

Solidroad (YC W25) Is Hiring (solidroad.com)

Kyber (YC W23) Is Hiring a Technical Account Manager (ycombinator.com)

Roundtable (YC S23) Is Hiring a President / CRO (ycombinator.com)

Roame (YC S23) Is Hiring (ycombinator.com)

GauntletAI (YC S17): All expenses paid AI training and guaranteed $200k+ job (gauntletai.com)

SchemeFlow (YC S24) Is Hiring an Engineer (London) to Speed Up Construction (ycombinator.com)

Shaped (YC W22) Is Hiring (ycombinator.com)

Spice Data (YC S19) is hiring a software engineer – back end (ycombinator.com)

Onlook (YC W25) Is Hiring an engineer in SF

OneText (YC W23) Is Hiring a DevOps/DBA Lead Engineer (jobs.ashbyhq.com)

Gander (YC F24) Is Hiring Founding Engineers and Interns (ycombinator.com)

Ziina (YC W21) the Series A fintech is hiring product engineers (ziina.notion.site)

Onyx (YC W24) – AI Assistants for Work Hiring Founding AE (ycombinator.com)

Great Question (YC W21) Is Hiring a Director of Customer Success (ycombinator.com)

Deepnote (YC S19) is hiring engineers to build an AI-powered data notebook (deepnote.com)

Converge (YC S23) Well-capitalized New York startup seeks product developers (runconverge.com)

CircuitHub (YC W12) is hiring full-stack robotics engineers (workatastartup.com)

AtoB (YC S20) – Stripe for Transportation – is hiring engineers (jobs.ashbyhq.com)

PromptArmor (YC W24) Is Hiring in San Francisco (ycombinator.com)

Depot (YC W23) is hiring an enterprise support engineer (UK/EU) (ycombinator.com)

Patched (YC S24) Is Hiring SWEs in Singapore (ycombinator.com)

Activeloop(YC S18)Is Hiring Senior Backend and AI Search Engineer(Mountain View) (careers.activeloop.ai)

Morph (YC S23) Is Hiring a ML Engineer

Spark AI (YC W24) Is Hiring a Full Stack Engineer in San Francisco (ycombinator.com)

Demodesk (YC W19) Is Hiring Rails Engineers (demodesk.com)

Piramidal (YC W24) Is Hiring a Senior Full Stack Engineer (ycombinator.com)

AccessOwl (YC S22) is hiring an AI TypeScript Engineer to connect 100s of SaaS (ycombinator.com)

StackAI (YC W23) Is Looking for SWR and Tailwind Wizards (ycombinator.com)

Bzip2 crate switches from C to 100% Rust

Comments (82)