For Good First Issue – A repository of social impact and open source projects (forgoodfirstissue.github.com)

As someone who has spent days wrestling with Python dependency hell just to get a model running, a simple cargo run feels like a dream. But I'm wondering, what was the most painful part of NOT having a framework? I'm betting my coffee money it was debugging the backpropagation logic.

ricardobeat · 13m ago

Have you tried uv [1]? It has removed 90% of the pain of running python projects for me.

[1] https://github.com/astral-sh/uv

codetiger · 31m ago

I guess, resource utilization like GPU, etc

taminka · 48m ago

lowkey ppl who praise cargo seem to have no idea of the tradeoffs involved in dependency management

the difficulty of including a dependency should be proportional to the risk you're taking on, meaning it shouldn't be as difficult as it in, say, C where every other library is continually reinventing the same 5 utilities, but also not as easy as it is with npm or cargo, because you get insane dependency clutter, and all the related issues like security, build times, etc

how good a build system isn't equivalent of how easy it is include a dependency, while modern languages should have a consistent build system, but having a centralised package repository that anyone freely pull to/from, and having those dependencies freely take on any number of other dependencies is a bad way to handle dependencies

quantumspandex · 15m ago

Security is another problem, and should be tackled systematically. Artificially making dependency inclusion hard is not it and is detrimental to the more casual use cases.

jokethrowaway · 1m ago

Is your argument that python's package management & ecosystem is bad by design - to increase security?

In my experience it's just bugs and poor decision making on the maintainers (eg. pytorch dropping support for intel mac, leftpad in node) or on the language and package manager developers side (py2->3, commonjs, esm, go not having a package manager, etc).

Cargo has less friction than pypi and npm. npm has less friction than pypi.

And yet, you just need to compromise one lone, unpaid maintainer to wreck the security of the ecosystem.

itsibitzi · 16m ago

What tool or ecosystem does this well, in your opinion?

Snuggly73 · 11m ago

Congrats - there is a very small problem with the LLM - its reusing transformer blocks and you want to use different instances of them.

Its a very cool excercise, I did the same with Zig and MLX a while back, so I can get a nice foundation, but since then as I got hooked and kept adding stuff to it, switched to Pytorch/Transformers.

icemanx · 8m ago

correction: It's a cool exercise if you write it yourself and not use GPT

Snuggly73 · 4m ago

well, hopefully the author did learn something or at least enjoyed the process :)

(the code looks like a very junior or a non-dev wrote it tbh).

jlmcgraw · 13m ago

Some commentary from the author here: https://www.reddit.com/r/rust/comments/1nguv1a/i_built_an_ll...

Goto80 · 1h ago

Nice. Mind to put a license on that?

thomask1995 · 1m ago

License added! Good catch

techsystems · 1h ago

> ndarray = "0.16.1" rand = "0.9.0" rand_distr = "0.5.0"

Looking good!

kachapopopow · 1h ago

I was slightly curious: cargo tree llm v0.1.0 (RustGPT) ├── ndarray v0.16.1 │ ├── matrixmultiply v0.3.9 │ │ └── rawpointer v0.2.1 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-complex v0.4.6 │ │ └── num-traits v0.2.19 │ │ └── libm v0.2.15 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-integer v0.1.46 │ │ └── num-traits v0.2.19 () │ ├── num-traits v0.2.19 () │ └── rawpointer v0.2.1 ├── rand v0.9.0 │ ├── rand_chacha v0.9.0 │ │ ├── ppv-lite86 v0.2.20 │ │ │ └── zerocopy v0.7.35 │ │ │ ├── byteorder v1.5.0 │ │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ │ ├── proc-macro2 v1.0.94 │ │ │ │ └── unicode-ident v1.0.18 │ │ │ ├── quote v1.0.39 │ │ │ │ └── proc-macro2 v1.0.94 () │ │ │ └── syn v2.0.99 │ │ │ ├── proc-macro2 v1.0.94 () │ │ │ ├── quote v1.0.39 () │ │ │ └── unicode-ident v1.0.18 │ │ └── rand_core v0.9.3 │ │ └── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.170 │ ├── rand_core v0.9.3 () │ └── zerocopy v0.8.23 └── rand_distr v0.5.1 ├── num-traits v0.2.19 () └── rand v0.9.0 ()

yep, still looks relatively good.

cmrdporcupine · 33m ago

linking both rand-core 0.9.0 and rand-core 0.9.3 which the project could maybe avoid by just specifying 0.9 for its own dep on it

tonyhart7 · 1h ago

is this satire or does I must know context behind this comment???

stevedonovan · 1h ago

These are a few well-chosen dependencies for a serious project.

Rust projects can really go bananas on dependencies, partly because it's so easy to include them

obsoleszenz · 1h ago

The project only has 3 dependencies which i interpret as a sign of quality

kachapopopow · 1h ago

This looks rather similar to when I asked an AI to implement a basic xor problem solver I guess fundementally there's really only a very limited amount of ways to implement this.

abricq · 38m ago

This is great ! Congratulations. I really like your project, especially I like how easily it is to peak at.

Do you plan on moving forward with this project ? I seem to understand that all the training is done on the CPU, and that you have next steps regarding optimizing that. Do you consider GPU accelerations ?

Also, do you have any benchmarks on known hardware ? Eg, how long would it take to train on a macbook latest gen or your own computer ?

Charon77 · 1h ago

Absolutely love how readable the entire project is

emporas · 1h ago

It is very procedural/object oriented. This is not considered good Rust practice. Iterators make it more functional, which is better, more succinct that is, and enums more algebraic. But it's totally fine for a thought experiment.

koakuma-chan · 36m ago

It's AI generated

Revisional_Sin · 27m ago

How do you know? The over-commenting?

koakuma-chan · 19m ago

I know because this is how an AI generated project looks. Clearly AI generated README, "clean" code, the way files are named, etc.

magackame · 10m ago

Not sure myself. Commit messages look pretty human. But the emojis in readme and comments like "// Re-export key structs for easier access", "# Add any test-specific dependencies here if needed" are sus indeed.

cmrdporcupine · 15m ago

To me it looks like LLM generated README, but not necessarily the source (or at least not all of it).

Or there's been a cleaning pass done over it.

koakuma-chan · 10m ago

I think pretty clearly the source is also at least partially generated. None the less, just a README like that already sends a strong signal to stop looking and not trust anything written there.

GardenLetter27 · 12m ago

The repeated Impls are strange.

magackame · 6m ago

Where? Don't see any on latest main (685467e).

yieldcrv · 1h ago

Never knew Rust could be that readable. Makes me think other Rust engineers are stuck in a masochistic ego driven contest, which would explain everything else I've encountered about the Rust community and recruiting on that side.

GardenLetter27 · 10m ago

Most Rust code looks like this - only generic library code goes crazy with all the generics and lifetimes, due to the need to avoid unnecessary mallocs and also provide a flexible API to users.

But most people aren't writing libraries.

jmaker · 1h ago

Not sure what you’re alluding to but that’s just ordinary Rust without performance or async IO concerns.

ndai · 1h ago

I’m curious where you got your training data? I will look myself, but saw this and thought I’d ask. I have a CPU-first, no-backprop architecture that works very well on classification datasets. It can do single‑example incremental updates which might be useful for continuous learning. I made a toy demo to train on tiny.txt and it can predict next characters, but I’ve never tried to make an LLM before. I think my architecture might work well as an on-device assistant or for on-premises needs, but I want to work with it more before I embarrass myself. Any open-source LLM training datasets you would recommend?

electroglyph · 1h ago

https://huggingface.co/datasets/NousResearch/Hermes-3-Datase...

Snuggly73 · 9m ago

To my untrained eye, this looks more like an instruct dataset.

For just plain text, I really like this one - https://huggingface.co/datasets/roneneldan/TinyStories

kachapopopow · 1h ago

huggingface has plenty of openai and antrophic user to assistant chains, beware there are dragons (hallucinations), but good enough for instruction training. I actually recommend distilling kimi k2 instead for instruction following capabilities.

enricozb · 1h ago

I did this [0] (gpt in rust) with picogpt, following the great blog by jaykmody [1].

[0]: https://github.com/enricozb/picogpt-rust [1]: https://jaykmody.com/blog/gpt-from-scratch/

bigmuzzy · 57m ago

nice

RustGPT: A pure-Rust transformer LLM built from scratch (github.com)

Removing newlines in FASTA file increases ZSTD compression ratio by 10x (log.bede.im)

Language Models Pack Billions of Concepts into 12k Dimensions (nickyoder.com)

Folks, we have the best π (lcamtuf.substack.com)

Betty Crocker broke recipes by shrinking boxes (cubbyathome.com)

A qualitative analysis of pig-butchering scams (arxiv.org)

PythonBPF – Writing eBPF Programs in Pure Python (xeon.me)

Grapevine canes can be converted into plastic-like material that will decompose (sdstate.edu)

Celestia – Real-time 3D visualization of space (celestiaproject.space)

Which colours dominate movie posters and why? (stephenfollows.com)

NASA's Guardian Tsunami Detection Tech Catches Wave in Real Time (jpl.nasa.gov)

Which NPM package has the largest version number? (adamhl.dev)

Omarchy on CachyOS (github.com)

For Good First Issue – A repository of social impact and open source projects (forgoodfirstissue.github.com)

Analyzing the memory ordering models of the Apple M1 (sciencedirect.com)

Sandboxing Browser AI Agents (earlence.com)

You’re a slow thinker. Now what? (chillphysicsenjoyer.substack.com)

Death to Type Classes (jappie.me)

OCSP Service Has Reached End of Life (letsencrypt.org)

Titania Programming Language (github.com)

Learning Lens Blur Fields (blur-fields.github.io)

Why We Spiral (behavioralscientist.org)

Writing an operating system kernel from scratch (popovicu.com)

A set of smooth, fzf-powered shell aliases&functions for systemctl (silverrainz.me)

Page Object (2013) (martinfowler.com)

Nicu's test website made with SVG (2007) (svg.nicubunu.ro)

Trigger Crossbar (serd.es)

Introduction to GrapheneOS (dataswamp.org)

AMD Turin PSP binaries analysis from open-source firmware perspective (blog.3mdeb.com)

Show HN: A store that generates products from anything you type in search (anycrap.shop)

Repetitive negative thinking associated with cognitive decline in older adults (bmcpsychiatry.biomedcentral.com)

Read to forget (mo42.bearblog.dev)

Xrust – XPath, XQuery, and XSLT for Rust (gitlab.gnome.org)

Decentralized YouTube alternative adds livestream scheduling in new release (news.itsfoss.com)

Models of European metro stations (stations.albertguillaumes.cat)

Cex.C – Comprehensively EXtended C Language (github.com)

The AI-Scraping Free-for-All Is Coming to an End (nymag.com)

The PC was never a true 'IBMer' (thechipletter.substack.com)

Observable Notebooks Data Loaders (observablehq.com)

La-Proteina (github.com)

Gentoo AI Policy (wiki.gentoo.org)

High Altitude Living – 8,000 ft and above (2021) (studioq.com)

MIT-MC CP/M archive files, 1979-1984 (github.com)

CorentinJ: Real-Time Voice Cloning (2021) (github.com)

Show HN: Dagger.js – A buildless, runtime-only JavaScript micro-framework (daggerjs.org)

FakeIt: C++ Mocking Made Easy (github.com)

Implementing namespaces and coding standards in WordPress plugin development (developer.wordpress.org)

macOS Tahoe is certified Unix 03 [pdf] (opengroup.org)

Patela: A basement full of amnesic servers (osservatorionessuno.org)

ChatControl update: blocking minority held but Denmark is moving forward anyway (disobey.net)

RustGPT: A pure-Rust transformer LLM built from scratch

Comments (41)