Hey all!
I've just released my qwen3-rs, a Rust project for running and exporting Qwen3 models (Qwen3-0.6B, 4B, 8B, DeepSeek-R1-0528-Qwen3-8B, etc) with minimal dependencies and no Python required.
- Educational: Core algorithms are reimplemented from scratch for learning and transparency.
- CLI tools: Export HuggingFace Qwen3 models to a custom binary format, then run inference (on CPU)
- Modular: Clean separation between export, inference, and CLI.
- Safety: Some unsafe code is used, mostly to work with memory mapping files (helpful to lower memory requirements on export/inference)
- Future plans: I would be curious to see how to extend it to support:
* fine-tuning of a small models
* optimize inference performance (e.g. matmul operations)
* WASM build to run inference in a browser
Basically, I used https://github.com/adriancable/qwen3.c as a reference implementation translated from C/Python to Rust with a help of commercial LLMs (mostly Claude Sonnet 4). Please note that my primary goal is self learning in this field, so some inaccuracies can be definitely there.
- Educational: Core algorithms are reimplemented from scratch for learning and transparency. - CLI tools: Export HuggingFace Qwen3 models to a custom binary format, then run inference (on CPU) - Modular: Clean separation between export, inference, and CLI. - Safety: Some unsafe code is used, mostly to work with memory mapping files (helpful to lower memory requirements on export/inference) - Future plans: I would be curious to see how to extend it to support: * fine-tuning of a small models * optimize inference performance (e.g. matmul operations) * WASM build to run inference in a browser
Basically, I used https://github.com/adriancable/qwen3.c as a reference implementation translated from C/Python to Rust with a help of commercial LLMs (mostly Claude Sonnet 4). Please note that my primary goal is self learning in this field, so some inaccuracies can be definitely there.
GitHub: https://github.com/reinterpretcat/qwen3-rs