Supreme Court's ruling practically wipes out free speech for sex writing online (ellsberg.substack.com)

Im pretty excited to play around with this. I’ve worked with whisper quite a bit, it’s awesome to have another model in the same class and from Mistral, who tend to be very open. I’m sure unsloth is already working on some GGUF quants - will probably spin it up tomorrow and try it on some audio.

homarp · 9h ago

Running Voxtral-Mini-3B-2507 on GPU requires ~9.5 GB of GPU RAM in bf16 or fp16.

Running Voxtral-Small-24B-2507 on GPU requires ~55 GB of GPU RAM in bf16 or fp16.

ipsum2 · 5h ago

24B is crazy expensive for speech transcription. Conspicuously no comparison with Parakeet, a 600M param model thats currently dominating leaderboards (but only for English)

sheerun · 2h ago

In demo they mention polish prononcuation is pretty bad, spoken as if second language of english-native speaker. I wonder if it's the same for other languages. On the other hand whispering-english is hillariously good, especially different emotions.

GaggiX · 10h ago

There is also a Voxtral Small 24B small model available to be downloaded: https://huggingface.co/mistralai/Voxtral-Small-24B-2507

lostmsu · 9h ago

Does it support realtime transcription? What is the ~latency?

danelski · 13h ago

They claim to undercut competitors of similar quality by half for both models, yet they released both as Apache 2.0 instead of following smaller - open, larger - closed strategy used for their last releases. What's different here?

halJordan · 5h ago

They didn't release voxtral large so your question doesn't really make sense

wmf · 6h ago

They're working on a bunch of features so maybe those will be closed. I guess they're feeling generous on the base model.

Havoc · 6h ago

Probably not looking to directly compete in transcription space

homarp · 11h ago

weights:https://huggingface.co/mistralai/Voxtral-Mini-3B-2507 and https://huggingface.co/mistralai/Voxtral-Small-24B-2507

homarp · 11h ago

Running Voxtral-Mini-3B-2507 on GPU requires ~9.5 GB of GPU RAM in bf16 or fp16.

Running Voxtral-Small-24B-2507 on GPU requires ~55 GB of GPU RAM in bf16 or fp16.

lostmsu · 9h ago

My Whisper v3 Large Turbo is $0.001/min, so their price comparison is not exactly perfect.

ImageXav · 9h ago

How did you achieve that? I was looking into it and $0.006/min is quoted everywhere.

lostmsu · 8h ago

Harvesting idle compute. https://borgcloud.org/speech-to-text

4b11b4 · 40m ago

This is your service?

BetterWhisper · 7h ago

Do you support speaker recognition?

lostmsu · 5h ago

No. I found models doing that unreliable when there are many speakers.

LLM Inevitabilism (tomrenner.com)

Adding a feature because ChatGPT incorrectly thinks it exists (holovaty.com)

OpenAI’s Windsurf deal is off, and Windsurf’s CEO is going to Google (theverge.com)

MacPaint Art from the Mid-80s Still Looks Great Today (blog.decryption.net.au)

Bypassing Google's big anti-adblock update (0x44.xyz)

Kiro: A new agentic IDE (kiro.dev)

Nvidia won, we all lost (blog.sebin-nyshkim.net)

Bootstrapping a side project into a profitable seven-figure business (projectionlab.com)

My open source project was relicensed by a YC company [license updated] (twitter.com)

Show HN: Ten years of running every day, visualized (nodaysoff.run)

Local-first software (2019) (inkandswitch.com)

Introducing tmux-rs (richardscollin.github.io)

Supabase MCP can leak your entire SQL database (generalanalysis.com)

Let me pay for Firefox (discourse.mozilla.org)

Bitchat – A decentralized messaging app that works over Bluetooth mesh networks (github.com)

Being too ambitious is a clever form of self-sabotage (maalvika.substack.com)

Measuring the impact of AI on experienced open-source developer productivity (metr.org)

Grok: Searching X for "From:Elonmusk (Israel or Palestine or Hamas or Gaza)" (simonwillison.net)

Are we the baddies? (geohot.github.io)

ETH Zurich and EPFL to release a LLM developed on public infrastructure (ethz.ch)

Supreme Court's ruling practically wipes out free speech for sex writing online (ellsberg.substack.com)

At Least 13 People Died by Suicide Amid U.K. Post Office Scandal, Report Says (nytimes.com)

The Rise of Whatever (eev.ee)

US Court nullifies FTC requirement for click-to-cancel (arstechnica.com)

Websites hosting major US climate reports taken down (apnews.com)

Tree Borrows (plf.inf.ethz.ch)

Mercury: Ultra-fast language models based on diffusion (arxiv.org)

Postgres LISTEN/NOTIFY does not scale (recall.ai)

How does a screen work? (makingsoftware.com)

Oakland cops gave ICE license plate data; SFPD also illegally shared with feds (sfstandard.com)

Happy 20th Birthday, Django (djangoproject.com)

Hidden interface controls that affect usability (interactions.acm.org)

Linda Yaccarino is leaving X (nytimes.com)

Nobody has a personality anymore: we are products with labels (freyaindia.co.uk)

I extracted the safety filters from Apple Intelligence models (github.com)

Apple's MLX adding CUDA support (github.com)

SVGs that feel like GIFs (koaning.io)

Data brokers are selling flight information to CBP and ICE (eff.org)

I used o3 to profile myself from my saved Pocket links (noperator.dev)

Open letter accuses BBC board member of having a conflict of interest on Gaza (theguardian.com)

Jane Street barred from Indian markets as regulator freezes $566M (cnbc.com)

Show HN: I wrote a "web OS" based on the Apple Lisa's UI, with 1-bit graphics (alpha.lisagui.com)

Apple's Browser Engine Ban Persists, Even Under the DMA (open-web-advocacy.org)

Anthropic cut up millions of used books, and downloaded 7M pirated ones – judge (businessinsider.com)

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels (github.com)

Cognition (Devin AI) to Acquire Windsurf (cognition.ai)

Major reversal in ocean circulation detected in the Southern Ocean (icm.csic.es)

A non-anthropomorphized view of LLMs (addxorrol.blogspot.com)

Bill Atkinson's psychedelic user interface (patternproject.substack.com)

Google can now read your WhatsApp messages (neowin.net)

Voxtral – Frontier open source speech understanding models

Comments (18)