DeepSeek v3.1 is not having a moment

5 paulpauper 1 8/23/2025, 4:21:47 PM thezvi.substack.com ↗

Comments (1)

karmakaze · 2h ago
What I find impressive with V3.1 are the things that are different, especially efficiency:

Significant improvements in training efficiency through innovations like FP8 mixed precision training, which reduces memory use by up to 75% and accelerates training.

Faster inference speed with multi-token prediction architecture, generating multiple tokens per step, resulting in 2-3x faster outputs.

New hybrid thinking mode that allows switching between fast non-thinking mode and slower, more thoughtful reasoning without quality loss.