Qwen3-235B-A22B-Instruct-2507

33 tosh 2 7/21/2025, 5:19:27 PM huggingface.co ↗

Comments (2)

jackmhny · 3h ago
these benches are crazy

    +-------------+----------+-----------+-------+-------+-------+
    | Task        | A22B-Ins |      A22B |    K2 | Opus4 | Deeps |
    +-------------+----------+-----------+-------+-------+-------+
    | GPQA        |    *77.5 |      62.9 | +75.1 | -74.9 |  68.4 |
    | AIME25      |    *70.3 |      24.7 | +49.5 |  33.9 | -46.6 |
    | LiveCB_v6   |    *51.8 |      32.9 | +48.9 |  44.6 | -45.2 |
    | ArenaHard2  |    *79.2 |     -52.0 | +66.1 |  51.5 |  45.6 |
    | BFCL_v3     |    *70.9 |     +68.0 | -65.2 |  60.1 |  64.7 |
    +-------------+----------+-----------+-------+-------+-------+
* 1st + 2nd - 3rd
homarp · 5h ago
teased on twitter, https://x.com/JustinLin610/status/1947281769134170147

and later they will release the thinking model

on selected benchmarks, it beats kimi