Tesla sues ex-Optimus engineer alleging theft of robotic trade secrets (fortune.com)

I wanted to share a very concrete technical challenge I submitted to all major AIs on the market (Claude, Gemini, Mistral, etc.), and which every one of them failed… except ChatGPT-4. The challenge:

    “Construct opcode 0x08C0C166 (rol ax,8) in ECX, starting from zeroed registers,
    with no memory access, no stack, no immediate values, only using classic instructions.

No cheating by assuming registers already contain the desired value.”

This question not only tests x86 assembly knowledge, but above all, pure algorithmic reasoning:

The results:

    Claude (Anthropic) and other advanced AIs: unable to provide a valid solution
    (some even admitted “that’s genius” when shown the answer!)
    ChatGPT-4:
        Not only solved it,
        But actually outperformed my own (human) solution, optimizing it to 17 instructions where I needed 18!

The code for those interested:

    xor cl,cl
    inc cl
    inc cl
    mov al,cl
    inc cl
    mov ch,cl
    ror cl,cl
    add cl,ch
    add cl,ch
    rol ch,cl
    mov bl,ch
    inc ch
    bswap ecx
    mul al
    add al,al
    mov cl,al
    mov ch,bl
    bswap ecx

Why share this here?

This challenge is:

    100% reproducible
    Impossible to “cheat” by copy-pasting from the web,
    A real benchmark for testing an AI’s deep reasoning,
    And, in my tests, ChatGPT-4 was the only AI to both solve and optimize it!

Kudos to the OpenAI team for this level of reasoning, and I encourage the community to share more “real world” challenges like this to truly compare AI model strength!

(PS: If any OpenAI team member wants more details or would like to see full logs/comparisons with other AIs, I can provide all outputs on request.)

Feel free to edit, add screenshots, or tweak for your favorite platform! If you want a short Twitter/X version or another adaptation, just ask. You’ve got a great “real benchmark” story here—enjoy sharing it!

Let me show you what ChatGPT-4o came up with.

    xor     cl, cl        ; CL = 0
    inc     cl            ; CL = 1
    inc     cl            ; CL = 2
    mov     al, cl        ; AL = 2               (on garde un 2 pour plus tard)
    rol     al, cl        ; AL = 8   (2 <<< 2)   ← remplace le combo mul+add
    inc     cl            ; CL = 3
    mov     ch, cl        ; CH = 3
    ror     cl, cl        ; CL = 96 (0x60)       (3 »» 3 mod 8)
    add     cl, ch        ; CL = 99
    add     cl, ch        ; CL = 102 (0x66)
    rol     ch, cl        ; CH = 0xC0            (3 <<< 6 = 0xC0)
    mov     bl, ch        ; BL = 0xC0            (on sauvegarde le C0)
    inc     ch            ; CH = 0xC1
    bswap   ecx           ; ECX = 0x66C10000
    mov     cl, al        ; CL = 0x08           (met le 08 en LSB)
    mov     ch, bl        ; CH = 0xC0           (replace le C0)
    bswap   ecx           ; ECX = 0x08C0C166

Why I’m convinced that 17 instructions is the true minimum:

When I gave this challenge to ChatGPT-4o, it took almost two full minutes of intense reasoning and step-by-step computation to produce a solution in 17 instructions. This wasn’t a random guess — it involved deep optimization, clever register reuse, and a brilliant use of ROL, ROR, and BSWAP to avoid any 32-bit immediates or memory usage.

Here’s why I believe a 16-instruction solution is nearly impossible:

    ChatGPT-4o is a cutting-edge symbolic optimizer.
    It found a solution with no constants, no stack, and no memory — just pure register arithmetic.
    Every instruction in the final solution is essential. There’s no fluff.

So unless someone discovers an undocumented opcode trick or abuses the architecture beyond normal constraints, 17 is likely the hard floor.

If you want to try, here’s your target output: ECX = 0x08C0C166 using clean 32-bit PE code, no stack, no memory, and no immediate 0x08C0C166.

Comments (4)

zgs · 21h ago

It might be shorter to do some multiplications:

0x08C0C166 = 2⁴ × 3 × 5⁵ × 11 × 89 + 1 = 10⁴ × 3 × 5 × 11 × 89 + 1

Plenty of values that can be reused (11 = 10 + 1 and 89 = 10² - 11).

Still, there is quite a bit of manipulation required and only 17 instructions to do them in.

xddj · 5h ago

That's a very clever approach — I hadn't even thought of factoring the target value like that.

Decomposing `0x08C0C166` into `2⁴ × 3 × 5⁵ × 11 × 89 + 1` and reusing parts like `11 = 10 + 1` and `89 = 10² - 11` is genuinely interesting.

Still, as you said, packing all the necessary manipulations into just 17 instructions is the real challenge — especially when you try to avoid any immediate constants, memory access, or stack usage.

If you do find a shorter sequence that matches the constraints exactly, please share! I’d love to see how far this can be optimized.

zgs · 22h ago

The initial XOR instruction isn't required, registers are zeroed as per the problem statement.

xddj · 5h ago

Yes, you're absolutely right — the initial `xor cl, cl` is technically redundant if we assume all registers are zeroed at start, as stated in the problem.

I kept it in the solution mostly out of habit and to make the logic more explicit, but you're correct that it could be removed, bringing the count down to 16.

That said, for consistency (and because some AI models needed it to understand the logic flow), I still include it when comparing instruction count across different versions.

But you're totally right: under the problem's assumptions, `xor cl, cl` is free.