Show HN: Single-agent long-horizon reasoning within one LLM run

4 hyluo 1 7/23/2025, 3:35:17 PM huggingface.co ↗
- We build the Thread Inference Model (TIM) based on the transformer architecture, and its dedicated runtime TIMRUN.

- TIM + TIMRUN = Intelligent workflow generation, context engineering, and multi-hop tool use happens at the runtime level

- TIM + TIMRUN supports virtually unlimited reasoning enabled by context pruning, significantly improves the efficiency for long-horizon reasoning tasks

- Inference API is live at https://subconscious.dev/

- More details: https://github.com/subconscious-systems/TIMRUN

Comments (1)

kevin8704 · 1d ago
Awesome! It looks like you’re building a “reasoning tree” approach with runtime-level context engineering and pruning.

Quick question — how does the context-pruning mechanism decide which KV states to discard vs. retain? Just trying to understand how it balances memory efficiency with reasoning depth.

I’ll sign up and try out the API — excited to try it out!