Show HN: Single-agent long-horizon reasoning within one LLM run
4 hyluo 1 7/23/2025, 3:35:17 PM huggingface.co ↗
- We build the Thread Inference Model (TIM) based on the transformer architecture, and its dedicated runtime TIMRUN.
- TIM + TIMRUN = Intelligent workflow generation, context engineering, and multi-hop tool use happens at the runtime level
- TIM + TIMRUN supports virtually unlimited reasoning enabled by context pruning, significantly improves the efficiency for long-horizon reasoning tasks
- Inference API is live at https://subconscious.dev/
- More details: https://github.com/subconscious-systems/TIMRUN
Quick question — how does the context-pruning mechanism decide which KV states to discard vs. retain? Just trying to understand how it balances memory efficiency with reasoning depth.
I’ll sign up and try out the API — excited to try it out!