Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

Comments (1)

kate_at_refact · 7h ago

Open-source Refact.ai achieves #1 on SWE-bench Lite with a 59.7% score. Our approach: fully autonomous Agent, no manual intervention needed.

How we did this:

• Prompt strategy: https://github.com/smallcloudai/refact/blob/swe-boosted-prom... • Claude 3.7 Sonnet as orchestrator • deep_analysis() tool (powered by o4-mini) for reasoning • Tool suite for repository exploration, code modification, and testing. Used dynamically based on task needs • One correct solution through iteration!

Autonomy = our core strength.

Refact.ai Agent completes the entire dev workflow independently: plans, executes, tests, self-corrects, and delivers a production-ready result. For each task, it made one multi-step run to generate a single correct solution, creating custom strategies rather than following rigid scripts.

You can read tech details on our SWE-bench approach: https://refact.ai/blog/2025/sota-on-swe-bench-lite-open-sour...

Your questions are welcome! Also, welcome to try Refact.ai Agent in VS Code and Jet Brains: https://linktr.ee/refactai

Show HN: VectorVFS, your filesystem as a vector database (vectorvfs.readthedocs.io)

Show HN: TextQuery – Query CSV, JSON, XLSX Files with SQL (textquery.app)

Show HN: Bracket – selfhosted tournament system (github.com)

Show HN: Klavis AI – Open-source MCP integration for AI applications (github.com)

Show HN: Tkintergalactic - Declarative Tcl/Tk UI Library for Python (github.com)

Show HN: Journelly for iOS: like tweeting but for your eyes only (in plain text) (xenodium.com)

Show HN: My AI Native Resume (ai.jakegaylor.com)

Show HN: I built a mini macOS app to reveal my yearly subscription spending (appps.od.ua)

Show HN: CodeCafé – A real-time collaborative code editor in the browser (github.com)

Show HN: DistilKitPlus, a distillation framework between any LLMs (github.com)

Show HN: Driverless print server for legacy printers, profit goes to open-source (printserver.ink)

Show HN: Oci2git – Convert OCI container images into Git repositories (github.com)

Show HN: Free, in-browser PDF editor (breezepdf.com)

Show HN: Open-source AI web parser lib & TUI (github.com)

Show HN: Ridvay Code – An AI Coding Assistant for VS Code (ridvay.com)

Show HN: I taught AI to commentate Pong in real time (github.com)

Show HN: Reverse Pac-Man (reverse-pacman.staticrun.app)

Show HN: An open-source low-code platform (flowcentralplatform.com)

Show HN: Reno, React and Vite and Hono Starter with Auth and E2E Type Safety (github.com)

Show HN: I built a painless local dev env for macOS (servbay.com)

Show HN: MP3 File Editor for Bulk Processing (cjmapp.net)

Show HN: Use Third Party LLM API in JetBrains AI Assistant (github.com)

Show HN: Ductape – Build back end integrations once, reuse them anywhere (ductape.app)

Show HN: VoltAgent – Open-Source Observability-First TS AI Agent Framework (github.com)

Show HN: I kept forgetting names and contacts so I built Cardio (mycardio.co)

Show HN: Open source SVG icons made for UI design (glowui.com)

Show HN: I built a synthesizer based on 3D physics (anukari.com)

Show HN: ProcASM – A general purpose, visual programming lanugage (procasm.temware.site)

Show HN: Pipask – safer pip without compromising convenience (github.com)

Show HN: A social media network where users share prompts instead of posts (2fjxieoiipm32.mocha.app)

Show HN: OSle – A 510 bytes OS in x86 assembly (github.com)

Show HN: EZ-TRAK Satellite Hand Tracking Suite (github.com)

Show HN: Serdev – A bundler-independent development server for Node.js (github.com)

Show HN: GPT-2 implemented using graphics shaders (github.com)

Show HN: LLM-Exe – A Modular TypeScript Toolkit for LLM Application Development (llm-exe.com)

Show HN: Kubetail – Real-time log search for Kubernetes (github.com)

Show HN: Roons – Mechanical Computer Kit (whomtech.com)

Show HN: Create your own finetuned AI model using Google Sheets (promptrepo.com)

Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser (hyperparam.app)

Show HN: ART – a new open-source RL framework for training agents (github.com)

Show HN: Blast – Fast, multi-threaded serving engine for web browsing AI agents (github.com)

Show HN: I built a hardware processor that runs Python (runpyxl.com)

Show HN: Exhibit and Site on Mechanisms for Students (mechanical-library.org)

Show HN: Live Air Quality Monitor (github.com)

Show HN: Sim Studio – Open-Source Agent Workflow GUI (github.com)

Show HN: GoVisual – lightweight, zero-config HTTP request visualizer for Go (github.com)

Show HN: A site that tracks how positively terms are discussed on Reddit (sentiment-index.github.io)

Show HN: I built a tool to automate repetitive tasks by recording my screen (clickrepeat.com)

Show HN: Visualizing web server activity using gource

Show HN: AgenticSeek – Self-hosted alternative to cloud-based AI tools (github.com)

Open-Source Refact.ai Agent is #1 on SWE-bench Lite With a 59.7% Score

Comments (1)