New Prompt Engineering Metaheuristic – (NoA) Network of Agents

3 scraper01 1 8/15/2025, 10:31:34 PM github.com ↗

Comments (1)

scraper01 · 8h ago
I've been looking into the idea of "deep thinking" in AI, but it seems reserved for big models with huge compute budgets. I wanted to see if a different approach was possible: trading instantaneous computation for a slower burn. To explore this, I've been building an open-source research project called Network of Agents (NoA). The goal is to turn a modest laptop (I'm developing on a 32GB RAM machine) into a "solution mining" rig. You can set up a hard problem, and using a local LLM (via Ollama and a quantized Qwen model like Qwen 30b a3b), let a society of agents work on it for hours or days, iteratively refining their collective answer. The Core Idea: Backpropagation with Natural Language The system is built with LangGraph and is inspired by neural networks. It runs in epochs, with each epoch consisting of a "Forward Pass" and a "Reflection Pass". 1. The Forward Pass (Inference): • Instead of numerical weights, the network's "weights" are the natural language system prompts of its agents. • The process starts by procedurally generating a multi-layered network of agents. The first layer gets cognitive diversity from MBTI archetypes and "seed verbs" related to the user's problem. • Subsequent "hidden" layers are built by having an agent-analyst chain create a "hard request" designed to challenge the previous layer, then spawning a new agent specialized for that challenge. • Information flows through the network layer by layer, with the combined JSON outputs of one layer being broadcast as input to all agents in the next. • 2. The Reflection Pass (Learning): This is where I've tried to simulate backpropagation. • Critique as the "Loss Function": After the final layer's outputs are synthesized into a single solution, a critique_agent assesses it against the original problem and generates a constructive critique. • Propagating the "Gradient": This critique is the error signal. It's propagated backward through the network. An agent in layer N-1 receives a targeted critique based on its contribution to the final answer generated by layer N. • The "Optimizer" Meta-Prompt: At each step of the backward pass, an update_agent_prompts_node uses the incoming critique as the main input to a meta-prompt. This meta-prompt's job is to completely rewrite and evolve the receiving agent's system prompt—its skills, attributes, and even its career—to better address the critique.

The entire network learns and adapts its own instructions, not through a central controller, but through a distributed process of peer-to-peer challenge. The Long-Term Vision: A New Kind of Training Data This is the part that I find most exciting. Every run of this system produces a complete, structured trace of a multi-agent collaborative process: the initial agent personas, the layer-by-layer reasoning (CoT traces), the critiques, and the evolution of each agent's prompts across epochs. This is a new kind of dataset that captures the dynamics of reasoning, not just static information. My long-term, ambitious goal is to use this data to train a "World Language Model" – a model trained not just on text, but on the fundamental patterns of collaboration, error correction, and social intelligence. This is an early-stage research project. The code is available for anyone to run, and the immediate roadmap includes dynamic memory for small models, P2P networking for distributed mining, and better visualization. I'd love to get this community's feedback. What do you think of this approach? Is the analogy to backpropagation sound? How would you improve the meta-prompts that drive the evolution? Thanks for reading.