Ask HN: Are we bike-shedding with prompt engineering?
5 achamian 6 6/11/2025, 2:06:20 AM
What if we're optimizing prompts while missing something fundamental about how intelligence wants to interact?
I've been experimenting with letting patterns emerge rather than controlling them, and the results surprised me. Curious if others are questioning our current approach to LLM interaction.
But yes, no doubt we are missing a lot. Please say more about your experiments.
A couple weeks back, I was building a "second brain" using Claude Desktop + filesystem + Neo4J. During a family emergency, I fell back on XP pairing practices - thinking aloud. While comparing memory systems using parallel agents (CN and CF), they immediately adopted different perspectives on the first prompt - one strategic, one tactical - without being asked to differentiate. Engaging both simultaneously produced noticeably better results.
This led to systematic experiments with multi-perspective thinking. The patterns were consistent enough that I documented them: https://github.com/achamian/think-center-why-maybe
Key discovery: HOW we engage matters as much as WHAT we prompt. The strategic/tactical split was just the beginning.
Claude Code breaks down large implementations to simpler TODOs, and produces far better code than single-shot prompts. There is something about problem decomposition that works well no matter whether it is in mathematics, LLMs, or software engineers.
The decomposition also shows a split between planning and execution. Doing them separately somehow provides the LLM more cognitive space to think.
Another example is CHASE-SQL. This is one of the top approaches in Text-to-SQL benchmark in bird-bench. They take a human textual data requirement, and instead of directly asking the LLM to generate a SQL query, they run it through multiple passes: generating portions of the requirement as pseudo-SQL fragments using independent LLM calls, combining them, then using a separate ranking agent to find the best one. Additional agents like a fixer to fix invalid SQL are also used.
What could've been done with a single direct LLM query is instead broken down into multiple stages. What was implicit (find the best query) is made explicit. And from how well it performs, it is clear that articulating fuzzy thoughts and requirements into explicit smaller clearer steps works as well for LLMs as it does for humans.
The difference: instead of sequential passes, you engage multiple viewpoints simultaneously. They build on each other's insights in real-time. Try this experiment:
Copy this prompt: https://github.com/achamian/think-center-why-maybe/blob/main...
Start with: "Weaver, I need to reply to an important email. Here's the context: [email details, recipient biases, objectives]"
After Weaver provides narrative strategy, ask: "Council, what are we missing?" Watch different perspectives emerge - Maker suggests concrete language, Checker spots assumptions, O/G notes psychological dynamics
Critical discovery: The tone matters immensely. Treat perspectives as respected colleagues - joke with them, thank them, admit mistakes. This isn't anthropomorphism - it functionally improves outputs. Playful collaboration enables perspectives to expand beyond initial boundaries.
What makes this powerful: all perspectives share evolving context while the collaborative tone enables breakthrough insights that rigid commanding never achieves.
When onboarding a friend, I used this framing: "Treat Waver/Maker/Checker like three intelligent interns on your team." This immediately shifted his mental model from "prompt engineering" to team collaboration. His first reaction revealed everything: "I don't like Checker - keeps raising objections." I explained that's literally Checker's job - like a good QA engineer finding bugs.
The parallel to XP practices became clear:
Waver explores the solution space (like brainstorming) Maker implements concrete solutions (like coding) Checker prevents mistakes (like code review/QA)
What makes this powerful: You're not optimizing prompts, you're managing a collaborative process. When you debate with Checker about which objections matter, Checker learns and adapts. Same context, same "prompt", totally different outcomes based on interaction quality.
When you ask Maker and Weaver to observe your conversation with Checker they notice how feedback is given and received. It is important to create an environment where "Feedback is a judgement free zone"
The resistance points are where breakthroughs happen. If you find yourself annoyed with one perspective, that's usually the signal to engage more deeply with its purpose, not bypass it.
[Related observation on how collaborative tone enables evolution: https://github.com/achamian/think-center-why-maybe/blob/main...]
Key discovery: Treat perspectives like team members. I told a friend: "Think of Waver/Maker/Checker as three intelligent interns on your team." His first reaction: "I don't like Checker - too many objections." That's when it clicked - it's Checker's JOB to object, like QA finding bugs.
This is NOT anthropomorphizing - it's lens selection. The labels activate specific response patterns, not personalities. Like switching between grep, awk, and sed for different text processing.
Once I started debating with Checker about which objections mattered (rather than dismissing them), output quality jumped dramatically. The interaction pattern matters more than the prompt structure.
Try this: Copy the prompt from [0], then engage with genuine collaboration - thank good insights, push back on weak objections, ask for clarification.
Just a reminder - talking politely helps.
[0]: https://github.com/achamian/think-center-why-maybe/blob/main...