1. Research
  2. Specification
  3. Blueprint
  4. Roadmap

Research

Step 1 Context

Description

User

Job to be done

Research

APIs

Code libraries

Specification

Step 2 - o4 mini

Ask me one question at a time so we can develop a thorough, step-by-step spec for an idea. Each question should build on my previous answers, and our end goal is to have a detailed specification I can hand off to a developer. Let’s do this iteratively and dig into every relevant detail.

Remember, only one question at a time.

Here’s the idea:

IDEA

Step 3

Now that we’ve wrapped up the brainstorming process, can you compile our findings into a comprehensive, developer-ready specification? Include all relevant requirements, architecture choices, data handling details, error handling strategies, and a testing plan so a developer can immediately begin implementation.

Blueprint

Step 4 -Claude 3.7

Draft a detailed, step-by-step blueprint for building this project. Then, once you have a solid plan, break it down into small, iterative chunks that build on each other. Look at these chunks and then go another round to break it into small steps. Review the results and make sure that the steps are small enough to be implemented safely with strong testing, but big enough to move the project forward. Iterate until you feel that the steps are right sized for this project.

From here you should have the foundation to provide a series of prompts for a code-generation LLM that will implement each step in a test-driven manner. Prioritize best practices, incremental progress, and early testing, ensuring no big jumps in complexity at any stage. Make sure that each prompt builds on the previous prompts, and ends with wiring things together. There should be no hanging or orphaned code that isn’t integrated into a previous step.

Make sure to separate each prompt section and use prompting best practices. Use markdown.

Each prompt should be tagged as text using code tags. The goal is to output prompts, but context is important as well.

Make sure to write out all the prompts, shorten them if needed to ensure we have every prompt needed to build this entire project. Each prompt should stand alone and not reference other prompts.

Road Map

Step 5

Can you make a ’todo.md’ that I can use as a checklist? Be thorough.

Effectent use of github copilot agent pro

Action / ModeBase Model (0 cost)When to Escalate (Signal)Escalation Model (Cost)
1. Assign story / issue to CopilotGPT-4o / GPT-4.1 / GPT-5 miniMisunderstands scope, wrong architecture, misses constraintsGemini 2.5 Pro (×1) or Claude Sonnet 4.5 (×1)
2. Resolve CI/CD workflow failuresGPT-5 mini / GPT-4.1Log analysis is shallow, wrong root cause, circular fixesClaude Sonnet 4.5 (×1) or Gemini 2.5 Pro (×1)
3. Interactive debugging (MCP, local)GPT-4o / GPT-5 miniFails to track state, can’t reason across functions/filesClaude Sonnet 4.5 (×1) or GPT-5 (×1)
4. Product planning & backlog managementGPT-4o / GPT-4.1Output is generic, poor prioritization, weak structureGemini 2.5 Pro (×1) or Claude Sonnet 4.5 (×1)
5. Test suite developmentGPT-4o / Grok Code Fast (×0.25)Misses edge cases, bad mocks, shallow coverageGPT-5 mini (0) or Claude Sonnet 4.5 (×1)
6. Large refactor / multi-file changesGPT-5 mini / GPT-4.1Loses global coherence, breaks invariants, wrong decompositionGPT-5 (×1) or Claude Opus 4.5 (×3)
7. Architecture / system designGPT-4o / GPT-5 miniLacks tradeoff analysis, no dependency reasoningGemini 2.5 Pro (×1) or Claude Opus 4.5 (×3)
8. Repetitive boilerplate / scaffoldingGrok Code Fast (×0.25)Needs domain reasoning or cross-module awarenessGPT-4o (0) or GPT-5 mini (0)

Below is a task-by-task, zero-multiplier playbook:

  • which 0-cost model to default to

  • why it’s the best fit

  • and how to prompt it so it behaves like a specialist instead of a chatbot.

🧭 Zero-Multiplier Model Map + Prompting Tactics

Action / ModeBest 0× ModelWhy This One WinsHow To Prompt It (Pattern)
1. Assign story / issue to CopilotGPT-4.1Best balance of instruction-following + code gen”You are implementing exactly this story. Restate requirements, list files to change, then implement. Do not invent scope.”
2. Resolve CI/CD workflow failuresGPT-5 miniBetter multi-step reasoning over logs/config”Given this error and this workflow, enumerate 3 likely causes, rank them, propose minimal fix for top one.”
3. Interactive debugging (MCP, local)GPT-5 miniTracks state & logic better across functions”Simulate execution path. At each step state variable values. Find first divergence from expected behavior.”
4. Product planning & backlog managementGPT-4oBest at structuring, summarizing, clustering ideas”Given this context, produce: (1) Epics, (2) Prioritized stories, (3) Acceptance criteria, (4) Dependencies.”
5. Test suite developmentGPT-4.1Strong at code patterns + test idioms”Given this function, list edge cases first, then generate table-driven tests covering them.”
6. Large refactor / multi-file changesGPT-5 miniBetter coherence across multiple components”First propose a refactor plan with steps and invariants. Wait. Then apply step 1 only.”
7. Architecture / system designGPT-4oBest at synthesis and narrative structure”Design 3 options. For each: tradeoffs, risks, migration path. Then recommend one.”
8. Boilerplate / scaffoldingGPT-4.1Fast, consistent, low hallucination”Generate only the skeleton. No business logic. Follow this existing style.”

🎯 Why These Pairings Work (Mental Models)

  • GPT-4.1 = “Senior implementer”
    Great at: code, tests, scaffolding, following instructions precisely.
  • GPT-4o = “Product + architecture synthesizer”
    Great at: structuring messy thoughts, plans, backlogs, options, narratives.
  • GPT-5 mini = “Junior architect / debugger”
    Great at: step-by-step reasoning, tracing logic, cross-file understanding.

🧠 Prompting Patterns That 2–3× Their Effectiveness

  1. Force a 

Two-Phase Response

Phase 1: Analyze and plan. Phase 2: Execute. Do not mix.

This dramatically improves coherence, especially for GPT-5 mini.

  1. Always Ask for 

Structure Before Code

Bad: “Fix this.” Good: “First explain what is wrong and propose a minimal fix. Then implement.”

  1. Use 

Role + Constraints

You are a cautious senior engineer. Constraints:

  • Do not change public APIs
  • Do not add dependencies
  • Do not refactor unrelated code

This reduces “helpful but expensive” rewrites.

  1. Use 

Invariants for Refactors

Invariants:

  • All tests must still pass
  • Behavior X must remain unchanged
  • Only touch these files

This keeps free models from going wild.

  1. Use 

Edge-Case-First for Tests

First list all edge cases. Then generate tests. Then show coverage gaps.

  1. Use 

Ranking & Elimination for Debugging

List 5 hypotheses. Eliminate 3. Deep dive on the top 2.

Propose a minimal experiment.

This turns GPT-5 mini into a systematic debugger.

🪜 The Escalation Ladder (Internalized)

For any task:

  1. Start with best 0× model for that task
  2. If it:
  • loses state
  • loops
  • misses global structure
  1. Then escalate to Claude Sonnet / Gemini / GPT-5

🧠 The Core Idea

Premium models are not for “better answers”.

They are for “bigger cognitive workspaces”. If you structure the problem well, the 0× models handle 80–90% of your workload.