Prompting Coding Agents

Techniques that get better results from CLI AI coding tools.

Last updated

Workflow

1. Agents vs Chatbots

Agent prompting differs fundamentally from chatbot prompting. A chatbot generates text in response to open-ended questions. An agent dynamically directs its own process: it reads files, runs commands, edits code, and decides what to do next based on the results. Each turn produces one concrete action, not a wall of suggestions.

Chatbot prompts reward exploration and open-endedness. Agent prompts require precision: name the specific files, state the constraints, define measurable success criteria. Vague instructions send agents into loops; precise ones let them execute autonomously.

Provide environmental context your agent cannot infer. The OS, shell, working directory, language version, and test framework all shape what commands the agent should run. Most CLI agents pick this up automatically, but confirming it in your prompt eliminates guesswork.

CharacteristicChatbot PromptAgent Prompt
Goal typeOpen-ended explorationSpecific, measurable outcome
ContextConversational historyFiles, env, working directory
OutputText responseCode changes + tool actions
IterationUser steers each turnAgent self-directs between turns
Failure modeWrong or vague answerInfinite loop or silent corruption
VerificationUser reads and judgesTests, linters, build output

2. The Four-Phase Workflow

The most reliable pattern for agent-driven development is Explore-Plan-Implement-Commit. Each phase has a distinct purpose, and skipping one tends to produce the exact failure mode it prevents.

Explore (read code) → Plan (Ctrl+G) → Implement (normal) → Verify + Commit Understand first Design approach Write code Tests pass, then ship

Explore

Start every session by asking the agent to read the relevant code. Point it at directories, entry points, or config files. The agent builds a mental model of the codebase before touching anything. Without this step, the agent writes code that compiles but doesn't fit the existing architecture.

Plan

Switch to Plan Mode (Ctrl+G in Claude Code) for research and architecture. In plan mode the agent thinks and reads but does not write files. Use this phase to agree on the approach before any code is generated. Ask the agent to outline the files it will change, the functions it will add, and the tests it will write.

Implement

Switch back to Normal Mode and let the agent execute the plan. Because the plan exists, the agent has a clear scope and is less likely to wander into unrelated refactors.

Verify and Commit

Run the test suite, check linter output, review the diff. Only commit when all checks pass. Skip planning only when the change is trivially scoped: a one-line fix, a config value update, a typo correction.

Why this works: The four-phase pattern prevents "infinite exploration" (where the agent keeps reading without acting) and "kitchen sink sessions" (where a single conversation accumulates unrelated changes until context degrades).

3. Stepwise Prompting

Complex tasks should be decomposed into sequential steps, each approved before the next begins. The anti-pattern is the mega-prompt: "Generate a complete Node.js app with auth, a React front-end, and deployment scripts." That prompt guarantees hallucinated dependencies, skipped error handling, and an architecture nobody asked for.

The fix is straightforward: one step at a time, wait for approval. First prompt: "Set up the project structure and install dependencies." Review. Second prompt: "Add the auth module using Passport with JWT." Review. Each step is small enough to verify and cheap enough to discard.

Goal enumeration

For refactoring tasks, number your objectives explicitly. The agent can track progress against the list and you can confirm each one independently.

Refactor the data-fetching layer. Objectives:
(1) Eliminate duplicate fetch calls across components
(2) Fetch user and settings data in parallel
(3) Preserve error specificity — each endpoint's errors
    must surface distinct messages, not a generic fallback

Rubber duck prompting

For debugging, ask the agent to walk through the code line-by-line, tracking variable values at each step. This forces the agent to simulate execution rather than pattern-match against common bugs. The bug often becomes obvious during the walkthrough without the agent needing to guess.

4. Verification is the Highest Leverage

If you adopt only one technique from this guide, make it this: give the agent a way to verify its own work. Provide tests, screenshots of expected output, a curl command that should return 200, a grep pattern the build log must contain. Without verification criteria, agents write plausible-looking code that passes agent-written tests but breaks in production.

The gap between "looks right" and "is right" is where agent-produced bugs hide. An agent can write a function and a test for that function in the same turn. If the agent misunderstands the requirement, the test encodes the same misunderstanding. The test passes. The feature is wrong.

Warning: Agent-produced tests may be technically valid but miss critical edge cases. Always provide your own test criteria — expected inputs, outputs, and boundary conditions the agent must satisfy.

Effective verification prompts look like this:

After making changes, verify:
- `npm test` passes with 0 failures
- The /api/users endpoint returns 401 without a token
- The migration runs cleanly on a fresh database
- No TypeScript errors in strict mode

5. Common Failure Patterns

Recognizing failure patterns early saves hours. These are the most common ways agent sessions go wrong, along with the fix for each.

PatternSymptomFix
Kitchen-sink sessionOne conversation accumulates unrelated changesClear context between tasks (/clear)
Correction spiralCorrecting the agent makes it worse each turnClear after 2 failed corrections and re-prompt
Over-specified CLAUDE.mdAgent follows stale or contradictory rulesPrune rules or convert to hooks
Trust-then-verify gapAgent says "done" but nothing was testedAlways include verification steps in the prompt
Infinite explorationAgent keeps reading files without writing anythingScope narrowly or delegate to subagents

Recursive Self-Improvement (RSIP)

When quality matters, tell the agent to iterate on its own output: generate, evaluate against criteria, improve, then repeat. Two to three cycles are usually enough. Diminishing returns set in fast, so cap the iterations explicitly in your prompt.

Write the function, then review it for edge cases.
Fix any issues you find. Repeat once more, then stop.