These aren't the obvious mistakes. They're the ones that look fine in review, pass a quick smoke test, and then fail subtly in production when a task is complex enough or the edge case hits.
Words like "prefer," "try to," "when possible," and "ideally" turn a rule into a suggestion. The model treats suggestions as optional, which is what they are.
The pattern to watch for: any instruction where a person could reasonably say "I tried but it was complicated." That's preference language. If you mean it as a rule, write it as one: no conditionals, no hedges.
When you don't specify output format, you get whatever the model thinks is appropriate. That might match what you want most of the time. But "most of the time" isn't a contract, and when it breaks you're parsing something unexpected.
This matters most when you're parsing the output programmatically. But it also helps with manual review: when you know what format to expect, you notice immediately when something's wrong.
Most system prompts describe the happy path. What happens when the agent can't find the file? When the API returns an error? When the task is ambiguous and the agent doesn't know how to proceed?
Without explicit error handling instructions, agents fill in the gap themselves. Sometimes that means they quietly skip the failing step. Sometimes they hallucinate what the result should have been. Sometimes they ask a clarifying question in the middle of a batch job where nobody's watching.
The last line matters as much as the error handling itself. "Do not continue past an error condition" tells the agent what to do when it doesn't know what to do: stop. Without that, it keeps going.
They all look reasonable. "Prefer to edit existing files" sounds like sensible guidance. "Identify architectural components" sounds clear. "Read the config file" seems complete. None of them trigger the reviewer's instinct that something is wrong, because they're not wrong in any obvious way.
The problem is that writing a system prompt isn't like writing documentation. Documentation describes intent. A system prompt is more like a specification — it needs to handle the cases where intent isn't obvious, not just the normal path. The gap between "describes the happy path clearly" and "handles everything the model might encounter" is where these failures live.
The Agent Prompt Playbook has 25 system prompts written for production — each one annotated with the reasoning behind the specific wording choices. $29 with the LAUNCH code. Get it on Payhip.