Claude Code · Testing

Claude Code writes tests that pass but don't test anything. Here's the prompt that fixes it.

Ask Claude Code to add tests and you'll often get something like this:

test('should process order', () => {
  const result = processOrder(mockOrder)
  expect(result).toBeDefined()
  expect(result).not.toBeNull()
})

That test is useless. It confirms processOrder doesn't throw and returns something. It won't catch a bug where the order total is calculated wrong, the status isn't set correctly, or the timestamps are missing.

Claude generates these because "write tests for this function" is ambiguous. Defined-and-not-null is technically a valid response to that instruction.

The prompt that changes it

Add this to your CLAUDE.md:

When writing tests:
- Test specific return values, not just that something returned
- Every test must fail if the core behavior changes
- No expect(result).toBeDefined() or expect(result).not.toBeNull() as the only assertion
- For each test, write a comment explaining what bug it would catch

The last rule matters most. Forcing a comment about "what bug this catches" makes Claude reason about what the test is actually for before writing the assertion. The quality improves noticeably.

What good looks like

Same function, with that rule in CLAUDE.md:

// Catches: wrong total calculation (tax not included)
test('order total includes tax at configured rate', () => {
  const order = processOrder({ items: [{ price: 100, qty: 1 }], taxRate: 0.1 })
  expect(order.total).toBe(110)
})

// Catches: status not set on new orders
test('new orders start with pending status', () => {
  const order = processOrder({ items: [{ price: 50, qty: 2 }] })
  expect(order.status).toBe('pending')
})

// Catches: missing created timestamp
test('processed orders have a created timestamp', () => {
  const before = Date.now()
  const order = processOrder({ items: [{ price: 10, qty: 1 }] })
  expect(order.createdAt).toBeGreaterThanOrEqual(before)
  expect(order.createdAt).toBeLessThanOrEqual(Date.now())
})

Each test fails for a specific, named reason. That's the bar.

The empty-return check

Add this as a second rule:

After writing tests, review each one: if the implementation returned nothing,
would this test still pass? If yes, rewrite it.

A test that passes even when the function returns nothing isn't testing behavior. It's testing that the function runs.

Why this matters more with agents

When Claude Code runs autonomously across multiple sessions, it writes tests as checkpoints. If those checkpoints don't verify actual behavior, they give you false confidence. You find out they weren't working when something breaks in production.

Two minutes in CLAUDE.md catches most of it.

About 25 prompts like these are in the Agent Prompt Playbook — instructions that change Claude's default behavior in specific ways. $29.