My Claude Code Review Checklist

This is the checklist I run through when reviewing Claude Code output. I built it from things I've actually caught — not from security advisories or best practice guides. Each item is there because I found it in real code.

The checklist

Does it do what I asked, specifically? Read the output against the spec, not against what looks reasonable. Claude will sometimes produce something slightly different that's still sensible. "Slightly different" can matter.

Are there any try-catch blocks that swallow errors silently? Empty catch blocks or catch blocks that only log "something went wrong." These hide bugs in production.

Are there any hardcoded values that should be configurable? Timeout values, retry counts, URL paths, page sizes. These usually need to come from config or environment variables.

Did it touch anything I didn't ask it to? Check the diff against the scope of the request. Claude sometimes refactors adjacent code it wasn't asked to change.

Are the type assertions justified? Any as SomeType cast. Usually Claude uses these when it couldn't resolve a type correctly. The cast makes the compiler happy but the underlying issue is still there.

Does the test cover the actual behavior, or just the happy path? Run through: what input would break this function? Does the test catch it?

Are all external inputs validated before use? User input, API responses, config values. Claude sometimes assumes the shape of incoming data matches what's expected.

Is there anything that would only fail at scale? N+1 queries, unbounded loops over potentially large datasets, synchronous operations on something that could grow.

Did it leave any TODO comments? Claude sometimes writes TODOs for things it decided not to implement. These need to be resolved, not committed.

Do the variable and function names describe what they actually do? Claude picks reasonable names but they're sometimes generic. Rename anything that would confuse someone reading it in six months.

How long this takes

For a small function: two minutes. For a larger feature: ten minutes. The checklist scales with the code size.

The most important item is the first one. Everything else catches specific categories of bugs. The first item catches the entire class of "Claude solved a different problem than I described."