Acceptance criteria: 5 formats and the structural traps

If user stories are the most-skipped agile artefact, acceptance criteria are the most-skipped part of user stories. Most teams write the title and description, then ship without explicit criteria for what “done” means. Then they’re surprised when QA pushes back, or when a stakeholder demos something the engineer didn’t think they needed.

This is the practical guide. Five acceptance-criteria formats, when each one fits, and the structural reasons most teams’ criteria fail.

What acceptance criteria actually do

A user story says what and why. Acceptance criteria say how we know it’s done.

Without them, “done” gets negotiated post-hoc — usually after the work, often after the sprint. With them, the negotiation happens before anyone writes code.

The signal that your team is missing acceptance criteria:

Tickets get rejected in review for missing scope nobody discussed.
“But I didn’t realise we needed that” is said in retro.
QA finds “bugs” that are actually scope gaps.
Stakeholders ask “is it done?” and the engineer can’t answer cleanly.

Three out of those four are scope-clarity problems, not engineering problems.

Format 1: The bullet list

Plain-language bullets. The most common format, the easiest to write, the easiest to read.

Story: User can reset password via email

Acceptance criteria:

User enters email on the reset page

System sends a reset email if the email matches an account

System shows the same message (“check your email”) whether the email matches or not (no enumeration)

Reset link expires after 24 hours

Reset link works exactly once

Use it when: the criteria are independent and the story is small. Most stories fit here.

Skip it when: the behaviour depends on context (input X gives output Y), or the criteria interact with each other.

The trap: writers fall into “the system shall…” spec language. Don’t. Acceptance criteria are pre-conversation, not a contract document.

Format 2: Given / When / Then (Gherkin)

Borrowed from BDD frameworks (Cucumber, etc.). Forces explicit scenario thinking.

Scenario: Email matches an existing account
Given a user with email “[email protected]” exists
When the user requests a password reset for “[email protected]”
Then a reset email is sent to that address
And the system shows “Check your email”

Scenario: Email does not match any account
Given no user has email “[email protected]”
When someone requests a password reset for “[email protected]”
Then no email is sent
And the system shows “Check your email” (same message)

Scenario: Reset link is reused
Given a valid reset link was clicked once successfully
When the same link is clicked again
Then the user sees “This link has already been used”

Use it when: behaviour varies by input or state. The format makes branching explicit.

Skip it when: the story is one straightforward thing (logging in with valid credentials). Gherkin adds ceremony without adding clarity for trivial cases.

The trap: most teams overuse Gherkin. They Gherkin-format a list of facts that aren’t actually scenarios. Three Givens, no Whens, no Thens — that’s just a bullet list pretending.

Format 3: Examples table

When the story is “compute the right value for these N inputs”, a table beats prose.

Story: Pricing for annual vs monthly billing

Plan Seats Term Expected total

Pro 1 Monthly £5

Pro 5 Monthly £25

Pro 1 Annual £50 (£60 - £10 discount)

Pro 5 Annual £250 (£300 - £50 discount)

Pro 0 Either Reject — invalid

Plan	Seats	Term	Expected total
Pro	1	Monthly	£5
Pro	5	Monthly	£25
Pro	1	Annual	£50 (£60 - £10 discount)
Pro	5	Annual	£250 (£300 - £50 discount)
Pro	0	Either	Reject — invalid

Use it when: the work is a lookup, calculation, or transformation with multiple input cases. Pricing, tax, permissions matrices, validation rules.

Skip it when: behaviour isn’t reducible to a row.

The trap: writers list happy-path rows and skip the boundaries. The interesting cases are the empty input, the maximum, the negative number, the invalid currency. If your table has 4 happy rows and zero edge rows, the criteria are incomplete.

Format 4: Acceptance test list

Each criterion is the title of a test that will exist after the work.

Story: Sprint capacity respects holidays

Tests to pass:

capacity_calculator_excludes_holidays

capacity_calculator_handles_partial_day_holidays

capacity_calculator_uses_team_timezone_for_holiday_resolution

capacity_calculator_returns_zero_when_all_days_are_holidays

capacity_calculator_warns_when_more_than_50_percent_of_sprint_is_holiday

Use it when: the team writes tests-first, or when “done” is provably automated. Particularly useful for backend stories with no UI surface.

Skip it when: the work is exploratory, design-heavy, or has no testable assertions yet.

The trap: not every story has a meaningful automated test. Forcing this format on a UI-polish story is theatre.

Format 5: Definition of done + story-specific delta

For mature teams: instead of repeating the same criteria on every story (has tests, code reviewed, docs updated), the team’s Definition of Done covers them. The story only lists what’s specific to this story.

Story-specific acceptance criteria:

Velocity calculator handles 0-velocity sprints without dividing by zero

Sprint forecaster uses median, not mean, when velocity has wide variance

Both calculators show the underlying numbers, not just the result

(Standard DoD applies — tests, review, deploy, docs, accessibility check.)

Use it when: the team has a working DoD. (See definition-of-done-engineering and the DoD generator tool.)

Skip it when: the team’s DoD is aspirational rather than actually enforced. Don’t reference a DoD that doesn’t gate merges.

The trap: stories with only “see DoD” and no story-specific criteria. That means the work isn’t actually defined yet — it’s just “do the thing”.

How to pick the format

Most stories don’t need a format conversation. Most stories need bullets. Pick a non-bullet format only when bullets won’t carry the load:

Branching behaviour → Gherkin
Lookup or calculation → Table
Backend-heavy / test-first → Test list
Mature team with stable DoD → DoD + delta

Mixing formats inside one story is fine. A story can have three bullets and a small examples table.

The five anti-patterns

Most criteria failures aren’t about format. They’re structural.

1. Restating the title

Story: User can log in

Acceptance criteria:

User can log in

Zero added information. The criterion has to define what counts as logging in — credentials accepted, session set, redirect happens — or it’s noise.

2. Implementation steps disguised as criteria

Acceptance criteria:

Add a password_resets table

Add a PasswordResetsController

Add a PasswordResetMailer

These are tasks, not criteria. Criteria describe outcomes; tasks describe work. If the team picks a different implementation tomorrow (a third-party auth provider), the criteria should still hold. Implementation tasks belong in a separate checklist or the engineer’s todo, not in acceptance criteria.

3. Untestable criteria

System should be fast

UI should look professional

Code should be clean

If you can’t agree whether it’s met, it’s not a criterion. “Page loads in under 200ms at the 95th percentile” is testable. “Should be fast” is wishful thinking.

4. Hidden criteria

The engineer ships, the reviewer says “but it should also handle X”, and X was never written down. If X was important, it was a criterion. If X wasn’t important enough to write down, it doesn’t get to block merge.

The fix: when a reviewer asks for X, either (a) reject the request because criteria don’t say X, or (b) accept X but add it to criteria so the next reviewer of the next story knows the bar moved. Don’t let the bar drift silently.

5. Criteria added after the work started

Engineer two days into the work
“Wait, can we add some acceptance criteria?”

If criteria appear after work begins, they’re observations of what was already built, not a contract for what to build. Either retroactively confirming work that’s already done (fine, but admit it) or scope-creeping the work mid-flight (not fine — defer to a new story).

The structural fix: refuse to start work without criteria. Refinement is where they get written. (See backlog refinement runbook.)

The five-minute rule

If acceptance criteria take more than five minutes to write for a normal-sized story, the story is too big or unclear. The criteria-writing process is a quality check on the story itself.

If you keep needing 30 minutes:

The story is an epic in disguise (split it)
The team doesn’t understand the requirement (more refinement, less typing)
The criteria are being used to litigate scope with a difficult stakeholder (a process problem, not a writing problem)

A small, clear story should generate clear criteria fast. If it doesn’t, the problem is upstream.

How acceptance criteria interact with the rest of the system

Story estimation: points should reflect the criteria as written. If criteria expand mid-sprint, the points were wrong, not the team. (See story point estimation.)
Definition of Done: team-wide gates that apply to every story. Criteria are story-specific. The two don’t overlap; they stack.
Sprint goal: criteria define when one story is done. The sprint goal defines when the sprint delivered something coherent. A sprint with 100% criteria-met but no sprint-goal progress had bad sprint planning, not bad criteria.
Retrospectives: if the same kind of criterion gets missed sprint after sprint (UX polish, accessibility, error states), it belongs in the team’s DoD, not on every story.

The minimum viable acceptance criteria

If your team currently has none, start here:

Every story has at least one bullet that says specifically what “working” looks like.
Every story has at least one bullet about an edge case (empty input, error state, the obvious-but-skipped case).
Every story has at least one bullet about how the user knows it succeeded.

Three bullets. Five minutes. Caught most of the “I didn’t know we needed that” complaints in the first two sprints.

After two months, escalate the format only where the simple bullets aren’t carrying the load.

Bottom line

Acceptance criteria aren’t compliance. They’re how the team and the requester agree what “done” means before the work starts, so the work doesn’t get redefined after it’s shipped.

Don’t write more than the story needs. Don’t write less than the story needs. The five-minute rule and the three-bullet minimum will catch most teams up.

When you graduate beyond that, format the criteria around the kind of story it is, not around what your training course taught.

Acceptance criteria: 5 formats and the structural traps that ruin them