Can AI coding agents do Test-Driven Development (TDD)? Great Pods

Can AI coding agents do Test-Driven Development (TDD)?

Key Points

  • Test-driven development (TDD) follows three phases: red (write failing test), green (write minimal code to pass), and refactor (clean up code)
  • AI coding agents typically don't naturally follow TDD - they prefer big bang approaches over iterative cycles
  • Standard prompting techniques to encourage TDD have mixed results and agents often give up after a few attempts
  • TDD Guard is a library that enforces TDD principles through guardrails that cannot be bypassed
  • The experiment used eShop on Web reference application, a .NET application with domain-driven design concepts
  • The test feature was "splitting a basket" - separating expensive items (>$100) from cheap items into two baskets
  • Without TDD Guard: Claude Code completed the feature in 4 minutes but used "test first" development, not true TDD
  • With TDD Guard: The process took 10 minutes but enforced proper TDD methodology through blocking violations
  • TDD Guard works by using hooks that trigger before file writes, calling an internal AI judge to validate TDD compliance
  • The library uses its own LLM instance as a judge to determine if TDD principles are being followed
  • TDD Guard supports multiple languages including Python, TypeScript, Golang, and .NET
  • The tool requires hook support in the AI coding assistant and language-specific test framework integration
  • TDD Guard successfully caught violations like writing multiple tests at once instead of one at a time
  • The library enforced outside-in development approach, flagging bottom-up implementation attempts
  • Code quality with TDD Guard was somewhat higher due to preventing unnecessary method invention
  • Trade-offs include increased token consumption and slower execution due to AI judge validation
  • TDD Guard maintains state through shared folders containing file updates and test outcomes
  • The library is the best available tool for enforcing TDD with AI coding agents, despite performance costs

Full Transcript