DispensablesHigh

Duplicate Code: What It Costs and How to Fix It

Identical or near-identical code blocks that appear in multiple locations, requiring synchronized maintenance.

Annual Cost$3.6k - $30k
Severity
4/5
CategoryDispensables
Detection4 tools

What It Is

Duplicate Code is the most commonly detected smell and one of the most costly. It occurs when the same logic exists in two or more places. The duplication may be exact (copy-paste) or structural (same algorithm with different variable names). When a bug is found in duplicated code, it must be fixed in every copy. When a requirement changes, every copy must be updated. The cost scales with the number of duplicates and the frequency of changes.

Threshold: Any block of 6+ lines duplicated in 2+ locations warrants extraction. Structural duplication (same algorithm, different names) is harder to detect but equally costly.

Why It Costs Money

1

Bug fixes must be applied to every copy. When a developer fixes a bug in one copy and misses another, the bug persists in production. In codebases with extensive duplication, teams report that 20-30% of 'new' bugs are actually unfixed duplicates of previously resolved issues.

2

Requirements changes multiply effort linearly. If a validation rule exists in 5 copies and the rule changes, the developer must find and update all 5. This is not 5x the typing; it is 5x the testing, 5x the code review, and 5x the risk of inconsistency.

3

Codebase size inflates artificially. Duplicated code increases the surface area that must be read, understood, and maintained. A 100,000-line codebase with 20% duplication has 20,000 lines that exist only to create maintenance burden.

Specific Cost Mechanisms

  • Unfixed-duplicate bugs: each missed copy costs 4-8 hours to diagnose when the 'fixed' bug reappears
  • Change multiplication: each change to duplicated logic requires N updates, N test runs, N review cycles
  • Reading burden: developers waste time reading code they have already seen, wondering if the copies are identical or subtly different

Estimated Annual Cost

Cost per instance by team size and codebase size. Based on $120,000 average developer salary. See full methodology.

Team SizeSmall (<50k LOC)Medium (50k-200k)Large (200k+)
3 devs$3,600$9,000$18,000
5 devs$6,000$15,000$30,000
10 devs$9,000$22,500$30,000
20 devs$12,000$30,000$30,000

How to Detect It

Specific rules and thresholds for automated detection. See full tool comparison.

SonarQube
common-java:DuplicatedBlocks

Detects blocks of 10+ duplicated lines (configurable)

CodeClimate
identical-code / similar-code

Detects both exact and structural duplication

PMD
CPD (Copy/Paste Detector)

Language-agnostic duplicate block detection

jscpd
Configurable thresholds

Standalone tool supporting 150+ languages

Refactoring Patterns

Proven techniques to eliminate this smell. See all refactoring patterns.

Extract Method

Duplicated blocks within the same class

Effort: 15-45 minutes per extraction
Impact: Eliminates the duplication entirely

Extract Class/Module

Duplicated logic appears across different classes

Effort: 2-4 hours
Impact: Centralises the logic in one place

Form Template Method

Subclasses contain similar algorithms with slight variations

Effort: 3-6 hours
Impact: Captures the common algorithm structure while allowing variation

Related Smells