Cost models illustrative. Numbers from academic and industry research; methodology documented. Not investment or engineering advice; your mileage will vary.
last verified May 202611 min read

Methodology: sources, math, and limitations

Data verifiedMay 2026

Every dollar figure on CodeSmellCost.com traces to a named source. This page is the source list, the per-component formula, the limitations, and the refresh cadence. If a number on the site does not check out against this methodology, email [email protected] and we will correct it.

The methodology rests on three layers: the canonical software-engineering references (Fowler, Martin, Beck, Feathers, McConnell, Ousterhout) for vocabulary and pattern catalogue; peer-reviewed empirical research (Bavota, Bird, Rahman, Khomh) for the smell-to-defect correlation evidence; and named industry research (Stripe, DORA, Pluralsight, GitClear) for the productivity-cost translation that makes the dollar figures defensible to a finance team.

§ 01
Sources
SourceTypeRefreshWhat we take
Fowler, Refactoring (2nd ed., 2018, Addison-Wesley)Canonical referenceStatic (book)The 22-smell taxonomy, the refactoring catalog, and the prose definitions. Every per-smell page on the site cites the relevant chapter or section page reference.
Robert C. Martin, Clean Code (2008, Prentice Hall)Canonical referenceStatic (book)Naming discipline, function-length guidance, and the Single Responsibility Principle. Cited alongside Fowler on the home page and /the-22-smells, with the explicit caveat that some Clean Code guidance is contested by Ousterhout.
Kent Beck, Test-Driven Development: By Example (2002, Addison-Wesley)Canonical referenceStatic (book)TDD as a precondition for cheap refactoring. Cited in /refactoring-roi as the answer to the CFO question 'how do you refactor safely without breaking everything?'.
Michael Feathers, Working Effectively with Legacy Code (2004, Prentice Hall)Canonical referenceStatic (book)Seams, characterisation tests, and the legacy-code refactoring playbook. Cited in /case-studies and /refactoring-roi for the practical 'how do you start refactoring a smell-dense codebase?' question.
Steve McConnell, Code Complete (2nd ed., 2004, Microsoft Press)Canonical referenceStatic (book)Empirical defect-density data on function length, complexity, and naming. Cited on /pr-review-time and /cost-model for the cognitive-complexity to defect-rate translation.
John Ousterhout, A Philosophy of Software Design (2018, Yaknyam Press)Counterpoint referenceStatic (book)The structural counterpoint to Clean Code's small-function gospel. Deep modules over shallow ones. Cited prominently on /duplicate-code (the DRY counterpoint) and /when-smells-are-ok.
Bavota et al., ICSE 2015: 'Are Code Smells Harmful?'Peer-reviewed researchStatic (published)Empirical evidence that smells emerge under schedule pressure and immediately increase fault-proneness. Cited on /bug-rate-correlations as one of the three primary defect-density references.
Bird et al., FSE 2011: 'Don't Touch My Code!'Peer-reviewed researchStatic (published)Fragmented ownership doubles or triples defect rates. Cited on /bug-rate-correlations and /cost-model for the incident-cost-attribution component.
Rahman 2025 meta-studyPeer-reviewed researchStatic (published)Effect sizes across 28 primary studies. God Class rho=0.38, Feature Envy rho=0.31, Duplicate Code rho=0.27. Cited prominently across all per-smell pages and /bug-rate-correlations.
Khomh et al. 2012: 'An Exploratory Study of the Impact of Code Smells on Software Change-proneness'Peer-reviewed researchStatic (published)Change-proneness correlation with smell density, especially Shotgun Surgery and Divergent Change. Cited on /the-22-smells and the per-smell change-preventer pages.
Stripe Developer Coefficient Report (2018, updated 2023)Industry researchPeriodicThe macro number: $85 billion per year lost globally to technical debt and bad code (16 percent of developer time). Used as the upper-bound sanity check for our per-team cost ranges. Our team-of-8 estimates summed across smells fall within the Stripe 15-25 percent range.
DORA State of DevOps 2024Industry researchAnnualThe performance-tier benchmarks (elite, high, medium, low) and the 10-25 percent rework-percentage band for low-performers vs under 5 percent for elites. Cited on the home page card grid, /refactoring-roi, /pr-review-time, and /cost-model.
Pluralsight State of Developer Onboarding 2024Industry researchAnnualTime-to-first-PR data: 2-4 weeks on clean codebases vs 6-12 weeks on smell-dense ones. The 4-8 week gap, at fully-loaded cost, drives the onboarding component of our cost model. Cited on /onboarding-drag and /cost-model.
Cisco / SmartBear, Best Kept Secrets of Peer Code Review (2007)Industry researchStatic (published)Reviewer effectiveness drops after 60 minutes and 400 LOC. Cited on /pr-review-time and /calculator for the PR review drag formula.
GitClear State of the Code 2024Industry researchAnnualAI-assistant copy-paste rate research. Cited on /duplicate-code with the explicit caveat that the AI-assistant-era duplication-rate observation is still being validated.
SonarSource, SonarQube documentationVendor referenceContinuousThe CodeSmell taxonomy reference (Sonar's internal categorisation), Cognitive Complexity definition (Campbell, SonarSource white paper 2018), and the rule set used by most teams in production. Cited on /tools and per-smell detection sections.
Software Engineering Institute (CMU SEI), Technical Debt research programAcademic / institutionalPeriodicThe standard taxonomy of technical-debt items and the cost-of-deferral framing. Cited on /refactoring-roi and /cost-model.
IEEE SWEBOK (Software Engineering Body of Knowledge, v3)Academic / institutionalPeriodicThe discipline-level definitions of software maintainability, quality attributes, and measurement. Cited on /cost-model and /methodology.
Empirical Software Engineering journal (Springer)Peer-reviewed venueContinuousSource venue for Khomh 2012 and several Bavota / Palomba follow-up studies. Treated as the gold-standard publication venue alongside ICSE / FSE / ASE.
BLS Occupational Employment and Wage Statistics (Software Developers)Public statisticsAnnualAnchor for fully-loaded cost defaults. US median software engineer wage approximately $130K base; fully-loaded cost (benefits, equipment, overhead) approximately $180K-$200K. Used as the calculator default.
§ 02
In Scope
§ 03
Out of Scope
§ 04
Calculation Framework

Team-of-8 baseline

All per-smell dollar bands on the site assume a team of 8 engineers. Scale linearly for other team sizes: a team of 12 multiplies the figure by 1.5x, a team of 4 by 0.5x. The team-of-8 baseline matches the median engineering-team size in DORA, Pluralsight, and GitClear survey populations.

Fully-loaded cost

US anchor: $180K-$200K per engineer fully loaded (base + benefits + equipment + overhead). Derived from BLS OEWS median wage ($130K) plus a 1.4-1.5x loaded multiplier consistent with Stripe and DORA survey populations. Calculator overrideable; per-smell badges use $180K.

PR review drag

Smell-dense code is reviewed approximately 2.3x slower (CodeScene Code Health research). Formula: team_size x prs_per_week x review_hours x smell_overhead_percent x 52 x hourly_rate. Smell overhead percent defaults to 30 percent (Cisco / SmartBear 2007 reviewer-effectiveness research bounds it at 20-40 percent).

Defect-density translation

Rahman 2025 rho values map to expected-value incident cost. God Class rho=0.38 means smell-dense modules have approximately 1.5-3x the defect rate of clean ones (per Bird 2011 ownership effect). Multiply by base incident frequency, average incident hours, and responding-engineer count to get annual expected-value cost contribution.

Onboarding drag

Pluralsight 2024: 2-4 weeks to first PR on clean, 6-12 weeks on smell-dense. Use the 4-8 week delta. Cost: hires_per_year x extra_ramp_weeks x (fully_loaded_cost / 52). For a 3-hire-per-year team at the US anchor, onboarding drag alone is approximately $42K-$84K per year before any other smell cost.

Severity multiplier

Three bands: cosmetic (0.6x), structural (1.0x baseline), critical (2.5x). Calibrated against Mantyla taxonomy. God Class is the canonical critical smell; Lazy Class is the canonical cosmetic; the bulk of the catalog sits in the structural band. See /cost-model for the full multiplier table.

§ 05
Refresh Cadence

Numbers, source URLs, and tool pricing get a first-business-week pass each month. LAST_VERIFIED_DATE in src/lib/schema.ts rolls forward when the pass finishes. The layout footer date, every per-page Article schema dateModified, the badge at the top of /about and /methodology, and every lastVerified page-shell prop all read from that single constant. One change rolls the entire site's freshness signal forward.

Out-of-cycle refresh triggers:

§ 06
Limitations
§ 07
Corrections

Found a number that does not check out, a citation that does not resolve, or a smell missing from the catalog? Email [email protected]. Corrections turn around within 5 business days. The corrected version ships with the next monthly pass and rolls LAST_VERIFIED_DATE forward.

About this siteCost model deep diveRun the calculatorBooks and references

Updated May 2026