The Coding Standards That Separate Confident Teams from Anxious Ones

Series: The Modern SDLC · Post 5 of 17 ← Post 4: Modern Agile · Post 6: Testing Strategy →

There are two kinds of codebases. The first kind you can navigate confidently — you can find what you're looking for, understand what you find, change it without fear, and trust that the tests will tell you if something breaks. The second kind you tiptoe through — every change feels like it might detonate something, nothing is quite where you expect it, and the only safe move is to add code alongside the existing code rather than replace it.

The difference between these two codebases is rarely talent. It's almost always standards — whether the team has them, whether they're automated, and whether they're applied consistently.

Standards don't slow teams down. The absence of standards does. Debugging a problem caused by an undocumented assumption, reverse-engineering what a function does because the name doesn't reveal its intent, spending twenty minutes in a code review discussing formatting that a linter should have caught — this is where time goes. Standards eliminate that category of waste and redirect it toward work that actually matters.

This post covers the practices that make a codebase a place engineers are confident working in, rather than a place they're anxious about touching.

The one thing to remember

Code is read far more often than it is written. Every decision about how to write code should optimise for the next person who reads it — which might be you in six months.

The measure of good code isn't whether it works today. It's whether a competent engineer who didn't write it can understand what it does, why it does it, and how to change it safely — without asking anyone.

Readable code over clever code

The instinct toward clever code is understandable. Elegant solutions to hard problems are satisfying. A one-liner that replaces twelve lines feels like progress. But cleverness has a cost: the next engineer who reads it — including the author six months later — has to reconstruct the reasoning before they can safely change anything.

Readability is a measurable property. A function is readable if its name reveals what it does without reading its body. A variable is readable if its name makes its purpose obvious in context. A block of code is readable if a competent engineer can understand it without comments.

Naming is the single most important dimension of readable code. A function called getUsersWithExpiredSubscriptions() tells you everything you need before you read a line of its implementation. A function called getUsers2() tells you nothing. The discipline of finding the right name — specific enough to be unambiguous, concise enough to be usable — forces the author to understand what the code actually does, which is itself a valuable exercise.

Function size is the second. A function should do one thing. Not "one thing plus some validation." Not "one thing and also handle the error case." One thing. The practical test: if you need a comment to explain what a block inside a function does, that block wants to be its own function with a descriptive name. When functions do one thing, they're easy to test, easy to understand, and easy to compose.

Magic numbers are readability debt. if retries > 3 is a mystery. if retries > MAX_RETRY_ATTEMPTS is documentation. Every naked literal in a codebase is a question — what does this number mean, why this value, where else is it used? Constants answer those questions at the point of definition and make the intent clear wherever the constant is used.

Abstraction consistency matters more than it gets credit for. A function that books a flight, constructs raw SQL, and sends an email is doing three things at wildly different levels of abstraction. The result is code that's unpredictable to navigate — you never know what level you're operating at until you're already deep inside a function. Keep abstraction levels consistent within a function. Either everything is high-level orchestration or everything is low-level implementation, not both.

TDD: a design discipline as much as a testing one

Test-Driven Development gets filed under "testing practices" but its most important benefit is in design. Writing the test before the implementation forces you to think about the interface before you've committed to the internals — and that pressure almost always produces better interfaces.

When you write the test first, you confront questions that implementation-first development lets you avoid: How should this function be called? What should it return? What happens when the input is invalid? What does the calling code actually need, as opposed to what's convenient to expose? These are design questions, and answering them before writing the implementation produces designs that serve the caller rather than the implementer.

The red-green-refactor cycle is the core of TDD: write a failing test, write the minimum code to make it pass, refactor with confidence. The cycle keeps code lean — you only write code that's required to pass a test — and covered, because coverage is a byproduct of the process rather than a goal you chase after the fact.

The common failure mode of TDD is applying it universally. TDD is high-value for business logic, algorithms, data transformations, and anything with complex branching. It's low-value for UI rendering, glue code with no logic, and exploratory prototypes where the interface isn't yet defined. Applying TDD to everything produces friction where it isn't warranted. Applying it selectively to the code that benefits most is more sustainable.

BDD (Behaviour-Driven Development) extends TDD by writing specifications in natural language — Given/When/Then — that both engineers and non-engineers can read. The specification becomes an executable test. The benefit is alignment: product managers, QAs, and engineers can all read the specification and agree on whether the behaviour is correct. Tools like Cucumber, SpecFlow, and Behave implement this in various languages. Worth adopting when the gap between what product specifies and what engineering builds is a recurring source of problems.

Code review: where standards get reinforced

Code review serves two purposes, and teams that focus only on the first one get half the value.

The first purpose is catching defects — logic errors, edge cases, security issues, performance problems. The second is spreading knowledge — making sure that more than one person understands how a piece of the system works, that the team's standards are applied consistently, and that engineers learn from each other's approaches.

PR size is the most important lever. Small PRs get thorough reviews. Large PRs get rubber-stamped. The correlation is strong and consistent across every team and codebase I've seen. A PR that can be reviewed in twenty minutes gets read carefully. A PR with 800 lines changed gets a scan and an LGTM, because nothing else is practical.

Target under 400 lines changed per PR. Anything over 800 should be split. Use stacked PRs for large features — a series of small PRs each building on the last, each reviewable in isolation.

What reviewers should focus on: logic correctness, edge cases and error handling, security implications, naming and clarity, test coverage, API design, performance implications. What reviewers should not focus on: formatting, import ordering, anything a linter or formatter catches automatically. If a reviewer is leaving formatting comments, the automated toolchain isn't doing its job.

Comment conventions reduce ambiguity. A simple prefix system: nit: for optional style suggestions the author can take or leave, question: for seeking understanding without mandating a change, blocking: for issues that must be addressed before merge. This removes the guesswork about whether feedback is mandatory — everyone knows which comments require action and which are suggestions.

The author's responsibilities are as important as the reviewer's. Write a clear PR description that provides context — why this change exists, what problem it solves, how to test it. Link the ticket. Add screenshots for UI changes. The reviewer shouldn't have to reconstruct the reason for the change from the diff alone.

Review SLA matters too. Unreviewed PRs block flow and create context-switching costs when the author has to reload the work. A team norm of first review within four business hours — not four days — keeps momentum and respects the author's time.

Documentation as code: docs that live with the code

Documentation that lives in a separate wiki decays. It falls out of sync with the code, becomes wrong in subtle ways, and stops being trusted. Engineers stop updating it because it's friction, and stop reading it because it's unreliable.

Documentation committed alongside code solves this. It gets reviewed in the same PR as the change it documents. It's updated when the code changes. It's versioned alongside the code it describes. It's findable in the repository rather than in a separate system that may or may not be maintained.

The README has one job: answer three questions. What does this project do? How do I run it locally? How do I run the tests? Everything else is secondary. A README that requires twenty minutes to read before a new engineer can get started is a README that's substituting for missing toolchain setup. Fix the toolchain; keep the README short.

ADRs in /docs/adr/ have been covered in Post 2 but deserve repetition here: they're the most underused documentation practice in engineering. Write them when significant technical decisions are made. Future engineers will thank you when they're trying to understand why the system works the way it does rather than some other way that seemed obvious to them.

Inline comments explain why, not what. The code already shows what it does. A comment that says // validates the user adds nothing. A comment that says // must validate before persisting — GDPR Art. 6 requires lawful basis before processing personal data saves hours of confusion. Comments document the non-obvious: a workaround for a known bug, a constraint imposed by an external system, a legal or compliance requirement that shaped a decision.

API documentation belongs in the code. OpenAPI specifications for REST APIs, generated from code annotations where possible. Committed in the repository. A single source of truth that's impossible to let drift from the implementation without noticing.

The PR template enforces documentation habits. A checkbox in the PR template — "have you updated or added documentation for this change?" — makes the question explicit rather than assumed. It doesn't guarantee documentation happens, but it makes skipping it a visible decision rather than an invisible omission.

SOLID and design principles: heuristics for navigating complexity

Design principles get a bad reputation in some engineering circles — rightly, when they're applied dogmatically rather than thoughtfully. They're not rules to follow in every situation. They're heuristics for recognising when code is heading somewhere painful.

Single responsibility is the most practically useful. A class, module, or function with a single reason to change is a class, module, or function that's easy to understand in isolation. When you find yourself editing a file for three different kinds of reasons — a business logic change, a data format change, and a UI change — that's a signal the file has too many responsibilities.

Open/closed means new behaviour should be addable without modifying existing code. Composition and interfaces over modification. This is the principle that makes systems extensible without being fragile — you can add a new payment provider without touching the code that handles all existing payment providers.

Dependency inversion means depending on abstractions rather than concretions. Your order service shouldn't import a specific email client — it should call a Notifier interface that happens to be implemented by an email client today and could be implemented by something else tomorrow. This is what makes code testable (inject a mock Notifier in tests) and what prevents tight coupling between modules.

DRY (Don't Repeat Yourself) — applied carefully. Duplication is a problem. But premature abstraction is worse than duplication. Wait until you see the same pattern three times before abstracting it. The first time: write it. The second time: notice the pattern. The third time: abstract it. An abstraction written from one use case is usually wrong for the second use case in a way that only becomes apparent when the second use case arrives.

YAGNI (You Aren't Gonna Need It) is the hardest principle to follow because it runs against the engineering instinct to build for future requirements. Extensibility hooks for requirements that don't exist, abstraction layers for scale that hasn't arrived, configuration options for use cases that haven't been requested — these feel responsible. They are, in most cases, complexity added speculatively that becomes dead weight or, worse, a constraint on the design that the eventual real requirement doesn't fit.

Feature flags: decouple deployment from release

Feature flags are often thought of as a product management tool for A/B testing and gradual rollouts. They're also one of the most important engineering practices for maintaining a healthy codebase and a calm deployment process.

The deployment/release distinction is the key insight: deploying code and releasing a feature are separate events. With feature flags, code can be merged to main, deployed to production, and running on every server while being invisible to every user. The feature is released when the flag is turned on, independently of the deployment.

This enables trunk-based development at scale. Engineers can merge incomplete work to main without it affecting users. Large features are developed incrementally, with each increment deployed behind a flag. If something goes wrong in production, the feature can be disabled instantly — sub-second, no redeployment required.

Flag hygiene is where most teams fail. Flags that outlive their feature accumulate. A codebase with fifty active feature flags has fifty conditional branches to reason about, fifty states to test, and fifty places where something could go wrong. Every flag needs an owner and an expiry plan. When a feature is fully released, the flag comes out. When a flag is no longer needed, the flag comes out. Quarterly flag audits — reviewing every active flag and asking whether it still needs to exist — are worth building into the team's rhythm.

The Definition of Done is the team's shared standard for what "finished" means. It's not a checklist an individual applies to their own work — it's a collective agreement about what quality looks like, enforced by the team.

A Definition of Done that works:

[ ] Acceptance criteria met and verified
[ ] Unit and integration tests written and passing
[ ] No new linter warnings introduced
[ ] Code reviewed and approved
[ ] Documentation updated (README, API docs, ADR if applicable)
[ ] Feature flagged if incomplete or high-risk
[ ] Deployed to staging and smoke-tested
[ ] Ticket updated and linked to PR
[ ] No secrets or credentials in code

The value isn't in the list — it's in the agreement. When a PR is submitted without tests, the response isn't "the process says you need tests." It's "we agreed as a team that this is what done means." That shift from process compliance to collective ownership is what makes standards actually stick.

Post this in your team's channel. Add it to your PR template. The first few times it catches something that would otherwise have shipped, the team will start applying it naturally rather than mechanically.

What goes wrong when standards are absent or inconsistent

The quality ratchet. Once code quality starts declining — more complexity, less coverage, more undocumented assumptions — it accelerates. Each new piece of code is written in the context of the existing code around it. Low-quality context produces low-quality additions. The codebase degrades faster than it was built.

The hero engineer dependency. When standards are tribal rather than documented and automated, the engineer who knows "how things are done here" becomes a bottleneck for every review and every architectural decision. When they leave, the knowledge leaves.

Review theatre. Code reviews on a codebase without standards are mostly about preferences. One reviewer likes one pattern, another prefers a different one, and the author is caught in the middle. Without agreed standards, reviews generate friction without producing consistency.

The untouchable module. Every long-lived codebase has one: a module that's known to be poorly structured, poorly tested, and critical to the system. Nobody wants to touch it. Features get built around it rather than through it. It accumulates until it's a genuine engineering risk.

If you do one thing from this post

Write your team's Definition of Done. Not alone — with the team, in a meeting, with everyone contributing.

Ask: what do we all agree must be true before a piece of work is considered complete? Write down the answers. Put the result in your PR template.

The conversation itself is valuable — it surfaces different assumptions about what "done" means that are currently resolving themselves in production. The document that comes out of it is a standard the whole team owns rather than a rule imposed from above.

Next up: Post 6 — The Testing Trophy: Why You're Probably Writing the Wrong Tests

← Post 4: Modern Agile: What Actually Works vs What's Just Ceremony

The Coding Standards That Separate Confident Teams from Anxious Ones

The Coding Standards That Separate Confident Teams from Anxious Ones

The one thing to remember

Readable code over clever code

TDD: a design discipline as much as a testing one

Code review: where standards get reinforced

Documentation as code: docs that live with the code

SOLID and design principles: heuristics for navigating complexity

Feature flags: decouple deployment from release

What goes wrong when standards are absent or inconsistent

If you do one thing from this post

Comments

The Modern SDLC

The Testing Trophy: Why You're Probably Writing the Wrong Tests

More from this blog

Containers and Kubernetes: What They Actually Are and When You Actually Need Them

Infrastructure as Code: Treat Your Cloud Like a Codebase

Shift Left: How to Make Security Every Engineer's Job Without Making It Nobody's Job

How to Build a CI Pipeline That Engineers Actually Trust

The Testing Trophy: Why You're Probably Writing the Wrong Tests

Command Palette

The Coding Standards That Separate Confident Teams from Anxious Ones

The one thing to remember

Readable code over clever code

TDD: a design discipline as much as a testing one

Code review: where standards get reinforced

Documentation as code: docs that live with the code

SOLID and design principles: heuristics for navigating complexity

Feature flags: decouple deployment from release

The definition of done: a social contract

What goes wrong when standards are absent or inconsistent

If you do one thing from this post

Comments

The Modern SDLC

The Testing Trophy: Why You're Probably Writing the Wrong Tests

More from this blog