Sample report · For illustration only
File № PD-001
Subject Series-C SaaS, ~180 engineers
Audit Product delivery
Filed MMXXVI

Product delivery audit

The subject's lead time from commit to production grew from a median of 18 hours to 4.5 days over 18 months. Feature throughput, as reported by product, dropped by roughly half. Leadership commissioned this audit to identify root causes and propose a sequenced path back to fast delivery.

01 / Subject & scope

Where we looked and where we did not.

The subject is a Series-C SaaS platform with two product lines, ~180 engineers, and a seven-year-old Ruby monolith backed by a small constellation of supporting services. We were commissioned by the VP of Engineering with a single question: why has delivery slowed, and what can we do about it?

The audit covered the twelve teams contributing to the primary product (~90 engineers). The data platform, ML infrastructure, and internal tools teams were explicitly excluded — they have their own delivery model and were not part of the reported slowdown.

We were given full read access to source repositories, CI/CD telemetry, incident records, and Linear. Interviews were conducted under the studio's standard confidentiality terms; no individual is identifiable in this report.

02 / Methodology

How we gathered evidence.

Over four weeks we:

  • Sat in on fourteen hours of team rituals — standups, planning, retros — across six teams.
  • Reviewed ninety days of CI/CD telemetry: build duration, flakiness, deploy frequency, merge-queue wait times.
  • Shadowed four engineers through a full feature delivery cycle, from spec to production.
  • Conducted twenty-three confidential interviews across engineers, engineering managers, product managers, and SREs.
  • Read the past sixty days of incident post-mortems.
  • Audited the deploy pipeline, branching strategy, and inter-service dependency graph.

Our intent was not to benchmark this organization against industry numbers. It was to understand the specific delivery friction inside this system, made of these people, working on this codebase.

03 / Findings

What we observed.

A. The merge queue is the bottleneck, not the build.

Median CI duration is forty-seven minutes — well within tolerable bounds for a codebase of this size. But the merge queue waits a median of fourteen hours during peak weekdays. This is driven by test flakiness (median 3.2 retries to land a PR), not by build duration. Engineers report routinely batching unrelated changes into one PR to avoid a second merge cycle, which then makes review harder, which delays the queue further.

B. Code review is the slowest stage of delivery.

Median time-to-first-review is nineteen hours. Median time-to-merge is 3.5 days. Senior engineers (L5+) are reviewing roughly 80% of PRs across the entire organization. Review depth varies wildly: some PRs receive line-by-line scrutiny, others are rubber-stamped within minutes. No team has documented review standards or a stated review rotation.

C. Product specs arrive too late and too thin to enable parallel work.

In a sample of forty-five recently shipped features, 60% arrived in engineering as a one-to-three paragraph Linear ticket with no acceptance criteria, no mockups, and no API surface defined. Engineers spend a median of four days per feature in what we call spec resolution — async conversation with PM and design before coding begins. Seven of the twelve teams have introduced their own ad-hoc "tech spec" practice to compensate, but each in a different format.

D. Ownership boundaries are unclear at the seams.

The monolith has no CODEOWNERS file. Cross-team PRs (changes touching two or more teams' areas) are 35% of all PRs and receive a median time-to-merge of eight days. Engineers describe a "no one wants to be on the hook for this" pattern, especially around shared models and the authentication layer.

04 / Diagnosis

What is actually happening.

The slowdown is not a technical problem. The codebase is workable. CI is acceptable. The team is talented and motivated. The slowdown is structural.

The four findings above compound:

  • A thin spec produces a large PR.
  • A large PR receives slow review.
  • Slow review meets flaky tests in the merge queue.
  • Cross-team changes wait for owners who do not exist.

The same feature might pass through all four of these delays before reaching production. Leadership has been hearing about test flakiness for months and is correctly identifying it as a problem — but it is downstream of the others. Fixing only the tests would yield perhaps a 20% improvement. Fixing the upstream constraints could yield three times that.

The team is not slow. The system around the team is slow. The same engineers, given a clearer system, would be three times faster.
05 / Recommendations

What we would change, in priority order.

1. Establish a written product-spec contract.

Specs must include acceptance criteria, sketched UX, and API surface before they enter engineering's queue. Define explicitly what "ready to start" means. This is the largest single lever in the report.

2. Decompose code review.

Move from senior-as-bottleneck to team-owned review rotations. Publish review standards per team. Use CODEOWNERS to route, not to gate.

3. Address flakiness with quarantine, not heroics.

Create a flake registry. Quarantine on the third retry. Assign each quarantined test to its team with a seven-day fix-or-delete window. Stop rewarding the "I rerun until it passes" pattern.

4. Designate codeowners for the shared layers.

Authentication, shared models, and the API gateway need named owning teams. Cross-team PRs route to owners. Owners merge.

5. Measure cycle time per team. Do not set targets.

Publish a small dashboard showing each team's commit-to-production. Resist the urge to set goals. The metric exists to surface, not to manage. Targets corrupt the signal.

06 / The first ninety days

A sequenced plan.

The plan below assumes leadership commits to all five recommendations.

Weeks 1–2.

Draft the spec contract with two volunteer PMs. Pilot in two teams. Define codeowners for shared layers and merge the CODEOWNERS file.

Weeks 3–4.

Roll the spec contract to all teams. Publish the first cycle-time dashboard. Begin the flake registry; quarantine the worst offenders.

Weeks 5–8.

Establish review rotations team-by-team. Senior engineers shift from primary reviewer to escalation reviewer.

Weeks 9–12.

Measure. Reassess. The spec contract and review rotations should show measurable cycle-time improvement by week eight. If they do not, the diagnosis was wrong; pause the rollout and re-investigate.

We expect a 40–60% reduction in median lead time within ninety days if all five recommendations are pursued, and a 20–30% reduction if only the first three are. We would rather promise less and observe more.