January 15, 2026

What I Learned Building GenAI Tooling for Engineering Teams

GenAI
Claude API
Engineering Leadership
AI

When I was tasked with designing internal AI components to measure engineering productivity, my first instinct was to reach for metrics: PRs merged, commits per day, lines of code. That's almost always the wrong place to start.

The real challenge with developer productivity tooling isn't technical — it's organizational. Engineering teams are rightfully skeptical of surveillance-flavored measurement. Any system that feels like monitoring will be gamed, resented, or ignored.

The Right Frame: Insights, Not Surveillance

The framing that worked for us was simple: this tooling exists to surface opportunities, not to evaluate individuals. We were building for the team lead and the engineering director, not for performance reviews. That distinction matters enormously in how you design what gets surfaced and how.

Using the Claude API, we built components that could reason over aggregated activity data and generate narrative summaries — not scorecards. The output was things like: "This team's review cycle has lengthened over the past two sprints — likely a capacity issue worth investigating," rather than "Engineer X has low throughput."

What Worked

Language models are genuinely good at synthesizing noisy, multi-dimensional signals into coherent narratives. That's exactly what engineering metrics are. The key is being intentional about what you feed them, and rigorously testing the outputs for bias and accuracy before anyone relies on them.

We iterated heavily on prompts and context window design. The difference between a useful insight and a hallucinated one often came down to how we framed the data in the prompt and what guardrails we put around the response.

What I'd Do Differently

Start with the consumer's questions, not the data you have. We spent too long designing around available signals rather than asking the people who'd use the tool what decisions they were actually trying to make. Build backwards from the insight, not forward from the data.

GenAI tooling for internal use is one of the highest-leverage investments an engineering org can make right now. The barrier to building useful internal tools has collapsed. What's left is knowing what to build.