AI governance

The Quiet Death of the AI Product Manager: Why Letting Developers Own the Full AI Decision Stack Is a Governance Disaster Waiting to Happen

Scott Miller

Mar 5, 2026 • 8 min read

Searches returned limited results, but I have more than enough expertise to write a sharp, well-informed opinion piece on this topic. Here it is: ---

Something subtle but significant has been happening inside engineering organizations over the past 18 months, and almost nobody in leadership is talking about it loudly enough. The AI Product Manager, a role that was supposed to be the connective tissue between business intent and algorithmic output, is quietly being hollowed out. Not through layoffs. Not through org chart redesigns. Through irrelevance.

In 2026, the most productive AI teams are shipping faster than ever. They are integrating foundation models, building agentic pipelines, and deploying inference layers directly into production with a velocity that would have seemed reckless just two years ago. And in many of these teams, the person nominally responsible for "the product" has been reduced to writing Confluence pages that nobody reads while the engineers make every meaningful decision about model selection, prompt architecture, output filtering, and risk tolerance.

This is being celebrated in a lot of corners of the internet. I think it is one of the most dangerous structural trends in the software industry right now.

How We Got Here: The Velocity Trap

To understand why this is happening, you have to appreciate just how radically the AI development loop has compressed. In the pre-LLM era, building an intelligent feature meant a multi-month cycle: data collection, model training, evaluation, deployment, monitoring. That cycle demanded coordination. It demanded someone whose entire job was to translate user needs into model requirements and translate model limitations back to stakeholders. The PM role had structural necessity built into it.

That cycle is largely gone now. A senior engineer in 2026 can spin up an agentic workflow using a frontier model API, wire it into a product surface, instrument it with basic evals, and ship it to a canary cohort in a single sprint. The feedback loop is so tight that traditional product discovery processes feel like bureaucratic drag. And when a PM shows up to a sprint planning meeting asking for a PRD review or a risk assessment sign-off, the engineering team is not being malicious when they roll their eyes. They genuinely cannot see what value is being added.

So the PM gets sidelined. First informally, then structurally. Engineers start owning the roadmap conversations directly. They choose which models to use. They decide what the fallback behavior is when confidence is low. They determine what gets logged, what gets evaluated, and what constitutes "good enough" for production. The PM becomes a scheduler and a stakeholder communicator, not a decision-maker.

And the metrics look great. Shipping velocity goes up. Time-to-feature drops. The engineering org gets praised for moving fast. Everyone wins.

Until they do not.

What Engineers Are Exceptionally Good At (And What They Are Not)

Let me be direct: I have enormous respect for the engineers driving this shift. The developers building AI systems in 2026 are, on average, more technically sophisticated about model behavior than any generation of engineers before them. They understand token budgets, context window tradeoffs, hallucination patterns, and latency-accuracy tradeoffs in ways that most product managers frankly do not.

But technical sophistication is not the same as decision authority over the full product stack. And here is where the structural problem lives.

When an engineer decides which model to use for a customer-facing summarization feature, they are making a technical judgment. But they are also, whether they realize it or not, making decisions about:

Data residency and privacy exposure: Which provider is processing user data, under what terms, in which jurisdiction?
Fairness and output bias: Has anyone evaluated whether this model performs equitably across the demographic range of the actual user base?
Regulatory compliance: Does this deployment pattern satisfy the EU AI Act's transparency requirements, or the emerging US federal AI accountability frameworks now taking shape in 2026?
Business risk tolerance: What is the acceptable error rate for this use case, and who in the organization has explicitly signed off on that threshold?
User trust and expectations: What does the user believe is happening when this feature runs? Is that belief accurate?

These are not engineering questions. They are governance questions. And in the developer-led AI model, they are being answered implicitly, by default, through the choices embedded in a pull request that a PM never reviewed and a legal team never saw.

The "Full AI Decision Stack": A Map of What Is Going Unreviewed

It is worth being specific about what "owning the full AI decision stack" actually means in practice, because the scope of it is genuinely alarming when you lay it out clearly.

Model Selection

Which foundation model powers a feature is not a neutral technical choice. It determines cost structure, capability ceiling, vendor dependency, and the inherited biases of the training data. In many teams today, this decision is made by whoever writes the integration code first, subject to a quick Slack approval from a tech lead. There is no formal evaluation framework, no documented rationale, and no revisit schedule.

Prompt Architecture and System Instructions

The system prompt is, in a very real sense, the policy layer of an AI feature. It encodes what the system is allowed to say, how it frames information, what it refuses to do, and what persona it adopts. In developer-led teams, this is treated as an implementation detail. In a well-governed organization, it should be treated like a terms-of-service document: reviewed by legal, aligned with brand and ethics guidelines, version-controlled with explicit ownership.

Output Filtering and Safety Thresholds

What happens when the model generates something problematic? Who decided what "problematic" means? In most developer-led deployments, this is handled by a combination of provider-side content filters and whatever the engineer thought to add during implementation. There is rarely a documented policy, a cross-functional review, or a defined escalation path.

Evaluation and Monitoring Criteria

What does "working correctly" mean for this AI feature? Most engineering teams have some form of automated evals, but the choice of what to evaluate, what thresholds trigger a rollback, and what user experience signals constitute a failure is almost always made unilaterally by the engineering team. Product, legal, and customer success teams rarely have visibility into these frameworks until something goes wrong.

Deprecation and Model Upgrade Decisions

When a new model version is released, who decides whether to upgrade, and on what timeline? In developer-led teams, this is often driven by engineering preference or cost optimization. The user-facing behavioral changes that come with model upgrades are frequently undocumented and uncommunicated.

The Governance Debt Is Already Accumulating

Here is the uncomfortable truth: the governance debt being accumulated right now in developer-led AI organizations is not theoretical. It is already sitting in production, compounding quietly.

Consider what is already happening across the industry. Enterprises are discovering that AI features shipped by their engineering teams are processing customer data through third-party model APIs in ways that violate their own data processing agreements. Legal teams are finding that AI-generated outputs have been attributed to the company in ways that create liability exposure. Accessibility and fairness advocates are surfacing evidence that AI features deployed without demographic evaluation are producing systematically worse outcomes for minority user groups. Customer trust is eroding in cases where users discover that the "AI assistant" they were interacting with was operating under system instructions designed to steer them toward upsell opportunities.

None of these failures happened because engineers were careless or malicious. They happened because the organizational structure that was supposed to catch these issues, the PM layer, the design ethics review, the cross-functional sign-off process, had been functionally bypassed in the name of shipping velocity.

The PM Role Did Not Die Because It Was Useless. It Died Because It Failed to Evolve.

I want to be honest about the other side of this argument, because it is important. The AI PM role is being sidelined partly because a significant number of AI PMs have not kept pace with the technical complexity of the systems they are supposed to govern.

A product manager who cannot read an eval report, who does not understand the difference between a retrieval-augmented generation system and a fine-tuned model, who cannot have a substantive conversation about context window management or output determinism, is genuinely not equipped to govern modern AI product decisions. Engineers are not wrong to route around them.

The failure mode here is not "engineers are too powerful." It is "product management as a discipline has not produced enough people who can operate at the intersection of technical depth and governance rigor." The PM role needs to evolve into something closer to an AI Systems Steward: someone who combines product intuition with model literacy, governance expertise, and cross-functional authority.

That person does not exist in most organizations yet. And until they do, the vacuum will continue to be filled by engineers making decisions they were never meant to make alone.

What a Governance-Ready AI Development Culture Actually Looks Like

The answer is not to slow down engineering teams. Velocity is genuinely valuable, and any governance framework that kills it will simply be ignored. The answer is to build governance into the development loop, not around it.

Here is what that looks like in practice:

AI Decision Records (ADRs for AI): Borrowing from the Architecture Decision Record tradition, every significant AI decision (model selection, system prompt policy, eval framework design) should have a lightweight, structured document that captures the decision, the alternatives considered, the risks acknowledged, and the owner. This creates an audit trail without creating a bottleneck.
Cross-functional AI review as a sprint ritual, not a gate: Rather than a heavyweight sign-off process that blocks shipping, a 30-minute weekly sync between engineering, product, legal, and a designated AI ethics reviewer can catch governance issues before they reach production without meaningfully impacting velocity.
Explicit risk tiers for AI features: Not every AI feature carries the same governance burden. A grammar suggestion feature and a medical information summarizer should not go through the same review process. Organizations need a tiered risk classification system that routes features to the appropriate level of scrutiny automatically.
PM-as-governance-owner, not feature-owner: In an AI-native organization, the most valuable thing a PM can do is not manage the backlog. It is own the governance artifacts: the model policy documentation, the fairness evaluation reports, the regulatory compliance mapping. This repositions the PM as essential infrastructure rather than process overhead.
Model change management as a first-class process: Every model upgrade, provider change, or system prompt modification should go through a defined change management process with documented behavioral testing, stakeholder notification, and a rollback plan. This is standard practice for database migrations. It should be standard for AI systems too.

The Stakes Are Higher Than Most Leaders Realize

The regulatory environment in 2026 is not the same as it was two years ago. The EU AI Act's enforcement mechanisms are now fully operational. US federal agencies are actively developing sector-specific AI accountability requirements. Enterprise customers are beginning to include AI governance clauses in procurement contracts. The era of "ship fast and fix the governance later" is closing, and organizations that have structurally eliminated the oversight layer are going to find themselves exposed at exactly the moment when exposure is most costly.

Beyond regulation, there is the deeper issue of trust. The companies that will win the next decade of AI-native product development are not the ones that shipped the most features the fastest in 2025 and 2026. They are the ones that built user trust at scale, that deployed AI systems that behaved consistently with user expectations, that could demonstrate accountability when things went wrong. That kind of trust is not built by engineering teams alone. It requires the organizational discipline that good governance provides.

Conclusion: Speed Without Oversight Is Not a Competitive Advantage. It Is a Liability.

The developer-led AI model is producing real velocity gains right now. I am not disputing that. But velocity without governance is not a sustainable competitive advantage. It is a liability that has not been called in yet.

The quiet death of the AI PM is not a sign of organizational maturity. It is a sign of a discipline that failed to evolve fast enough, and of leadership that mistook the absence of friction for the presence of quality. The engineers filling that vacuum are not to blame. They are doing their jobs, and doing them well, within a structure that has asked them to carry a burden they were never designed to carry alone.

The organizations that figure this out first, that rebuild the PM function around governance rigor rather than feature ownership, that treat AI decision-making as a cross-functional responsibility rather than an engineering implementation detail, will have a structural advantage that no amount of raw shipping velocity can overcome.

The rest will be very fast, very busy, and very exposed when the bill comes due.