ChatGPT for Doctors Just Hit a $12 Billion Valuation. The Vertical AI Thesis Is Winning.

OpenEvidence raised $250 million at a $12 billion valuation in March 2026. It is growing 12 times year-over-year. It does one thing: answer clinical questions for physicians, accurately, with citations, in a format that holds up in a medical context.

It does not help you write emails. It does not summarize meeting notes. It does not generate social media copy. It answers clinical questions.

The valuation and the growth rate are evidence for a thesis that has been argued for three years but is only now accumulating the financial data to back it up: vertical AI beats horizontal AI for professional applications.

Why General-Purpose AI Keeps Falling Short for Professionals

The promise of general-purpose AI — ChatGPT, Claude, Gemini — is genuine. These tools are useful across an enormous range of tasks, require no specialized training, and are available to anyone with a subscription. For general productivity work, exploration, and drafting, they are genuinely transformative.

For professional applications where errors have consequential costs, they have a structural problem. General-purpose models are trained to be useful across all domains, which means their knowledge of any specific domain is broad but not deep. In medical applications, "broad but not deep" means the model will answer questions confidently, with appropriate-sounding citations, while occasionally being wrong about things that matter — drug interactions, dosing protocols, contraindications for specific patient populations.

Clinicians know this. The result is that general-purpose AI tools see minimal adoption for clinical decision support specifically, even as the same clinicians use ChatGPT for administrative tasks without hesitation. The trust problem is not about AI generally. It is about whether a specific tool can be trusted for a specific high-stakes purpose.

OpenEvidence solved this by making the problem smaller and the solution deeper. Rather than building a general medical AI, it built a clinical question-answering system trained on peer-reviewed medical literature, validated by clinicians, with every answer traceable to a specific source. Physicians can verify the reasoning. The citations are real and current. The error rate for domain-specific questions is measurably lower than general-purpose alternatives.

The Numbers Behind the Thesis

A $12 billion valuation requires some context to interpret. This is not a pre-revenue startup being valued on potential. OpenEvidence is growing 12x year-over-year, which implies a revenue trajectory that justifies the valuation within a credible projection window.

For comparison: this places OpenEvidence above the last public valuation of several well-known general-purpose AI companies that serve markets many times larger than US clinical medicine. A specialized tool serving a defined professional audience is worth more — at least on a per-revenue-dollar basis — than many horizontal tools competing across every possible use case.

The reason is unit economics. Enterprise software sold to professionals with a clear ROI case has better margins, lower churn, and higher expansion revenue than consumer or broad enterprise SaaS. A physician who uses OpenEvidence for clinical decision support will not switch to a competitor because it is slightly cheaper. The switching cost — in trust, in workflow integration, in the risk of using something less validated — is too high.

What This Means for AI Strategy

The OpenEvidence case is a clean illustration of a strategic choice every organization building on AI must make: tool or platform.

General-purpose AI is a platform. It is broad, flexible, and relatively cheap. It is the right choice for tasks that are varied, lower-stakes, and do not require domain-specific accuracy.

Vertical AI is a tool. It is narrow, deep, and more expensive to build or buy. It is the right choice for tasks that are repetitive, high-stakes, and require accuracy that general-purpose models cannot consistently provide.

Most organizations defaulting to "we use ChatGPT" are making a platform choice. That is appropriate for a large portion of their AI-assisted work. The question is whether they have identified the specific applications where a vertical tool would outperform the platform — and whether those applications are significant enough to justify the investment.

In medicine, that question has a clear answer. Legal research, financial analysis, engineering design review, and regulatory compliance have similarly clear answers. The pattern is consistent: wherever domain-specific accuracy is a professional liability — where being wrong has measurable consequences — vertical AI outperforms horizontal.

Which Verticals Are Next

OpenEvidence's trajectory suggests several categories where the vertical AI opportunity is comparable:

Legal research and document review. The same accuracy problem applies: a general-purpose model will summarize case law, but a model trained specifically on legal precedent and validated by attorneys will do it with the precision that billable work requires.

Financial analysis and compliance. Regulatory filings, audit work, and compliance documentation require the kind of source-traceable, auditable output that general-purpose AI does not reliably provide. A vertical model with financial regulation as its training domain can.

Engineering design and safety review. The pattern repeats: broad knowledge is useful for ideation, but design decisions with safety implications require domain accuracy that generalist tools cannot guarantee.

Ad tech and programmatic. Less obvious but real: campaign optimization, inventory quality assessment, and audience segment validation all have accuracy thresholds that matter. A tool built specifically for programmatic decision-making, trained on ad tech data and validated by practitioners, would command the same premium OpenEvidence commands in medicine.

The OpenEvidence valuation is not a medical story. It is a market signal: professionals will pay substantially more for AI that is accurate in their domain than for AI that is generally capable. The businesses that build or identify vertical AI for their specific context — rather than defaulting to the most capable general-purpose tool — are building on a better foundation.

The $12 billion number is the evidence.

Why General-Purpose AI Keeps Falling Short for Professionals

The Numbers Behind the Thesis

What This Means for AI Strategy

Which Verticals Are Next

Discussion