In late 2020, Dario Amodei was VP of Research at OpenAI. He had an internal document that circulated in the executive committee arguing that the speed at which model capability was being scaled, without equivalent investment in safety research, would in a few years produce an institutional risk difficult to manage. The company's response was to continue the trajectory.
In early 2021 he left. He took his sister Daniela, Tom Brown, and another group of key researchers with him. Three months later they founded Anthropic.
That decision changed the public conversation about how AI gets built — and, five years on, redrew the professional market.
What was actually being argued
The underlying disagreement between Anthropic and the culture its founders came from isn't ideological. It's operational.
One position holds that AI safety is a compliance process: ship the model, observe what goes wrong, add filters. It's fast. It's scalable. It's the industry standard.
The other position holds that AI safety can't be added after the fact in a robust way. If you train a model for months with one objective (predict the next token), then try to redirect it with bolted-on filters, that's fragile — the filters can be bypassed, the model's decisions stay opaque, and the serious problems show up only in production at scale. Anthropic's founders hold the second.
The technique they developed to implement that second position is called Constitutional AI.
Constitutional AI, without the marketing
The original paper has been public since 2022 (Bai et al., "Constitutional AI: Harmlessness from AI Feedback"). The method has four phases.
First, they train a base model the standard way — next-token prediction over a massive corpus. At this point the model is capable but not aligned.
Second, they write a "constitution": a set of explicit principles the model has to follow. Honesty, refusal of harmful tasks, respect for different cultural contexts, no assisting with disinformation, and so on. The constitution is public, on Anthropic's site.
Third, the model generates responses to test prompts and evaluates itself against those principles. Where it spots misalignment, it rewrites.
Fourth — and here's the key innovation — they use that model-generated feedback to retrain. They call it RLAIF (Reinforcement Learning from AI Feedback), in contrast to the RLHF (Reinforcement Learning from Human Feedback) that dominated the field. The operational difference is that alignment ends up baked into the weights, not stacked on top.
The practical consequence is what you notice when you use it: Claude refuses problematic tasks consistently, explains why without moralizing, and the refusals are hard to bypass with prompt tricks — not because there's a guard watching, but because the decision is born inside the model.
The product line as of April 2026
Anthropic stopped being just a model a while back. Today there's an integrated product line.
Claude.ai is the public web. Free version with daily limits, Claude Pro at $20 a month with no practical limits, Team and Enterprise plans for groups.
Claude API runs on token consumption. It's the path for developers and for companies integrating Claude into their own products. Current Opus 4.7 pricing: $5 per million input tokens, $25 per million output tokens.
Claude Code is the developer tool. It combines code editing, execution, and reasoning over entire repos. It's what many people use instead of a conventional IDE for work that mixes coding with explaining.
Computer Use is the agent capability: Claude observes the screen, moves mouse and keyboard, and executes full flows inside applications. Available since October 2024.
Cowork is a collaborative layer where one or several humans work with Claude in real time on documents, analysis, code.
Model Context Protocol (MCP) is an open protocol for applications to connect to Claude. It's an important strategic move because it's being adopted by independent third parties — meaning the ecosystem grows without every integration needing to go through Anthropic.
The model trajectory
Worth looking at how Claude evolved to understand the company's pace.
Claude 1 and 2 (2023) were modest releases — competent but not benchmark leaders. The criticism of Anthropic at that point was that it invested heavily in safety and lightly in raw capability.
Claude 3 (March 2024) changed that. A family of three models — Opus, Sonnet, Haiku — with explicit trade-offs between capability and speed. Started competing closely with the leaders of that moment on reasoning and code.
Claude 3.5 and 3.7 Sonnet (October 2024 and 2025) consolidated the position. Successive improvements in speed, long-context handling, and code quality.
Claude Opus 4.6 (early 2026) and Opus 4.7 (April 16, 2026) are the current models. The most important qualitative improvement of Opus 4.7 over 4.6 — covered in detail in another piece — is instruction literalism: the model executes what you ask for more predictably, which makes it especially useful in agents and in automated flows.
Each version held the foundational principles. The difference was capability and speed. The underlying promise didn't change.
Why so many use it as their main work tool
There's a usage argument that closes the loop and is worth stating directly.
In applied consulting, the most expensive thing isn't the per-hour cost of the tool. It's the cost of a mistake that ends up delivered to a client. A brilliant-but-false answer, a recommendation that skips an important exception, a synthesis that invents a plausible-sounding fact — those are all operational disasters.
The properties that reduce that risk are different from the ones that win benchmarks. They are: saying "I don't know" when it doesn't; separating what it infers from what it verifies; refusing problematic tasks coherently; producing results that don't fall apart when you reread them carefully.
Those are the properties Claude prioritizes. That's why it's the main tool in my consulting work. That's why it's the one I teach in the course. Not because it's perfect — it isn't. Because for professional work with real stakes, reliability outweighs novelty.
The honest limitations
A pro-Claude piece wouldn't be honest without naming what it doesn't do better.
For very artistic image generation, alternatives have more creative muscle. For real-time spoken conversation with voice, others have better implementations. For deep integration with office suites like Google Workspace or Microsoft 365, the providers that own the suite have a structural edge — they're inside.
Those limitations are real. The practical question is which ones touch your work. If your day is spent producing illustrated content, you may need something else. If your day is analyzing, writing, coding, reviewing, and reasoning — Claude sits at the center of that field.
To close, and to keep going
Anthropic just crossed five years since founding. The team grew to several hundred people; it raised around $8 billion across successive rounds; it kept an internal culture aligned with the original mission. That's a rarity in an industry where scaling speed usually leaves culture behind.
Will it "win" over the other giants in the field? Wrong question. The future is multi-vendor — you'll use different tools for different tasks. But there's one tool that becomes your default, the one you spend the most time in and where you build your operational trust. For many serious professionals, that tool is Claude. And Anthropic's trajectory explains why.
If you want to dig into how model capability gets measured and compared, How AIs are measured is the next link. If you want the broader competitive picture, The AI race.
Which property of an AI tool weighs most in your work: raw capability, speed, or reliability?
Here's something Claude does that most other AI tools still don't quite nail: when it doesn't know something, it tells you it doesn't know.
Sounds obvious. It isn't.
Try this with any AI tool you have: ask something narrow and obscure. Most of them will produce a confident-sounding answer that's tailored to please you. Claude will say "I don't have this" or "here's what I know up to this point, the rest is my guess." In professional work, that habit is worth gold.
The company that took the long road
Anthropic started in 2021 with a rare decision for that moment.
A group of researchers — including siblings Dario and Daniela Amodei — had been working at OpenAI and disagreed with how priorities were being handled inside. They thought safety in AI models couldn't be an add-on at the end. It had to be foundational from day one.
They didn't get heard. They left. They started Anthropic in San Francisco with that one idea as their entire mission.
Five years later, Anthropic is the AI company that the most serious professionals choose to trust for real work.
Claude wasn't born in April
Anthropic's main product is called Claude. It's the AI model you talk to when you open claude.ai, and it's behind many other products today.
The trajectory matters because it tells you this isn't an experiment:
- 2023 — Claude 1 and Claude 2. They worked but were modest compared to the competition at the time.
- 2024 — Claude 3 (Opus, Sonnet, Haiku). Big jump. Anthropic started winning professional users.
- 2024-2025 — Claude 3.5 and 3.7 Sonnet. Successive improvements in code, reasoning, long-document reading.
- 2026 — Claude Opus 4.6, then Opus 4.7 (this past April 16).
Each version held the same principles: honesty, safety, predictability. Speed and capability changed. The promise didn't.
Constitutional AI, without the academic gloss
The technique Anthropic uses to train Claude is called Constitutional AI. The idea is easy to explain.
Imagine you have to train a new employee. You have two options.
Option 1: let them work without rules and every time they screw up, tell them "no, not that." They learn by getting hit.
Option 2: before they start, hand them the ten principles that run the place — "treat the client well, don't invent numbers, if you don't know something say so, don't sign anything you don't understand." When they walk into the job, those principles are already part of how they think.
Constitutional AI is option 2. Anthropic hands the model, during training, a set of written principles. Then it uses the model itself to evaluate its outputs against those principles and improve.
The practical consequence? When Claude refuses something, the refusal is consistent and explainable. Not a patch on top.
Why a lot of us use it as the default tool
There's a concrete reason Claude is the main tool in my consulting work and the one I teach in the course.
I work with sensitive client data: contracts, financial spreadsheets, internal strategy. The tool helping me with that can't be brilliant and sloppy. It has to be predictable, honest, and safe.
When I hand Claude a contract and ask "pull the clauses that could cause problems," it does two things I value: it finds the clauses, and it tells me which ones could be read multiple ways. It doesn't assert with certainty what's actually ambiguous. That small habit — separating what it knows from what it's guessing — is what lets me trust what it returns.
What yes, what no
Claude isn't best at everything. Worth being honest.
It's strong on long-document analysis, on code, on applied reasoning, on natural writing, and on following complex instructions. It's the best option I know for professional work with sensitive data.
It isn't your first pick if what you need is heavily artistic image generation, or if you live all day inside Google Workspace and want AI glued to your email and Docs. For those, there are more comfortable alternatives.
Practical rule: for producing work, Claude. For uses where integration with the office suite you already use weighs more, check what comes with that suite.
What to take away
Three things worth remembering:
- Anthropic took the long road. Safety as foundation, not as filter. That choice made it slower to make noise — and made it today's option for delegating serious work.
- Constitutional AI isn't marketing. It's a technique with a public paper that changes how the model decides what to do and what not to. Trusting Claude doesn't require an act of faith.
- If your work depends on not getting things badly wrong, Claude is the tool worth having as default. For everything else, alternatives exist and that's fine.
In late 2020, Dario Amodei was VP of Research at OpenAI. He had an internal document that circulated in the executive committee arguing that the speed at which model capability was being scaled, without equivalent investment in safety research, would in a few years produce an institutional risk difficult to manage. The company's response was to continue the trajectory.
In early 2021 he left. He took his sister Daniela, Tom Brown, and another group of key researchers with him. Three months later they founded Anthropic.
That decision changed the public conversation about how AI gets built — and, five years on, redrew the professional market.
What was actually being argued
The underlying disagreement between Anthropic and the culture its founders came from isn't ideological. It's operational.
One position holds that AI safety is a compliance process: ship the model, observe what goes wrong, add filters. It's fast. It's scalable. It's the industry standard.
The other position holds that AI safety can't be added after the fact in a robust way. If you train a model for months with one objective (predict the next token), then try to redirect it with bolted-on filters, that's fragile — the filters can be bypassed, the model's decisions stay opaque, and the serious problems show up only in production at scale. Anthropic's founders hold the second.
The technique they developed to implement that second position is called Constitutional AI.
Constitutional AI, without the marketing
The original paper has been public since 2022 (Bai et al., "Constitutional AI: Harmlessness from AI Feedback"). The method has four phases.
First, they train a base model the standard way — next-token prediction over a massive corpus. At this point the model is capable but not aligned.
Second, they write a "constitution": a set of explicit principles the model has to follow. Honesty, refusal of harmful tasks, respect for different cultural contexts, no assisting with disinformation, and so on. The constitution is public, on Anthropic's site.
Third, the model generates responses to test prompts and evaluates itself against those principles. Where it spots misalignment, it rewrites.
Fourth — and here's the key innovation — they use that model-generated feedback to retrain. They call it RLAIF (Reinforcement Learning from AI Feedback), in contrast to the RLHF (Reinforcement Learning from Human Feedback) that dominated the field. The operational difference is that alignment ends up baked into the weights, not stacked on top.
The practical consequence is what you notice when you use it: Claude refuses problematic tasks consistently, explains why without moralizing, and the refusals are hard to bypass with prompt tricks — not because there's a guard watching, but because the decision is born inside the model.
The product line as of April 2026
Anthropic stopped being just a model a while back. Today there's an integrated product line.
Claude.ai is the public web. Free version with daily limits, Claude Pro at $20 a month with no practical limits, Team and Enterprise plans for groups.
Claude API runs on token consumption. It's the path for developers and for companies integrating Claude into their own products. Current Opus 4.7 pricing: $5 per million input tokens, $25 per million output tokens.
Claude Code is the developer tool. It combines code editing, execution, and reasoning over entire repos. It's what many people use instead of a conventional IDE for work that mixes coding with explaining.
Computer Use is the agent capability: Claude observes the screen, moves mouse and keyboard, and executes full flows inside applications. Available since October 2024.
Cowork is a collaborative layer where one or several humans work with Claude in real time on documents, analysis, code.
Model Context Protocol (MCP) is an open protocol for applications to connect to Claude. It's an important strategic move because it's being adopted by independent third parties — meaning the ecosystem grows without every integration needing to go through Anthropic.
The model trajectory
Worth looking at how Claude evolved to understand the company's pace.
Claude 1 and 2 (2023) were modest releases — competent but not benchmark leaders. The criticism of Anthropic at that point was that it invested heavily in safety and lightly in raw capability.
Claude 3 (March 2024) changed that. A family of three models — Opus, Sonnet, Haiku — with explicit trade-offs between capability and speed. Started competing closely with the leaders of that moment on reasoning and code.
Claude 3.5 and 3.7 Sonnet (October 2024 and 2025) consolidated the position. Successive improvements in speed, long-context handling, and code quality.
Claude Opus 4.6 (early 2026) and Opus 4.7 (April 16, 2026) are the current models. The most important qualitative improvement of Opus 4.7 over 4.6 — covered in detail in another piece — is instruction literalism: the model executes what you ask for more predictably, which makes it especially useful in agents and in automated flows.
Each version held the foundational principles. The difference was capability and speed. The underlying promise didn't change.
Why so many use it as their main work tool
There's a usage argument that closes the loop and is worth stating directly.
In applied consulting, the most expensive thing isn't the per-hour cost of the tool. It's the cost of a mistake that ends up delivered to a client. A brilliant-but-false answer, a recommendation that skips an important exception, a synthesis that invents a plausible-sounding fact — those are all operational disasters.
The properties that reduce that risk are different from the ones that win benchmarks. They are: saying "I don't know" when it doesn't; separating what it infers from what it verifies; refusing problematic tasks coherently; producing results that don't fall apart when you reread them carefully.
Those are the properties Claude prioritizes. That's why it's the main tool in my consulting work. That's why it's the one I teach in the course. Not because it's perfect — it isn't. Because for professional work with real stakes, reliability outweighs novelty.
The honest limitations
A pro-Claude piece wouldn't be honest without naming what it doesn't do better.
For very artistic image generation, alternatives have more creative muscle. For real-time spoken conversation with voice, others have better implementations. For deep integration with office suites like Google Workspace or Microsoft 365, the providers that own the suite have a structural edge — they're inside.
Those limitations are real. The practical question is which ones touch your work. If your day is spent producing illustrated content, you may need something else. If your day is analyzing, writing, coding, reviewing, and reasoning — Claude sits at the center of that field.
To close, and to keep going
Anthropic just crossed five years since founding. The team grew to several hundred people; it raised around $8 billion across successive rounds; it kept an internal culture aligned with the original mission. That's a rarity in an industry where scaling speed usually leaves culture behind.
Will it "win" over the other giants in the field? Wrong question. The future is multi-vendor — you'll use different tools for different tasks. But there's one tool that becomes your default, the one you spend the most time in and where you build your operational trust. For many serious professionals, that tool is Claude. And Anthropic's trajectory explains why.
If you want to dig into how model capability gets measured and compared, How AIs are measured is the next link. If you want the broader competitive picture, The AI race.
Which property of an AI tool weighs most in your work: raw capability, speed, or reliability?
When Dario Amodei left OpenAI in early 2021, he didn't just take experience with him. He took an alignment thesis the institution he was resigning from had chosen not to prioritize. That thesis — that robust alignment requires architectural foundations, not surface-level patches — is the same thesis that, five years later, defines Anthropic's competitive differentiation in the professional market. Worth disassembling with technical precision.
The alignment problem that motivated the founding
The AI alignment debate has a concrete technical axis: how to make a large language model, trained with one objective function (next-token prediction), exhibit behavior aligned with human preferences that aren't in that objective function.
The mainstream strategy in 2020-2021 was RLHF — Reinforcement Learning from Human Feedback. It works like this: train the base model, collect thousands of examples where humans rank responses, train a reward model that learns those preferences, and use reinforcement learning to push the base model toward the human preferences. It's the technique that trained InstructGPT and gave rise to ChatGPT.
RLHF has three limitations Anthropic's co-founders identified as critical. One: it requires enormous human scale — tens of thousands of annotator hours. Two: human preferences are inconsistent — different annotators rank differently, and the reward model ends up learning the noise. Three, and deeper: alignment lives as a thin layer on top of the base model, which produces reward hacking — the model learns to optimize for appearing aligned rather than being it.
Constitutional AI is the technical answer to all three at once.
Constitutional AI mechanics
The Bai et al. (2022) paper describes the method in detail. The condensed version is this.
Supervised Critique-and-Revise phase. Take a base model. Present it with prompts designed to elicit problematic responses. The model responds. Then ask the same model to critique its own response using a constitutional principle as criterion. The model identifies the problem. Then ask it to rewrite the response correcting the identified problem. With thousands of pairs (original response, revised response) you do supervised fine-tuning.
RLAIF phase. Take the model from the previous step. For each prompt, generate two responses. Ask the model itself (or a separate model trained as evaluator) to rank the two responses according to constitutional principles. Use those rankings to train a reward model. Then apply standard reinforcement learning against that reward model.
The operational difference with classic RLHF is that the source of feedback is the model applying written principles, not humans ranking intuitions. That gives three measurable advantages: scale (you don't need thousands of annotators), consistency (principles get applied uniformly), and interpretability (you know exactly which principle governs each decision).
The honest critique of Constitutional AI is that the problem shifts from "evaluating individual responses" to "writing a constitution that captures human preferences correctly". The constitution is now the fragility point. Anthropic has published iterations of its constitution and acknowledges the problem openly — it's open research, not solved.
Opus 4.7 technical positioning
As of April 2026, Anthropic's frontier model is Claude Opus 4.7. Its technical positioning is best defined by contrast with the prior generation (Opus 4.6) rather than by benchmark against external competitors — contemporary comparative benchmarks provided by Anthropic itself should be read under the evidentiary regime of "interested-party evidence", as warned in the launch coverage.
The main qualitative improvement documented by Anthropic is instruction literalism. The previous model interpreted instructions with a margin of implicit discretion; the new one executes them more literally. The operational consequence for systems depending on Claude — autonomous agents, multi-turn pipelines, MCP integrations — is improved predictability. The consequence for casual users with ambiguous prompts is that output quality now correlates more tightly with input quality.
Other documented improvements: vision (image processing up to 2,576 pixels on the long side, vs. ~1,024 before), memory across long sessions, and pre-response self-verification. The performance figures accompanying these improvements come from Anthropic itself or commercial partners (Notion, Harvey, Rakuten, Databricks); independently verified external benchmarks typically arrive 4 to 8 weeks after a release.
Constitutional AI vs alternatives: technical analysis
Worth comparing Constitutional AI to the dominant technical alternatives in the field.
Classic RLHF (used by most competitors). More mature, more available data, better characterized in literature. Limitations discussed above.
DPO (Direct Preference Optimization). An alternative to RLHF that skips the intermediate reward model by training directly on preference pairs. More computationally efficient. Doesn't solve the alignment fragility problem.
Open models without deep alignment (several in the open source space). Strategy: let the deployer do their own fine-tuning. Has flexibility advantages but transfers the problem to the end user, who rarely has the technical capacity to do robust alignment.
Mixture-of-experts with safety routing. Emerging strategy where problematic queries route to specialized models. Still in research, no mature deployment.
Constitutional AI isn't the only technical solution. It's Anthropic's bet — and it's the bet most coherent with the underlying position: that alignment is an architectural problem, not an application-layer one.
Interpretability mechanisms
A less visible but strategically important component of Anthropic's research is mechanistic interpretability — understanding what the model is computing internally, not just what outputs it produces.
The team published in 2024 work on features (internal representations) extracted from Claude 3 Sonnet using sparse autoencoder techniques. The relevant result: they were able to identify and manipulate individual features — a "user attachment" feature, a "buggy code" feature, a "Golden Gate Bridge" feature. This is early research but it points to a future where you can understand why a model says what it says.
Why does it matter? Because if alignment is interpretable, it's debuggable. If it's debuggable, it's improvable in directed ways. And the total cost of operating AI systems in production depends heavily on how much time engineers spend guessing why the model behaved oddly in an edge case.
Ecosystem strategy: MCP and agents
Two strategic moves by Anthropic deserve technical analysis.
Model Context Protocol (MCP). Launched November 2024 as an open protocol. Defines how applications (MCP servers) expose capabilities to clients (Claude or other LLMs). Strategic importance: it standardizes integration. Before MCP each integration was custom; with MCP, third parties can build servers that work with any compatible client. Anthropic gains an ecosystem without having to develop each integration itself. It's the platform play — analogous to how standard HTTP let the web grow without a central actor approving each site.
Computer Use. Launched October 2024 as an agent capability. The implementation combines vision (model processes screenshots), reasoning (model decides next step), and execution (model emits mouse and keyboard commands). The obvious critique is that it's slow and expensive compared to direct APIs. The product response is that it doesn't compete with direct APIs — it competes with manual human workflows in applications that don't expose APIs. That's a much larger market surface.
Both moves are consistent with a single strategy: position Claude not as an isolated model but as an orchestration layer between human and software.
Capital and institutional survival
Anthropic raised significant capital: successive rounds totaling roughly $8 billion through April 2026, with principal investors including Google, Amazon, Salesforce, and others. This positions it with several years of runway for research and infrastructure.
The legitimate question about institutional survival isn't financial — it's competitive. Can a company with ~500 people maintain innovation velocity against competitors with 5,000 to 10,000? The empirical answer over the last five years is yes: per-person velocity at Anthropic is substantially higher than per-person velocity at the large AI corporations, which suggests size isn't the relevant metric.
The relevant metric is cultural cohesion and decision velocity. On both, Anthropic has a structural advantage over much larger incumbents. It's the same advantage successful startups have historically had over incumbents in other sectors.
Editorial thesis
I'll close with a thesis that goes past reporting and into evaluation.
The AI sector is going to stratify over the next five years into at least three distinct layers. Raw capability layer — ever-larger models, where the winner is whoever has the most compute and data. Mass integration layer — AI inside everyday products, where the winner is whoever already owns the distribution (operating systems, office suites, browsers). Professional reliability layer — tools that critical work can be delegated to at scale, where the winner is whoever builds with the technical discipline and consistency of purpose that market requires.
Anthropic bets on the third layer. It's a narrower bet than competitors fighting on all three at once, but it's the bet where the moat isn't scale but accumulated trust. And it's the layer where the customer — the professional using the tool to produce work their reputation depends on — has higher willingness to pay for quality and lower price sensitivity.
That's the bet that defines Anthropic. Five years after founding, that bet is paying off. And the trajectory suggests it will keep paying off.
What's your personal empirical test for deciding whether an AI tool is reliable enough to occupy the default slot in your professional work?