When people talk about AI today, they almost always blur two different things into one. They say “I use ChatGPT” when they actually mean “I use the ChatGPT client to talk to an OpenAI model.” They say “Claude told me…” when they mean “I used the Claude app to query the Claude Sonnet model.” Same with Gemini from Google, Copilot from Microsoft, and every other consumer-facing AI brand.
This shorthand is harmless in conversation. It becomes a real problem when IT leaders use the same shorthand to plan budgets, evaluate vendors, and build an organizational AI strategy.
The client and the model are two separate products. They get sold together, priced together, and marketed together — but they have different cost curves, different lock-in profiles, and different futures. If you treat them as one thing, you’re going to make decisions that look fine today and hurt you in twelve months.
Here’s why the distinction matters, and what to do about it.
Table of Contents
- The two layers, plainly
- What the client actually gives you
- The pricing trap
- Why this matters strategically
- What to do this quarter
The two layers, plainly
Think of it like email. Outlook is a client. SMTP and Exchange are the underlying protocols and servers. You can run Outlook against Exchange, Gmail, or anything else that speaks the right protocols. The client gives you the interface, the workflows, the integrations. The server does the actual work of moving mail.
AI works the same way:
- The model does the inference — GPT-5, Claude Opus 4.7, Gemini 3, Llama, and so on. These are the underlying engines.
- The client is what wraps the model in a usable product — chat history, file upload, memory, productivity integrations, agent tooling, MCP support, organizational controls. When you subscribe to Claude Pro or ChatGPT Plus, you’re buying a bundle: access to specific models and the client experience that vendor built around them. The bundle is convenient, which is exactly why most people never separate the two in their head.
What the client actually gives you
If the model is the engine, the client is everything else that makes the engine useful at work. The major AI clients today all compete on roughly the same axes:
Memory and context. The ability to remember past conversations, user preferences, and organizational facts across sessions. This is a client feature. The model itself is stateless.
Productivity integrations. Connectors into Microsoft 365, Google Workspace, SharePoint, OneDrive, Gmail, AWS services, and so on. Anthropic, OpenAI, and Google have each been racing to build these out. None of them are properties of the model — they’re properties of the client.
Tool use and MCP. Model Context Protocol is a relatively new open standard that lets AI clients call external tools in a consistent way. Vendors publish MCP servers for their own services (GitHub, Asana, Slack, and many others), and the developer community publishes thousands more on GitHub. MCP support is a client capability. A model can be trained to use tools well, but the plumbing that exposes real tools to it lives in the client.
Surface area. Vendors are no longer just shipping a chat window. Anthropic’s Claude for Excel, Word, and PowerPoint add-ins put Claude directly inside Microsoft 365 apps. Visual Studio Code has matured into a serious AI client in its own right, with first-party extensions from OpenAI, Anthropic, Google, and Microsoft’s own GitHub Copilot. The same underlying model can show up in five different clients, each tuned for a different job.
This last point is the one most people miss. The model you use and the place you use it from are independent choices.
The pricing trap
Here’s where the client-versus-model distinction stops being academic and starts costing real money.
Most AI clients today are sold as per-user, per-month subscriptions. Claude Pro, ChatGPT Plus, Copilot Pro — flat fee, generous-enough usage caps, predictable invoice. That predictability is the whole reason these plans dominate at the consumer and small-team level.
But the model layer underneath those clients is priced by tokens. Every input character, every output character, every cached context — all metered. The vendor is paying token rates to the model provider (or absorbing them internally) while charging you a flat rate. When models get heavier, when agents take multi-step actions, when context windows grow — that gap widens.
Something has to give. And it’s starting to.
GitHub announced that on June 1, 2026, Copilot plans transition from premium-request-based billing to a usage-based model called GitHub AI Credits, with 1 AI Credit equaling $0.01. Per-seat prices stay the same — Copilot Pro at $10/month with $10 in credits, Pro+ at $39/month with $39 in credits, Business at $19/user, Enterprise at $39/user. Code completions and Next Edit Suggestions stay unlimited. But Copilot Chat, the CLI, the cloud agent, Spaces, and third-party coding agents now draw from your credit pool based on actual token consumption.
For light users this changes nothing. For developers running multi-hour agentic coding sessions against frontier models, the math gets uncomfortable. Anthropic’s Opus 4.7, for example, jumps from a 7.5x request multiplier to a 27x multiplier for annual-plan subscribers who stay on the legacy request-based billing. Heavy agentic users on monthly plans will simply burn through their included credits faster and either cap their usage or pay overage.
GitHub is the first major AI client to make this shift publicly. They will not be the last. Anyone building an AI strategy on the assumption that flat-rate seats will stay flat-rate forever is building on sand.
Why this matters strategically
Now we can answer the question this post opened with: why does it matter that IT leaders separate the client from the model?
Three reasons, and they compound.
Procurement and cost control. When you treat “Claude” or “Copilot” as a single thing, you can only evaluate it as a single line item. When you separate them, you start asking sharper questions. Which models does this client expose, and at what cost? Can I bring my own model via the API? What happens to my bill if the vendor switches to metered pricing — am I structurally a heavy user or a light one? Are my power users on the right client for their workload, or am I paying for a productivity client when what they really need is API access?
Architecture and lock-in. Clients lock you in far more than models do. Models are increasingly substitutable — most serious applications can swap GPT-5 for Claude Opus or Gemini 3 with modest re-prompting. But the client is where your memory, your integrations, your agents, your team workflows, and your data residency settings all live. Moving off a client is hard. Moving between models is comparatively easy. Plan accordingly: invest in clients (or build your own) where switching cost is acceptable, and stay model-agnostic where you can.
Organizational AI roadmap. The pattern I see in mid-market and enterprise IT is leaders rolling out “ChatGPT for everyone” or “Copilot for everyone” as if that single decision settles the AI question for the year. It doesn’t. A finance team that lives in Excel needs the Claude or Copilot Excel add-in. A development team needs Copilot or Claude Code in their IDE. A research team might be better served by direct API access through a custom internal client. Different surfaces, possibly different models underneath, all coordinated under one strategy. The companies that get the most leverage from AI in the next two years will be the ones that explicitly choose clients per workload rather than picking one bundle and calling it done.
What to do this quarter
A practical starting point:
- Inventory what you’re actually paying for. For each AI subscription in your org, list the client, the underlying model(s) it exposes, the pricing model, and what would happen to your bill if it shifted to usage-based.
- Tag your users by workload, not by license. Knowledge workers, developers, analysts, and executives all use AI differently. A single license type for everyone is almost always wrong.
- Test direct API access for at least one workload. Even running a small internal tool against the Anthropic or OpenAI API will teach your team more about real cost structures than another year of flat-rate seats.
- Watch the pricing announcements. GitHub’s shift is the canary. When the next major vendor follows, you want to already understand which of your use cases are heavy and which are light. The vendors aren’t going to make this distinction for you. Their marketing depends on the bundle. But the bill, eventually, will make the distinction obvious. Better to see it coming.