The five cloud cost optimisation mistakes quietly costing enterprises millions

Every six months, somewhere in the building, a finance leader looks at the cloud bill and asks the obvious question: why is this still going up?

What follows is usually a quarter of activity. A FinOps tool gets bought. A team gets stood up. Someone sends a list of 400 unused EBS volumes to a Slack channel. A few thousand dollars get clawed back. Everyone calls it a win.

Six months later, the bill is bigger again.

The reason this pattern repeats — across companies of every size, on every cloud — is that "cost optimisation" has become a tactical exercise pretending to be a strategic one. The visible savings are real. They're just rounding errors compared to the structural decisions nobody is questioning.

Here are the five mistakes we see most often, ranked by how much they tend to cost.

1. Optimising line items instead of architecture

The fastest way to make a cloud bill look smaller is to right-size instances, delete orphaned resources, and buy reserved capacity for whatever you happen to be running. That's the floor of cost optimisation, not the ceiling.

The structural costs — the ones that determine whether your three-year bill is $10M or $40M — were locked in at design time. Did you pick a managed database when a fleet of self-managed nodes would have done? Are you paying premium egress charges because the data plane and control plane are in different regions? Is your event pipeline running on a service that bills per invocation, processing volumes that would have made batch jobs trivial?

"You can't right-size your way out of an architecture that was wrong from the start."

None of these show up in a cost-explorer dashboard as "waste." They look like normal usage. But they're the reason the optimisation work always feels like emptying the Atlantic with a bucket.

2. Treating reserved capacity like a procurement decision

Commitments — Reserved Instances, Savings Plans, Committed Use Discounts, whatever the vendor calls them this year — are one of the highest-leverage levers in the cloud economic model. Used well, they cut 30–60% off compute spend. Used badly, they lock you into outdated workloads or wrong regions for years.

The mistake we see most: treating commitments as a finance/procurement workflow rather than an architectural one. Someone in FP&A buys a three-year all-upfront commitment based on last quarter's run rate, the engineering team migrates that workload to a different instance family in month four, and the commitment becomes a sunk cost the rest of the team has to design around.

The right model is the inverse: engineering owns the commitment portfolio, finance provides the capital and the discipline. You commit to workload patterns you have high conviction about, not to instance types.

3. Confusing FinOps tooling with FinOps practice

The tools are good. CloudHealth, Apptio Cloudability, native AWS/Azure/GCP cost tools, the open-source projects — all of them surface real waste, all of them produce credible-looking dashboards. None of them, on their own, will reduce your cloud bill.

FinOps is a practice, not a product. The savings come from people having uncomfortable conversations: a platform engineer telling a product team that their feature isn't worth what it costs to run. A CTO telling a board that they're going to accept slightly worse latency to halve a regional egress bill. Those conversations don't happen because there's a dashboard. They happen because someone owns the outcome and has the authority to enforce it.

The pattern we see The FinOps initiatives that actually compound have one thing in common: a single named senior leader who reports cloud unit economics to the executive team monthly, and whose performance is measured on it. Tools follow. Tools never lead.

4. Letting "we'll optimise it later" become permanent

"We'll optimise after launch" is a perfectly reasonable thing to say while shipping. It becomes pathological when the architecture choices made for speed-to-market harden into the production reality that a 200-person engineering org now depends on.

By the time anyone has the political capital to revisit those choices, the cost of changing them — re-platforming a critical system, breaking client contracts that assumed a certain architecture — is several multiples of the cumulative waste. So the workaround becomes the system, and the system becomes the budget line.

The fix isn't to optimise earlier. It's to price the technical debt at the point of incurring it. When a team chooses speed over efficiency, that decision should come with a quantified estimate of the recurring cost, an owner, and a review date. Most of the time, the answer will still be "ship it" — and that's fine. The point is that the bill is on the books.

5. Ignoring the AI cost curve until it's too late

Every cloud cost playbook written before 2023 has a hole in it. AI workloads — training, inference, agentic systems with their fanned-out tool calls and retries — don't follow the same economic patterns as traditional cloud compute, and a lot of FinOps practices haven't caught up.

A few specific traps:

Per-token pricing models that scale non-linearly with usage. A pilot that costs $200/month can become a production line item of $80,000/month with the same per-call cost — and look "in line" the entire way up.
Agentic systems with no cost ceilings. An agent that can call tools recursively can rack up an unbounded bill in a single user session if the loop logic isn't constrained.
GPU commitments locked in before the model strategy is locked in. Buying multi-year reserved GPU capacity in 2026 against a model architecture you'll have replaced by 2027 is the new long-term mainframe lease.

The discipline here is the same as classical FinOps — observability, ownership, unit economics — but the variables are moving faster. We treat AI cost governance as a separate practice with its own playbook, not a footnote on the FinOps deck.

So what actually works?

The optimisation programmes we've seen deliver durable, compounding savings — the kind that show up on a P&L two years later, not just in a slide deck — share four characteristics:

Architecture review precedes line-item review. Before you tune, you ask whether the workload should exist in this shape at all.
Senior ownership with teeth. One named executive, monthly reporting, performance tied to unit economics.
Engineering-owned commitments. The teams that run the workloads decide what to commit to, with finance providing the capital and the rigor.
AI workloads governed separately. Different cost curves, different controls, different cadence.

None of this is glamorous. None of it makes a good keynote slide. But it's the difference between a cost optimisation project and a cost-optimised company.

If your last optimisation effort delivered a number that felt smaller than the noise in your bill, this is probably why.