Why Your AWS Bill Keeps Growing (And How to Fix It)

April 2, 20267 min read

AWSDevelopment CostDevOpsSoftware DevelopmentSoftware Architecture

Why Your AWS Bill Keeps Growing (And How to Fix It)

You open Cost Explorer and the trend is clear: up and to the right, every single month. You've set billing alerts. You've read the AWS docs. You've maybe even paid for a Trusted Advisor subscription. But the bill keeps climbing.

Here's the thing most cost optimization guides won't tell you: by the time your AWS bill becomes a problem, the decisions that caused it were made months or years ago. The infrastructure is just faithfully executing the architecture you gave it.

According to Flexera's 2025 State of the Cloud Report, organizations waste 30-40% of their cloud spend on idle, oversized, or underutilized resources. Global cloud waste is on track to hit $44.5 billion in 2025. That's not a monitoring problem. That's an architecture problem.

These are the six decisions that cause it.

1. You Lifted and Shifted Instead of Re-Architecting

The fastest migration is almost never the cheapest one over time.

When teams move workloads from on-premises to EC2, they typically carry over the same sizing assumptions they used for physical servers. On-prem servers are provisioned for peak load because you buy hardware upfront. On AWS, you pay per hour. Running a server sized for Black Friday traffic on a quiet Tuesday in March is pure waste.

A production workload running on an oversized m6i.4xlarge when an m6i.xlarge would handle 95% of your traffic costs roughly $400 more per month per instance, before you factor in EBS, data transfer, and related services. Multiply that across your fleet and you're paying a premium for comfort, not performance.

The fix isn't always a full re-architecture. Start with AWS Compute Optimizer. Run it for 14 days. Most teams find 20-30% in savings on compute alone just by right-sizing.

2. Your Data Is Traveling Farther Than It Needs To

Data transfer costs are one of the most underestimated line items on any AWS bill. For data-heavy platforms, they can represent 25-35% of total monthly spend.

The problem compounds quickly when your architecture wasn't designed with data locality in mind. A common one: application servers in us-west-2 querying a database in us-east-1 because the database was set up first and nobody thought twice about it. That cross-region traffic costs $0.02 per GB. At 100TB per month, that's $2,000 in transfer fees before you've done anything useful with the data.

Subtler but just as damaging: services in the same VPC talking to each other through a NAT Gateway instead of VPC endpoints. NAT Gateway charges $0.045 per GB processed. S3 and DynamoDB traffic routed through NAT when a free VPC Gateway Endpoint would do the same job is one of the most common unnecessary charges in AWS accounts.

Before you add another caching layer, map your data flows. Where does the data originate, where does it go, and what does it cross on the way?

3. You're Using On-Demand Pricing for Predictable Workloads

On-demand is the right pricing model for unpredictable workloads. Most production workloads are not unpredictable.

If your API servers run 24/7, if your RDS instance is always on, if your background job workers run on a consistent schedule, you're paying the on-demand premium for resources that qualify for 30-60% discounts through Savings Plans or Reserved Instances.

The hesitation is usually: "What if we need to change the instance type?" AWS Compute Savings Plans solved that. You commit to a dollar amount of compute per hour, not to a specific instance type or region. You get up to 66% off on-demand rates and keep flexibility.

The rule of thumb: if a resource has been running for more than 30 days and you expect it to keep running, it's a Savings Plan candidate. For development and staging environments, the answer is the opposite. Shut them down outside business hours with AWS Instance Scheduler. A t3.large dev box running 730 hours a month that you can reduce to 200 hours saves $35 per instance per month. Across 20 dev environments, that's $8,400 a year.

4. CloudWatch Logs Is Storing Everything Forever

Observability is important. Retaining 18 months of verbose debug logs in CloudWatch at $0.03 per GB per month is not observability. It's expensive archiving.

Most teams set up CloudWatch logging once during development, enable verbose logging for debugging, and never revisit it. Those logs keep ingesting and storing. CloudWatch log ingestion costs $0.50 per GB. For a system that generates 50GB of logs per day, that's $750 per day in ingestion alone before storage.

This is fixable with two changes. First, set a log retention policy on every log group. 7 days for debug, 30 days for application logs, 90 days for security and audit logs is a reasonable starting point. Second, use structured logging and log levels properly. A production system that's still logging at DEBUG level is generating 5-10x the log volume it needs to.

AWS is introducing tiered ingestion pricing for Lambda logs starting May 2025, but the more important fix is architectural: decide what you actually need to keep and for how long before you start writing logs, not after the bill arrives.

5. Your Storage Has No Lifecycle Policy

S3 looks cheap at $0.023 per GB in Standard storage. That number is misleading when you're storing application logs, database backups, user uploads, and build artifacts at the same tier.

A 1TB bucket with versioning enabled can silently become 5TB over time as old versions accumulate. If that bucket holds infrequently accessed backup archives, you're paying Standard storage rates for data you'll access maybe once a year.

The fix is S3 storage class lifecycle policies. Objects that haven't been accessed in 30 days move to Standard-IA at $0.0125 per GB. After 90 days, they move to Glacier Instant Retrieval at $0.004 per GB. Deep Archive costs $0.00099 per GB. That's a 95% reduction on the cost of data you're keeping for compliance but never touching.

Run S3 Analytics on your largest buckets for 30 days. It will tell you exactly which objects are candidates for cheaper storage tiers, and then you can automate the transitions with a lifecycle rule.

6. Nobody Owns the Cost

This is the one the technical articles skip. Architecture alone isn't the whole story.

AWS cost growth often isn't a technical failure. It's an organizational one. When nobody on the team has explicit responsibility for cloud costs, every engineer makes locally reasonable decisions that collectively produce a spiraling bill. Spin up a dev environment to test something, forget to shut it down. Create an RDS snapshot for safety, never clean up old ones. Add detailed logging for a bug investigation, never remove it.

The fix here is simple but requires discipline. Tag everything with owner, environment, and cost center. Enable AWS Cost Anomaly Detection so unusual spend surfaces within hours, not at the end of the month. Set per-team budget alerts at 80% and 100% of expected spend. And review the Cost Explorer top-10 services in your weekly engineering sync, not just when the finance team emails.

At NUS Technology, when we take on platform modernization projects, we consistently find that the biggest cost wins come from combining governance changes with architecture fixes simultaneously. You can optimize an instance type, but if the process that created the oversized instance hasn't changed, it'll be back in three months. Both problems have to be solved at once.

What the Compounding Effect Actually Looks Like

These six issues don't exist in isolation. They compound. A lift-and-shift migration (issue 1) into a multi-region setup without traffic modeling (issue 2) with on-demand pricing (issue 3) and no log lifecycle policies (issue 4) can turn a $5,000/month AWS bill into $18,000/month by the time a team reaches growth stage, without adding a single new user-facing feature.

We see this pattern regularly with teams that come to us after scaling. The infrastructure grew with the product, but the architecture decisions from year one never got revisited. Our complex system integration work often includes an infrastructure audit as a first step because the cost profile of the system usually tells you a lot about the technical debt you're about to walk into.

Fixing it doesn't require a full re-architecture. Start with the top two line items in your Cost Explorer, trace them back to the architectural decision that created them, and fix at that level. The tactical fixes matter, but the only way to stop the cycle is to change the design decision that produced the cost.

Frequently Asked Questions

How do I quickly find out what's driving my AWS bill?

Open AWS Cost Explorer and filter by service for the last 3 months. Sort by cost. Your top 3 services typically account for 70-80% of spend. Then drill into each one and filter by usage type to see exactly which resource category is largest. For most web applications, EC2, RDS, and data transfer will dominate. That's where you start.

Are Reserved Instances or Savings Plans better for reducing compute costs?

Savings Plans are more flexible and the better default choice for most teams. You commit to a specific dollar amount of compute per hour across any instance type, OS, or region, rather than committing to a specific instance configuration. If your workload mix changes, Savings Plans adapts. Reserved Instances still make sense for steady RDS workloads where the instance type is unlikely to change.

Does moving to serverless (Lambda) always reduce costs?

No. Lambda is cheaper for sporadic, event-driven workloads with low invocation volume. For workloads running near-constantly, containers on ECS or a right-sized EC2 instance behind a Savings Plan will almost always cost less. The mistake teams make is choosing Lambda for architectural reasons and then discovering the cost profile at scale is worse than a traditional server would have been.

How much can I realistically save by fixing these architecture issues?

Most teams running AWS without deliberate cost governance are overspending by 30-40% according to Flexera's 2025 research. In practice, teams that address right-sizing, storage lifecycle policies, and pricing model selection simultaneously typically reduce their bills by 25-50% without any reduction in capability. The highest-impact single change is usually switching from on-demand to Savings Plans for baseline compute, combined with shutting down non-production environments outside business hours.

Conclusion

Your AWS bill isn't growing because AWS is expensive. It's growing because six architectural decisions from the past are compounding quietly in the background.

Logging policies, data routing, pricing models, storage tiers, instance sizing, and cost ownership all interact. Fix one without the others and you'll slow the growth but not stop it. Fix all six and you reclaim the 30-40% of spend that's delivering no value to your users.

If you're unsure where to start, the most useful first step is a structured infrastructure audit that connects your bill line items to the decisions that produced them. If you want to talk through what that looks like for your stack, reach out to the NUS Technology team and we can walk through your current setup.

Written By