How much does CPU utilization increase when you downsize an EC2 instance one size?

Going down one size within the same instance family roughly halves vCPUs and memory, so CPU utilization approximately doubles. An instance running at 40% P99.5 CPU will peak around 80% on the smaller size. Infralyst uses this relationship to set its 40% threshold, leaving enough headroom for normal variation after the downsize.

Why is P99.5 a better signal than P95 or average for EC2 rightsizing?

Average CPU hides peaks entirely, and P95 can miss the spikes that cause problems after halving resources. P99.5 captures nearly all genuine load while filtering out one-off blips from restarts or deploys. It's strict enough to protect you after a downsize but stable enough to ignore noise.

Can you rightsize an EC2 instance without memory metrics?

You can downsize based on CPU alone, but you're missing half the picture. EC2 doesn't publish memory usage by default, so without the CloudWatch Agent you have no signal on whether RAM is a constraint. Infralyst flags instances missing memory data with a warning so you know the recommendation is based on CPU only.

What does sustained swap activity mean for EC2 rightsizing?

Swapping means the instance is already under memory pressure, even if the memory percentage looks fine. Sustained swap in/out activity or consistently high swap used indicates the workload is hitting its RAM limit. Downsizing that instance would make the problem worse. Fix the memory pressure first, then revisit sizing.

How do you detect if an EC2 workload has seasonal spikes before downsizing?

Check 12 months of CloudWatch data and look for recurring peaks (end of quarter, holidays, monthly batch jobs) that would push CPU or memory past the safe threshold on a smaller instance. If you have less than a year of history, note the blind spot and revisit after accumulating more data. Infralyst runs this seasonality check automatically when sufficient history exists.

How do EBS throughput limits affect EC2 rightsizing decisions?

A smaller instance type can have lower EBS throughput and IOPS ceilings. Before downsizing, check for sustained high throughput, persistent queue depth, and burst credit depletion (BurstBalance on gp2/st1/sc1). If the target instance type's EBS throughput limit is close to your observed peak, the downsize will create a storage bottleneck.

What's the safest way to roll out an EC2 instance downsize in production?

Treat it like any other production change: one size down at a time, during a low-risk window, with no other changes stacked in the same window. If something goes wrong, rollback is fast: Infralyst PRs touch a single instance type argument, so GitHub's revert button gives you a one-click undo.

Does enabling EC2 detailed monitoring matter for rightsizing analysis?

It depends on your workload. Basic monitoring reports at 5-minute intervals, which can smooth over brief CPU spikes. Detailed monitoring reports every 1 minute and catches shorter peaks that basic monitoring misses. If your workload has sharp, short-lived bursts, detailed monitoring gives you a more accurate picture of peak utilization before you commit to a downsize.

Why should you only downsize an EC2 instance one size at a time?

Each size step roughly doubles utilization. Jumping two sizes would quadruple it, leaving almost no headroom for spikes. Going one step at a time lets you validate the workload on the new size before considering another reduction. Infralyst only ever recommends the next size down within the same family.

Infralyst | Automated Terraform PRs to cut AWS costs

Q: Should you downsize a T-family EC2 instance the same way as other instance types?

No. T-family instances (t3, t3a, t4g) use CPU burst credits, which changes the analysis. Before downsizing, confirm CPUCreditBalance isn't trending toward zero and that CPUSurplusCreditCharged stays at zero across the lookback period. A workload that depends on constant burst credits isn't a downsizing candidate regardless of average utilization.

Introduction

If you're looking to reduce EC2 costs, downsizing oversized instances is the lowest risk place to start. No architecture changes, no migration, no AWS cost optimization tool required, just a smaller instance type within the same family.

This post covers how we think about EC2 rightsizing: how much data you need, what to measure, and the guardrails that prevent bad surprises.

Slightly overprovisioned is fine

Going down one size within the same family typically halves CPU and memory. An m6i.xlarge (4 vCPUs, 16 GB) becomes an m6i.large (2 vCPUs, 8 GB). There are exceptions, so always check the specific types you're working with, but halving is a useful mental model. It also explains why the thresholds below are what they are: you need enough room for the workload to fit comfortably after losing half its resources.

The goal isn't to run your instances hot. It's to find the ones with way too much headroom.

The goal of EC2 rightsizing isn't to run your instances hot. Some headroom is good. It absorbs traffic spikes, deploy day load, and unplanned surges. The goal is to find instances with way too much headroom and bring them down one notch.

Pre-flight checks

Before looking at metrics, rule out instances that can't be downsized. Skip anything with instance store volumes (that ephemeral storage gets destroyed on resize) and anything already at the smallest size in its family.

How much data you need

Use at least 30 days of CloudWatch data, preferably 60. You can look back up to 365 days to catch seasonal patterns, but anything older than a year is stale. Make sure the instance actually ran for at least 95% of the observation window so the data represents a real workload, not an instance that was mostly idle.

If you care about short spikes, enable detailed monitoring (1 minute intervals). Basic monitoring uses 5 minute intervals and can smooth over brief peaks.

CPU

Use P99.5 CPU utilization as your primary signal. Most AWS rightsizing advice focuses on averages or P95, but these can miss the peaks that actually cause problems after a downsize. If P99.5 is below 40 percent across the observation window, the instance is a reasonable candidate.

P99.5 catches what averages hide without overreacting to one-off deploy spikes.

Why P99.5 specifically? It captures nearly all real usage while filtering out the handful of one-off spikes from restarts or deploys that don't reflect actual load. It's less noisy than raw max (which overreacts to blips) but doesn't smooth over the peaks that matter.

Since going down a size roughly doubles utilization, an instance sitting at 40% P99.5 will peak around 80% on the smaller instance. The average will typically be much lower, so there's still headroom for normal variation.

It's worth tracking the average alongside P99.5. The average gives you a feel for typical load, which is useful context when reviewing a change, but it shouldn't be what drives the decision.

Memory

The same threshold applies to memory: P99.5 below 40 percent.

EC2 doesn't publish memory usage by default. You need an agent like the CloudWatch Agent to get this data. We cover the setup in our guide to enabling EC2 memory metrics. If you don't have it, you can still downsize based on CPU alone, but you should be aware that you're flying partially blind. A smaller instance means less RAM, and there's no metric telling you whether that matters.

Even when P99.5 looks fine, watch for sustained swapping (swap in/out activity or consistently high swap used). Swapping means the instance is already under memory pressure regardless of what the percentage says.

Seasonal spike detection

CPU and memory thresholds tell you whether the instance is oversized right now. But some workloads have predictable peaks that only show up at certain times of year: end of quarter traffic, holiday surges, annual batch jobs, or recurring monthly spikes.

If you have 12 months of CloudWatch data, check whether any recurring peak would push CPU or memory past the threshold on a smaller instance. If it would, hold off on the downsize. If you have less than 12 months, just be aware of the blind spot and check again after you've accumulated more history.

Disk and network guardrails

A smaller instance can come with lower network and EBS ceilings.

For EBS, watch for saturation using EBS volume metrics (AWS/EBS), not just CPU. Look at sustained high throughput or IOPS and any persistent queueing. If you're on gp2/st1/sc1, check BurstBalance for burst credit depletion. Verify that the target instance type's EBS throughput limit is comfortably above your observed peak throughput.

For network, check NetworkIn/NetworkOut and packet rates for sustained high utilization.

Burstable instances (T family)

This section only applies to T family instances (t3, t3a, t4g, etc.). If your instances aren't T family, skip ahead.

T instances accumulate CPU credits during low usage and spend them during bursts. Before downsizing a T instance, confirm CPUCreditBalance isn't trending toward zero and that CPUSurplusCreditCharged stays at zero across the lookback. If a workload lives on constant burst credits, it's not a downsizing candidate.

Rollout and rollback

Treat a downsize like any other production change. One size down at a time. Low risk window. Watch CPU and memory right after to make sure the smaller instance is handling the load.

Have a rollback plan ready. If you manage infrastructure with Terraform, that's just reverting the instance type in code and applying. Keep the revert PR ready before you ship the downsize so you can move fast if something looks wrong. Don't stack multiple changes (deploys, config changes, downsizes) in the same window.

If any single check in this post fails, keep the current size and look again next cycle.

Conclusion

EC2 cost optimization doesn't need to be complicated. Get at least 30 days of data, check that CPU and memory P99.5 are both under 40 percent, glance at a year of history for seasonal spikes, and verify your disk and network limits. That covers most of what any EC2 rightsizing tool should be doing under the hood.

One size down within the same family is usually enough to capture real savings without breaking anything. If you're managing infrastructure with Terraform, that makes the whole process even simpler: Terraform cost optimization is just changing an instance type in code, reviewing the metrics, and merging. Check back monthly as your monitoring improves.

Best Practices for EC2 Rightsizing