The Importance of Capacity Planning for EBS Volumes in AWS

Ashan Fernando - Feb 1 '18 - - Dev Community

Provisioning an EC2 instance with an EBS volume for your applications is quite straightforward in AWS. However, have you ever thought whether you are provisioning either under or over the required amount of capacity for Storage?

This is one of the main places where over provisioning happens. Its common to provision higher amount of capacity or throughput just to be on the safe side but finally ending up in problems when application workload changes. Therefore doing a proper capacity planning considering different factors like volume type, size, throughput, latency, automation, and costs are really important.

In addition, there are other factors such as selecting the taking snapshots, pre-warming & etc. which also affects AWS EBS performance that also needs to be considered for capacity planning.

One of the main challenges applications faces with poor capacity planning is in handling peak application loads as well as when handling failures. With the support of automation tools available in AWS and knowing the limitations, it is possible to plan for these situations without over-provisioning the storage which costs more.

Storage Volume Type and Size

This is one of the main areas where wrong selections are made since AWS EBS provides multiple options for underlying physical storage device types and storage capacity. In the past, it was important to plan for the storage size since it was required to restart the EC2 instance after increasing the storage size which can result in a downtime. With the current generation of AWS EBS Volumes, you can increase the size, change the volume type of IOPS without detaching it. It is possible to do the same for detached volumes as well.

Another important action you can take is to automate the associated volume modifications using CloudFormation, AWS CLI or using AWS SDKs so that you can provision the required configuration optimally and when it is needed.

Storage IOPS

It is equally important to do plan on the required throughput. Most of the time I see many EC2 configurations go ahead with the default settings in selecting general purpose SSD which is alright unless you know what you are doing. However, there are costs benefits in selecting HDD storage over SSD for several sequential access scenarios or non-trivial workloads which needs to be considered while planning the throughput. On the other hand, if you are dealing with applications with high-performance requirements and IOPS, it is important to do several load testing for the application forehand and understand the IOPS requirements for different nature of workloads so that adequate automation and capacity can be provisioned to efficiently run the applications.

Storage Latency

This is one of the areas where many forget about considering especially for high-performance applications. One of the important aspects of managing latency is to manage the queue length which depends on the number of pending I/Os for an EBS device. You need to consider this and plan the queue considerations based on the latency and I/O size.

Another area many are unaware of is that upon creation of an EBS volume from a snapshot and try to access it for the first time can cause unexpected latencies. For an application that requires performance starting from the first read needs to be Pre-Warmed by hitting each block before running the application in production which can be achieved by initializing the EBS.

Automation

This is a very crucial area which you should be focussing on. Knowing the limitations and capabilities for automating EBS capacity and provisioning possibilities can save lots of money due to over and under provisioning of EBS volumes.

One of the places you can start in figuring out the automation capabilities is to go through the features in AWS EBS Volume in web console and then understand the underlying configurations which you can automate or require manual interactions. In addition looking at the matrices in CloudWatch for EBS also provides a great insight in possibilities to monitor and trigger for automation which might require writing a custom logic using AWS EBS SDK in a Lambda function or period monitor and provisioning triggers.

Costs

Provisioning extreme limits with the highest throughput available is something anyone can do. It takes careful planning and effort to identify the required capacity limitations and provision accordingly. This is the most important area you need to carefully evaluate for AWS EBS capacity planning because finally, it boils down to the application performance and how efficiently the storage is utilized for the application reducing the costs.

I frequently use AWS Calculator to come up with rough estimates after selecting the EBS volume size, type and IOPS configurations. It is also important to understand that the AWS costs changes from region to region and also based on the Usage. So selecting right region while having the necessary configuration for automated tasks can save you a considerable amount of money if done properly. This includes scheduling snapshots on required time frames, increasing storage size with predictive analysis while affecting the changes at the right time frames.

. . . . .