So, who actually owns your Data in AWS?
After some thought provoking conversations I had at a recent industry event, I wanted to help try and demystify Data Ownership in AWS at a high level, and talk about who owns your data once it is in the AWS Public Cloud. First things first, because Amazon Web Services is a company run out of the USA, that does not mean that your data is stored in (The USA - *let’s get to this later), or owned by Amazon Web Services.
Starting with some context first - AWS Shared Responsibiltiy Model AWS have a framework to help everyone using AWS understand where the responsibilities of security lie. This is called the “Shared Responsibility Model”. In a nutshell, AWS are responsible for the security “of the cloud” (The AWS Global Network, hardware security, physical data centre security, network security etc) and you as the customer are responsible for the security “in the cloud” - your RDS and EC2 instances, any other resources need to be appropriately secured when deployed. If you don’t, then you might be opening yourself up to a potential world of hurt.
But, what about the AWS Global Services? This is a good point. I won’t go into huge depth but it is correct that AWS does host a handful of core services either globally, with the control plane itself hosted in Virginia (us-east-1) think IAM, AWS Support, CloudFront, S3, and S3 for some examples. A good example of this has been seen on a rare service outage in which you might be unable to reach the Support control plane in the Management Console, but can still lodge a support ticket by going in via another AWS region. Or, there could be issues with a service where you cannot access it, but you have no outages or performance degradation in the underlying service stack you’re running.
Encryption is there for a reason, use it The underlying encryption service for AWS is KMS (Key Management Service). With this AWS supports encryption at rest and in transit. KMS (also the S3 implementation of KMS e.g SSE-S3) utilises KMS under the hood. KMS keys are symmetric keys, can be re-used, shared, use many for certain purposes (SOLID principals). As a basic starting point if we’re talking about a simple three-tier web application (ALB > EC2 > RDS) load-balancer, compute, and data layers use the latest ciphers for the ALB, and encrypt your RDS instances and backing EBS storage using KMS. This is a really simple way to further secure your data, and to ensure it remains private to any unsolicited access.
Depending on your needs you could also use CloudHSM to let you manage and access your KMS keys on FIPS-validated hardware, protected with customer-owned, single-tenant HSM instances that run in your own Virtual Private Cloud (VPC). This gives you certified and full control of your encryption keys.
Where is my data stored in AWS? Your data is stored in AWS data centres comprising of each availability zone in the AWS region in which you deploy it(depending on your deployment model). No data is shipped unnecessarily to the USA. Unless you make it that way. For example CloudFront distributions by-design will store your selected data across the AWS Global Networks point of presence (data centres closest to your end-users) for the best performance and lowest latency.
What information can AWS see? I’ll take the classic example again of support ticket lodgement. You can see for yourself this in action. AWS can only see at a high level resources deployed in an account. They cannot see, read, or access your data. This is why when you get asked to provide further debugging information, e.g a HAR file (HTTP archive file) to get the information as it is unavailable to AWS support staff.
So, there you have it. A very high level of where your data is stored in AWS!