Aaron Bedra gave a fantastic talk when he was with Jemurai where he laid out some of the most important areas of thinking around security when using AWS. One of the things I loved about it was that he gave a nice checklist for each of several high-level areas. This helped me to get my head around the AWS security ecosystem much better, and it’s helped several of our clients out since then.
But once I started sharing my notes from watching the talk, it became obvious that this topic deserved a home in text format on the web, to help more folks the way it’s helped me. And happily for everyone, Aaron agreed—so I’m happy to share this joint post with Aaron today!
The cloud can be a huge win for security. More and more companies are understanding that security in the cloud looks different from security in a data center, but when using the cloud well, the benefits for security are fantastic. Let’s explore cloud security using automation, IAM, network design, encryption, auditing, and continuous integration as our main areas to consider.
There are lots of great things about automation, but for security purposes, one of the biggest is that automation provides the ability to review, analyze, and audit actions on the platform. Humans clicking buttons, on the other hand, is a recipe for security issues. So for cloud platforms like AWS, the website’s console access should generally be limited to exploration and learning—ideally never used at all, even paging the security team when someone logs into the web console. And along with the security wins, automation helps teams to go faster!
- All infrastructure is recorded as code (e.g. Terraform, CloudFormation, Ansible, etc.)
- All infrastructure changes are made by an automated tool
- Console logins are restricted (perhaps to a handful of admins)
- Teams are educated and empowered to make necessary changes with automation
IAM is one of the really core AWS security services. It provides the ability to scope access control exactly the way that your organization wants it, with users and permissions. Mistakes here provide control over everything, from anywhere in the world. This is a high-risk area, and we need to treat it seriously.
If your organization has a directory service like Active Directory, you should go ahead and use that as your system of record to simplify onboarding and offboarding. And if not, you’ll want to make sure to set up strong account requirements (strong passwords, MFA required, including AWS in offboarding checklist, etc.)
Treat the root account with the utmost caution, avoiding it altogether whenever possible. In fact, we can and should go further, and assume roles using temporary STS tokens to limit the scope of the credentials we use, rather than using IAM users’ permissions directly. Tools like AWS Vault can help to make the right way the easy way.
- Any existing directory (e.g. Active Directory) is replicated into AWS and used as the system of record
- Root account usage is limited to the tasks that require the root account
- Root account has MFA enabled
- Root account has no access keys (if possible)
- Users have no permissions outside ability to use STS to assume roles (side effect: less console usage!)
- MFA is enabled for all human users
- MFA required to access privileged roles
- Users are trained and given tools to make role assumption seamless and easy
- Users have no inline IAM policies
While IAM says what users and services can do, network design using VPCs lets us dictate how hosts and services can communicate from a networking perspective. There are lots of variables here depending on the use case and company, but there are a few principles to guide network design.
First, we want boundaries, and the VPC is the top-level boundary that AWS provides. You can segment networks to isolate environments and scope data into sensitive and non-sensitive areas. There should be only one way in to manage hosts in a VPC, via something like a bastion host or a VPN. It’s also worth trying out AWS Systems Manager Session Manager, a managed service that uses IAM credentials rather than SSH to authenticate.
- Everything is deployed inside a VPC
- Flow logs are enabled and monitored
- Everything has a security group attached
- Any security group that allows access from 0.0.0.0/0 (any host on the Internet) has a detailed description and justification (perhaps using tags)
Encryption is difficult, and we should avoid implementing it whenever possible. Our choices can both reduce our efforts and make us safer! AWS provides some rich encryption capabilities, including KMS for key management, which should be our default. All keys should originate in KMS, and you can then attach a given key to database instances and anywhere else you need encryption.
Use KMS for everything you can! You can’t use KMS for literally everything—for example KMS has a limit of 4kb per message, so it’s not about encrypting large documents—but it allows you to generate encryption keys. This means some cryptography may find its way into your own code if you have specialized needs beyond the default stance of “attach a KMS key directly to an AWS service.”
- All master keys / key encryption keys are stored with KMS
- All KMS keys have the rotation option enabled
- All AWS services that store data should use KMS
- Data encryption keys are generated using KMS master keys and stored encrypted
While we all do our best to make good security choices, it’s best to have someone or something else checking our work. Getting things to 100% is hard, but there are some great tools out there for improving our AWS account’s security. In the open-source world, you might try running ScoutSuite and seeing what you find. You might find something surprising!
CloudTrail and CloudWatch are an absolute necessity for security—definitely enable CloudTrail for active regions (or even all regions), and set up CloudWatch alerts that trigger on big-ticket items. Enable activity logging and create actionable responses to potentially-bad actions—things like logging into the root account, creating users, adding administrative roles, too many KMS decrypt events, etc. These kinds of actions should trigger alerts in tools like Slack, PagerDuty, VictorOps, etc. AWS’s GuardDuty service can be really helpful here in monitoring for malicious activity and taking action.
- Configuration analysis is performed at least daily, if not for every change
- CloudTrail is enabled for all active regions (or for all regions!)
- CloudWatch metrics and alarms are implemented for major violation cases
- GuardDuty is enabled
Automation is great, like we talked about earlier. Because our infrastructure is defined in code, we can hook auditing up into a continuous integration pipeline. Many software developers are familiar with using continuous integration for testing and packaging purposes, and using it for security and infrastructure is a great idea too.
Similarly to other CI usage, you can implement checks and controls for pull requests and require them to pass before deploying or merging code. You can create sign-off policies that run automatically in the pull request flow, to increase feedback and flow, in keeping with the DevOps ethos.
- All infrastructure changes trigger configuration audits
- Critical issues found in CI trigger an immediate response
- Use the CI pipeline to create a sign-off process that allows teams to move faster
The security benefits of being on cloud providers like AWS are massive, and you can get a lot out of defining your infrastructure as data in these ways. Even in regulated and highly compliance-oriented environments, working well in this ecosystem can help us to move fast without breaking things.
Hopefully these insights and checklists will be as useful for you as they have been for us!