Let’s admit it—if you’re in cloud security, you’re in a race with thousands of attackers! You want to move fast and fix those risky misconfigurations before an attacker exploits them. To win this race you need to be fully equipped. With attackers leaning heavily on automation to detect vulnerabilities, you need to beat them in their own game.
CloudHealth Secure State gives you the tools that help increase your chances of winning: real-time visibility, detection, and a platform to monitor security and compliance risks. In this blog post, I want to talk about a new addition to this toolkit, a unique remediation approach to help you scale security and accelerate the response needed to protect cloud assets.
You need a better approach to automate cloud security
As the list of public cloud services, configuration settings, and dependencies between them all grow exponentially, it’s become humanly impossible for developers to know all the configuration security best practices or have the ability to monitor the impact a simple configuration drift can have over time. This is the primary reason why misconfigurations are highly prevalent and the leading cause of security breaches in the cloud.
For configuration monitoring, security teams often turn to tools that periodically scan the cloud for changes. The volume of alerts and false positives from such tools can be overwhelming. If internal problems aren’t enough, the attackers you compete with are getting extremely sophisticated. With advanced scripts and automated tools, it takes them just a few minutes to detect and exploit an accidental misconfiguration. CloudHealth Secure State provides the real-time visibility, detection, and risk prioritization capabilities you need to better understand security risks.
However, to counter external threats, you need to complement detection with an even better response plan. Executing a successful remediation strategy to fix misconfigurations requires strong collaboration between developers and the rest of your team. Everyone needs to build a shared understanding of the deployment context and the security policies to align on remediation steps that don’t disrupt application availability. Scaling security also means you’ll need to uniquely address the needs of all your different teams and applications.
Very quickly, your small security team could be under the pressure to fix tens of thousands of misconfigurations. Doing so manually, or using a custom solution built on cloud-native services, can be very challenging. In practice, most security teams struggle to make progress with remediations as they’re concerned about introducing disruptive application changes and trusting SaaS tools that require overly-permissive policies to automate changes.
Introducing a scalable, in-account remediation approach
Today, I am pleased to share that CloudHealth Secure State is offering a unique remediation approach as a beta feature to help automate actions across cloud environments. A key aspect of this approach is its cloud permissions control policy, which enables you to manage and remediate misconfigurations with CloudHealth Secure State while maintaining read-only access (least privileges) to your cloud accounts within the service.
The architectural design that makes this feasible consists of a remediation worker that you deploy in your cloud accounts to execute actions and a set of remediation jobs defined in that worker image. Only the workers deployed by you have limited write privileges to your cloud accounts. This design ensures that only you own the execution of remediations in your cloud environment and that CloudHealth Secure State cannot inadvertently make changes to your resources.
Besides the security benefits baked into the architecture, the entire remediation workflow is designed to help you build trust with your DevOps teams and scale security actions. CloudHealth Secure State’s comprehensive remediation capabilities enable you to:
- Leverage pre-defined remediation jobs or create custom ones to address the needs of different teams (SOC, GRC, Vulnerability Management, DevOps, IT Operations, etc.)
- Target remediations to quickly resolve existing misconfigurations
- Publish remediations to enable DevOps teams and address misconfigurations where you need additional application context
- Build guardrails to proactively auto-remediate new misconfigurations at real-time speed
No matter how remediations are triggered, you can maintain centralized visibility into the progress of all remediations and changes made to your cloud resources.
A step-by-step demonstration of remediations in CloudHealth Secure State
To start resolving misconfigurations detected by CloudHealth Secure State, you begin by creating worker groups that execute actions within your control accounts. The service continuously monitors the health of all workers to ensure they’re ready when needed to execute actions. Once workers are setup, you can quickly create new remediations to address misconfigurations detected by various security and compliance rules available within CloudHealth Secure State.
Rather than apply remediations broadly, sometimes you’ll need to target them to specific resources. For example, you may want to fix issues in your production environment before addressing other parts of your infrastructure. Remediations enable you to granularly target jobs based on conditions that include cloud accounts, regions, and resource tags.
Next, the setup gives you the flexibility to decide the type of misconfigurations you want to resolve—either new or existing. By enabling auto-remediation during the setup, you can ensure that the service proactively remediates all new misconfigurations associated with a particular rule, as soon as they are detected. To address existing misconfigurations, you can simply leave auto-remediation disabled and manually trigger remediation jobs for resources that need to be corrected.
If you’re not sure about the impact of a particular action on the application environment, you can also delegate decisions by publishing remediations for use by DevOps teams. These teams can then either initiate remediations from the service GUI or programmatically execute them via externally defined APIs.
To get you started with remediations, the service also provides a list of out-of-the-box actions that fix common misconfigurations such as unencrypted S3 buckets or instances with SSH or RDP ports open to the internet. If you don’t see a relevant job in the pre-defined list, you can quickly define a custom job in the worker image.
Once the remediations have been triggered for tracking and troubleshooting, the service also enables you to monitor remediation progress and audit historic configuration changes.
As you can see in these workflows, we’ve designed the solution to help you preview misconfiguration context, test remediations in a particular environment, and gradually gain the trust and confidence needed to scale remediations across clouds and teams.
Recommendations for operationalizing your remediation strategy
Based on engagements with multiple customers, we’ve defined some recommendations for building a scalable remediation approach:
- Define requirements – Your organization might be subject to compliance requirements such as GDPR, HIPPA, PCI, etc. Besides regulatory requirements, it’s always a best practice to compare your cloud security posture against CIS foundation benchmarks. You should selectively enable controls that are relevant to your organization and review these periodically.
- Build guardrails – Security guardrails are a great way to help protect your cloud assets without slowing down developers. Start by identifying misconfigurations that must be fixed, auto-remediate them, and then build guardrails to manage ongoing enforcement. Example use cases include enabling AWS CloudTrail logs or restricting Azure NSGs with Remote Desktop port open to the internet.
- Ease into automation – As you begin to embrace automation, it’s ok to start slow. Pick a team, a few cloud accounts, and limited services in the beginning. Automatically remediate misconfigurations you discover and gradually expand this process to include additional accounts, services, and teams. Allow for exceptions wherever your automated security policies don’t make sense.
- Empower your developers – In the cloud, security truly is a distributed responsibility—so give your developers access to the security tools you use. Allow them to proactively verify configurations and use your remediations to fix misconfigurations. This is critical for educating them on best practices and scaling efficiently beyond a small cloud security team.
- Iterate and Optimize – You must continuously look for patterns in the open list of misconfigurations and identify remediations you can automate to save time and reduce risk. At the same time, recognize that the cloud is complex. No matter how much you want to automate, there will always be exceptions, decisions, and processes that will not be easily automated. Just continue to iterate and optimize.
Accelerate cloud security
With CloudHealth Secure State’s real-time detection and remediation capabilities, you can proactively mitigate risks across cloud environments. To talk to an expert on cloud security and compliance best practices, or request a free CloudHealth Secure State trial, click here.
You can also learn more about how to implement a successful cloud security posture in our in-depth eBook: Top 10 Best Practices for Cloud Security Posture Management