As the popularity of cloud computing increases, so does the volume of businesses collecting, storing, and analyzing data for better business insights. Our five best practices for Google Cloud data governance can help businesses address the management of data in the cloud without losing the benefits of cloud computing.
For many businesses operating in Google Cloud, data is their most valuable asset. However, due to the speed with which data is collected and stored, the ability to track data and manage it can be very difficult. If businesses are unable to track and manage data, risks materialize with respect to data security, data integrity, and data compliance.
Who's responsible for tracking and managing Google Cloud data?
One of the questions businesses need to address with regards to Google Cloud data governance is who's responsible for tracking and managing data. Making each team or department responsible for their own data security, data integrity, and data compliance is fraught with cost, performance, and security risks, and therefore, responsibility should be delegated to a Cloud Center of Excellence.
A Cloud Center of Excellence is a multi-functional team tasked with the responsibility of developing a framework for the business´s cloud operations, with the objective to achieve continuous cost optimization, govern the cloud using standard KPIs across the whole business, and proactively manage cloud risk.
Google Cloud data governance consists of a) the rules that enable businesses to keep on top of their data, and b) the measures to enforce the rules. However, if the rules and enforcement measures are too stringent, they can stifle innovation and have a negative impact on the flexibility, efficiency, and strategic value benefits of cloud computing.
Our five best practices for Google Cloud data governance should help businesses find a balance between managing data and ensuring it's accessible when required.
1. Ensure you have total visibility of data
Without a holistic view of data and its sources, it can be difficult to know what data you have, where data originated from, and what data is in the public domain that shouldn’t be. For this reason, it’s important to identify any “shadow” Line of Business IT within the business and, if it exists, to integrate it with authorized IT activity.
2. Implement a universal labeling policy
In order to classify and organize data, a universal labeling policy—where all assets are labeled in the same format—is essential. Businesses operating in a multicloud environment should take care to ensure the labels used in Google Cloud (where only lowercase labels are allowed) follow the same format as tags used in AWS or Azure Clouds.
3. Apply least privilege access controls
Least privilege access controls restrict access rights for users, accounts, and processes to only those resources absolutely required to perform routine, legitimate activities. With regards to data in the Google Cloud, businesses can set up owner and reader privileges at project and data set levels to help control access to data.
4. Enable data access audit logs
To avoid potential data loss through security incidents, fraudulent activity, and operational problems, it’s important to enable Data Access audit logs and configure IAM profiles so the audit logs cannot be disabled at user level. The audit logs should be collected and stored securely in a limited access storage volume for analysis when required.
5. Encrypt sensitive data
One of the advantages of implementing a universal labeling policy is that it’s easier to identify and encrypt sensitive data. This—and total visibility—avoids the necessity to encrypt everything and the potential performance problems associated with total encryption. With Google Cloud’s Data Loss Prevention API, you can also de-identify, mask, or tokenize sensitive data.
CloudHealth and Google Cloud data governance
Although it’s not difficult to develop Google Cloud data governance policies, it can be difficult to enforce compliance with them. It only takes a misspelled label or misconfigured IAM policy for data to “escape” and be exposed to risk or corruption. CloudHealth is an excellent solution for enforcing Google Cloud data governance as the platform monitors cloud environments around the clock and alerts you to any violations—or potential violations—of data governance policies.
The key to balanced, yet effective enforcement of your Google Cloud data governance policies is to first use the CloudHealth platform to unite data from all sources. Then take advantage of CloudHealth’s policy-driven automation capabilities to monitor your Google Cloud and alert you to events such as:
- Labels that don’t conform to your universal labeling policy.
- Users with more access to data than they should have.
- Disabled Data Access audit logs and insecurely stored audit logs.
- Publicly-accessible storage volumes and unencrypted data.
For an in-depth guide into building a successful cloud operations and governance practice, see our whitepaper: Building a Successful Cloud Operations and Governance Practice
And for more specific information on optimizing your Google Cloud environment, see our eBook: How to Build Long-Term Success in GCP