top of page

Coconut King of Salesforce Sandbox – Data Anonymization and GDPR

Updated: Oct 21, 2020


Once upon a time, there was a king who really liked Coconuts. In his Coconut-mania, he decided to have his palace security designed like a Coconut. The border wall and security were an impenetrable shell, but once inside, everything was accessible.


It worked until one night his Vizier decided to revolt. While the King was asleep, the Vizier simply walked into the royal chamber, deposed him and declared himself the new king.


“Coconut challenge of hardened Production Salesforce Org and soft Sandboxes is a real vulnerability…and sadly frequent!”


The model of fortifying one layer and leaving the internals less secure is often referred to as a “Coconut” model in today’s IT security parlance.


Most Salesforce customers do a fairly good job of locking down their production org with Salesforce’s extensive and granular security controls ranging from IP white-listing, SSO, multi-factor authentication to Shield, but the soft core of the Coconut is the Sandbox.





IT often needs to make Sandboxes easily accessible to reduce friction around user acceptance testing, training, trying out production support fixes, etc.


Consequently, sandboxes are often configured with less restrictive controls, both in terms of security settings as well as access to third-party contractors, offshore development teams and so on.


Ultimately, the security and privacy of your customer’s personal data is as good as the weakest link.

Even worse, with each sandbox refresh, the customer data continues to leak from Production. This sweetens the pot even more for an unsavory Vizier to walk in one night, and take it all away.


Unfortunately, as this is real personal data, the impact of a Sandbox breach is not any less severe than production, both in terms of a brand’s trust and regulatory fines. GDPR, CCPA, and other regulations have clear guidelines for data breach reporting.


“A straight forward way to ensure security of customer data across enviroments is to anonymize it in the Sandbox.”

This ensures that even if a bad guy gets access to a sandbox, they will only get scrambled data. Here is a perspective on some of these options which have varying cost and complexity.


Note: This article was written before Salesforce introduced Data Masking. From an options perspective, the comparison is still valid as both Cloud Compliance and Salesforce Data Mask technically follow the same approach. Read more here



Option 1: Automated Anonymization on Refresh with Salesforce’s SandboxPostCopy


Salesforce’s “SandboxPostCopy” Apex interface can cause apex code execution after a Sandbox refresh occurs. It is the most direct way to ensure that once a Sandbox is refreshed, Apex code to anonymize data will be executed. This ensures that all Sandboxes are anonymized, every time!




So, how do you use it? You can write your own Apex classes to Anonymize data once the Sandbox is refreshed. This is not trivial. Also, as new objects/fields are added, the code will have to be updated, tested and actively maintained – incurring additional cost/effort.


“Programming good anonymization logic that sufficiently and reliably anonymizes data can be tricky.”

Cloud Compliance addresses this issue by extending its robust de-identification capabilities to Sandbox refresh. Essentially, you create a “De-identification Mapping” with a simple declarative wizard and just call it from “SandboxPostCopy” interface in simple steps.


Step 1: Create a de-identification mapping for any Standard or Custom Object in a few clicks with an easy to use metadata driven interface.





Step 2: Create a new “Data Retention” mapping and reference the “De-identification” mapping created earlier.




Step 3: Once the de-identification mapping is ready, Cloud Compliance will generate the “SandboxPostCopy” code.





Step 4. Add the code to your “SandboxPostCopy” Class, and Cloud Compliance will anonymize Sandboxes on create and refresh.


global class PrepareMySandbox implements SandboxPostCopy {
    global void runApexClass(SandboxContext context) {
        System.debug('Begining Cloud Compliance Anonymization');

        // This line of Apex code will execute Sandbox Anonymization //for a given object mapping.

       //Anonymize Contacts
       pccc_dm.CloudComplianceSandboxDeIdentify.execute('a0B18000005iCCfEAM');

      //Anonymize Leads
       pccc_dm.CloudComplianceSandboxDeIdentify.execute('a0B18000005iCCkEAM');

    }
}

Now, whenever Sandbox is refreshed, Salesforce would automatically fire it in post-processing to ensure that customer data is always secure.


Option 2: Manual/Self-initiated Anonymization and Refresh with Extract → Anonymize → Load


A number of ETL and Archiving vendors offer anonymization capabilities. However, this can require extracting Salesforce data, de-identifying it, and then loading it into Sandboxes.


By design, this requires a separate effort for data refresh and metadata refresh. It also puts the operational onus of when/how to run it on you, the customer.


Finally, most of these tools are primarily archiving and ETL first, and their native ability to remove attachments, delete field history, and other such features may are varying.


Discuss your specific GDPR/CCPA use cases with the author of this article. https://calendly.com/plumcloudlabs/

Conclusion: Anonymized Sandboxes are not only good data hygiene but when combined with other processes, they can be a valuable safeguard against data breach, ensure personal data privacy, and CCPA/GDPR compliance.

179 views0 comments
PC_Logo_hres_transparent.png
bottom of page