Cloud Compliance

Advanced Configuration: Dealing with large data set (LDV)

Got millions of records in your Salesforce Org? You’ll need to chunk millions of records into smaller data-set.

Example: There are 20 million contacts. It could take tens of hours to mask them, the masking operation can also encounter SOQL time-outs. The optimal approach is to slice them into five 4 million data sets. This way all of these five data sets will process in parallel. This also prevents any SOQL time-out exceptions.

DataMasker’ Filter Criteria’ can be used to slice large data sets into smaller data sets. Filter Criteria contains the where clause of a SOQL.

The challenge with utilizing the ‘where’ clause in this situation is that when you attempt to perform the SOQL operation on a chunk of records, it can time out. This makes it difficult to guess the right number of records to mask.

Error message displaying 'communication failure' due to SOQL query timeout in Salesforce

Method 1: Use Postman

This is where Postman comes in. Postman is an API platform for building and using APIs. It simplifies each step of the API lifecycle and streamlines collaboration, helping you create better APIs faster.

Below are the steps on how to utilize Postman

Step 1: Refer article to download Postman and connect to your Salesforce Org

Step 2: Once the connection is made, you will have the ability to submit simple query batch jobs using Bulk API.

This will make it easier for you to get the count, as it will not time out. For this, refer to the following article, which is linked here.

Step 3: Once a job is submitted, go to ‘Navigate’ -> ‘Salesforce Dashboard Setup’ -> ‘Bulk Data Load Jobs.’

Bulk Data Load Jobs interface showing a successful job submission

Now, you can see the status of the job and prevent your masking from being impacted by time-outs.

Screenshot of Salesforce's 'Monitor Bulk Data Load Jobs' page, highlighting a completed job

Method 2: Use Workbench

This makes it difficult for you to guess the right number of records to mask. Here Workbench comes to the rescue.

Workbench is an API platform for building and using APIs. It simplifies each step of the API lifecycle and streamlines collaboration, helping you create better APIs—faster.

Enable the ability to execute Bulk API queries through an exposed DataMasker API

pcldm.DM_DataMaskingService.createBulkQueryJob(‘____’);

For e.g.,

pcldm.DM_DataMaskingService.createBulkQueryJob(‘select id from Contact limit 3000000’);

Goto workbench, log in with your salesforce org, and click on ‘Apex Execute.’

!

Click on ‘Execute’.

Entering Apex code for bulk query job in Salesforce Workbench under the Apex Execute tab.

After Execute the query it will give the result

Result of executing Apex code in Salesforce Workbench, showing successful execution.

The User has to check in Bulk Data Loads Jobs.

Salesforce Bulk Data Load Jobs screen showing a successful job completion for 3 million Contact records

Method 3: DataMasker API

Enable the ability to execute Bulk API queries through an exposed DataMasker API via the Developer Console

pcldm.DM_DataMaskingService.createBulkQueryJob(‘select id from employee__c limit 2000000’);

Click on ‘Execute’.

Apex code for creating a bulk query job in Salesforce Workbench, with 'select id from employee__c limit 2000000' highlighted

The User has to check in Bulk Data Loads Jobs.

Setup showing Bulk Data Load Jobs with a highlighted job processing 3,000,000 Contact records

Need Help?

If you have questions about this documentation, please contact our support team.