Technical Deep Dive

How Sandbox DataMasker Works

Field-level rules. Object relationship integrity. Email suppression. Automatic execution on every sandbox refresh. Here is exactly what happens during a DataMasker run.

The Masking Pipeline

From sandbox refresh trigger to developer access

1

Trigger

CI/CD or sandbox refresh

A sandbox create, refresh, or CI/CD post-deployment hook triggers DataMasker automatically. No manual step or ticket required.

2

Sandbox Refresh

Salesforce copies production data

Salesforce copies production records into the sandbox org. DataMasker waits for the refresh to complete before beginning masking; users are blocked from access during this window.

3

Data Masking

PII replaced, automations muted

DataMasker executes all configured masking rules across every in-scope object simultaneously, using Apex Batch jobs.

  • Mask or delete sensitive data (names, SSNs, email addresses, phone numbers, financial amounts)
  • Remove files, Chatter posts, tasks, and other content records
  • Mute automations: Flows, Process Builders, and Apex triggers suppressed to prevent outbound emails or callouts
4

Post-Refresh Automation

Environment configured for safe use

After masking completes, DataMasker runs environment cleanup steps that make the sandbox functional without PII risk.

  • Update Remote Site Settings and custom settings to point to non-production endpoints
  • Update User email addresses, removing the .invalid suffix that Salesforce appends on refresh
  • Update other settings and metadata to match sandbox configuration
5

Ready for Use

Users notified, sandbox unlocked

Developers, QA engineers, and contractors receive access notifications. The sandbox contains realistic-format data with no real personal information.

Under the Hood

What DataMasker does in each phase

01

Configure field-level masking rules

Using DataMasker's clicks-based rule builder, you configure masking instructions for every field that contains personal data. Standard Salesforce objects come with pre-built templates: Contact, Lead, Account, Case, and Activity fields are pre-mapped. Custom objects and custom fields are added manually or discovered via Personal Data Discovery integration.

Masking types available: realistic name substitution, email redirect to safe domain, phone number format replacement, SSN/ID format masking, date range shuffling, financial amount range replacement, full anonymization, and custom regex patterns.

02

Rules execute automatically on sandbox refresh

When a sandbox refresh completes, DataMasker's post-refresh automation triggers immediately, before users are granted access to the refreshed environment. No manual step. No post-refresh script. No ticket required.

The timing matters: users cannot access the sandbox until masking is complete. This eliminates the exposure window between refresh completion and masking execution that exists with manual processes.

03

Relational integrity maintained across objects

DataMasker masks related objects as a consistent set. When a Contact's name is replaced with 'James Richardson', every related Activity, Case, and custom record that stores that name is updated with the same substitute. Queries that join across objects return coherent results.

Critical for testing scenarios involving cross-object workflows, roll-up summary fields, and Salesforce Flow automation. The masked environment behaves identically to production.

04

Email and callout suppression during masking

Salesforce automations (Process Builders, Flows, Apex triggers) fire when records are updated. During masking, these automations would send real emails to real customers if not suppressed. DataMasker mutes outbound email and external callouts for the duration of the masking run.

This prevents the scenario where a masking job triggers a 'Your account has been updated' email to 50,000 real customers because a Flow fired on a masked Contact update.

05

Sandbox available with masked data

After masking completes, developers and contractors gain access to a sandbox with realistic-format data that maintains all the structural properties of production (same volume, same relationships, same field distributions) but with no real personal data present.

AgentForce and Einstein AI models trained in the masked sandbox receive data with the same statistical distributions as production, preserving model accuracy without exposing real personal information.

Architecture

100% inside your Salesforce org

DataMasker is a managed package. Every component (configuration UI, masking logic, and APIs) runs inside your Salesforce org as native Apex. No external infrastructure. No data leaves your environment.

Salesforce Org

DataMasker Managed Package

Mapping and Configuration UI

Lightning / Visualforce

Declarative rule builder for configuring masking instructions per field, per object. No Apex required for standard configurations.

Masking and Deletion Logic

Custom Objects + Custom Settings

Masking rules, object mappings, and job configurations stored natively in Salesforce Custom Objects and Custom Settings: queryable, auditable, and backed up with your org metadata.

Automation Processes + APIs

Apex

All masking execution, scheduling, and DevOps API endpoints run as managed Apex code within your org. No external compute. No outbound data movement.

1

Salesforce Tooling API and Batch interfaces used with managed Apex code for masking process execution and automation

2

Salesforce Metadata API used for masking configuration mapping alongside managed code

3

REST API endpoints built with managed Apex using Salesforce standard patterns, callable from Copado, Gearset, GitLab CI, and any CI/CD tool

Your Options

DataMasker vs the alternatives

How DataMasker compares to Salesforce Data Mask, DIY Apex, and external ETL tools. See full technical comparison

DimensionDataMaskerSalesforce Data MaskDIY (custom Apex)External ETL
Data securitySecure: data never leaves SalesforceSecure: data never leaves SalesforceUncertain: depends on implementationNot secure: data leaves Salesforce
Development effortHours: declarative + minimal ApexHours: declarative + minimal ApexWeeks/months: all custom ApexDays/weeks: tool-dependent
Maintenance effortHours: declarative config changesHours: declarativeDays: custom code updatesHours to weeks: tool-dependent
PerformanceUp to 1M records/object/hourUp to 200K records/object/hourUncertain10K to 10M+ p/h (tool-dependent)
CostFree tier + paid at 5% of ACV10% of Annual Contract ValueDevelopment + ongoing maintenanceLicensing + dev + maintenance

Technical Specifications

Processing speed

Up to 1M records/object/hour

Maximum tested

99M records (24 hours)

Salesforce Data Mask

200K records/object/hour (5x slower)

Sandbox types

Full Copy, Partial Copy, Developer

Object types

All standard + custom Salesforce objects

Formula fields

Supported (unlike Salesforce Data Mask)

Picklist fields

Supported

Email suppression

Yes (during masking run)

External callout suppression

Yes (during masking run)

DevOps API

REST API: Copado, Gearset, Flosum, GitLab

Deployment method

Managed package via AppExchange

Implementation time

3 weeks typical

Frequently Asked Questions

See DataMasker run on a live sandbox refresh

30-minute technical demo. We walk through field-level masking rules, relational integrity, and DevOps integration on a real Salesforce org.