Technical Deep Dive
How Sandbox DataMasker Works
Field-level rules. Object relationship integrity. Email suppression. Automatic execution on every sandbox refresh. Here is exactly what happens during a DataMasker run.
The Masking Pipeline
From sandbox refresh trigger to developer access
Trigger
CI/CD or sandbox refresh
A sandbox create, refresh, or CI/CD post-deployment hook triggers DataMasker automatically. No manual step or ticket required.
Sandbox Refresh
Salesforce copies production data
Salesforce copies production records into the sandbox org. DataMasker waits for the refresh to complete before beginning masking; users are blocked from access during this window.
Data Masking
PII replaced, automations muted
DataMasker executes all configured masking rules across every in-scope object simultaneously, using Apex Batch jobs.
- Mask or delete sensitive data (names, SSNs, email addresses, phone numbers, financial amounts)
- Remove files, Chatter posts, tasks, and other content records
- Mute automations: Flows, Process Builders, and Apex triggers suppressed to prevent outbound emails or callouts
Post-Refresh Automation
Environment configured for safe use
After masking completes, DataMasker runs environment cleanup steps that make the sandbox functional without PII risk.
- Update Remote Site Settings and custom settings to point to non-production endpoints
- Update User email addresses, removing the .invalid suffix that Salesforce appends on refresh
- Update other settings and metadata to match sandbox configuration
Ready for Use
Users notified, sandbox unlocked
Developers, QA engineers, and contractors receive access notifications. The sandbox contains realistic-format data with no real personal information.
Under the Hood
What DataMasker does in each phase
Configure field-level masking rules
Using DataMasker's clicks-based rule builder, you configure masking instructions for every field that contains personal data. Standard Salesforce objects come with pre-built templates: Contact, Lead, Account, Case, and Activity fields are pre-mapped. Custom objects and custom fields are added manually or discovered via Personal Data Discovery integration.
Masking types available: realistic name substitution, email redirect to safe domain, phone number format replacement, SSN/ID format masking, date range shuffling, financial amount range replacement, full anonymization, and custom regex patterns.
Rules execute automatically on sandbox refresh
When a sandbox refresh completes, DataMasker's post-refresh automation triggers immediately, before users are granted access to the refreshed environment. No manual step. No post-refresh script. No ticket required.
The timing matters: users cannot access the sandbox until masking is complete. This eliminates the exposure window between refresh completion and masking execution that exists with manual processes.
Relational integrity maintained across objects
DataMasker masks related objects as a consistent set. When a Contact's name is replaced with 'James Richardson', every related Activity, Case, and custom record that stores that name is updated with the same substitute. Queries that join across objects return coherent results.
Critical for testing scenarios involving cross-object workflows, roll-up summary fields, and Salesforce Flow automation. The masked environment behaves identically to production.
Email and callout suppression during masking
Salesforce automations (Process Builders, Flows, Apex triggers) fire when records are updated. During masking, these automations would send real emails to real customers if not suppressed. DataMasker mutes outbound email and external callouts for the duration of the masking run.
This prevents the scenario where a masking job triggers a 'Your account has been updated' email to 50,000 real customers because a Flow fired on a masked Contact update.
Sandbox available with masked data
After masking completes, developers and contractors gain access to a sandbox with realistic-format data that maintains all the structural properties of production (same volume, same relationships, same field distributions) but with no real personal data present.
AgentForce and Einstein AI models trained in the masked sandbox receive data with the same statistical distributions as production, preserving model accuracy without exposing real personal information.
Architecture
100% inside your Salesforce org
DataMasker is a managed package. Every component (configuration UI, masking logic, and APIs) runs inside your Salesforce org as native Apex. No external infrastructure. No data leaves your environment.
Salesforce Org
DataMasker Managed Package
Mapping and Configuration UI
Lightning / Visualforce
Declarative rule builder for configuring masking instructions per field, per object. No Apex required for standard configurations.
Masking and Deletion Logic
Custom Objects + Custom Settings
Masking rules, object mappings, and job configurations stored natively in Salesforce Custom Objects and Custom Settings: queryable, auditable, and backed up with your org metadata.
Automation Processes + APIs
Apex
All masking execution, scheduling, and DevOps API endpoints run as managed Apex code within your org. No external compute. No outbound data movement.
Salesforce Tooling API and Batch interfaces used with managed Apex code for masking process execution and automation
Salesforce Metadata API used for masking configuration mapping alongside managed code
REST API endpoints built with managed Apex using Salesforce standard patterns, callable from Copado, Gearset, GitLab CI, and any CI/CD tool
Your Options
DataMasker vs the alternatives
How DataMasker compares to Salesforce Data Mask, DIY Apex, and external ETL tools. See full technical comparison
| Dimension | DataMasker | Salesforce Data Mask | DIY (custom Apex) | External ETL |
|---|---|---|---|---|
| Data security | Secure: data never leaves Salesforce | Secure: data never leaves Salesforce | Uncertain: depends on implementation | Not secure: data leaves Salesforce |
| Development effort | Hours: declarative + minimal Apex | Hours: declarative + minimal Apex | Weeks/months: all custom Apex | Days/weeks: tool-dependent |
| Maintenance effort | Hours: declarative config changes | Hours: declarative | Days: custom code updates | Hours to weeks: tool-dependent |
| Performance | Up to 1M records/object/hour | Up to 200K records/object/hour | Uncertain | 10K to 10M+ p/h (tool-dependent) |
| Cost | Free tier + paid at 5% of ACV | 10% of Annual Contract Value | Development + ongoing maintenance | Licensing + dev + maintenance |
Technical Specifications
Processing speed
Up to 1M records/object/hour
Maximum tested
99M records (24 hours)
Salesforce Data Mask
200K records/object/hour (5x slower)
Sandbox types
Full Copy, Partial Copy, Developer
Object types
All standard + custom Salesforce objects
Formula fields
Supported (unlike Salesforce Data Mask)
Picklist fields
Supported
Email suppression
Yes (during masking run)
External callout suppression
Yes (during masking run)
DevOps API
REST API: Copado, Gearset, Flosum, GitLab
Deployment method
Managed package via AppExchange
Implementation time
3 weeks typical
Frequently Asked Questions
See DataMasker run on a live sandbox refresh
30-minute technical demo. We walk through field-level masking rules, relational integrity, and DevOps integration on a real Salesforce org.