Skip to main content

Salesforce Connector

The Salesforce connector provides full bidirectional CRM integration -- OAuth authentication, bulk and incremental sync of Contacts, Accounts, and Opportunities, configurable field mapping with custom field discovery, a three-stage data quality pipeline, and score writeback.

Prerequisites

Before connecting Salesforce, you need:

  1. A Salesforce Connected App with OAuth 2.0 enabled
  2. The api and refresh_token OAuth scopes granted
  3. The environment variables listed below configured in your deployment

Environment Variables

VariableDescription
SALESFORCE_CLIENT_IDOAuth Connected App client ID
SALESFORCE_CLIENT_SECRETOAuth Connected App client secret
SALESFORCE_CALLBACK_URLOAuth redirect URI (e.g., https://app.example.com/api/connectors/salesforce/callback)
CONNECTOR_ENCRYPTION_KEY32-byte hex string (64 hex characters) for AES-256-GCM token encryption

Authentication

The Salesforce connector uses the OAuth 2.0 Web Server Flow with CSRF-safe state parameters.

OAuth Flow

CSRF Protection

The OAuth state parameter is signed with HMAC-SHA256 using CLERK_SECRET_KEY. The payload format is tenantId:connectorId, and verification uses constant-time buffer comparison to prevent timing attacks.

Token Encryption

All Salesforce tokens are encrypted at rest using AES-256-GCM:

  • Algorithm: aes-256-gcm
  • IV length: 12 bytes (randomly generated per encryption)
  • Auth tag length: 16 bytes
  • Storage format: base64(iv + tag + ciphertext)
  • Key: CONNECTOR_ENCRYPTION_KEY environment variable (32-byte hex)

Token refresh is automatic -- when an access token expires (default 2-hour TTL), the connector transparently uses the refresh token to obtain a new one.

Key management

The CONNECTOR_ENCRYPTION_KEY must be exactly 64 hex characters (32 bytes). Rotating this key requires re-encrypting all stored tokens.

Sync Pipeline

The connector supports two sync modes: initial (full) and incremental (change-data-capture).

Initial Sync

The initial sync fetches all records using SOQL queries, processed in three sequential phases:

Records are fetched in configurable batch sizes (default: 200) using an async generator that handles Salesforce's query pagination (nextRecordsUrl).

Default SOQL Fields

EntityDefault Fields
ContactId, Email, FirstName, LastName, Title, Phone, AccountId
AccountId, Name, Website, Industry, NumberOfEmployees, AnnualRevenue
OpportunityId, Name, StageName, CloseDate, AccountId, Amount
Custom fields

Admins can add custom SOQL fields through the field mapping UI. The Metadata API discovers all available custom fields on Contact, Account, and Opportunity objects.

Incremental Sync

Incremental sync uses Salesforce's SystemModstamp field for change detection. Each sync:

  1. Appends WHERE SystemModstamp > {lastCursor} to SOQL queries
  2. Ensures SystemModstamp is included in the SELECT clause
  3. Queries modified records with standard query()
  4. Queries deleted records with queryAll() (includes soft-deleted records with IsDeleted = true)
  5. Tracks the maximum SystemModstamp across all results as the new cursor

If no cursor exists (first run), the connector falls through to the initial full sync.

Why SystemModstamp over Streaming API?

SystemModstamp polling provides reliable catch-up after downtime, works within standard API limits, and avoids the complexity of managing streaming subscriptions across tenants.

Data Transformation

Raw Salesforce records are transformed to canonical entities using pure functions in sync/transform.ts.

Transform Pipeline

  1. Zod validation -- raw records are parsed through Salesforce-specific Zod schemas (SalesforceContactSchema, SalesforceAccountSchema, SalesforceOpportunitySchema). Schemas use .passthrough() to preserve custom fields.
  2. Custom field mapping -- if tenant has custom mappings, applyFieldMappings() handles dot-notation paths, compound fields (Name, Address), and per-field transforms.
  3. Default mapping -- without custom mappings, built-in transforms normalize emails to lowercase, trim whitespace, extract domains from URLs, and coerce numeric strings.

Field Mapping Features

FeatureDescription
Dot-notation pathsAccount.Industry extracts nested values
Compound field flatteningSalesforce compound Name/Address fields are expanded
Per-field transformslowercase, uppercase, trim, none, custom
Required field validationThrows if a required source field is missing
Duplicate target detectionWarns on conflicting mappings

Quality Pipeline

Every batch passes through a three-stage quality pipeline before reaching the database:

Stage 1: Required Field Gate

Validates that essential fields are present:

EntityRequirement
PersonMust have externalId AND (email OR firstName + lastName)
AccountMust have externalId AND name
OpportunityMust have externalId, name, AND accountExternalId

Records that fail validation are excluded with detailed failure reasons.

Stage 2: Deduplication Gate

  • Removes records with duplicate externalId values
  • For persons: also deduplicates by normalized email (case-insensitive, trimmed)
  • Tracks duplicatesRemoved count in stats

Stage 3: Type Coercion Gate

  • Coerces numeric strings in employeeCount, annualRevenue, and amount to numbers
  • Parses date strings in closeDate and occurredAt to Date objects
  • Lowercases and trims email fields
  • Trims all remaining string fields

The pipeline returns a QualityResult with both the clean records and comprehensive stats:

interface QualityStats {
inputCount: number;
passedCount: number;
failedCount: number;
duplicatesRemoved: number;
failures: QualityFailure[];
}

Custom Field Discovery

The Metadata API module (metadata.ts) discovers available fields on Salesforce objects:

  • describeObject(conn, objectName) -- returns all fields with name, label, type, and updateability
  • getWriteableFields(conn, objectName) -- filters to only updateable fields (used for writeback target selection)
  • Compound field types (address, location) are automatically filtered out
  • Results are sorted: standard fields first, then custom fields, alphabetically by label
  • A per-process describe cache avoids redundant API calls

Score Writeback

The writeback engine pushes computed scores back to Salesforce custom fields.

Writeback Fields

Canonical FieldSalesforce TargetType
fitScoreCustom number fieldParsed to float
engagementScoreCustom number fieldParsed to float
combinedScoreCustom number fieldParsed to float
tierCustom text fieldString value
scoredAtCustom datetime fieldISO 8601 string

Writeback Process

  1. Admin maps canonical score fields to Salesforce custom fields via the field mapping UI
  2. The writeback engine builds update payloads for each scored record
  3. Records are processed in batches of 200 (Salesforce composite API limit)
  4. Per-record success/failure tracking with detailed error messages
  5. Writeback orchestration runs as a Temporal workflow for reliability
interface WritebackResult {
entityType: string;
total: number;
success: number;
failed: number;
errors: Array<{ entityId: string; error: string }>;
}
Writeback entity types

Writeback supports Contact (person), Account, and Opportunity entity types. The target Salesforce object is resolved from the SALESFORCE_ENTITY_OBJECTS mapping.

Setup Steps

  1. Create a Salesforce Connected App with OAuth 2.0 enabled and the api refresh_token scopes
  2. Set the callback URL to https://your-domain.com/api/connectors/salesforce/callback
  3. Configure environment variables (SALESFORCE_CLIENT_ID, SALESFORCE_CLIENT_SECRET, SALESFORCE_CALLBACK_URL, CONNECTOR_ENCRYPTION_KEY)
  4. Navigate to Settings > Connectors in the GTM Clarity dashboard
  5. Click "Connect Salesforce" to start the OAuth flow
  6. Review default field mappings and customize if needed
  7. Trigger initial sync -- the system will automatically switch to incremental syncs afterward
  8. Configure writeback (optional) -- map score fields to your Salesforce custom fields

Troubleshooting

IssueCauseResolution
Token exchange failsInvalid client credentials or callback URLVerify Connected App settings match env vars
Sync returns zero recordsSOQL fields not accessibleCheck field-level security in Salesforce
Quality pipeline drops recordsMissing required fieldsReview failure reasons in sync stats
Writeback batch errorsCustom field not on target objectUse metadata discovery to verify field exists
HMAC verification failsState parameter tamperedEnsure CLERK_SECRET_KEY matches across instances