Salesforce Connector

The Salesforce connector provides full bidirectional CRM integration -- OAuth authentication, bulk and incremental sync of Contacts, Accounts, and Opportunities, configurable field mapping with custom field discovery, a three-stage data quality pipeline, and score writeback.

Prerequisites

Before connecting Salesforce, you need:

A Salesforce Connected App with OAuth 2.0 enabled
The api and refresh_token OAuth scopes granted
The environment variables listed below configured in your deployment

Environment Variables

Variable	Description
`SALESFORCE_CLIENT_ID`	OAuth Connected App client ID
`SALESFORCE_CLIENT_SECRET`	OAuth Connected App client secret
`SALESFORCE_CALLBACK_URL`	OAuth redirect URI (e.g., `https://app.example.com/api/connectors/salesforce/callback`)
`CONNECTOR_ENCRYPTION_KEY`	32-byte hex string (64 hex characters) for AES-256-GCM token encryption

Authentication

The Salesforce connector uses the OAuth 2.0 Web Server Flow with CSRF-safe state parameters.

OAuth Flow

CSRF Protection

The OAuth state parameter is signed with HMAC-SHA256 using CLERK_SECRET_KEY. The payload format is tenantId:connectorId, and verification uses constant-time buffer comparison to prevent timing attacks.

Token Encryption

All Salesforce tokens are encrypted at rest using AES-256-GCM:

Algorithm: aes-256-gcm
IV length: 12 bytes (randomly generated per encryption)
Auth tag length: 16 bytes
Storage format: base64(iv + tag + ciphertext)
Key: CONNECTOR_ENCRYPTION_KEY environment variable (32-byte hex)

Token refresh is automatic -- when an access token expires (default 2-hour TTL), the connector transparently uses the refresh token to obtain a new one.

Key management

The CONNECTOR_ENCRYPTION_KEY must be exactly 64 hex characters (32 bytes). Rotating this key requires re-encrypting all stored tokens.

Sync Pipeline

The connector supports two sync modes: initial (full) and incremental (change-data-capture).

Initial Sync

The initial sync fetches all records using SOQL queries, processed in three sequential phases:

Records are fetched in configurable batch sizes (default: 200) using an async generator that handles Salesforce's query pagination (nextRecordsUrl).

Default SOQL Fields

Entity	Default Fields
Contact	`Id`, `Email`, `FirstName`, `LastName`, `Title`, `Phone`, `AccountId`
Account	`Id`, `Name`, `Website`, `Industry`, `NumberOfEmployees`, `AnnualRevenue`
Opportunity	`Id`, `Name`, `StageName`, `CloseDate`, `AccountId`, `Amount`

Custom fields

Admins can add custom SOQL fields through the field mapping UI. The Metadata API discovers all available custom fields on Contact, Account, and Opportunity objects.

Incremental Sync

Incremental sync uses Salesforce's SystemModstamp field for change detection. Each sync:

Appends WHERE SystemModstamp > {lastCursor} to SOQL queries
Ensures SystemModstamp is included in the SELECT clause
Queries modified records with standard query()
Queries deleted records with queryAll() (includes soft-deleted records with IsDeleted = true)
Tracks the maximum SystemModstamp across all results as the new cursor

If no cursor exists (first run), the connector falls through to the initial full sync.

Why SystemModstamp over Streaming API?

SystemModstamp polling provides reliable catch-up after downtime, works within standard API limits, and avoids the complexity of managing streaming subscriptions across tenants.

Data Transformation

Raw Salesforce records are transformed to canonical entities using pure functions in sync/transform.ts.

Transform Pipeline

Zod validation -- raw records are parsed through Salesforce-specific Zod schemas (SalesforceContactSchema, SalesforceAccountSchema, SalesforceOpportunitySchema). Schemas use .passthrough() to preserve custom fields.
Custom field mapping -- if tenant has custom mappings, applyFieldMappings() handles dot-notation paths, compound fields (Name, Address), and per-field transforms.
Default mapping -- without custom mappings, built-in transforms normalize emails to lowercase, trim whitespace, extract domains from URLs, and coerce numeric strings.

Field Mapping Features

Feature	Description
Dot-notation paths	`Account.Industry` extracts nested values
Compound field flattening	Salesforce compound Name/Address fields are expanded
Per-field transforms	`lowercase`, `uppercase`, `trim`, `none`, `custom`
Required field validation	Throws if a required source field is missing
Duplicate target detection	Warns on conflicting mappings

Quality Pipeline

Every batch passes through a three-stage quality pipeline before reaching the database:

Stage 1: Required Field Gate

Validates that essential fields are present:

Entity	Requirement
Person	Must have `externalId` AND (`email` OR `firstName` + `lastName`)
Account	Must have `externalId` AND `name`
Opportunity	Must have `externalId`, `name`, AND `accountExternalId`

Records that fail validation are excluded with detailed failure reasons.

Stage 2: Deduplication Gate

Removes records with duplicate externalId values
For persons: also deduplicates by normalized email (case-insensitive, trimmed)
Tracks duplicatesRemoved count in stats

Stage 3: Type Coercion Gate

Coerces numeric strings in employeeCount, annualRevenue, and amount to numbers
Parses date strings in closeDate and occurredAt to Date objects
Lowercases and trims email fields
Trims all remaining string fields

The pipeline returns a QualityResult with both the clean records and comprehensive stats:

interface QualityStats {
  inputCount: number;
  passedCount: number;
  failedCount: number;
  duplicatesRemoved: number;
  failures: QualityFailure[];
}

Custom Field Discovery

The Metadata API module (metadata.ts) discovers available fields on Salesforce objects:

describeObject(conn, objectName) -- returns all fields with name, label, type, and updateability
getWriteableFields(conn, objectName) -- filters to only updateable fields (used for writeback target selection)
Compound field types (address, location) are automatically filtered out
Results are sorted: standard fields first, then custom fields, alphabetically by label
A per-process describe cache avoids redundant API calls

Score Writeback

The writeback engine pushes computed scores back to Salesforce custom fields.

Writeback Fields

Canonical Field	Salesforce Target	Type
`fitScore`	Custom number field	Parsed to float
`engagementScore`	Custom number field	Parsed to float
`combinedScore`	Custom number field	Parsed to float
`tier`	Custom text field	String value
`scoredAt`	Custom datetime field	ISO 8601 string

Writeback Process

Admin maps canonical score fields to Salesforce custom fields via the field mapping UI
The writeback engine builds update payloads for each scored record
Records are processed in batches of 200 (Salesforce composite API limit)
Per-record success/failure tracking with detailed error messages
Writeback orchestration runs as a Temporal workflow for reliability

interface WritebackResult {
  entityType: string;
  total: number;
  success: number;
  failed: number;
  errors: Array<{ entityId: string; error: string }>;
}

Writeback entity types

Writeback supports Contact (person), Account, and Opportunity entity types. The target Salesforce object is resolved from the SALESFORCE_ENTITY_OBJECTS mapping.

Setup Steps

Create a Salesforce Connected App with OAuth 2.0 enabled and the api refresh_token scopes
Set the callback URL to https://your-domain.com/api/connectors/salesforce/callback
Configure environment variables (SALESFORCE_CLIENT_ID, SALESFORCE_CLIENT_SECRET, SALESFORCE_CALLBACK_URL, CONNECTOR_ENCRYPTION_KEY)
Navigate to Settings > Connectors in the GTM Clarity dashboard
Click "Connect Salesforce" to start the OAuth flow
Review default field mappings and customize if needed
Trigger initial sync -- the system will automatically switch to incremental syncs afterward
Configure writeback (optional) -- map score fields to your Salesforce custom fields

Troubleshooting

Issue	Cause	Resolution
Token exchange fails	Invalid client credentials or callback URL	Verify Connected App settings match env vars
Sync returns zero records	SOQL fields not accessible	Check field-level security in Salesforce
Quality pipeline drops records	Missing required fields	Review failure reasons in sync stats
Writeback batch errors	Custom field not on target object	Use metadata discovery to verify field exists
HMAC verification fails	State parameter tampered	Ensure `CLERK_SECRET_KEY` matches across instances

Prerequisites​

Environment Variables​

Authentication​

OAuth Flow​

CSRF Protection​

Token Encryption​

Sync Pipeline​

Initial Sync​

Default SOQL Fields​

Incremental Sync​

Data Transformation​

Transform Pipeline​

Field Mapping Features​

Quality Pipeline​

Stage 1: Required Field Gate​

Stage 2: Deduplication Gate​

Stage 3: Type Coercion Gate​

Custom Field Discovery​

Score Writeback​

Writeback Fields​

Writeback Process​

Setup Steps​

Troubleshooting​