Last verified: 2026-03-08 Target: apps/sync-worker

Sync Worker Architecture

The sync-worker is a minimal Express.js service deployed on Google Cloud Run that mirrors video flows from the production Redis instance to the staging Redis instance. It is triggered by Cloud Scheduler every 4 hours and also exposes an endpoint for on-demand per-flow re-sync.

Purpose

Staging environments need realistic production data for testing. The sync-worker copies flow configs and metadata from production Redis → staging Redis, while protecting staging-native flows from being overwritten.

Module Overview

File Responsibilities

File	Responsibility
`src/index.ts`	Express app bootstrap, Sentry init, route definitions (`/health`, `/sync`, `/sync/flow`), process signal handlers
`src/sync.ts`	Core sync logic: full bulk sync (`sync()`) and per-flow sync (`syncSpecificFlow()`)
`src/env.ts`	Zod environment schema with lazy singleton via `createServiceEnvironment`
`src/logger.ts`	Lazy singleton Logger wrapping `@repo/core/server/logger`
`src/types.d.ts`	Node type reference (`/// <reference types="node" />`)

Primary Data Flow — Scheduled Sync

Integration Map

Endpoints

Method	Path	Purpose	Auth
`GET`	`/health`	Health check (Cloud Run lifecycle)	None
`POST`	`/sync`	Bulk sync all flows from source → dest	None (feature flag gate)
`POST`	`/sync/flow`	Re-sync a specific flow by ID	None (feature flag gate)
`GET`	`/api/sentry-error`	Sentry smoke test (manual use)	None

Feature Flag Guard

Both /sync and /sync/flow check ENABLE_FLOW_SYNC === "true" and return HTTP 403 if disabled. The flag defaults to "false" to prevent accidental sync on new deployments.

Local Environment Behavior

When ENVIRONMENT=local, the routes for /sync and /sync/flow are not registered at all (early return from createApp). Only /health and /api/sentry-error are available locally. This is by design — sync requires live cloud Redis instances.

Redis Data Model

clients             (HASH)
├── {flowId}  → "1"     # All known flows; value is always "1"
└── {flowId}  → "1"     # ...

client:{flowId}     (HASH)
├── clerkOrganizationId → "{clerk_org_id}"
├── isSyncCopy          → "true" | (absent)
├── flowConfigHistory   → (excluded from sync)
└── ...all other flow fields

The sync logic reads and writes these two hash structures:

clients — index of all flow IDs
client:{flowId} — all fields for a specific flow

Sync Strategy Details

What Gets Copied

All fields in client:{flowId} are copied except:

flowConfigHistory — production version history should not pollute staging
isSyncCopy — explicitly set by the worker on every upsert

Protection Mechanism

The isSyncCopy flag determines whether a flow can be overwritten:

Flow State in Dest	Action
Not present	Create new (set `isSyncCopy: "true"`)
Present, `isSyncCopy: "true"`	Overwrite with latest source data
Present, no `isSyncCopy` (or `"false"`)	Skip — staging-native flow, do not touch

This ensures staging flows created manually for testing are never overwritten.

Hello Org Filtering

Flows belonging to the internal hello/demo Clerk organization are filtered via isAnyHelloOrganization(clerkOrganizationId). These flows exist in production for onboarding demos but should never appear in staging.

Environment Variables

Variable	Required	Default	Purpose
`ENVIRONMENT`	Yes	—	`local` \| `cloud`; disables sync endpoints in local
`DEPLOYMENT_ENVIRONMENT`	Yes	—	`local` \| `preview` \| `staging` \| `prod`
`PORT`	No	`3004`	HTTP listener port (Cloud Run uses `8080`)
`SOURCE_PROCESSED_VIDEO_DATA_KV_URL`	In cloud	—	Production Redis URL (read-only)
`DEST_PROCESSED_VIDEO_DATA_KV_URL`	In cloud	—	Staging Redis URL (write target)
`ENABLE_FLOW_SYNC`	No	`"false"`	Feature flag; must be `"true"` for sync endpoints to work
`SENTRY_DSN`	No	—	Enables Sentry error reporting (cloud only)
`BUILD_SHA`	No	—	Git commit SHA (injected by CI)
`BUILD_REF`	No	—	Git ref (injected by CI)
`BUILD_TIME`	No	—	ISO 8601 build timestamp (injected by CI)

Sentry Integration

Sentry is initialized in src/index.ts via Sentry.init() (cloud only, DSN required). After init, registerSentryProviders(Sentry) wires logger error()/warn() calls to auto-capture.

Config:

Setting	Value
`sampleRate`	`1.0` (100% of errors)
`tracesSampleRate`	`0.2` (20% of requests traced)
`profilesSampleRate`	Via `nodeProfilingIntegration()`
Tags set	`service`, `runtime`, `environment`, `build_sha`, `build_ref`, `build_time`

See error-logging-and-sentry.md for the provider registration pattern.

Deployment

Environment	Deployed?	Role
Production	✅	Triggers sync to staging
Preview	✅	Can trigger sync
Staging	❌	Destination — never runs sync-worker itself
Local	N/A	Sync endpoints disabled; health check available

The service is deployed via Cloud Run and triggered by Cloud Scheduler every 4 hours. The enable_sync_worker Terraform variable controls whether it is deployed per environment.

See README.md and terraform/SYNC_WORKER_DEPLOYMENT.md for operational details.

Build & Bundling

Uses tsup with skipNodeModulesBundle: true — the app's own code is bundled (resolves relative imports) but all node_modules including @repo/* are externalized. Packages must have compiled dist/ output.

Docker build uses turbo prune sync-worker --docker for minimal multi-stage builds on node:22-slim. See internal-packages-and-docker.md.