All docs/sync-worker

docs/architecture/sync-worker-architecture.md

Last verified: 2026-03-08 Target: apps/sync-worker

Sync Worker Architecture

The sync-worker is a minimal Express.js service deployed on Google Cloud Run that mirrors video flows from the production Redis instance to the staging Redis instance. It is triggered by Cloud Scheduler every 4 hours and also exposes an endpoint for on-demand per-flow re-sync.

Purpose

Staging environments need realistic production data for testing. The sync-worker copies flow configs and metadata from production Redis → staging Redis, while protecting staging-native flows from being overwritten.


Module Overview

File Responsibilities

FileResponsibility
src/index.tsExpress app bootstrap, Sentry init, route definitions (/health, /sync, /sync/flow), process signal handlers
src/sync.tsCore sync logic: full bulk sync (sync()) and per-flow sync (syncSpecificFlow())
src/env.tsZod environment schema with lazy singleton via createServiceEnvironment
src/logger.tsLazy singleton Logger wrapping @repo/core/server/logger
src/types.d.tsNode type reference (/// <reference types="node" />)

Primary Data Flow — Scheduled Sync


Integration Map


Endpoints

MethodPathPurposeAuth
GET/healthHealth check (Cloud Run lifecycle)None
POST/syncBulk sync all flows from source → destNone (feature flag gate)
POST/sync/flowRe-sync a specific flow by IDNone (feature flag gate)
GET/api/sentry-errorSentry smoke test (manual use)None

Feature Flag Guard

Both /sync and /sync/flow check ENABLE_FLOW_SYNC === "true" and return HTTP 403 if disabled. The flag defaults to "false" to prevent accidental sync on new deployments.

Local Environment Behavior

When ENVIRONMENT=local, the routes for /sync and /sync/flow are not registered at all (early return from createApp). Only /health and /api/sentry-error are available locally. This is by design — sync requires live cloud Redis instances.


Redis Data Model

clients             (HASH)
├── {flowId}  → "1"     # All known flows; value is always "1"
└── {flowId}  → "1"     # ...

client:{flowId}     (HASH)
├── clerkOrganizationId → "{clerk_org_id}"
├── isSyncCopy          → "true" | (absent)
├── flowConfigHistory   → (excluded from sync)
└── ...all other flow fields

The sync logic reads and writes these two hash structures:

  • clients — index of all flow IDs
  • client:{flowId} — all fields for a specific flow

Sync Strategy Details

What Gets Copied

All fields in client:{flowId} are copied except:

  • flowConfigHistory — production version history should not pollute staging
  • isSyncCopy — explicitly set by the worker on every upsert

Protection Mechanism

The isSyncCopy flag determines whether a flow can be overwritten:

Flow State in DestAction
Not presentCreate new (set isSyncCopy: "true")
Present, isSyncCopy: "true"Overwrite with latest source data
Present, no isSyncCopy (or "false")Skip — staging-native flow, do not touch

This ensures staging flows created manually for testing are never overwritten.

Hello Org Filtering

Flows belonging to the internal hello/demo Clerk organization are filtered via isAnyHelloOrganization(clerkOrganizationId). These flows exist in production for onboarding demos but should never appear in staging.


Environment Variables

VariableRequiredDefaultPurpose
ENVIRONMENTYeslocal | cloud; disables sync endpoints in local
DEPLOYMENT_ENVIRONMENTYeslocal | preview | staging | prod
PORTNo3004HTTP listener port (Cloud Run uses 8080)
SOURCE_PROCESSED_VIDEO_DATA_KV_URLIn cloudProduction Redis URL (read-only)
DEST_PROCESSED_VIDEO_DATA_KV_URLIn cloudStaging Redis URL (write target)
ENABLE_FLOW_SYNCNo"false"Feature flag; must be "true" for sync endpoints to work
SENTRY_DSNNoEnables Sentry error reporting (cloud only)
BUILD_SHANoGit commit SHA (injected by CI)
BUILD_REFNoGit ref (injected by CI)
BUILD_TIMENoISO 8601 build timestamp (injected by CI)

Sentry Integration

Sentry is initialized in src/index.ts via Sentry.init() (cloud only, DSN required). After init, registerSentryProviders(Sentry) wires logger error()/warn() calls to auto-capture.

Config:

SettingValue
sampleRate1.0 (100% of errors)
tracesSampleRate0.2 (20% of requests traced)
profilesSampleRateVia nodeProfilingIntegration()
Tags setservice, runtime, environment, build_sha, build_ref, build_time

See error-logging-and-sentry.md for the provider registration pattern.


Deployment

EnvironmentDeployed?Role
ProductionTriggers sync to staging
PreviewCan trigger sync
StagingDestination — never runs sync-worker itself
LocalN/ASync endpoints disabled; health check available

The service is deployed via Cloud Run and triggered by Cloud Scheduler every 4 hours. The enable_sync_worker Terraform variable controls whether it is deployed per environment.

See README.md and terraform/SYNC_WORKER_DEPLOYMENT.md for operational details.


Build & Bundling

Uses tsup with skipNodeModulesBundle: true — the app's own code is bundled (resolves relative imports) but all node_modules including @repo/* are externalized. Packages must have compiled dist/ output.

Docker build uses turbo prune sync-worker --docker for minimal multi-stage builds on node:22-slim. See internal-packages-and-docker.md.


Related Documentation