All docs/general

docs/architecture/video-processor-storage-paths.md

Video Processor Storage Paths

This document explains where files are stored for:

  • question responses
  • final video responses

...once they pass through the video processing pipeline, and how those paths are constructed.

TL;DR

  • Raw question recordings are uploaded first under branded-video-flow/videos/... (before video-processor).
  • Raw final videos are uploaded first under library/finalVideos/... (before video-processor).
  • Video processor does not re-upload the original input file.
  • Video processor writes derived artifacts (audio, thumbnails, optional transcoded MP4, optional HLS) under video-processor/... prefixes.

1) Question responses

End-to-end flow

Stored paths (question response)

Raw input (before video-processor):

  • branded-video-flow/videos/{videoFlowId}/{sessionId}/{questionId}.{mp4|webm}
  • Upload route uses addRandomSuffix: true, so real filename can become question1-<random>.mp4.

Derived files written by video-processor:

  • Audio:
    • logical path: audio/{videoFlowId}/{sessionId}/{questionId}.wav
    • actual stored key: video-processor/audio/audio/{videoFlowId}/{sessionId}/{questionId}.wav
  • Static thumbnail(s):
    • logical path: videos/{videoFlowId}/{sessionId}/{questionId}-thumbnail.jpg
    • actual stored key: video-processor/video/videos/{videoFlowId}/{sessionId}/{questionId}-thumbnail-<random>.jpg
  • Animated thumbnail (optional):
    • logical path: videos/{videoFlowId}/{sessionId}/{questionId}-animated-thumbnail.gif
    • actual stored key: video-processor/video/videos/{videoFlowId}/{sessionId}/{questionId}-animated-thumbnail-<random>.gif
  • Transcoded MP4 (only when needed, optional):
    • logical path: videos/{videoFlowId}/{sessionId}/{questionId}-transcoded.mp4
    • actual stored key: video-processor/video/videos/{videoFlowId}/{sessionId}/{questionId}-transcoded-<random>.mp4
  • HLS package (optional):
    • logical base: videos/{videoFlowId}/{sessionId}/{questionId}/hls/...
    • actual stored base: video-processor/video/videos/{videoFlowId}/{sessionId}/{questionId}/hls/...
    • includes master.m3u8, rendition playlists, and .ts segments.

2) Final video responses

End-to-end flow

Stored paths (final video response)

Raw input (before video-processor):

  • library/finalVideos/{videoFlowId}/{sessionId}/{id}-v{version}.{ext}
  • Upload route uses addRandomSuffix: true, so stored file may include random suffix.

Derived files written by video-processor use the final video id as questionId:

  • Audio:
    • video-processor/audio/audio/{videoFlowId}/{sessionId}/{finalVideoId}.wav
  • Thumbnail(s):
    • video-processor/video/videos/{videoFlowId}/{sessionId}/{finalVideoId}-thumbnail-<random>.jpg
  • Animated thumbnail (optional):
    • video-processor/video/videos/{videoFlowId}/{sessionId}/{finalVideoId}-animated-thumbnail-<random>.gif
  • Transcoded MP4 (optional):
    • video-processor/video/videos/{videoFlowId}/{sessionId}/{finalVideoId}-transcoded-<random>.mp4
  • HLS (optional):
    • video-processor/video/videos/{videoFlowId}/{sessionId}/{finalVideoId}/hls/...

3) How paths are constructed

Core generators

@repo/blob/path-generation.ts constructs logical paths:

  • generateVideoPath(...):
    • videos/{videoFlowId}/{sessionId}/{questionId}{-suffix}.{ext}
  • generateAudioPath(...):
    • audio/{videoFlowId}/{sessionId}/{questionId}.wav
  • generateFinalVideoPath(...):
    • library/finalVideos/{videoFlowId}/{sessionId}/{id}-v{version}.{ext}

Sanitization rules

For generated segments (videoFlowId, sessionId, questionId, id):

  • lowercased
  • diacritics removed
  • invalid chars replaced with -
  • repeated - collapsed
  • leading/trailing - trimmed

So the logical keys from path generators are normalized.

Prefix layering (why keys start with video-processor/...)

BlobClient prepends basePrefix from client config:

  • video client prefix: video-processor/video
  • audio client prefix: video-processor/audio

Then it appends the logical path from generators.

So:

  • videos/... becomes video-processor/video/videos/...
  • audio/... becomes video-processor/audio/audio/...

Random suffix behavior

  • Upload routes (handleUpload) enforce addRandomSuffix: true.
  • Thumbnail/transcoded uploads from video-processor also use random suffix.
  • Extracted audio intentionally uses deterministic path (addRandomSuffix: false) to keep one stable WAV per (videoFlowId, sessionId, questionId).

4) Important behavior notes

  • Video processor processes by URL and writes derivatives; it does not duplicate raw input file by default.
  • Optional branches:
    • transcoding only when format detection requires it
    • HLS generation is non-fatal and may be absent
  • On failures, video-processor attempts cleanup of already uploaded derivatives.