Nowhere Plans
Apr 28, 2026

trebek

2 min read

1 view

GitHub · PyPI

trebek

What you're missing

Jeopardy! is a ritual for you, and you want to keep up with it or train for one day walking onto the set of Alex Trebek studios. You learn best through repetition and quizzes, and need structured notes to study from - but there's no easy way to get it from the vast library of Jeopardy! video content.

Why you need this

You want to own your personal Jeopardy! knowledge base, to be able to quiz yourself on categories you want to study, and be able to look up any answer and its context in seconds. There's a few options for this, but none of them are great.

Trebek is a crash-immune pipeline for extracting structured data from Jeopardy! episodes. It handles the messy parts like inaccurate transcriptions, episodes splitting across two files, and more, so you can focus on studying.

How do I get this?

Prerequisites: Python 3.11+ and a free Gemini API key. An NVIDIA GPU is recommended for WhisperX transcription. Docker is the easiest path since it bundles all GPU dependencies for you.

Hybrid mode (recommended)

The lightweight CLI runs natively while GPU workloads are delegated to the official Docker image:

bash
1pip install trebek
2echo "GEMINI_API_KEY=<your_key_here>" > .env
3trebek run \
4 --input-dir </path/to/your/videos> \
5 --docker

Full Docker deployment

bash
1git clone https://github.com/arvarik/trebek.git && cd trebek
2cp .env.example .env
3# Set GEMINI_API_KEY
4docker compose up -d

The pipeline is fully crash-immune. Kill it at any point and it resumes exactly where it left off via SQLite-backed state tracking. Run trebek stats for a live dashboard of pipeline health, cost tracking, and stage timing.