Loading project
Preparing this case study...
Preparing this case study...
A Python-powered podcast production automation system for transcription, subtitles, B-roll sourcing, OpenAI Vision review, refill loops, review boards, and edit cut maps.
Project Snapshot
Technical Footprint
BEIA Media Suite is a Python-powered podcast production automation system built to support my BEIA Journal media workflow.
The strongest part of the project is the automation layer inside 06_SCRIPTS. I built scripts that help move an episode from raw recording into usable production assets: extracted audio, normalised audio, transcripts, SRT subtitles, B-roll manifests, sourced B-roll clips, AI-assisted visual review reports, reject folders, refill manifests, review boards, and edit cut maps.
The transcription script uses FFmpeg to extract and normalise audio, then uses Whisper large-v3 to generate English transcripts and SRT subtitle files. This means raw video or audio can be turned into structured written material for editing, captions, clips, and repurposing.
The B-roll system starts with episode theme maps. extract_queries.py reads the theme map, finds section headers and search terms, maps them into canonical BEIA sections like Cold Open, Question, Scene, Discovery, Emotional Honesty, Integration, Scripture, Mirror, and Outro, then creates JSON manifests and episode-specific B-roll folders.
The sourcing layer uses the Pexels Videos API. beia_broll_fetcher.py takes the manifest, decorates search queries with a nature/no-people guardrail, fetches clips in parallel, avoids previously rejected or already-used videos, downloads MP4 files into the right episode folders, and writes download metadata for later review.
The review layer uses FFmpeg/ffprobe to sample frames from each clip and OpenAI Vision to classify whether the clip should be kept, kept pending manual review, or rejected. It checks for visible people, food, indoor settings, man-made objects, text, or logos, then writes JSON and CSV review reports and can move rejected clips into _reject folders.
The system also includes a refill loop. If a query does not have an accepted clip, beia_refill_loop.py creates a missing report, builds a smaller refill manifest, fetches replacements, runs the review again, and repeats the cycle until the missing slots are filled or the maximum rounds are reached.
For editing, beia_cut_engine.py parses SRT files and a blueprint JSON file to assemble timeline segments and export a STEP 5 FCP-style timeline map. This connects transcripts and episode structure back into the edit process.
This project is not a finished publishing platform yet. It does not currently prove upload automation or final delivery into the export folders. Its real value is the production automation system: turning a messy podcast process into a repeatable pipeline for transcripts, subtitles, visual sourcing, AI review, edit planning, and short-form content preparation.
I built the Python automation layer inside 06_SCRIPTS, including tools for transcription, subtitle generation, B-roll query extraction, Pexels video sourcing, OpenAI Vision review, reject handling, refill loops, review boards, and cut-map generation.
I also designed the supporting production structure around raw recordings, synced media, edit planning, assets, exports, and scripts so that the automation had a clear media workflow to operate inside.
The main technical contribution was turning a messy manual podcast process into a repeatable
One of the main challenges was reducing the repeated manual work behind podcast production. Recording the episode is only the beginning. After that comes audio extraction, audio normalisation, transcription, subtitles, B-roll planning, visual sourcing, clip review, rejected assets, refill searches, and edit mapping.
Another challenge was making B-roll selection systematic. Instead of randomly searching for footage, I built a theme-map-to-manifest process so each episode’s visuals could follow the emotional and editorial structure of the episode.
The OpenAI Vision review layer also needed careful rules. The system was built to reject clips only when there was clear visible evidence of things that did not fit the nature-only direction, such as people, food, indoor scenes, man-made objects, text, or logos.
A current limitation is that the system supports production preparation more than final publishing. It helps create transcripts, subtitles, B-roll assets, review reports, refill manifests, review boards, and cut maps, but upload automation and final export discipline are not fully proven yet.
BEIA Media Suite reduces the manual workload behind producing BEIA podcast episodes.
It helps turn raw recordings into production-ready assets: transcripts, subtitles, B-roll manifests, downloaded clips, review reports, rejected clip folders, refill manifests, review boards, and edit maps.
As a portfolio project, it shows my ability to build automation around a real creative workflow. This is not just file organisation. It is a working media automation toolkit that connects FFmpeg, Whisper, Pexels, OpenAI Vision, JSON manifests, CSV reports, and editorial planning.
This project taught me that media production can be engineered like a pipeline. Each stage has inputs, outputs, validation, reports, and handoffs.
I also learnt that automation works best when it supports the creative process rather than replacing it. The scripts do not make the final edit for me, but they remove a lot of repeated manual work around transcription, B-roll sourcing, visual review, and edit planning.
The biggest learning was that creative systems need both editorial structure and operational discipline. The theme maps provide the story structure, while the scripts turn that structure into manifests, folders, clips, review reports, refill loops, and cut maps.
I help founders and teams turn messy ideas into reliable systems — from MVPs and APIs to AI-enabled automation workflows.