Automated Video Tutorial Creation with OpenAI and OBS

Generate AI narration for videos using OpenAI GPT-4 Vision and OpenAI TTS, then record professional screencasts with OBS Studio.

Overview

This module provides a complete video tutorial creation workflow:

OpenAI GPT-4 Vision - Analyzes video frames to generate contextual narration scripts
OpenAI TTS - Converts scripts to natural-sounding speech with HD quality

Sample Video: See a completed tutorial created with this workflow: Earnings Dashboard Tutorial

Prerequisites

OpenAI API Key

export OPENAI_API_KEY='your-openai-api-key'

FFmpeg (for frame extraction)
- Already set up at /tmp/ffmpeg-7.0.2-amd64-static/ffmpeg
Python Package
```
pip install openai
```

Quick Start

Using Pre-extracted Frames

from src.openai_vision import OpenAIScriptGenerator
from openai import OpenAI
import glob

# Get frames
frame_paths = sorted(glob.glob('src/polly/frames/*.jpg'))

# Generate script from frames
script_gen = OpenAIScriptGenerator()
script = script_gen.generate_from_frames(
    frame_paths=frame_paths,
    duration_seconds=30,
    style='professional'
)

# Convert to audio with OpenAI TTS
client = OpenAI()
response = client.audio.speech.create(
    model='tts-1-hd',
    voice='onyx',
    input=script
)
response.write_to_file('output.mp3')

Complete Workflow (Extract + Analyze + Generate)

from src.openai_vision import VideoNarrationEngine

engine = VideoNarrationEngine()

# Automatically: extract frames → analyze → generate audio
engine.process_video_with_frames(
    video_path='video.mov',
    output_audio_path='narration.mp3',
    num_frames=5,
    style='professional'
)

Examples

Run the example script:

# Set your OpenAI API key
export OPENAI_API_KEY='sk-...'

# Run example with existing frames
python -m src.openai-vision.example

API Reference

OpenAIScriptGenerator

script_gen = OpenAIScriptGenerator(api_key='sk-...')

# Analyze frames
script = script_gen.generate_from_frames(
    frame_paths=['frame1.jpg', 'frame2.jpg'],
    duration_seconds=30,
    style='professional'  # or 'casual', 'tutorial', 'educational'
)

# Generate from description
script = script_gen.generate_from_description(
    video_description="Tutorial showing...",
    duration_seconds=60,
    style='tutorial'
)

VideoNarrationEngine

engine = VideoNarrationEngine(
    openai_api_key='sk-...',
    tts_voice='onyx'  # alloy, echo, fable, onyx, nova, shimmer
)

# From frames
engine.process_video_with_frames(
    video_path='video.mp4',
    output_audio_path='narration.mp3',
    num_frames=5,
    style='professional',
    voice='onyx'
)

# From description
engine.process_video_with_description(
    video_path='video.mp4',
    description="This video shows...",
    output_audio_path='narration.mp3',
    style='tutorial',
    voice='onyx'
)

Narration Styles

professional - Clear, authoritative, business-like
casual - Friendly, conversational, relaxed
tutorial - Patient, instructive, educational
educational - Warm, engaging, teaching
friendly - Upbeat, approachable, enthusiastic

Cost Comparison

Service	Cost per Request
OpenAI GPT-4 Vision	~$0.01-0.02
OpenAI TTS HD	~$0.03 per 1000 chars
Total	~$0.02-0.05

Troubleshooting

OpenAI API Key Error

# Set environment variable
export OPENAI_API_KEY='your-key-here'

# Or pass directly
script_gen = OpenAIScriptGenerator(api_key='sk-...')

FFmpeg Not Found

Install the static FFmpeg binary (no sudo required):

# Download static FFmpeg binary
cd /tmp
curl -O https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz

# Extract
tar -xf ffmpeg-release-amd64-static.tar.xz

# Verify installation
/tmp/ffmpeg-7.0.2-amd64-static/ffmpeg -version

The static binary will be available at /tmp/ffmpeg-7.0.2-amd64-static/ffmpeg.

Recording Your Screencast with OBS

After generating your narration audio, use OBS Studio to create the final screencast:

1. Install OBS Studio

macOS:

# Download from official website
open https://obsproject.com/download

# Or if you have Homebrew:
brew install --cask obs

Linux:

# Ubuntu/Debian
sudo apt install obs-studio

# Or use the static binary is already available

2. Configure OBS

Open OBS Studio and create a new scene
Add Screen Capture:
- Click "+" under Sources
- Select “Screen Capture” (Linux) or “macOS Screen Capture” (Mac)
- Choose your display
- Click OK
Add Audio Source:
- Click "+" under Sources
- Select “Media Source”
- Check “Local File”
- Browse to your narration file: src/openai-vision/earnings_tutorial_narration.mp3
- Uncheck “Loop”
- Click OK
Configure Audio Mixer:
- In the Audio Mixer panel (bottom), mute your microphone
- Only keep the Media Source audio enabled
- Adjust volume levels as needed

3. Grant Permissions (macOS)

Open System Settings → Privacy & Security
Scroll to Screen Recording
Toggle ON for “OBS”
Restart OBS if prompted

4. Record Your Screencast

Prepare your browser with the earnings dashboard open
In OBS, click “Start Recording”
The audio will play automatically - follow along with your screen actions
When the audio ends, click “Stop Recording”

5. Find Your Recording

Default location: ~/Videos/ (macOS/Linux)
Format: .mkv or .mp4 (configure in Settings → Output)

Tips for Better Results

Practice first: Run through the screens 2-3 times before recording
Timing: The narration is ~83 seconds, plan your actions accordingly
Audio quality: Use the generated audio directly - no microphone noise
Output settings: Settings → Output → Recording Quality: “High Quality, Medium File Size”

Integration with Existing Code

This module provides a complete OpenAI solution for video narration. Use it when:

You want a unified OpenAI workflow (Vision + TTS)
AWS Bedrock has permission issues
You prefer OpenAI’s GPT-4 Vision quality
You want faster setup with just an API key

Available OpenAI TTS Voices

alloy - Neutral, balanced
echo - Clear, professional male
fable - Warm, expressive
onyx - Deep, authoritative male (recommended for tutorials)
nova - Friendly female
shimmer - Soft, engaging female