Automated Video Tutorial Creation with OpenAI and OBS
Generate AI narration for videos using OpenAI GPT-4 Vision and OpenAI TTS, then record professional screencasts with OBS Studio.
Overview
This module provides a complete video tutorial creation workflow:
- OpenAI GPT-4 Vision - Analyzes video frames to generate contextual narration scripts
- OpenAI TTS - Converts scripts to natural-sounding speech with HD quality
Sample Video: See a completed tutorial created with this workflow: Earnings Dashboard Tutorial
Prerequisites
OpenAI API Key
export OPENAI_API_KEY='your-openai-api-key'FFmpeg (for frame extraction)
- Already set up at
/tmp/ffmpeg-7.0.2-amd64-static/ffmpeg
- Already set up at
Python Package
pip install openai
Quick Start
Using Pre-extracted Frames
from src.openai_vision import OpenAIScriptGenerator
from openai import OpenAI
import glob
# Get frames
frame_paths = sorted(glob.glob('src/polly/frames/*.jpg'))
# Generate script from frames
script_gen = OpenAIScriptGenerator()
script = script_gen.generate_from_frames(
frame_paths=frame_paths,
duration_seconds=30,
style='professional'
)
# Convert to audio with OpenAI TTS
client = OpenAI()
response = client.audio.speech.create(
model='tts-1-hd',
voice='onyx',
input=script
)
response.write_to_file('output.mp3')
Complete Workflow (Extract + Analyze + Generate)
from src.openai_vision import VideoNarrationEngine
engine = VideoNarrationEngine()
# Automatically: extract frames → analyze → generate audio
engine.process_video_with_frames(
video_path='video.mov',
output_audio_path='narration.mp3',
num_frames=5,
style='professional'
)
Examples
Run the example script:
# Set your OpenAI API key
export OPENAI_API_KEY='sk-...'
# Run example with existing frames
python -m src.openai-vision.example
API Reference
OpenAIScriptGenerator
script_gen = OpenAIScriptGenerator(api_key='sk-...')
# Analyze frames
script = script_gen.generate_from_frames(
frame_paths=['frame1.jpg', 'frame2.jpg'],
duration_seconds=30,
style='professional' # or 'casual', 'tutorial', 'educational'
)
# Generate from description
script = script_gen.generate_from_description(
video_description="Tutorial showing...",
duration_seconds=60,
style='tutorial'
)
VideoNarrationEngine
engine = VideoNarrationEngine(
openai_api_key='sk-...',
tts_voice='onyx' # alloy, echo, fable, onyx, nova, shimmer
)
# From frames
engine.process_video_with_frames(
video_path='video.mp4',
output_audio_path='narration.mp3',
num_frames=5,
style='professional',
voice='onyx'
)
# From description
engine.process_video_with_description(
video_path='video.mp4',
description="This video shows...",
output_audio_path='narration.mp3',
style='tutorial',
voice='onyx'
)
Narration Styles
- professional - Clear, authoritative, business-like
- casual - Friendly, conversational, relaxed
- tutorial - Patient, instructive, educational
- educational - Warm, engaging, teaching
- friendly - Upbeat, approachable, enthusiastic
Cost Comparison
| Service | Cost per Request |
|---|---|
| OpenAI GPT-4 Vision | ~$0.01-0.02 |
| OpenAI TTS HD | ~$0.03 per 1000 chars |
| Total | ~$0.02-0.05 |
Troubleshooting
OpenAI API Key Error
# Set environment variable
export OPENAI_API_KEY='your-key-here'
# Or pass directly
script_gen = OpenAIScriptGenerator(api_key='sk-...')
FFmpeg Not Found
Install the static FFmpeg binary (no sudo required):
# Download static FFmpeg binary
cd /tmp
curl -O https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz
# Extract
tar -xf ffmpeg-release-amd64-static.tar.xz
# Verify installation
/tmp/ffmpeg-7.0.2-amd64-static/ffmpeg -version
The static binary will be available at /tmp/ffmpeg-7.0.2-amd64-static/ffmpeg.
Recording Your Screencast with OBS
After generating your narration audio, use OBS Studio to create the final screencast:
1. Install OBS Studio
macOS:
# Download from official website
open https://obsproject.com/download
# Or if you have Homebrew:
brew install --cask obs
Linux:
# Ubuntu/Debian
sudo apt install obs-studio
# Or use the static binary is already available
2. Configure OBS
Open OBS Studio and create a new scene
Add Screen Capture:
- Click "+" under Sources
- Select “Screen Capture” (Linux) or “macOS Screen Capture” (Mac)
- Choose your display
- Click OK
Add Audio Source:
- Click "+" under Sources
- Select “Media Source”
- Check “Local File”
- Browse to your narration file:
src/openai-vision/earnings_tutorial_narration.mp3 - Uncheck “Loop”
- Click OK
Configure Audio Mixer:
- In the Audio Mixer panel (bottom), mute your microphone
- Only keep the Media Source audio enabled
- Adjust volume levels as needed
3. Grant Permissions (macOS)
- Open System Settings → Privacy & Security
- Scroll to Screen Recording
- Toggle ON for “OBS”
- Restart OBS if prompted
4. Record Your Screencast
- Prepare your browser with the earnings dashboard open
- In OBS, click “Start Recording”
- The audio will play automatically - follow along with your screen actions
- When the audio ends, click “Stop Recording”
5. Find Your Recording
- Default location:
~/Videos/(macOS/Linux) - Format:
.mkvor.mp4(configure in Settings → Output)
Tips for Better Results
- Practice first: Run through the screens 2-3 times before recording
- Timing: The narration is ~83 seconds, plan your actions accordingly
- Audio quality: Use the generated audio directly - no microphone noise
- Output settings: Settings → Output → Recording Quality: “High Quality, Medium File Size”
Integration with Existing Code
This module provides a complete OpenAI solution for video narration. Use it when:
- You want a unified OpenAI workflow (Vision + TTS)
- AWS Bedrock has permission issues
- You prefer OpenAI’s GPT-4 Vision quality
- You want faster setup with just an API key
Available OpenAI TTS Voices
- alloy - Neutral, balanced
- echo - Clear, professional male
- fable - Warm, expressive
- onyx - Deep, authoritative male (recommended for tutorials)
- nova - Friendly female
- shimmer - Soft, engaging female
