Automated Video Tutorial Creation with OpenAI

Automated Video Tutorial Creation with OpenAI and OBS

Generate AI narration for videos using OpenAI GPT-4 Vision and OpenAI TTS, then record professional screencasts with OBS Studio.

Overview

This module provides a complete video tutorial creation workflow:

  • OpenAI GPT-4 Vision - Analyzes video frames to generate contextual narration scripts
  • OpenAI TTS - Converts scripts to natural-sounding speech with HD quality

Sample Video: See a completed tutorial created with this workflow: Earnings Dashboard Tutorial

Prerequisites

  1. OpenAI API Key

    export OPENAI_API_KEY='your-openai-api-key'
    
  2. FFmpeg (for frame extraction)

    • Already set up at /tmp/ffmpeg-7.0.2-amd64-static/ffmpeg
  3. Python Package

    pip install openai
    

Quick Start

Using Pre-extracted Frames

from src.openai_vision import OpenAIScriptGenerator
from openai import OpenAI
import glob

# Get frames
frame_paths = sorted(glob.glob('src/polly/frames/*.jpg'))

# Generate script from frames
script_gen = OpenAIScriptGenerator()
script = script_gen.generate_from_frames(
    frame_paths=frame_paths,
    duration_seconds=30,
    style='professional'
)

# Convert to audio with OpenAI TTS
client = OpenAI()
response = client.audio.speech.create(
    model='tts-1-hd',
    voice='onyx',
    input=script
)
response.write_to_file('output.mp3')

Complete Workflow (Extract + Analyze + Generate)

from src.openai_vision import VideoNarrationEngine

engine = VideoNarrationEngine()

# Automatically: extract frames → analyze → generate audio
engine.process_video_with_frames(
    video_path='video.mov',
    output_audio_path='narration.mp3',
    num_frames=5,
    style='professional'
)

Examples

Run the example script:

# Set your OpenAI API key
export OPENAI_API_KEY='sk-...'

# Run example with existing frames
python -m src.openai-vision.example

API Reference

OpenAIScriptGenerator

script_gen = OpenAIScriptGenerator(api_key='sk-...')

# Analyze frames
script = script_gen.generate_from_frames(
    frame_paths=['frame1.jpg', 'frame2.jpg'],
    duration_seconds=30,
    style='professional'  # or 'casual', 'tutorial', 'educational'
)

# Generate from description
script = script_gen.generate_from_description(
    video_description="Tutorial showing...",
    duration_seconds=60,
    style='tutorial'
)

VideoNarrationEngine

engine = VideoNarrationEngine(
    openai_api_key='sk-...',
    tts_voice='onyx'  # alloy, echo, fable, onyx, nova, shimmer
)

# From frames
engine.process_video_with_frames(
    video_path='video.mp4',
    output_audio_path='narration.mp3',
    num_frames=5,
    style='professional',
    voice='onyx'
)

# From description
engine.process_video_with_description(
    video_path='video.mp4',
    description="This video shows...",
    output_audio_path='narration.mp3',
    style='tutorial',
    voice='onyx'
)

Narration Styles

  • professional - Clear, authoritative, business-like
  • casual - Friendly, conversational, relaxed
  • tutorial - Patient, instructive, educational
  • educational - Warm, engaging, teaching
  • friendly - Upbeat, approachable, enthusiastic

Cost Comparison

ServiceCost per Request
OpenAI GPT-4 Vision~$0.01-0.02
OpenAI TTS HD~$0.03 per 1000 chars
Total~$0.02-0.05

Troubleshooting

OpenAI API Key Error

# Set environment variable
export OPENAI_API_KEY='your-key-here'

# Or pass directly
script_gen = OpenAIScriptGenerator(api_key='sk-...')

FFmpeg Not Found

Install the static FFmpeg binary (no sudo required):

# Download static FFmpeg binary
cd /tmp
curl -O https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz

# Extract
tar -xf ffmpeg-release-amd64-static.tar.xz

# Verify installation
/tmp/ffmpeg-7.0.2-amd64-static/ffmpeg -version

The static binary will be available at /tmp/ffmpeg-7.0.2-amd64-static/ffmpeg.

Recording Your Screencast with OBS

After generating your narration audio, use OBS Studio to create the final screencast:

1. Install OBS Studio

macOS:

# Download from official website
open https://obsproject.com/download

# Or if you have Homebrew:
brew install --cask obs

Linux:

# Ubuntu/Debian
sudo apt install obs-studio

# Or use the static binary is already available

2. Configure OBS

  1. Open OBS Studio and create a new scene

  2. Add Screen Capture:

    • Click "+" under Sources
    • Select “Screen Capture” (Linux) or “macOS Screen Capture” (Mac)
    • Choose your display
    • Click OK
  3. Add Audio Source:

    • Click "+" under Sources
    • Select “Media Source”
    • Check “Local File”
    • Browse to your narration file: src/openai-vision/earnings_tutorial_narration.mp3
    • Uncheck “Loop”
    • Click OK
  4. Configure Audio Mixer:

    • In the Audio Mixer panel (bottom), mute your microphone
    • Only keep the Media Source audio enabled
    • Adjust volume levels as needed

3. Grant Permissions (macOS)

  1. Open System SettingsPrivacy & Security
  2. Scroll to Screen Recording
  3. Toggle ON for “OBS”
  4. Restart OBS if prompted

4. Record Your Screencast

  1. Prepare your browser with the earnings dashboard open
  2. In OBS, click “Start Recording”
  3. The audio will play automatically - follow along with your screen actions
  4. When the audio ends, click “Stop Recording”

5. Find Your Recording

  • Default location: ~/Videos/ (macOS/Linux)
  • Format: .mkv or .mp4 (configure in Settings → Output)

Tips for Better Results

  • Practice first: Run through the screens 2-3 times before recording
  • Timing: The narration is ~83 seconds, plan your actions accordingly
  • Audio quality: Use the generated audio directly - no microphone noise
  • Output settings: Settings → Output → Recording Quality: “High Quality, Medium File Size”

Integration with Existing Code

This module provides a complete OpenAI solution for video narration. Use it when:

  • You want a unified OpenAI workflow (Vision + TTS)
  • AWS Bedrock has permission issues
  • You prefer OpenAI’s GPT-4 Vision quality
  • You want faster setup with just an API key

Available OpenAI TTS Voices

  • alloy - Neutral, balanced
  • echo - Clear, professional male
  • fable - Warm, expressive
  • onyx - Deep, authoritative male (recommended for tutorials)
  • nova - Friendly female
  • shimmer - Soft, engaging female