Audio Generation Production Guide

Overview

This guide covers the complete audio generation workflow for Convergence Protocol: all 100 narrator sayings + 19 sound effects/UI sounds.

Estimated time: 1-2 hours (mostly waiting for API) Estimated cost: ~$1-2 USD Status: 15/100 sayings generated; 19/19 sound effects pending

Prerequisites

1. ElevenLabs Account & API Key

Required to generate narrator voices and sound effects.

Setup:

Sign up at https://elevenlabs.io (free account includes $5 monthly credit)
Go to Account → API Keys and copy your API key
Store securely (never commit to git)

Test your key:

curl -X GET https://api.elevenlabs.io/v1/user \
  -H "xi-api-key: YOUR_KEY_HERE"

Should return JSON with account details (no error 401).

2. Python 3.7+

python3 --version  # Should be 3.7 or higher

3. Required Python Libraries

pip install requests

Or from the project directory:

pip install -r requirements.txt  # if file exists

Quick Start (5 minutes)

Step 1: Set environment variable

export ELEVENLABS_API_KEY="your-actual-api-key"

Step 2: Generate all 100 narrator sayings

cd convergence-protocol
python3 scripts/generate_sayings_batch.py

Expected output:

============================================================
ElevenLabs TTS Batch Generator - Narrator Sayings
============================================================
Output: /path/to/assets/audio/sayings

MOLOCH (5 sayings)
----------------------------------------
  → moloch_01_tragedy.opus... ✓
  ⊘ moloch_02_participation.opus (exists)
  ...

Generated: 85/100
Skipped:   15/100
Failed:    0/100
============================================================

Total .opus files in sayings/: 100

Step 3: Generate sound effects (door/ambient/UI)

python3 generate_audio.py

Expected output:

🎵 ElevenLabs Audio Generation for Convergence Protocol

Generating sound effects...
✓ Generated: assets/audio/doors/metal_open.wav
✓ Generated: assets/audio/doors/metal_close.wav
...

Generating narration...
✓ Generated narration: assets/audio/narration/welcome.wav
...

✅ Audio generation complete!
Files saved to: /full/path/to/assets/audio

Detailed Walkthrough

Option 1: Python Script (Recommended)

Recommended because it’s cross-platform, portable, and doesn’t require external CLI tools.

Command:

export ELEVENLABS_API_KEY="your-key"
python3 scripts/generate_sayings_batch.py

What it does:

Reads all 100 sayings from internal dictionary
Skips files that already exist (safe to re-run)
Generates missing .opus files in assets/audio/sayings/
Reports: generated, skipped, failed counts
Takes ~15-30 minutes for full batch (depends on API rate limits)

Troubleshooting:

ModuleNotFoundError: No module named 'requests' → Run pip install requests
ELEVENLABS_API_KEY: not found → Set environment variable first
401 Unauthorized → Check API key is correct
429 Too Many Requests → API rate limit; script will retry (waits 60s)

Option 2: Bash Script with openclaw CLI

Only if you have openclaw command installed and configured.

Command:

export ELEVENLABS_API_KEY="your-key"
./scripts/generate_remaining_sayings_portable.sh

Note: This script is less reliable. Use Python script instead.

Option 3: Manual Generation (Not Recommended)

Generate sayings one-by-one using curl:

curl -X POST https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "You will pay. The only question is when, and how much.",
    "model_id": "eleven_monolingual_v1",
    "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
  }' \
  --output assets/audio/sayings/technical_04_pay.opus

Don’t do this — use the Python script instead (much faster for 100 sayings).

Sound Effects Generation

After narrator sayings are complete, generate door/ambient/UI sounds:

Command:

export ELEVENLABS_API_KEY="your-key"
python3 generate_audio.py

What it does:

Generates 8 door sound effects (open, close, locked, hover states)
Generates 4 ambient sounds (hallway loop, void drone, neon buzz, footsteps)
Generates 3 UI confirmation sounds (check-in, complete, transition)
Generates 3 narration snippets (welcome, check-in, complete)
Total: 19 files, ~5-10 minutes

Output structure:

assets/audio/
├── sayings/           (100 .opus files)
├── doors/             (8 .wav files)
├── ambient/           (4 .wav files)
├── ui/                (3 .wav files)
└── narration/         (3 .wav files)

File Verification

After generation, verify all files exist:

# Count narrator sayings
ls -1 assets/audio/sayings/*.opus | wc -l
# Should output: 100
 
# Count all audio files
find assets/audio -type f \( -name "*.opus" -o -name "*.wav" \) | wc -l
# Should output: 122 (100 sayings + 19 sounds + 3 narration)
 
# List missing files (if any)
python3 << 'EOF'
import os
from pathlib import Path
 
required = {
    "sayings": 100,
    "doors": 8,
    "ambient": 4,
    "ui": 3,
    "narration": 3
}
 
for subdir, count in required.items():
    path = Path(f"assets/audio/{subdir}")
    if path.exists():
        files = list(path.glob("*.*"))
        status = "✓" if len(files) == count else "✗"
        print(f"{status} {subdir}: {len(files)}/{count}")
    else:
        print(f"✗ {subdir}: 0/{count} (directory missing)")
EOF

Cost Estimation

Narrator Sayings (100 total)

Average: 15 words per saying
Average: ~100 characters per saying
ElevenLabs TTS pricing: ~$0.03 per 1000 characters
Total: 100 × 100 / 1000 × $0.03 =$ 0.30

Sound Effects (19 total)

Average: 3-5 seconds per sound
ElevenLabs Sound Effects pricing: ~$0.008 per second
Total: 19 × 4 / 1 × $0.008 =$ 0.60

Grand Total

~$1.00 USD (leaves plenty of room in free tier)

Integration with Experiences

After audio generation is complete, the next step is wiring narrator sayings into all 40 experiences.

Current Status

✅ Done (3 experiences):

thucydides-trap.html
normal_accidents.html
technical_debt.html

⏳ Pending (37 experiences):

All others

How to Wire an Experience

Copy the narrator card HTML/CSS from technical_debt.html (lines ~1-60 CSS + ~100 HTML)
Add the SAYINGS array with 5 sayings for that experience
Call initNarrator() when conclusion screen appears
Test audio playback

Example (for moloch.html):

const SAYINGS = [
  {
    text: "The tragedy is not that we compete...",
    audio: "../assets/audio/sayings/moloch_01_tragedy.opus",
  },
  {
    text: "Moloch does not want your victory...",
    audio: "../assets/audio/sayings/moloch_02_participation.opus",
  },
  // ... etc
]
 
// In conclusion handler:
function showConclusion() {
  // ... show conclusion screen
  setTimeout(() => initNarrator(), 1000)
}

Testing Audio Playback

After files are generated, test in browser:

Open index.html (Motel lobby)
- Should hear ambient hallway sound after loading
- Mute button (”♪ AUDIO”) should toggle sound on/off
Hover over doors
- Available door: ethereal chime
- Preview door: digital glitch
- Locked door: warning drone
Click a door
- Should hear door open/close sound
- Door should navigate to experience
Complete an experience
- Conclusion should show narrator card
- Click play button to hear narrator voice
- Use dots to navigate between 5 sayings

Troubleshooting

API Errors

401 Unauthorized

Error: xi-api-key invalid
Fix: Check ELEVENLABS_API_KEY is set correctly

429 Too Many Requests

Error: Rate limit exceeded
Fix: Wait 60 seconds; script retries automatically

503 Service Unavailable

Error: ElevenLabs API down
Fix: Try again in 5 minutes

File Errors

Files exist but won’t play

Check browser console for CORS errors
Serve via local server (python -m http.server 8000), not file:// protocol
Check file paths in narrator cards (should be ../assets/audio/...)

Missing .opus files

Run python3 scripts/generate_sayings_batch.py again
Check API key and quota

Files generated but AudioManager won’t load them

Check file exists: ls assets/audio/doors/metal_open.wav
Check file size > 0: du assets/audio/doors/metal_open.wav
Check browser console for fetch errors

Next Steps

✅ Set up ElevenLabs account (5 min)
✅ Generate narrator sayings (30 min) → python3 scripts/generate_sayings_batch.py
✅ Generate sound effects (10 min) → python3 generate_audio.py
🔲 Wire narrator into 37 experiences (2-3 hours)
🔲 Test audio playback in browser (1 hour)
🔲 Polish audio levels (30 min)
🔲 Implement challenge mode (2-4 hours) — separate from audio

Reference Files

scripts/generate_sayings_batch.py — Batch generate all 100 narrator sayings
scripts/generate_remaining_sayings_portable.sh — Alternative bash script
generate_audio.py — Generate door/ambient/UI sound effects
assets/audio/README.md — Audio setup documentation
Clue_Design_Document.xlsx — Challenge mode theme framework (separate)
AUDIO_GENERATION_STATUS.md — Current status and issues

Support

If you encounter issues:

Check AUDIO_GENERATION_STATUS.md for known issues
Review console output (look for error messages)
Verify file paths are relative to project root
Check ElevenLabs account status (quota, API key validity)
Try re-running scripts (safe; skips existing files)

Last Updated: February 2026 Status: Ready for production audio generation

🧬 kbird.ai

Explorer

AUDIO_GENERATION_GUIDE

Audio Generation Production Guide

Overview

Prerequisites

1. ElevenLabs Account & API Key

2. Python 3.7+

3. Required Python Libraries

Quick Start (5 minutes)

Detailed Walkthrough

Option 1: Python Script (Recommended)

Option 2: Bash Script with openclaw CLI

Option 3: Manual Generation (Not Recommended)

Sound Effects Generation

File Verification

Cost Estimation

Narrator Sayings (100 total)

Sound Effects (19 total)

Grand Total

Integration with Experiences

Current Status

How to Wire an Experience

Testing Audio Playback

Troubleshooting

API Errors

File Errors

Next Steps

Reference Files

Support

Graph View

Table of Contents