Audio Generation Production Guide

Overview

This guide covers the complete audio generation workflow for Convergence Protocol: all 100 narrator sayings + 19 sound effects/UI sounds.

Estimated time: 1-2 hours (mostly waiting for API) Estimated cost: ~$1-2 USD Status: 15/100 sayings generated; 19/19 sound effects pending


Prerequisites

1. ElevenLabs Account & API Key

Required to generate narrator voices and sound effects.

Setup:

  1. Sign up at https://elevenlabs.io (free account includes $5 monthly credit)
  2. Go to Account β†’ API Keys and copy your API key
  3. Store securely (never commit to git)

Test your key:

curl -X GET https://api.elevenlabs.io/v1/user \
  -H "xi-api-key: YOUR_KEY_HERE"

Should return JSON with account details (no error 401).

2. Python 3.7+

python3 --version  # Should be 3.7 or higher

3. Required Python Libraries

pip install requests

Or from the project directory:

pip install -r requirements.txt  # if file exists

Quick Start (5 minutes)

Step 1: Set environment variable

export ELEVENLABS_API_KEY="your-actual-api-key"

Step 2: Generate all 100 narrator sayings

cd convergence-protocol
python3 scripts/generate_sayings_batch.py

Expected output:

============================================================
ElevenLabs TTS Batch Generator - Narrator Sayings
============================================================
Output: /path/to/assets/audio/sayings

MOLOCH (5 sayings)
----------------------------------------
  β†’ moloch_01_tragedy.opus... βœ“
  ⊘ moloch_02_participation.opus (exists)
  ...

Generated: 85/100
Skipped:   15/100
Failed:    0/100
============================================================

Total .opus files in sayings/: 100

Step 3: Generate sound effects (door/ambient/UI)

python3 generate_audio.py

Expected output:

🎡 ElevenLabs Audio Generation for Convergence Protocol

Generating sound effects...
βœ“ Generated: assets/audio/doors/metal_open.wav
βœ“ Generated: assets/audio/doors/metal_close.wav
...

Generating narration...
βœ“ Generated narration: assets/audio/narration/welcome.wav
...

βœ… Audio generation complete!
Files saved to: /full/path/to/assets/audio

Detailed Walkthrough

Recommended because it’s cross-platform, portable, and doesn’t require external CLI tools.

Command:

export ELEVENLABS_API_KEY="your-key"
python3 scripts/generate_sayings_batch.py

What it does:

  • Reads all 100 sayings from internal dictionary
  • Skips files that already exist (safe to re-run)
  • Generates missing .opus files in assets/audio/sayings/
  • Reports: generated, skipped, failed counts
  • Takes ~15-30 minutes for full batch (depends on API rate limits)

Troubleshooting:

  • ModuleNotFoundError: No module named 'requests' β†’ Run pip install requests
  • ELEVENLABS_API_KEY: not found β†’ Set environment variable first
  • 401 Unauthorized β†’ Check API key is correct
  • 429 Too Many Requests β†’ API rate limit; script will retry (waits 60s)

Option 2: Bash Script with openclaw CLI

Only if you have openclaw command installed and configured.

Command:

export ELEVENLABS_API_KEY="your-key"
./scripts/generate_remaining_sayings_portable.sh

Note: This script is less reliable. Use Python script instead.


Generate sayings one-by-one using curl:

curl -X POST https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "You will pay. The only question is when, and how much.",
    "model_id": "eleven_monolingual_v1",
    "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
  }' \
  --output assets/audio/sayings/technical_04_pay.opus

Don’t do this β€” use the Python script instead (much faster for 100 sayings).


Sound Effects Generation

After narrator sayings are complete, generate door/ambient/UI sounds:

Command:

export ELEVENLABS_API_KEY="your-key"
python3 generate_audio.py

What it does:

  • Generates 8 door sound effects (open, close, locked, hover states)
  • Generates 4 ambient sounds (hallway loop, void drone, neon buzz, footsteps)
  • Generates 3 UI confirmation sounds (check-in, complete, transition)
  • Generates 3 narration snippets (welcome, check-in, complete)
  • Total: 19 files, ~5-10 minutes

Output structure:

assets/audio/
β”œβ”€β”€ sayings/           (100 .opus files)
β”œβ”€β”€ doors/             (8 .wav files)
β”œβ”€β”€ ambient/           (4 .wav files)
β”œβ”€β”€ ui/                (3 .wav files)
└── narration/         (3 .wav files)

File Verification

After generation, verify all files exist:

# Count narrator sayings
ls -1 assets/audio/sayings/*.opus | wc -l
# Should output: 100
 
# Count all audio files
find assets/audio -type f \( -name "*.opus" -o -name "*.wav" \) | wc -l
# Should output: 122 (100 sayings + 19 sounds + 3 narration)
 
# List missing files (if any)
python3 << 'EOF'
import os
from pathlib import Path
 
required = {
    "sayings": 100,
    "doors": 8,
    "ambient": 4,
    "ui": 3,
    "narration": 3
}
 
for subdir, count in required.items():
    path = Path(f"assets/audio/{subdir}")
    if path.exists():
        files = list(path.glob("*.*"))
        status = "βœ“" if len(files) == count else "βœ—"
        print(f"{status} {subdir}: {len(files)}/{count}")
    else:
        print(f"βœ— {subdir}: 0/{count} (directory missing)")
EOF

Cost Estimation

Narrator Sayings (100 total)

  • Average: 15 words per saying
  • Average: ~100 characters per saying
  • ElevenLabs TTS pricing: ~$0.03 per 1000 characters
  • Total: 100 Γ— 100 / 1000 Γ— 0.30

Sound Effects (19 total)

  • Average: 3-5 seconds per sound
  • ElevenLabs Sound Effects pricing: ~$0.008 per second
  • Total: 19 Γ— 4 / 1 Γ— 0.60

Grand Total

~$1.00 USD (leaves plenty of room in free tier)


Integration with Experiences

After audio generation is complete, the next step is wiring narrator sayings into all 40 experiences.

Current Status

βœ… Done (3 experiences):

  • thucydides-trap.html
  • normal_accidents.html
  • technical_debt.html

⏳ Pending (37 experiences):

  • All others

How to Wire an Experience

  1. Copy the narrator card HTML/CSS from technical_debt.html (lines ~1-60 CSS + ~100 HTML)
  2. Add the SAYINGS array with 5 sayings for that experience
  3. Call initNarrator() when conclusion screen appears
  4. Test audio playback

Example (for moloch.html):

const SAYINGS = [
  {
    text: "The tragedy is not that we compete...",
    audio: "../assets/audio/sayings/moloch_01_tragedy.opus",
  },
  {
    text: "Moloch does not want your victory...",
    audio: "../assets/audio/sayings/moloch_02_participation.opus",
  },
  // ... etc
]
 
// In conclusion handler:
function showConclusion() {
  // ... show conclusion screen
  setTimeout(() => initNarrator(), 1000)
}

Testing Audio Playback

After files are generated, test in browser:

  1. Open index.html (Motel lobby)

    • Should hear ambient hallway sound after loading
    • Mute button (”β™ͺ AUDIO”) should toggle sound on/off
  2. Hover over doors

    • Available door: ethereal chime
    • Preview door: digital glitch
    • Locked door: warning drone
  3. Click a door

    • Should hear door open/close sound
    • Door should navigate to experience
  4. Complete an experience

    • Conclusion should show narrator card
    • Click play button to hear narrator voice
    • Use dots to navigate between 5 sayings

Troubleshooting

API Errors

401 Unauthorized

Error: xi-api-key invalid
Fix: Check ELEVENLABS_API_KEY is set correctly

429 Too Many Requests

Error: Rate limit exceeded
Fix: Wait 60 seconds; script retries automatically

503 Service Unavailable

Error: ElevenLabs API down
Fix: Try again in 5 minutes

File Errors

Files exist but won’t play

  • Check browser console for CORS errors
  • Serve via local server (python -m http.server 8000), not file:// protocol
  • Check file paths in narrator cards (should be ../assets/audio/...)

Missing .opus files

  • Run python3 scripts/generate_sayings_batch.py again
  • Check API key and quota

Files generated but AudioManager won’t load them

  • Check file exists: ls assets/audio/doors/metal_open.wav
  • Check file size > 0: du assets/audio/doors/metal_open.wav
  • Check browser console for fetch errors

Next Steps

  1. βœ… Set up ElevenLabs account (5 min)
  2. βœ… Generate narrator sayings (30 min) β†’ python3 scripts/generate_sayings_batch.py
  3. βœ… Generate sound effects (10 min) β†’ python3 generate_audio.py
  4. πŸ”² Wire narrator into 37 experiences (2-3 hours)
  5. πŸ”² Test audio playback in browser (1 hour)
  6. πŸ”² Polish audio levels (30 min)
  7. πŸ”² Implement challenge mode (2-4 hours) β€” separate from audio

Reference Files

  • scripts/generate_sayings_batch.py β€” Batch generate all 100 narrator sayings
  • scripts/generate_remaining_sayings_portable.sh β€” Alternative bash script
  • generate_audio.py β€” Generate door/ambient/UI sound effects
  • assets/audio/README.md β€” Audio setup documentation
  • Clue_Design_Document.xlsx β€” Challenge mode theme framework (separate)
  • AUDIO_GENERATION_STATUS.md β€” Current status and issues

Support

If you encounter issues:

  1. Check AUDIO_GENERATION_STATUS.md for known issues
  2. Review console output (look for error messages)
  3. Verify file paths are relative to project root
  4. Check ElevenLabs account status (quota, API key validity)
  5. Try re-running scripts (safe; skips existing files)

Last Updated: February 2026 Status: Ready for production audio generation