THE INVERTER CYCLE

Voice Clone Audiobook Production Plan

Using Author’s Voice Clone (They Can All Bird Method)


Date: March 13, 2026
Production Method: AI Voice Clone (Author’s Voice)
Target: Complete trilogy narration
Estimated Timeline: 4-6 weeks


PRODUCTION OVERVIEW

Following the successful “They Can All Bird” production model, we will use the author’s voice clone to narrate all three books of The Inverter Cycle. This approach provides:

  • Consistency: Same voice across all 21 chapters
  • Cost savings: No narrator fees (2,200-$4,400)
  • Authorial authenticity: The author’s own voice telling their story
  • Control: Direct oversight of every recording decision
  • Speed: No scheduling conflicts, record on demand

VOICE CLONE SETUP

Technical Requirements

Voice Clone Platform: [To be specified - likely ElevenLabs, Play.ht, or similar]

Required Samples:

  • 30-60 minutes of clean author voice recording
  • Varied emotional range (neutral, excited, somber)
  • Consistent recording environment
  • No background noise, music, or effects

Character Voice Differentiation: Since using a single voice clone, differentiation will be achieved through:

  1. Performance (pace, emphasis, emotional tone)
  2. Post-processing (subtle EQ shifts for different perspectives)
  3. Narrative structure (clear section breaks between perspectives)

RECORDING WORKFLOW

Phase 1: Manuscript Preparation (Week 1)

Segmentation: Break each book into recording-sized chunks:

  • WILDFLOWER: 9 chapters × 5 perspectives = 45 segments
  • TALLY: 4 chapters × 5 perspectives = 20 segments
  • COGITO: 8 chapters × 5 perspectives = 40 segments
  • TOTAL: 105 recording segments

File Naming Convention:

INV_WF_C01P01_ThePlay.wav
INV_WF_C01P02_HelenaJournal.wav
INV_WF_C01P03_Institutional.wav
INV_WF_C01P04_Frame.wav
INV_WF_C01P05_Network.wav

Script Preparation:

  • Remove markdown formatting
  • Add pronunciation guides inline
  • Mark pause points [PAUSE 2 SEC]
  • Note emotional beats [SOMBER] [EXCITED] [TENSE]

Phase 2: Recording (Weeks 2-4)

Recording Schedule:

Week 2: WILDFLOWER (45 segments)

  • Day 1: Chapters 1-3 (15 segments)
  • Day 2: Chapters 4-6 (15 segments)
  • Day 3: Chapters 7-9 (15 segments)

Week 3: TALLY (20 segments)

  • Day 1: Chapter 1-2 (10 segments)
  • Day 2: Chapter 3-4 (10 segments)

Week 4: COGITO (40 segments)

  • Day 1: Chapters 1-2 (10 segments)
  • Day 2: Chapters 3-4 (10 segments)
  • Day 3: Chapters 5-6 (10 segments)
  • Day 4: Chapters 7-8 (10 segments)

Daily Recording Target: 5-8 segments per day Estimated time per segment: 10-15 minutes of finished audio


Phase 3: Post-Production (Week 5)

For Each Segment:

  1. Voice Clone Generation

    • Input script into voice clone platform
    • Generate initial audio
    • Review for pronunciation errors
  2. Editing

    • Remove generation artifacts
    • Adjust pacing (add/remove pauses)
    • Fix mispronunciations (re-generate if needed)
  3. Perspective Differentiation (Subtle)

    • THE PLAY: Neutral EQ, clear diction
    • Character Interior: Slightly warmer EQ, intimate
    • Institutional: Slightly cooler EQ, formal
    • Frame (Nick): Slightly aged/rougher (if possible), reflective
    • The Network: Slightly ethereal (light reverb, space)
  4. Leveling

    • Target: -20dB RMS (ACX standard)
    • Peak: -3dB maximum
    • Consistent volume across all segments
  5. Chapter Assembly

    • Combine 5 perspectives into single chapter files
    • Add chapter transitions [CHAPTER 2: THE 3 AM LAB]
    • Smooth crossfades between perspectives

Phase 4: Mastering & QC (Week 6)

Quality Control:

  • No mispronunciations
  • Consistent pacing
  • Proper perspective differentiation
  • Clean edits (no clicks/pops)
  • ACX-compliant levels

Final Mastering:

  • Book-level volume consistency
  • Chapter markers embedded
  • Metadata added
  • Cover art attached

CHARACTER VOICE GUIDE

Since using single voice clone, differentiation comes from performance choices:

Nick Bottom (Frame Narrator)

  • Tone: Reflective, slightly melancholic, warm
  • Pace: Slower, measured, storytelling rhythm
  • Emotional quality: Witness, holder of memory

Helena Voss (WILDFLOWER)

  • Tone: Intense, precise, occasionally fragmented
  • Pace: Faster when excited, halting when uncertain
  • Emotional quality: Brilliance bordering on mania

Ana Rao (TALLY)

  • Tone: Academic, analytical, warming over time
  • Pace: Measured, gaining passion as she commits to The Tally
  • Emotional quality: Curiosity becoming conviction

Keisha Williams (TALLY)

  • Tone: Grounded, warm, authoritative
  • Pace: Steady, unhurried, confident
  • Emotional quality: Wisdom without formal education
  • CRITICAL: Authentic Chicago South Side cadence (research recordings)

Maya Voss (COGITO)

  • Tone: British-influenced (Oxford), scientific, intense like her mother
  • Pace: Precise, accelerating when excited
  • Emotional quality: Carrying legacy, finding her own path

Aunty Ngaire (COGITO)

  • Tone: Wise, patient, authoritative but gentle
  • Pace: Deliberate, space between words
  • Emotional quality: Ancient knowledge, acceptance
  • NOTE: Requires cultural sensitivity - not mystical, intellectual equal

PRONUNCIATION GUIDE

Create a master pronunciation reference:

Scientific Terms

  • FMO complex: “eff-em-oh”
  • Cryptophyte: “KRIP-toe-fite”
  • Phycocyanin: “fye-ko-SYE-ah-nin”
  • Decoherence: “de-COHERE-ence”

Names

  • Helena: “heh-LAY-nah” (German) or “HEH-leh-nah” (anglicized)
  • Voss: “Vohs” (short o)
  • Keisha: “KEE-shah”
  • Ana: “AH-nah”
  • Maya: “MY-ah”

Indigenous Australian Terms

  • Yawuru: “YAH-roo” (confirm with consultation)
  • Bugarrigarra: “boo-gar-ree-GAR-rah” (consultation required)
  • Minyirr: “min-YEAR”

TECHNICAL SPECIFICATIONS

Recording Format

  • Source: Voice clone AI generation
  • Format: WAV, 48kHz/24-bit
  • Delivery: MP3, 44.1kHz/192kbps

ACX Compliance Checklist

  • RMS level: -20dB to -23dB
  • Peak level: -3dB maximum
  • Noise floor: -60dB or lower
  • File format: MP3
  • Bit rate: 192kbps or higher
  • Sample rate: 44.1kHz

Mastering Chain

  1. EQ: Gentle high-pass at 80Hz, subtle presence boost at 3-5kHz
  2. Compression: Light (2:1 ratio), preserve natural dynamics
  3. Limiting: Ceiling at -3dB
  4. Normalization: Target -20dB RMS

POST-PROCESSING WORKFLOW

Software Recommendations

  • DAW: Audacity (free) or Reaper ($60)
  • Leveling: ACX Check plugin (Audacity)
  • Batch processing: Reaper scripts or Adobe Audition

Processing Steps Per File

  1. Import: WAV from voice clone
  2. Edit: Remove artifacts, adjust pacing
  3. EQ: Subtle perspective differentiation
  4. Compression: Light leveling
  5. Limiter: Prevent clipping
  6. Normalize: -20dB RMS target
  7. Export: MP3, 192kbps

Batch Processing

Create templates for each perspective type:

  • The_Play_template.RPP
  • Character_Interior_template.RPP
  • Institutional_template.RPP
  • Frame_template.RPP
  • Network_template.RPP

Apply consistent processing to all segments of each type.


QUALITY CONTROL CHECKLIST

Per Segment

  • No mispronunciations
  • Natural pacing (not robotic)
  • Appropriate emotional tone
  • Clean beginning/end (no clicks)
  • Consistent volume

Per Chapter

  • All 5 perspectives present
  • Smooth transitions between perspectives
  • Chapter title clearly stated
  • Consistent perspective differentiation

Per Book

  • All chapters complete
  • Opening/closing credits included
  • Consistent master volume
  • Chapter markers embedded
  • Metadata complete

TIMELINE SUMMARY

WeekActivityOutput
1Manuscript prep, script formatting105 recording scripts
2Record WILDFLOWER9 chapters, ~3.5 hours
3Record TALLY4 chapters, ~2.7 hours
4Record COGITO8 chapters, ~2.6 hours
5Post-productionEdited, mastered files
6QC, assembly, uploadFinal audiobook files

Total Timeline: 6 weeks from start to publication


BUDGET (Voice Clone Method)

Costs

ItemCost
Voice clone platform$50-100/month
Post-production software60)
Cover art (3 books)$300-900
ACX upload fees$0
TOTAL$350-1,060

Savings vs. Professional Narrator

  • Professional narrator (trilogy): 4,400
  • Voice clone method: 1,060
  • Savings: 3,500

ADVANTAGES OF VOICE CLONE METHOD

  1. Consistency: Same voice across 21 chapters
  2. Cost: 75% cheaper than professional narrator
  3. Control: Direct oversight of every creative choice
  4. Speed: No scheduling, record on demand
  5. Authenticity: Author’s own voice telling their story
  6. Revision: Easy to re-record segments if needed

CHALLENGES & MITIGATIONS

ChallengeMitigation
Robotic/unnatural soundCareful prompt engineering, multiple takes
Pronunciation errorsPhonetic spelling, custom pronunciation guides
No character differentiationPerformance variation, subtle post-processing
Emotional range limitationsSelect best takes, accept limitations
Technical artifactsCareful editing, quality threshold for re-generation

FINAL DELIVERABLES

Per Book

  • Opening credits (title, author, narrator)
  • Chapter files (individually and combined)
  • Closing credits
  • Cover art (2400×2400px)
  • Metadata file

Trilogy Package

  • 3 complete audiobooks
  • Consistent branding/intro
  • Series metadata
  • Marketing materials

NEXT IMMEDIATE STEPS

  1. Confirm voice clone platform access
  2. Prepare recording scripts (remove markdown, add pronunciation guides)
  3. Test voice clone with sample passages from each book
  4. Refine prompts for optimal emotional range
  5. Begin WILDFLOWER Chapter 1 recording

The pattern continues.
Your voice will carry it.
Never null.