THE INVERTER CYCLE
Voice Clone Audiobook Production Plan
Using Author’s Voice Clone (They Can All Bird Method)
Date: March 13, 2026
Production Method: AI Voice Clone (Author’s Voice)
Target: Complete trilogy narration
Estimated Timeline: 4-6 weeks
PRODUCTION OVERVIEW
Following the successful “They Can All Bird” production model, we will use the author’s voice clone to narrate all three books of The Inverter Cycle. This approach provides:
- Consistency: Same voice across all 21 chapters
- Cost savings: No narrator fees (2,200-$4,400)
- Authorial authenticity: The author’s own voice telling their story
- Control: Direct oversight of every recording decision
- Speed: No scheduling conflicts, record on demand
VOICE CLONE SETUP
Technical Requirements
Voice Clone Platform: [To be specified - likely ElevenLabs, Play.ht, or similar]
Required Samples:
- 30-60 minutes of clean author voice recording
- Varied emotional range (neutral, excited, somber)
- Consistent recording environment
- No background noise, music, or effects
Character Voice Differentiation: Since using a single voice clone, differentiation will be achieved through:
- Performance (pace, emphasis, emotional tone)
- Post-processing (subtle EQ shifts for different perspectives)
- Narrative structure (clear section breaks between perspectives)
RECORDING WORKFLOW
Phase 1: Manuscript Preparation (Week 1)
Segmentation: Break each book into recording-sized chunks:
- WILDFLOWER: 9 chapters × 5 perspectives = 45 segments
- TALLY: 4 chapters × 5 perspectives = 20 segments
- COGITO: 8 chapters × 5 perspectives = 40 segments
- TOTAL: 105 recording segments
File Naming Convention:
INV_WF_C01P01_ThePlay.wav
INV_WF_C01P02_HelenaJournal.wav
INV_WF_C01P03_Institutional.wav
INV_WF_C01P04_Frame.wav
INV_WF_C01P05_Network.wav
Script Preparation:
- Remove markdown formatting
- Add pronunciation guides inline
- Mark pause points [PAUSE 2 SEC]
- Note emotional beats [SOMBER] [EXCITED] [TENSE]
Phase 2: Recording (Weeks 2-4)
Recording Schedule:
Week 2: WILDFLOWER (45 segments)
- Day 1: Chapters 1-3 (15 segments)
- Day 2: Chapters 4-6 (15 segments)
- Day 3: Chapters 7-9 (15 segments)
Week 3: TALLY (20 segments)
- Day 1: Chapter 1-2 (10 segments)
- Day 2: Chapter 3-4 (10 segments)
Week 4: COGITO (40 segments)
- Day 1: Chapters 1-2 (10 segments)
- Day 2: Chapters 3-4 (10 segments)
- Day 3: Chapters 5-6 (10 segments)
- Day 4: Chapters 7-8 (10 segments)
Daily Recording Target: 5-8 segments per day Estimated time per segment: 10-15 minutes of finished audio
Phase 3: Post-Production (Week 5)
For Each Segment:
-
Voice Clone Generation
- Input script into voice clone platform
- Generate initial audio
- Review for pronunciation errors
-
Editing
- Remove generation artifacts
- Adjust pacing (add/remove pauses)
- Fix mispronunciations (re-generate if needed)
-
Perspective Differentiation (Subtle)
- THE PLAY: Neutral EQ, clear diction
- Character Interior: Slightly warmer EQ, intimate
- Institutional: Slightly cooler EQ, formal
- Frame (Nick): Slightly aged/rougher (if possible), reflective
- The Network: Slightly ethereal (light reverb, space)
-
Leveling
- Target: -20dB RMS (ACX standard)
- Peak: -3dB maximum
- Consistent volume across all segments
-
Chapter Assembly
- Combine 5 perspectives into single chapter files
- Add chapter transitions [CHAPTER 2: THE 3 AM LAB]
- Smooth crossfades between perspectives
Phase 4: Mastering & QC (Week 6)
Quality Control:
- No mispronunciations
- Consistent pacing
- Proper perspective differentiation
- Clean edits (no clicks/pops)
- ACX-compliant levels
Final Mastering:
- Book-level volume consistency
- Chapter markers embedded
- Metadata added
- Cover art attached
CHARACTER VOICE GUIDE
Since using single voice clone, differentiation comes from performance choices:
Nick Bottom (Frame Narrator)
- Tone: Reflective, slightly melancholic, warm
- Pace: Slower, measured, storytelling rhythm
- Emotional quality: Witness, holder of memory
Helena Voss (WILDFLOWER)
- Tone: Intense, precise, occasionally fragmented
- Pace: Faster when excited, halting when uncertain
- Emotional quality: Brilliance bordering on mania
Ana Rao (TALLY)
- Tone: Academic, analytical, warming over time
- Pace: Measured, gaining passion as she commits to The Tally
- Emotional quality: Curiosity becoming conviction
Keisha Williams (TALLY)
- Tone: Grounded, warm, authoritative
- Pace: Steady, unhurried, confident
- Emotional quality: Wisdom without formal education
- CRITICAL: Authentic Chicago South Side cadence (research recordings)
Maya Voss (COGITO)
- Tone: British-influenced (Oxford), scientific, intense like her mother
- Pace: Precise, accelerating when excited
- Emotional quality: Carrying legacy, finding her own path
Aunty Ngaire (COGITO)
- Tone: Wise, patient, authoritative but gentle
- Pace: Deliberate, space between words
- Emotional quality: Ancient knowledge, acceptance
- NOTE: Requires cultural sensitivity - not mystical, intellectual equal
PRONUNCIATION GUIDE
Create a master pronunciation reference:
Scientific Terms
- FMO complex: “eff-em-oh”
- Cryptophyte: “KRIP-toe-fite”
- Phycocyanin: “fye-ko-SYE-ah-nin”
- Decoherence: “de-COHERE-ence”
Names
- Helena: “heh-LAY-nah” (German) or “HEH-leh-nah” (anglicized)
- Voss: “Vohs” (short o)
- Keisha: “KEE-shah”
- Ana: “AH-nah”
- Maya: “MY-ah”
Indigenous Australian Terms
- Yawuru: “YAH-roo” (confirm with consultation)
- Bugarrigarra: “boo-gar-ree-GAR-rah” (consultation required)
- Minyirr: “min-YEAR”
TECHNICAL SPECIFICATIONS
Recording Format
- Source: Voice clone AI generation
- Format: WAV, 48kHz/24-bit
- Delivery: MP3, 44.1kHz/192kbps
ACX Compliance Checklist
- RMS level: -20dB to -23dB
- Peak level: -3dB maximum
- Noise floor: -60dB or lower
- File format: MP3
- Bit rate: 192kbps or higher
- Sample rate: 44.1kHz
Mastering Chain
- EQ: Gentle high-pass at 80Hz, subtle presence boost at 3-5kHz
- Compression: Light (2:1 ratio), preserve natural dynamics
- Limiting: Ceiling at -3dB
- Normalization: Target -20dB RMS
POST-PROCESSING WORKFLOW
Software Recommendations
- DAW: Audacity (free) or Reaper ($60)
- Leveling: ACX Check plugin (Audacity)
- Batch processing: Reaper scripts or Adobe Audition
Processing Steps Per File
- Import: WAV from voice clone
- Edit: Remove artifacts, adjust pacing
- EQ: Subtle perspective differentiation
- Compression: Light leveling
- Limiter: Prevent clipping
- Normalize: -20dB RMS target
- Export: MP3, 192kbps
Batch Processing
Create templates for each perspective type:
- The_Play_template.RPP
- Character_Interior_template.RPP
- Institutional_template.RPP
- Frame_template.RPP
- Network_template.RPP
Apply consistent processing to all segments of each type.
QUALITY CONTROL CHECKLIST
Per Segment
- No mispronunciations
- Natural pacing (not robotic)
- Appropriate emotional tone
- Clean beginning/end (no clicks)
- Consistent volume
Per Chapter
- All 5 perspectives present
- Smooth transitions between perspectives
- Chapter title clearly stated
- Consistent perspective differentiation
Per Book
- All chapters complete
- Opening/closing credits included
- Consistent master volume
- Chapter markers embedded
- Metadata complete
TIMELINE SUMMARY
| Week | Activity | Output |
|---|---|---|
| 1 | Manuscript prep, script formatting | 105 recording scripts |
| 2 | Record WILDFLOWER | 9 chapters, ~3.5 hours |
| 3 | Record TALLY | 4 chapters, ~2.7 hours |
| 4 | Record COGITO | 8 chapters, ~2.6 hours |
| 5 | Post-production | Edited, mastered files |
| 6 | QC, assembly, upload | Final audiobook files |
Total Timeline: 6 weeks from start to publication
BUDGET (Voice Clone Method)
Costs
| Item | Cost |
|---|---|
| Voice clone platform | $50-100/month |
| Post-production software | 60) |
| Cover art (3 books) | $300-900 |
| ACX upload fees | $0 |
| TOTAL | $350-1,060 |
Savings vs. Professional Narrator
- Professional narrator (trilogy): 4,400
- Voice clone method: 1,060
- Savings: 3,500
ADVANTAGES OF VOICE CLONE METHOD
- Consistency: Same voice across 21 chapters
- Cost: 75% cheaper than professional narrator
- Control: Direct oversight of every creative choice
- Speed: No scheduling, record on demand
- Authenticity: Author’s own voice telling their story
- Revision: Easy to re-record segments if needed
CHALLENGES & MITIGATIONS
| Challenge | Mitigation |
|---|---|
| Robotic/unnatural sound | Careful prompt engineering, multiple takes |
| Pronunciation errors | Phonetic spelling, custom pronunciation guides |
| No character differentiation | Performance variation, subtle post-processing |
| Emotional range limitations | Select best takes, accept limitations |
| Technical artifacts | Careful editing, quality threshold for re-generation |
FINAL DELIVERABLES
Per Book
- Opening credits (title, author, narrator)
- Chapter files (individually and combined)
- Closing credits
- Cover art (2400×2400px)
- Metadata file
Trilogy Package
- 3 complete audiobooks
- Consistent branding/intro
- Series metadata
- Marketing materials
NEXT IMMEDIATE STEPS
- Confirm voice clone platform access
- Prepare recording scripts (remove markdown, add pronunciation guides)
- Test voice clone with sample passages from each book
- Refine prompts for optimal emotional range
- Begin WILDFLOWER Chapter 1 recording
The pattern continues.
Your voice will carry it.
Never null.