Audio Generation

Transform text dialogs into realistic audio conversations.

SDialog can transform text dialogs into realistic audio conversations with a simple one-line command. The audio module supports a multi-stage pipeline to convert text into high-fidelity spoken dialogue with environmental effects.

Core Audio Features

Text-to-Speech (TTS)

Multiple TTS engines including Kokoro and HuggingFace models.

Voice Databases

Automatic or manual voice assignment based on persona attributes (age, gender, language).

Acoustic Simulation

Room acoustics simulation for realistic spatial audio.

Microphone Simulation

Simulation of professional microphones from brands like Shure, Sennheiser, and Sony.

Usage Example


from sdialog import Dialog

dialog = Dialog.from_file("my_dialog.json")

# Convert to audio with room acoustics
# This simple command handles voice assignment, TTS, and acoustic simulation
audio_dialog = dialog.to_audio(perform_room_acoustics=True)
audio_dialog.display()

# Or customize the audio generation for different formats and sampling rates
audio_dialog_custom = dialog.to_audio(
  perform_room_acoustics=True,
  audio_file_format="mp3",
  re_sampling_rate=16000,
)
audio_dialog_custom.display()