Skip to main content

stream_voice.py

Source: sunholo/senses/stream_voice.py

Functions

setup_tts_subparser(subparsers)

Sets up an argparse subparser for the 'tts' command.

Args: subparsers: The subparsers object from argparse.ArgumentParser().

tts_command(args)

Executes the TTS command based on parsed arguments.

Args: args: The parsed command-line arguments.

Classes

StreamingTTS

Example usage

def sample_text_stream(): sentences = [ "Hello, this is a test of streaming text to speech.", "Each sentence will be converted to audio separately.", "This allows for lower latency in long-form text to speech conversion." ] for sentence in sentences: yield sentence time.sleep(0.5) # Simulate delay between text chunks

Initialize and run

tts = StreamingTTS() tts.process_text_stream(sample_text_stream())

  • del(self)

    • Cleanup method to ensure stream is closed.
  • init(self)

    • Initialize self. See help(type(self)) for accurate signature.
  • _apply_fade(self, audio: numpy.ndarray, fade_duration: float, fade_in: bool = True, fade_out: bool = True) -> numpy.ndarray

    • Apply fade in/out to audio with specified duration.
  • _initialize_audio_device(self)

    • Initialize audio device with proper settings.
  • _make_fade(self, length: int, fade_type: str = 'l') -> numpy.ndarray

    • Generate a fade curve of specified length and type.
  • _play_audio_chunk(self, audio_chunk: numpy.ndarray, is_final_chunk: bool = False)

    • Play a single audio chunk with proper device handling.
  • audio_player(self)

    • Continuously play audio chunks from the queue.
  • generate_audio_stream(self, text)

    • Generate a stream of audio data from a text chunk. Returns audio in WAV format for streaming.

Args: text (str): Text to convert to speech

Yields: bytes: WAV-formatted audio data

  • process_text_stream(self, text_generator)

    • Process incoming text stream and convert to audio.
  • save_to_file(self, text_generator, output_path)

    • Save the audio to a WAV file with minimal fading.
  • set_language(self, language_code: str)

    • Set the language for text-to-speech conversion.

Args: language_code: Language code in BCP-47 format (e.g., 'en-US', 'es-ES', 'fr-FR')

  • set_voice(self, voice_name: str)
    • Set the language for text-to-speech conversion.

Args: language_code: Language code in BCP-47 format (e.g., 'en-US', 'es-ES', 'fr-FR')

  • set_voice_gender(self, gender: str)
    • Set the voice gender for text-to-speech conversion.

Args: gender: One of 'NEUTRAL', 'MALE', or 'FEMALE'

  • text_to_audio(self, text)
    • Convert text chunk to audio bytes using Google Cloud TTS.
Sunholo Multivac

Get in touch to see if we can help with your GenAI project.

Contact us

Other Links

Sunholo Multivac - GenAIOps

Copyright ©

Holosun ApS 2024