stream_voice.py
Source: sunholo/senses/stream_voice.py
Functions
setup_tts_subparser(subparsers)
Sets up an argparse subparser for the 'tts' command.
Args: subparsers: The subparsers object from argparse.ArgumentParser().
tts_command(args)
Executes the TTS command based on parsed arguments.
Args: args: The parsed command-line arguments.
Classes
StreamingTTS
Example usage
def sample_text_stream(): sentences = [ "Hello, this is a test of streaming text to speech.", "Each sentence will be converted to audio separately.", "This allows for lower latency in long-form text to speech conversion." ] for sentence in sentences: yield sentence time.sleep(0.5) # Simulate delay between text chunks
Initialize and run
tts = StreamingTTS() tts.process_text_stream(sample_text_stream())
-
del(self)
- Cleanup method to ensure stream is closed.
-
init(self)
- Initialize self. See help(type(self)) for accurate signature.
-
_apply_fade(self, audio: numpy.ndarray, fade_duration: float, fade_in: bool = True, fade_out: bool = True) -> numpy.ndarray
- Apply fade in/out to audio with specified duration.
-
_initialize_audio_device(self)
- Initialize audio device with proper settings.
-
_make_fade(self, length: int, fade_type: str = 'l') -> numpy.ndarray
- Generate a fade curve of specified length and type.
-
_play_audio_chunk(self, audio_chunk: numpy.ndarray, is_final_chunk: bool = False)
- Play a single audio chunk with proper device handling.
-
audio_player(self)
- Continuously play audio chunks from the queue.
-
generate_audio_stream(self, text)
- Generate a stream of audio data from a text chunk. Returns audio in WAV format for streaming.
Args: text (str): Text to convert to speech
Yields: bytes: WAV-formatted audio data
-
process_text_stream(self, text_generator)
- Process incoming text stream and convert to audio.
-
save_to_file(self, text_generator, output_path)
- Save the audio to a WAV file with minimal fading.
-
set_language(self, language_code: str)
- Set the language for text-to-speech conversion.
Args: language_code: Language code in BCP-47 format (e.g., 'en-US', 'es-ES', 'fr-FR')
- set_voice(self, voice_name: str)
- Set the language for text-to-speech conversion.
Args: language_code: Language code in BCP-47 format (e.g., 'en-US', 'es-ES', 'fr-FR')
- set_voice_gender(self, gender: str)
- Set the voice gender for text-to-speech conversion.
Args: gender: One of 'NEUTRAL', 'MALE', or 'FEMALE'
- text_to_audio(self, text)
- Convert text chunk to audio bytes using Google Cloud TTS.