-
Notifications
You must be signed in to change notification settings - Fork 383
EdgeEngine.get_stream_info() is not right #342
Copy link
Copy link
Open
Description
In TextToAudioStream, when the engine is edge,format equals 66536, which is not right, please repair
def _on_audio_chunk(self, chunk):
"""
Postprocessing of single chunks of audio data.
This method is called for each chunk of audio data processed. It first determines the audio stream format.
If the format is `pyaudio.paFloat32`, we convert to paInt16.
Args:
chunk (bytes): The audio data chunk to be processed.
"""
format, channels, sample_rate = self.engine.get_stream_info()
if format == pyaudio.paFloat32:
audio_data = np.frombuffer(chunk, dtype=np.float32)
audio_data = np.int16(audio_data * 32767)
chunk = audio_data.tobytes()
else:
# For other formats, convert to numpy array for RMS calculation
audio_data = np.frombuffer(chunk, dtype=np.int16)
# Calculate RMS value for lip sync
if len(audio_data) > 0:
# Normalize to float32 for RMS calculation
if format == pyaudio.paFloat32:
# audio_data is already converted above
normalized_data = audio_data.astype(np.float32) / 32767.0
else:
# For int16 format
normalized_data = audio_data.astype(np.float32) / 32767.0
# Calculate RMS
self.current_rms = np.sqrt(np.mean(np.square(normalized_data)))
else:
self.current_rms = 0.0
if self.output_wavfile and self.wf:
if self._is_engine_mpeg():
self.wf.write(chunk)
else:
self.wf.writeframes(chunk)
if self.chunk_callback:
self.chunk_callback(chunk)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels