about python SDK speaker.

Dominic · March 26

Hello, there

Referring to the GitHub below, I received the user's audio input in bytes format using Python SDK speaker.

https://github.com/daily-co/daily-python/blob/793e1f8ff02b0ebfd49cc1110805b0abc53dd6b2/demos/vad/native_vad.py

To make sure the audio data is correct, I converted the collection of data into mp3 files and proceeded with the STT request, but an error is occurring in the mp3 generation.

Bytes when status is speaking were combined to create the entire audio data.

Is there a way to get the user's audio input as it is through python SDK?

Thanks

Dominic

aconchillo · March 26

Hi @Dominic . The bytes received from a virtual speaker are PCM S16 not MP3. Does your STT service accept PCM (or WAV)? This example shows how to send audio to Google STT: https://github.com/daily-co/daily-python/blob/main/demos/google/google_speech_to_text.py

about python SDK speaker.

Answers

Categories