Deepgram transcription integration: possibility of better "full sentence" understanding?

zarally · October 2023

I am leveraging daily.co's integration with deepgram for transcribing audio. It's pretty awesome, but the integration with deepgram results in a lot of incomplete sentences when I use the documentation I found to determine whether a snippet is "complete" or not:

if (e.fromId === 'transcription' && e.data?.is_final) {

Below is an example of a transcription from an audio recording about Michel Foucault:

content: 'Michel Fu was a French twentieth century philosopher and his historian.'
content: 'Who spent his career for forensic criticizing the power of the modern board'
content: 'our capitalist state, including its beliefs, law courts, prison doctors,'
content: 'psychiatrists. His goal was to work out nothing less than how'
content: 'power work and then to change it in the direction'
content: 'of a Marxist an utopia. Though he spent most of his life in libraries reasons'
content: 'seminar rooms. He was a committed revolutionary figure.'
content: 'He met with enormous popularity in a lead to Parisian intellectual circles.'
content: 'Jean paul Sa admired him deeply, and he still maintains a wide following'
content: 'among young people studying at university in the prosperous corners of the world.'
content: 'His background, which he was extremely reluctant ever talk about and tried to prevent'
content: 'journalists from investigating at all costs was very privileged.'
content: 'Both his parents were inordinate rich',
content: 'coming from a long line of successful surgeons in Port in West'
content: 'for France. His father dr Paul',
content: 'came to represent all the Michel would hate about Bourgeois France.'

So two questions, really:

Is there an updated documentation for this integration effort? Or is there any known knobs I can turn other than e.data?.is_final to try to get more "complete" sentences here?
Deepgram has a new model called nova2 in the works - is it on daily's roadmap to integrate with that? Is there an ETA?

Thanks!

kwindla · October 2023

I believe you can use nova-2 by passing model and tier to startTranscription().

    "model": "2-ea",
    "tier": "nova",

But I'll ask my colleague Corey to weigh in to confirm that.

Regarding the best options for complete phrase endpointing, we're working with Deepgram to give you more control here.

zarally · October 2023

Thanks - that sounds great. I can also batch phrases up on my side a bit more before sending them on. I will try adding model & tier to see if that makes a difference. Any idea on how to confirm deepgram is using that model or not in my return?

Appreciate the feedback!

rajneesh · October 2023

you can listen for `transcription-started` event, this event return the `model` used for transcription.

JeanRooy · November 2023

Hi, i have the same desire to transcribe longer sentences or complete sentences instead of snippets.

I read Deepgram docs, and we should use endpointing and interim results for this.

https://developers.deepgram.com/docs/understand-endpointing-interim-results

I did try to pass endpointing=500 (default is 10 milliseconds) as per Deepgram docs and interin_results=true.

callFrame.startTranscription({

model: "2-ea",

tier: "nova",

endpointing: "500",
interim_results: "true",

});

https://developers.deepgram.com/docs/endpointing

However, the post request in Deepgram logs remains endpointing=false and no interim_result=true included in the post request.

POST /v1/listen?punctuate=true&endpointing=false&language=en&model=2-ea&tier=nova&profanity_filter=false&times=false

Can Daily add more configurations as per Deepgram API docs to the startTranscription() configurations, please?

kcimc · November 2023

I requested that the endpointing and interim_results parameters be added to the Python SDK, on GitHub:

https://github.com/daily-co/daily-python/issues/11

I can also confirm that "2-ea"/"nova" works for me.

zarally · November 2023

+1 for this - endpointing would be super helpful - I echoed this request in the daily-react repo:

https://github.com/daily-co/daily-react/issues/26

mark_at_daily · November 2023

Thanks @kcimc and @zarally. We are planning to expand the parameter support in all of our SDKs to includes these two items plus others.

zarally · November 2023

Thanks @mark_at_daily - any ETA on that? Sorry to be a pest, theendpointing param would be a big help to me.

star_900 · November 2023

+1 for this and would love to know when this will be updated on daily-js since it looks like from the daiily-python repo, someone mentioned it will be coming in a few weeks:

https://github.com/daily-co/daily-python/issues/11

mark_at_daily · November 2023

Hi all, apologies for the delay. We've been working on this and are targeting a release by the end of the week (Dec 1st).

Deepgram transcription integration: possibility of better "full sentence" understanding?

Answers

Categories