🎯 Objective: Learn how to record audio in Web and send to Machine Learning Cloud Approachs in order to transform Speech to text 📖 TO DO:
- Grab the Record from the index.html and send to GCloud Speech to text and receive the Text, then Show it
- Grab the Record from the index.html and send to AWS Transcribe and receive the Text, then Show it
- Make a little refactor and use Vue3 for the Interface to show the notes.
Run
npm i
npm run start:devto launch index.html and go to http://localhost:3000/*
The Snippet about how to Get Permission of Mic and Start/Stop was taken from: CodePen
The Approach about how to send in correct way the BLOB to Backend was taken from here: Fetch API upload file
The Documentation to Remember how to use the Multer Interceptor from Express and File Interceptor from Nest File Upload Nest
You must have an GCloud Account with billing enabled Enter here: Enable API
Then go to Cloud Speech-to-Text API
And you should be the api enabled:

Also Remember the prices Free Tier: Till 60 minutes/month Level 1: USD 0.024/Minute
To Test locally, export as Variable:
export GOOGLE_APPLICATION_CREDENTIALS=/home/eddyalien/webpage-9a255-c6e8df3d5c82.jsonFor this Test I have tested with 'es-MX' language. The response of the API is the following:
{
"results":[
{
"alternatives":[
{
"words":[
],
"transcript":"Okay vamos a probar esto no tiene más de 5 segundos",
"confidence":0.9308170676231384
}
],
"channelTag":0,
"resultEndTime":{
"seconds":"5",
"nanos":50000000
},
"languageCode":"es-us"
}
],
"totalBilledTime":{
"seconds":"15",
"nanos":0
}
}As I can see, minimum Billing GCloud takes is 15 seconds yet if the audio is 5, for example.
Finally, Google Cloud Documentation is a little bit spreat for many places. Most useful to fix my issues was here: google.cloud.speech.v1
And a little introduction but useful Converting speech to text with Node.js
