STT Cloud Learning

🎯 Objective: Learn how to record audio in Web and send to Machine Learning Cloud Approachs in order to transform Speech to text 📖 TO DO:

Grab the Record from the index.html and send to GCloud Speech to text and receive the Text, then Show it
Grab the Record from the index.html and send to AWS Transcribe and receive the Text, then Show it
Make a little refactor and use Vue3 for the Interface to show the notes.

Getting Started

Run

npm i
npm run start:dev

to launch index.html and go to http://localhost:3000/*

The Snippet about how to Get Permission of Mic and Start/Stop was taken from: CodePen

The Approach about how to send in correct way the BLOB to Backend was taken from here: Fetch API upload file

The Documentation to Remember how to use the Multer Interceptor from Express and File Interceptor from Nest File Upload Nest

Enabling GCloud API

You must have an GCloud Account with billing enabled Enter here: Enable API

Then go to Cloud Speech-to-Text API

And you should be the api enabled:

Also Remember the prices Free Tier: Till 60 minutes/month Level 1: USD 0.024/Minute

Create ServiceAccount and get a JSON Credential

To Test locally, export as Variable:

export GOOGLE_APPLICATION_CREDENTIALS=/home/eddyalien/webpage-9a255-c6e8df3d5c82.json

For this Test I have tested with 'es-MX' language. The response of the API is the following:

{
   "results":[
      {
         "alternatives":[
            {
               "words":[
                  
               ],
               "transcript":"Okay vamos a probar esto no tiene más de 5 segundos",
               "confidence":0.9308170676231384
            }
         ],
         "channelTag":0,
         "resultEndTime":{
            "seconds":"5",
            "nanos":50000000
         },
         "languageCode":"es-us"
      }
   ],
   "totalBilledTime":{
      "seconds":"15",
      "nanos":0
   }
}

As I can see, minimum Billing GCloud takes is 15 seconds yet if the audio is 5, for example.

Finally, Google Cloud Documentation is a little bit spreat for many places. Most useful to fix my issues was here: google.cloud.speech.v1

And a little introduction but useful Converting speech to text with Node.js

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
client		client
docs/images		docs/images
src		src
test		test
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
nest-cli.json		nest-cli.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STT Cloud Learning

Getting Started

Enabling GCloud API

Create ServiceAccount and get a JSON Credential

About

Uh oh!

Releases

Packages

Languages

EddyArellanes/stt-cloud-learning

Folders and files

Latest commit

History

Repository files navigation

STT Cloud Learning

Getting Started

Enabling GCloud API

Create ServiceAccount and get a JSON Credential

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages