GitHub - aws-samples/amazon-transcribe-output-word-document: An Amazon Transcribe demo to produce a Microsoft Word document containing the turn-by-turn transcription of the audio. This will include additional metadata depending upon the options selected, such as caller sentiment, category identification and issue detection

Convert ASR JSON To Word Document

Overview

This Python3 application will process the results of a synchronous Amazon Transcribe job, in either Transcribe standard mode or Transcribe Call Analytics (TCA) mode, or Amazon Bedrock Data Automation Audio (BDA) job and will turn it into a Microsoft Word document that contains a turn-by-turn transcript from each speaker. It can handle processing a local JSON output file, or it can dynamically query the Amazon Transcribe service to download the JSON. It works in one of two different modes:

Transcribe Standard Mode - this can optionally call Amazon Comprehend to generate sentiment for each turn of the conversation. This mode can handle either speaker-separated or channel-separated audio files
Transcribe Call Analytics Mode - using the Call Analytics feature of Amazon Transcribe, the Word document will present all of the analytical data in either a tabular or graphical form
BDA Audio Mode - using the standard BDA Audio project, and optionally the contact center custom blueprint. The standard project includes general transcript-related features, but the custom blueprint includes additional call-level features

Features

The following table summarise which features are available in each mode. It should be noted that some missing items in BDA Audio can be done with a non-standard custom blueprint. Not all features are supported by this demo, such as Dominant Language Detection, and please note that any entries highlighted with ****** have additional numbered caveats listed below the table.

Feature	Transcribe	Call Analytics	BDA Audio
Standard Call Characteristics
Job information	✅	✅	❌
Word-level confidence scores	✅	✅	❌
Word-level timings	✅	✅	✅
Sentiment Analysis
Call-level sentiment [1]	❌	✅	✅**
Speaker sentiment trend [1] [2]	Via Comprehend	✅**	✅**
Turn-level sentiment	Via Comprehend	✅	Via Comprehend
Turn-level sentiment scores	Via Comprehend	❌	Via Comprehend
Call Characteristics
Call issue detection [1] [3]	❌	✅**	✅**
Call non-talk ("silent") time	❌	✅	❌
Category detection	❌	✅	✅
Entity detection [1]	❌	❌	✅**
Generative call summarisation [3] [4]	❌	✅**	✅
Named speaker identification	❌	✅	❌
PII identification [3] [4]	✅**	✅**	✅
PII redaction [4]	✅**	✅**	✅
Speaker interruptions	❌	✅	❌
Speaker talk time	❌	✅	❌
Speaker volume	❌	✅	❌
Topic detection	❌	❌	✅
Toxicity detection	✅	❌	✅

[1] This feature in BDA Audio is only possibly via a custom blueprint

[2] Speaker sentiment trend in Transcribe Call Analytics only provides data points for each quarter of the call per speaker

[3] This feature in Transcribe or Transcribe Call Analytics is only available on synchronous streams

[4] This feature in Transcribe or Transcribe Call Analytics is not available in all supported languages, and may not be available in all Amazon Transcribe-supporting AWS Regions

Usage

Prerequisites

This application relies upon three external python libraries, which you will need to install onto the system that you wise to deploy this application to. They are as follows:

python-docx
scipy
matplotlib

These should all be installed using the relevant tool for your target platform - typically this would be via pip, the Python package manager, but could be via yum, apt-get or something else. Please consult your platform's Python documentation for more information.

Additionally, as the Python code will call APIs in Amazon Transcribe and, optionally, Amazon Comprehend, the target platform will need to have access to AWS access keys or an IAM role that gives access to the following API calls:

Amazon Transcribe - GetTranscriptionJob() and GetCallAnalyticsJob()
Amazon Comprehend - DetectSentiment()

Sample files

The repository contains the following sample files in the sample-data folder:

example-call.wav - an example two-channel call audio file
example-call.json - the result from Amazon Transcribe when the example audio file is processed in Call Analytics mode
example-call.docx - the output document generated by this application against a completed Amazon Transcribe Call Analytics job using the example audio file. The next section describes this file structure in more detail
example-call-redacted.wav - the example call with all PII redacted, which can be output by Call Analytics if you enable PII and request that results are delivered to your own S3 bucket

Mode-specific output

Transcribe & Call Analytics Bedrock Data Automation Audio

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
images		images
python		python
sample-data		sample-data
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README-bda.md		README-bda.md
README-transcribe.md		README-transcribe.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Convert ASR JSON To Word Document

Overview

Features

Usage

Prerequisites

Sample files

Mode-specific output

Security

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Convert ASR JSON To Word Document

Overview

Features

Usage

Prerequisites

Sample files

Mode-specific output

Security

License

About

Topics

Resources

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages