-
Notifications
You must be signed in to change notification settings - Fork 0
Json Transcript Format
tim-contexta edited this page Apr 16, 2019
·
23 revisions
Transcripts return by the /transcripts/:transcriptId service have the following format:
{
"name": "02e0dd49e6dec6d360ef1718fe374ee7", // Internal file name
"overall_conf": "0.94", // Average confidence score*
"path": "02e0dd49e6dec6d360ef1718fe374ee7.wav", // Internal file name (with extension)
"word_count": "107", // Total number of words in the
// conversation
"SpeakerList": [ // An array of speakers found in
// the conversation
{
"spkid": "spk1" // Speaker id referenced by the
// speech segments
},
{
"spkid": "spk2"
}
],
"SegmentList": [ // An array of speech segments
// (sentences). Each segment
// should be punctuated after
// the last word.
{
"spkid": "spk1", // The speaker id for this speech
// segment.
"question": "False", // Attribute to signal if this
// speech segment contains a
// question. Segments with this
// attribute = True should be
// punctuated with a ? instead of
// a .
"words": [ // An array of words said in this
// speech segment
{
"text": "goedemorgen", // The text of the word.
"conf": "1.00", // Confidence score*
"dur": "0.75", // Duration of the word in seconds
"stime": "2.23" // Start time of the word in seconds
// since the begining of the audio.
},
{
"text": "inpakt",
"conf": "0.50",
"dur": "0.45",
"stime": "3.10"
},
{
"text": "energie",
"conf": "0.77",
"dur": "0.39",
"stime": "3.55"
}
]
},
{
"spkid": "spk2",
"question": "True",
"words": [
{
"text": "goedemiddag",
"conf": "0.97",
"dur": "0.54",
"stime": "7.24"
},
{
"text": "waar",
"conf": "1.00",
"dur": "0.18",
"stime": "7.84"
}
]
]
}*More info about confidence scores
A subscription to Contexta Analyzer will also make meta data fields visible.