Add logic to convert to srt #1

Utsal20 · 2020-01-23T05:22:09Z

Things left to do:

Modify logic to read and write to/from an AWS bucket
Make the implementation more robust
Possibly add tests

ramnanib2 · 2020-01-29T08:30:55Z

transcribe.py

 		logger.info("Starting Transcription of Video File: %s" % video_file)
 		comment = transcribe_video_file(video_file)
 		output[video_file] = comment
+		convert_transcribe_to_srt(video_file)


It would be nice to do a best-effort post-processing of comments within a try-catch block, so that if the post-processing of a single transcript fails we can still move on and complete the rest.

ramnanib2 · 2020-01-29T08:32:44Z

transcribe.py

+			else:
+				end = format_time(items[len(items)-1]['end_time'])
+
+		with open(transcript_file_name_from_video_file_name(video_file).replace('.json', '.srt'), 'w', encoding='utf-8') as f:


Ideally we'd like to create one new srt file and one new txt file containing the topmost ranked transcript.

ramnanib2 · 2020-01-29T08:33:50Z

transcribe.py

+	logger.info("Conversion to srt started for video file: %s" % video_file)
+	with open(transcript_file_name_from_video_file_name(video_file), encoding='utf-8') as f:
+		raw = json.load(f)
+		items = raw['results']['items']


if results are null or empty and items are null or empty, log the same and move on.

ramnanib2 · 2020-01-29T08:40:11Z

transcribe.py

+			start = format_time(current)
+			if token['type'] == 'punctuation':
+				next_line = next_line[0:-1] + token['alternatives'][0]['content']
+				end = format_time(items[counter - 1]['end_time'])


If the punctuation is at the beginning of the transcript, won't this throw an index-out-of-bounds exception ?

ramnanib2 · 2020-01-29T08:48:10Z

transcribe.py

+				next_line = token['alternatives'][0]['content'] + ' '
+				current = float(token['start_time'])
+			else:
+				next_line += token['alternatives'][0]['content'] + ' '


It seems like any token with end_time - start_time > 5.0 will be written to the srt file. However, I'm not seeing how a sequence of tokens with smaller individual time spans will be strung together in a single sentence ?

ramnanib2

Great work ! Thanks for making the changes. Some minor comments. I think the logic within can be simplified a little within the items loop.

add logic to convert to srt

9a7b1f6

Utsal20 requested a review from ramnanib2 January 23, 2020 05:22

ramnanib2 reviewed Jan 29, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add logic to convert to srt #1

Add logic to convert to srt #1

Uh oh!

Utsal20 commented Jan 23, 2020

Uh oh!

ramnanib2 Jan 29, 2020 •

edited

Loading

Uh oh!

ramnanib2 Jan 29, 2020

Uh oh!

ramnanib2 Jan 29, 2020

Uh oh!

ramnanib2 Jan 29, 2020

Uh oh!

ramnanib2 Jan 29, 2020

Uh oh!

ramnanib2 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add logic to convert to srt #1

Are you sure you want to change the base?

Add logic to convert to srt #1

Uh oh!

Conversation

Utsal20 commented Jan 23, 2020

Uh oh!

ramnanib2 Jan 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramnanib2 Jan 29, 2020

Choose a reason for hiding this comment

Uh oh!

ramnanib2 Jan 29, 2020

Choose a reason for hiding this comment

Uh oh!

ramnanib2 Jan 29, 2020

Choose a reason for hiding this comment

Uh oh!

ramnanib2 Jan 29, 2020

Choose a reason for hiding this comment

Uh oh!

ramnanib2 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ramnanib2 Jan 29, 2020 •

edited

Loading