-
Notifications
You must be signed in to change notification settings - Fork 41
Description
This diarization doesn't compare favorably with Whisper. Wondering if I'm missing something in call parameters or other
On this two minute video - https://www.youtube.com/watch?v=xbyEs7DJshw&ab_channel=HipronarySchool%23Callcenter ,
speechbox segments as 14 speaker transitions as below (vs something between 37 and 45). Code used is pretty trivial but attached as context
--
speaker text
0 SPEAKER_00 Hello, can you take a picture for a spot in h...
1 SPEAKER_01 How don't I have any?
2 SPEAKER_00 Yes, it's ATK 0804949. Okay, just let me veri...
3 SPEAKER_01 You did? For the ninth time, the only is not ...
4 SPEAKER_00 Okay, sir. I totally understand your situatio...
5 SPEAKER_01 Okay, yeah, yeah, we usually then our brother...
6 SPEAKER_00 Could you take a look design the boss and ver...
7 SPEAKER_01 But they're not a cable in the dogs. Okay, le...
8 SPEAKER_00 Sores are my mistake current in system, the e...
9 SPEAKER_01 Okay, but if you have a lot of trouble going ...
10 SPEAKER_00 If you want anything already wrong from us, w...
11 SPEAKER_01 No, I don't need different deals, guys. Thank...
12 SPEAKER_00 For doing your doctor, I will just cut your f...
13 SPEAKER_01 Yeah, yeah, I'll write right now.
14 SPEAKER_00 How night is?
timestamp
0 (0.0, 13.2)
1 (13.2, 14.7)
2 (14.7, 53.0)
3 (53.0, 57.0)
4 (57.0, 66.8)
5 (66.8, 71.16)
6 (72.0, 76.8)
7 (77.28, 80.72)
8 (82.56, 101.76)
9 (101.76, 107.0)
10 (107.0, 111.0)
11 (111.0, 114.0)
12 (114.0, 117.0)
13 (117.0, 119.0)
sharpenspeechbrain.py.zip
14 (119.0, 120.0)