Skip to content

Conversation

@DomhnallBoyle
Copy link

Hi, I created this pull request to add word log-likelihood scores to the output

I used this for other projects but hopefully it can be useful for someone else too

Thanks

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Type: Enhancement

PR Summary: The pull request introduces an enhancement to the existing functionality by adding log-likelihood scores to the output of word and phone alignments. This change allows for a more detailed analysis of the alignment results by including the log-likelihood scores alongside the start and end times of phones within words.

Decision: Comment

📝 Type: 'Enhancement' - not supported yet.
  • Sourcery currently only approves 'Typo fix' PRs.
✅ Issue addressed: this change correctly addresses the issue or implements the desired feature.
No details provided.
✅ Small diff: the diff is small enough to approve with confidence.
No details provided.

General suggestions:

  • Consider adding error handling or validation for the new log-likelihood score extraction to prevent potential runtime errors due to unexpected data formats. This could improve the robustness of the feature and ensure consistent behavior across different datasets.
  • Given the addition of log-likelihood scores, it might be beneficial to update any related documentation or examples to demonstrate how to use and interpret these new scores. This could help users take full advantage of the new feature.

Thanks for using Sourcery. We offer it for free for open source projects and would be very grateful if you could help us grow. If you like it, would you consider sharing Sourcery on your favourite social media? ✨

Share Sourcery

Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.


# Append this phone to the latest word (sub-)list
ph = lines[j].split()[2]
log_likelihood = float(lines[j].split()[3])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (llm): Extracting the log_likelihood directly from lines[j].split()[3] without checking if the log_likelihood value exists or if the split operation resulted in enough elements could lead to an IndexError if the data format is ever not as expected. Consider adding a check to ensure that the data format is correct before accessing the index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants