are there steps i need to take to test this out on novel input? say i have videos of my own i would like to generate summaries for