Create automated tests to determine model performance.
- Test should measure performance via websocket to a running
node-server and whisper-service instance and via directly instantiation of model in python.
- Should measure latency, accuracy, and other relevant stats of returned transcriptions