Excellent work!
I find the data selection for the 665K instruction dataset in LLaVa 1.5 to be very challenging. It seems that LLaVa 1.5 is a significantly underfit model, which makes the data selection less meaningful. Can we instead use a better pre-trained model and a larger data pool, so that data selection becomes more meaningful in this context?