Instantiating a LinearMeasurement object appears to be very slow on GPU:
https://github.com/ryan112358/private-pgm/blob/57554e604733a52f6dd217bd67feef0aa7a7435b/src/mbi/marginal_loss.py#L74-L80
I tried removing the converter and the code seems to run fine, and actually faster. My guess is that many calls to send data in advance to the GPU memory are not efficient at all.