First attempt at wrapping ABCpy - work-in-progress#4
First attempt at wrapping ABCpy - work-in-progress#4LoryPack wants to merge 9 commits intosbi-benchmark:mainfrom
Conversation
|
Hi, first off, this is great, thanks a lot for the PR! To answer your questions:
To compare algorithms we mostly use 2-sample tests, e.g. C2ST. Let's say we run rejection ABC at a simulation budget of 1k simulations and use a quantile of 0.1, so we end up with 100 top samples. Typically, we compute C2STs against 10k reference posterior samples and would like to compute C2ST on a balanced dataset (i.e., containing as many reference as approximate posterior samples). The We could either resample from the population of 100 samples to obtain 10k samples or fit a KDE to obtain more samples -- in the manuscript's appendix, we compare both and found that doing a KDE fit slightly improves performance for some tasks. So it'd be great to have the KDE option for (Side note: We also checked computing C2STs using 1k instead of 10k samples -- it did not make much of a difference in terms of overall results -- we opted for 10k to be on the safe side).
We only used quantiles / num_top_samples in the manuscript in the end, but I agree it would be nice to have in case it is easy to do.
This sounds very useful to me -- it would allow exploring whether splitting the budget to do multi-simulations per parameter is helpful on a given task.
SASS and LRA were for some experiments we report in the appendix, it would be very nice to have but I think it's totally fine if they are not supported in the first version. Let me know if more questions come up, or something was unclear in my explanations above. Best, Jan-Matthis |
|
Hi Jan-Matthis, Thanks for your reply, that was very clear. I'll add the KDE, eps and the multiple simulations per parameter value in the next iteration, and update once that is done. I'll skip the SASS and LRA for now. Thanks for your help |
|
Hello, I have finally had time to work on this (sorry for the long gap from my last update). As suggested, I have done the followin:
Just as a quick remark, I noticing that some trials raise the following warning when selecting the kernel bandwidth for KDE with cross-validation: Please let me know if there is anything you'd like to be changed in some way. |
jan-matthis
left a comment
There was a problem hiding this comment.
Hi, thanks a lot, great progress!
I have done a quick first pass over the PR and left a few comments here and there. Perhaps @janfb can have a look as well?
Regarding the KDE error, could you post a short code snippet to reproduce it (ideally with a fixed random seed)? We'll have a closer look
|
Thanks for that, will work on incorporating the updates you suggested. Once this is OK for the simple Rejection ABC, I am thinking of wrapping some of the other ABC algorithms which we have in ABCpy, as well as the Synthetic Likelihood ones. The issue is however for some ABC algorithms you cannot know a priori how many simulations they will require; I will therefore start by wrapping the ones which have a fixed simulations budget. |
Remove random seed and small fix
janfb
left a comment
There was a problem hiding this comment.
Looks great! I added a comment and a question.
| ) | ||
| journal_standard_ABC = sampler.sample( | ||
| [[np.array(observation)]], | ||
| n_samples=num_simulations // num_simulations_per_param, |
There was a problem hiding this comment.
the n_samples is a fixed budget, right? there is in principle no way it can be exceeded, and if it was then we would get a SimulationBudgetExceeded exception because of the max_calls passed above, no?
There was a problem hiding this comment.
Basically in ABCpy Rejection ABC fixes a number of samples with distance from the observation below a given threshold, and simulates from the model until that number of samples are accepted. As a workaround to get a fixed simulation budget, I used an extremely large epsilon so that all simulation are accepted. The results are then post-processed and the ones with smaller distance are accepted.
I realize it's not the cleanest implementation ever, but it should work here. As in ABCpy we rarely use RejectionABC for practical purposes, we never improved the implementation so that it allows a fixed simulation target.
Hello,
I thought I'd open this pull request to keep track of my progress and ask for feedback (even if I don't think this is ready to merge yet).
I've done my first attempt at wrapping ABCpy. For now I've wrapped the Rejection ABC algorithm (I know you've that already, but that was the easiest one to do).
It seems to work (I've added a
try_ABCpy.pyfile if you want to experiment with it), but I have made quite some simplifications as I did not know how to do some things. Precisely:num_simulationswhich is the total number of simulations which you allow, and thennum_sampleswhich is the returned number of posterior samples; if I am not wrong, you generate those with KDE from the ABC samples (at least this is what happens in the pyABC wrap). In my wrap of ABCpy, I have for now run Rej-ABC fornum_simulationstimes, and then simply returned as posterior samples the ones obtained when considering the quantile of distances defined byquantile, or thenum_top_samplesones. I am therefore not returning the specifiednum_samples. Do you suggest using your KDE code in this case as well? I believe it should be easily doable, only did not know what was the precise aim of it.num_top_samples,quantile,eps. I have not yet used theepsone, but will add code for that soon.Please let me know what you think of my attempt.