-
Notifications
You must be signed in to change notification settings - Fork 10
Speed up null fitting for categorical X's via Fisher scoring + line search #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
Hi @svteichman -- I think this is ready for your feedback and edits. A "wish list", not all essential
Please let me know your questions -- and thank you!!!! |
|
Very exciting! I'll take a look and get started on integrating this into the codebase. |
|
This PR is superseded by PR #179, and can be closed once that is reviewed and merged (although we may want to remember someday that this PR is the place where Amy derived the second derivative of the pseudohuber function!!) |
The goal here is not to have a general faster fitting but some major speed ups for the common case of categorical covariates and pseudoHuber median g(.).
It turns out that #90 is impossible (no closed form for pH median g(.)), but we can speed things up.
I'm seeing perhaps 1000x faster, basically instantaneous fitting, but I haven't broadly tested it over a range of p and J. (n doesn't matter here)
In the one case I investigated where estimates differ from
fit_null, this approach had higher likelihood and it strictly enforces the g constraints (ie constraint_tol is trivially 0).What's it actually doing? Fisher scoring (in the free parameters) + a line search to ensure the likelihood is increasing at each step.
Lots of ChatGPT here, which is fine because I know what I want.
Not ready for review, but opening to give some updates to @svteichman
Next steps (AW) - taken from my notes in fit_null_discrete