-
Notifications
You must be signed in to change notification settings - Fork 8
Added binomial model #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #79 +/- ##
===========================================
- Coverage 99.18% 72.42% -26.76%
===========================================
Files 11 20 +9
Lines 369 486 +117
===========================================
- Hits 366 352 -14
- Misses 3 134 +131 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/binomial_model.jl
Outdated
| function obj(x) | ||
| mul!(Ax, A', x) | ||
| # Stable computation of log(1+exp(z)) - b*z | ||
| return sum(@. (Ax > 0) * ((1 - b) * Ax + log(1 + exp(-Ax))) + (Ax <= 0) * (log(1 + exp(Ax)) - b * Ax)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m a bit confused. If this is a nonlinear least-squares problem, the objective should be resid!(r, x), and the gradient should be the vector g that results from jactv!(g, x, r).
src/binomial_model.jl
Outdated
| nlp_model, nls_model = binomial_model(A, b) | ||
|
|
||
| Return an instance of an `NLPModel` representing the binomial logistic regression | ||
| problem, and an `NLSModel` representing the system of equations formed by its gradient. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now, but it’s a bit confusing because the two models don’t represent the same problem. I suggest returning the NLPModel only. For that, you only need implement obj and grad!. If I understand well, the current jacv! and jactv! are actually hprod! (the product between the Hessian and a vector), hence why they are the same.
It would be easy to write a generic wrapper to minimize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saravkin I think what you want to do is the following:
function binomial_model(A, b)
# preallocate storage, etc...
function obj(x)
# return sum(exp(...))
end
function grad!(g, x)
# ... what you put in `resid!()`
# fill `g` with the gradient at `x`
end
function hprod!(hv, x, v; obj_weight = 1)
# ... what you put in `jacv!()`
# fill `hv` with the product of the Hessian at `x` with `v`
end
return NLPModel(x0, obj, grad = grad!, hprod = hprod!, meta_args = (name = "Binomial"))
endYour problems isn't a least-squares problem, so you only return the NLPModel.
|
Just updated the binomial model and added a test that shows it works as expected. |
dpo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks! I made a number of suggestions to make your code more generic (it works more generally than with Float64; e.g., it should work in Float32 or Float128).
Also, there's no need to commit the Manifest.toml.
| Minimize | ||
| f(x) = sum( log(1 + exp(a_i' * x)) - b_i * (a_i' * x) ) | ||
| where b_i ∈ {0, 1} and `A` is (m features) × (n samples). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Minimize | |
| f(x) = sum( log(1 + exp(a_i' * x)) - b_i * (a_i' * x) ) | |
| where b_i ∈ {0, 1} and `A` is (m features) × (n samples). | |
| Minimize | |
| f(x) = sum( log(1 + exp(a_i' * x)) - b_i * (a_i' * x) ) | |
| where b_i ∈ {0, 1} and `A` is (m features) × (n samples). |
| Minimize | ||
| f(x) = sum( log(1 + exp(a_i' * x)) - b_i * (a_i' * x) ) | ||
| where b_i ∈ {0, 1} and `A` is (m features) × (n samples). | ||
| Equivalently, z = A' * x (length n). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this last sentence. There's no z in the description above.
| where b_i ∈ {0, 1} and `A` is (m features) × (n samples). | ||
| Equivalently, z = A' * x (length n). | ||
| """ | ||
| function binomial_model(A, b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| function binomial_model(A, b) | |
| function binomial_model(A::AbstractMatrix{T}, b::AbstractVector{T}) where T <: Real |
| eltype(A) <: Real || throw(ArgumentError("A must have a real element type")) | ||
| eltype(b) <: Real || throw(ArgumentError("b must have a real element type")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| eltype(A) <: Real || throw(ArgumentError("A must have a real element type")) | |
| eltype(b) <: Real || throw(ArgumentError("b must have a real element type")) |
| Ax = zeros(Float64, n) # z = A' * x | ||
| p = zeros(Float64, n) # sigmoid(z) | ||
| w = zeros(Float64, n) # p*(1-p) | ||
| tmp_n = zeros(Float64, n) # sample-space buffer | ||
| tmp_v = zeros(Float64, n) # sample-space buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Ax = zeros(Float64, n) # z = A' * x | |
| p = zeros(Float64, n) # sigmoid(z) | |
| w = zeros(Float64, n) # p*(1-p) | |
| tmp_n = zeros(Float64, n) # sample-space buffer | |
| tmp_v = zeros(Float64, n) # sample-space buffer | |
| Ax = similar(b) # z = A' * x | |
| p = similar(b) # sigmoid(z) | |
| w = similar(b) # p*(1-p) | |
| tmp_n = similar(b) # sample-space buffer | |
| tmp_v = similar(b) # sample-space buffer |
| s = 0.0 | ||
| @inbounds @simd for i in 1:n | ||
| zi = Ax[i] | ||
| s += softplus(zi) - Float64(b[i]) * zi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| s = 0.0 | |
| @inbounds @simd for i in 1:n | |
| zi = Ax[i] | |
| s += softplus(zi) - Float64(b[i]) * zi | |
| s = zero(T) | |
| @inbounds @simd for i in 1:n | |
| zi = Ax[i] | |
| s += softplus(zi) - b[i] * zi |
|
|
||
| @inbounds @simd for i in 1:n | ||
| p[i] = sigmoid(Ax[i]) | ||
| tmp_n[i] = p[i] - Float64(b[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| tmp_n[i] = p[i] - Float64(b[i]) | |
| tmp_n[i] = p[i] - b[i] |
|
|
||
| @inbounds @simd for i in 1:n | ||
| pi = sigmoid(Ax[i]) | ||
| w[i] = pi * (1.0 - pi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| w[i] = pi * (1.0 - pi) | |
| w[i] = pi * (1 - pi) |
| @. tmp_v *= w # tmp_v .= w .* tmp_v | ||
| mul!(hv, A, tmp_v) # hv = A * tmp_v | ||
|
|
||
| if obj_weight != 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if obj_weight != 1.0 | |
| if obj_weight != 1 |
| return hv | ||
| end | ||
|
|
||
| x0 = zeros(Float64, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| x0 = zeros(Float64, m) | |
| x0 = fill!(similar(b, m), zero(T)) |
I think a basic binomial classifier is good to have. it is not a nonlinear least squares, but provides function/gradient/hessian.