Skip to content

Latest commit

 

History

History
27 lines (23 loc) · 1.19 KB

File metadata and controls

27 lines (23 loc) · 1.19 KB

GLM to SQL

What's nice about logistic regression is that the scoring can be done in a simple SQL query (it's just a weighted average transformed with a sigmoid). But how to persuade Rapidminer to generate the SQL?

Assuming that your model is generated with Generalized Linear Model operator, pass the trained model to Execute Script operator. Copy paste the content of glm2sql.java into Execute Script. It will generate the core of SQL that may look like:

select "ID"
      , 1/(1 + exp(-(-7.2194 + "ADRESS_COUNT" + "ADRESS_TYPE_LAST = temporal"))) as "PREDICTED_PROBABILITY"
from (
  select "ID"
        , 0.0688 * "ADRESS_COUNT" as "ADRESS_COUNT"  -- Example of a numeric feature
        , case when "ADRESS_TYPE_LAST" = 'temporal' then 0.7613 else 0 end as "ADRESS_TYPE_LAST = temporal"  -- Example of a nominal feature
  from "MAINSAMPLE"
) t1

Supported features

  1. Linear and logistic regression (with logit linkage)
  2. Numerical and nominal features

GLM to Table

If you want to extract the coefficients and store them into the database (e.g.: for a dashboard), use glm2table.java, which returns ExampleSet.

Contribution

Pull requests are welcomed.