Skip to content

Add glum support#13

Draft
s3alfisc wants to merge 3 commits intomainfrom
glum
Draft

Add glum support#13
s3alfisc wants to merge 3 commits intomainfrom
glum

Conversation

@s3alfisc
Copy link
Copy Markdown
Member

@s3alfisc s3alfisc commented Nov 23, 2025

Closes #9 .

import pandas as pd
import sklearn
from sklearn.datasets import fetch_openml
from glum import GeneralizedLinearRegressor, GeneralizedLinearRegressorCV
import maketables as mt

house_data = fetch_openml(name="house_sales", version=3, as_frame=True)

# Use only select features
X = house_data.data[
    [
        "bedrooms",
        "bathrooms",
        "sqft_living",
        "floors",
        "waterfront",
        "view",
        "condition",
        "grade",
        "yr_built",
    ]
].copy()

# Targets
y = house_data.target

X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
    X, y, test_size = 0.3, random_state=5
)

glm = GeneralizedLinearRegressor(family="normal", alpha=0.1, l1_ratio=1)
glm.fit(X_train, y_train)
glm.covariance_matrix(X = X_train, y = y_train, store_covariance_matrix = True, clusters=pd.factorize(X_train.bedrooms)[0])

mt.ETable(
    [glm], 
    caption="GLM FIT via GLUM"
)
image

@s3alfisc s3alfisc marked this pull request as draft November 23, 2025 13:33
return "y"

def fixef_string(self, model: Any) -> str | None:
"""GLUM doesn't typically have fixed effects notation."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any categorical variable passed to Glum will be treated as fixed effects. The resulting column name will follow the formatting argument categorical_format.

I'm not sure if that changes anything here, since we don't have a fixed way to describe fixed effects.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was poor phrasing on my end - the doctoring should rather read "GLUM does not sweep out fixed effects during estimation and therefore fixed effects are (efficiently) estimated as 'regular' coefficients". This is in contrast to pyfixest where we use FWL to sweep out the fixed effects and need to document this somehow in the regression table.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to clarify it with 10102c8.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thanks for the feedback / taking a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

glumsupport

2 participants