-
Notifications
You must be signed in to change notification settings - Fork 32
How to include structural zeros? #152
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or requestquestionFurther information is requestedFurther information is requested
Description
What's the preferred way to model structural zeros in a Formula?
Assume the following toy example: I have a
| e | f | |
|---|---|---|
| a | 1 | 0 |
| b | 2 | 3 |
| c | 4 | 0 |
given as a pandas dataframe as follows:
df = pd.DataFrame(
data={
'F1': ['a', 'a', 'b', 'b', 'c', 'c'],
'F2': ['e', 'f', 'e', 'f', 'e', 'f'],
'n': [ 1, 0, 2, 3, 4, 0]
})The combinations n ~ C(F1):C(F2) on that data as follows
y, X = Formula('n ~ C(F1):C(F2)').get_model_matrix(df, ensure_full_rank=False)then the corresponding variables C(F1)[T.a]:C(F2)[T.f] and C(F1)[T.c]:C(F2)[T.f] are columns of X. Is there a way to remove these parameters already in the formula? Is there another concept in formulaic to deal with this type of constraints?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestquestionFurther information is requestedFurther information is requested