problem "fit" with current relation #1339
Unanswered
lepelletieralexandre
asked this question in
Potential bug
Replies: 1 comment 1 reply
-
|
Can you please tell me which version are you using? Because I am unable to reproduce your error on the latest branch. from verticapy import vDataFrame
from verticapy.machine_learning.vertica.linear_model import LogisticRegression
from verticapy._utils._sql._sys import _executeSQL
# Create sample data
df = vDataFrame({
"survived": [0, 1, 1, 0, 1],
"age": [22, 38, 26, 35, 28],
"fare": [7.25, 71.28, 7.92, 53.1, 8.05]
})
# Push to Vertica DB with a long name that includes "SELECTION"
long_table_name = "TVUL_STAT_titanic_CHAID_INT_SANS_NA_SELECTION_OK_DEMO_2"
df.to_db(long_table_name)
# Define the Logistic Regression model
model = LogisticRegression("titanic_model_selection_bug_2")
# Fit the model (this would also trigger the same splitting internally)
try:
model.fit(input_relation=df, X=["age", "fare"], y="survived")
except Exception as e:
print("Expected error during fit():")
print(e) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
There is a bug when I use the "fit" method with "LogisticRegression" model.
In fact, I have this error :
MissingRelation: Severity: ERROR, Message: Relation "SCHEMA.TVUL_STAT_titanic_CHAID_INT_SANS_NA_ SELECT ION_OK_DECOUP_MAIN_TRAIN_TEST_train_DUMMIES_MODELE_titanic_test" does not exist, Sqlstate: 42V01, Routine: throwRelationDoesNotExist,
There are 2 spaces between "SELECT" for "SELECTION" word which have been added.
I found where the problem comes from.
In "fit" method here : https://github.com/vertica/VerticaPy/blob/master/verticapy/machine_learning/vertica/base.py , line 2354, you use "self.input_relation = input_relation.current_relation()". But, by default, the parameter "reindent" in "current_relation" function is TRUE.
line 432 here -> https://github.com/vertica/VerticaPy/blob/master/verticapy/core/vdataframe/_sys.py#L432 , you call "indent_vpy_sql" function, which is define here line 1219 -> https://github.com/vertica/VerticaPy/blob/master/verticapy/_utils/_sql/_format.py , where spaces are adds between "SELECT" word.
I think, when we use "fit" fonction, the "reindent" parameter from "current_relation" need to be False. And, in general, all function which call "current_relation" function to capture "input_relation" in the code need to define False value to "reindent" parameter ;
for exemple, line 7837 for class Unsupervised(VerticaModel).
Can you fix this problem?
Best regards,
Alexandre
Beta Was this translation helpful? Give feedback.
All reactions