Skip to content

Classification experiments and logging#50

Open
wehrfabi wants to merge 53 commits intoheidmic:mainfrom
wehrfabi:main
Open

Classification experiments and logging#50
wehrfabi wants to merge 53 commits intoheidmic:mainfrom
wehrfabi:main

Conversation

@wehrfabi
Copy link

No description provided.

@heidmic heidmic requested review from RomanSraj and heidmic March 12, 2025 09:15
Copy link
Owner

@heidmic heidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large amount of changes needed to achieve acceptable standard for merge

.gitignore Outdated
@@ -1,7 +1,13 @@
# Requirements
requirements.txt
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be ignored

.gitignore Outdated
**/mlruns/**
mlruns/**
mlruns*
**/onlineruns/**
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

.gitignore Outdated
*/output_json/**

# Tex
*.tex
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

.gitignore Outdated
*.tex

# Jupyter
.ipynb_checkpoints/
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

*/output/**

# Json
output_json/**
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if can be removed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

creates a .json-file of the logging results using suprb.json.dump (select json-files are used in latex_rule.py)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runs alternate classifiers on datasets and produces .csv of complexity results for dt

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uses RandomForestClassifier instead of RandomForestRegressor

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

way too many duplications in this folder. should be a single script

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep old

Copy link
Collaborator

@RomanSraj RomanSraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't have so many different sbatches. My suggestion is: Have 1 sbatch file where we can give parameters to and have a bash file with all the different evaluations we want to run. That way we could comment in/out the ones we want to test and don't forget how we ran evaluations before. Currently these many files just pollute the repo

"""Always use f1 and accuracy for evaluation on classification tasks."""

if scoring is None:
scoring = set()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably generate a warning that no scoring has been chosen

scoring = set(scoring)
else:
scoring = {scoring}
scoring.update({'f1', 'accuracy'})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to have f1 and accuracy as default values for scoring and give the user the option to add these together with whatever scoring they want as opposed to adding f1 and accuracy every time?

def check_scoring(scoring):
"""Always use R^2 and MSE for evaluation."""
def check_regression_scoring(scoring):
"""Always use R^2 and MSE for evaluation on regression tasks."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably generate a warning that no scoring has been chosen

Wouldn't it be better to have f1 and accuracy as default values for scoring and give the user the option to add these together with whatever scoring they want as opposed to adding f1 and accuracy every time?

# Performs model swapping on a trained SupRB estimator and evaluates it
def __init__(
self,
dummy_estimator: BaseEstimator,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called dummy_estimator?

y_train, y_test = self.y[train_index], self.y[test_index]
estimator = self.trained_estimators[i]
estimator.model_swap_fit(self.local_model,X_train, y_train)
estimator.logger_ = DefaultLogger()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be given when creating an evaluation and not be hardcoded to DefaultLogger?

Comment on lines +84 to +86
#mixing = mixing_model.ErrorExperienceClassification()
#matching_type=rule.matching.BinaryBound()
#fitness = rule.fitness.PseudoAccuracy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Comment on lines +128 to +131
#params.rule_generation__init__fitness = getattr(MeanInit, "fitness")()
#if isinstance(params.rule_generation__init__fitness, rule.fitness.VolumeWu):
# params.rule_generation__init__fitness__alpha = trial.suggest_float(
# 'rule_generation__init__fitness__alpha', 0.01, 0.2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Comment on lines +147 to +149
#params.solution_composition__init__mixing__filter_subpopulation__rule_amount = 4
#params.solution_composition__init__mixing__experience_weight = 1.0
#params.solution_composition__init__mixing = mixing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Comment on lines +154 to +156
# Upper and lower bound clip the experience into a given range
# params.solution_composition__init__mixing__experience_calculation__lower_bound = trial.suggest_float(
# 'solution_composition__init__mixing__experience_calculation__lower_bound', 0, 10)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

return getattr(datasets, method_name)(**kwargs)


#@click.command()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is click commented out here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants