Classification experiments and logging#50
Conversation
heidmic
left a comment
There was a problem hiding this comment.
Large amount of changes needed to achieve acceptable standard for merge
.gitignore
Outdated
| @@ -1,7 +1,13 @@ | |||
| # Requirements | |||
| requirements.txt | |||
.gitignore
Outdated
| **/mlruns/** | ||
| mlruns/** | ||
| mlruns* | ||
| **/onlineruns/** |
.gitignore
Outdated
| */output_json/** | ||
|
|
||
| # Tex | ||
| *.tex |
.gitignore
Outdated
| *.tex | ||
|
|
||
| # Jupyter | ||
| .ipynb_checkpoints/ |
| */output/** | ||
|
|
||
| # Json | ||
| output_json/** |
There was a problem hiding this comment.
creates a .json-file of the logging results using suprb.json.dump (select json-files are used in latex_rule.py)
There was a problem hiding this comment.
runs alternate classifiers on datasets and produces .csv of complexity results for dt
There was a problem hiding this comment.
effectively duplicates https://github.com/heidmic/suprb-experimentation/blob/main/runs/comparisons/rf.py
There was a problem hiding this comment.
uses RandomForestClassifier instead of RandomForestRegressor
slurm/class_tuning.sbatch
Outdated
There was a problem hiding this comment.
way too many duplications in this folder. should be a single script
RomanSraj
left a comment
There was a problem hiding this comment.
We shouldn't have so many different sbatches. My suggestion is: Have 1 sbatch file where we can give parameters to and have a bash file with all the different evaluations we want to run. That way we could comment in/out the ones we want to test and don't forget how we ran evaluations before. Currently these many files just pollute the repo
| """Always use f1 and accuracy for evaluation on classification tasks.""" | ||
|
|
||
| if scoring is None: | ||
| scoring = set() |
There was a problem hiding this comment.
This should probably generate a warning that no scoring has been chosen
| scoring = set(scoring) | ||
| else: | ||
| scoring = {scoring} | ||
| scoring.update({'f1', 'accuracy'}) |
There was a problem hiding this comment.
Wouldn't it be better to have f1 and accuracy as default values for scoring and give the user the option to add these together with whatever scoring they want as opposed to adding f1 and accuracy every time?
| def check_scoring(scoring): | ||
| """Always use R^2 and MSE for evaluation.""" | ||
| def check_regression_scoring(scoring): | ||
| """Always use R^2 and MSE for evaluation on regression tasks.""" |
There was a problem hiding this comment.
This should probably generate a warning that no scoring has been chosen
Wouldn't it be better to have f1 and accuracy as default values for scoring and give the user the option to add these together with whatever scoring they want as opposed to adding f1 and accuracy every time?
| # Performs model swapping on a trained SupRB estimator and evaluates it | ||
| def __init__( | ||
| self, | ||
| dummy_estimator: BaseEstimator, |
There was a problem hiding this comment.
Why is this called dummy_estimator?
| y_train, y_test = self.y[train_index], self.y[test_index] | ||
| estimator = self.trained_estimators[i] | ||
| estimator.model_swap_fit(self.local_model,X_train, y_train) | ||
| estimator.logger_ = DefaultLogger() |
There was a problem hiding this comment.
Shouldn't this be given when creating an evaluation and not be hardcoded to DefaultLogger?
| #mixing = mixing_model.ErrorExperienceClassification() | ||
| #matching_type=rule.matching.BinaryBound() | ||
| #fitness = rule.fitness.PseudoAccuracy() |
| #params.rule_generation__init__fitness = getattr(MeanInit, "fitness")() | ||
| #if isinstance(params.rule_generation__init__fitness, rule.fitness.VolumeWu): | ||
| # params.rule_generation__init__fitness__alpha = trial.suggest_float( | ||
| # 'rule_generation__init__fitness__alpha', 0.01, 0.2) |
| #params.solution_composition__init__mixing__filter_subpopulation__rule_amount = 4 | ||
| #params.solution_composition__init__mixing__experience_weight = 1.0 | ||
| #params.solution_composition__init__mixing = mixing |
| # Upper and lower bound clip the experience into a given range | ||
| # params.solution_composition__init__mixing__experience_calculation__lower_bound = trial.suggest_float( | ||
| # 'solution_composition__init__mixing__experience_calculation__lower_bound', 0, 10) |
| return getattr(datasets, method_name)(**kwargs) | ||
|
|
||
|
|
||
| #@click.command() |
There was a problem hiding this comment.
Why is click commented out here?
No description provided.