[ENH]: Calculate F1 following the recommendations in Forman et at.

### Which feature do you want to include?

For extreme imbalance scenarios (1 to 5% positive classes), F1 should **not** be calculated in each fold and then averaged. Instead, the True positives and False positives should be counted in each fold, and then a final F1 score calculated. This avoids biased results when computing F1 in each fold (which could also be undetermined, if no True classes are in the test set).
Both performances converge when the problem is balanced.

[Forman et at.](https://www.kdd.org/exploration_files/v12-1-p49-forman-sigkdd.pdf)

### How do you imagine this integrated in julearn?

Retain the True positives and False positives for each fold, and then calculate a final F1 score.


### Do you have a sample code that implements this outside of julearn?

```shell

```

### Anything else to say?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH]: Calculate F1 following the recommendations in Forman et at. #289

Which feature do you want to include?

How do you imagine this integrated in julearn?

Do you have a sample code that implements this outside of julearn?

Anything else to say?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ENH]: Calculate F1 following the recommendations in Forman et at. #289

Description

Which feature do you want to include?

How do you imagine this integrated in julearn?

Do you have a sample code that implements this outside of julearn?

Anything else to say?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions