Skip to content

Constrained Recall Objective has weird interaction with LightGBM early stopping criteria #11

@AndreFCruz

Description

@AndreFCruz

Description

  • When optimizing for Recall (minimizing FNR), only label positive samples are considered for computing the loss or its gradient;
  • However, passing a gradient of zero for all label negatives leads to weird behavior in the GBM::Train function;
  • So, for now, we're scaling down the gradient of all label negatives by multiplying them with a tiny positive number: see the label_negative_weight in ConstrainedRecallObjective::GetGradients;
    • This shouldn't be needed, but seems to temporarily fix the issue with no unintended consequences (as the gradient flowing is very small);

Reproducible example

  1. Omit the else clause in ConstrainedRecallObjective::GetGradients, which deals with label negative samples, and in theory should not be needed for optimizing for recall;
  2. Compile and run, and observe weird "-inf split" messages, which can lead to training stopping too early;

Metadata

Metadata

Assignees

No one assigned

    Labels

    S effortT-shirt effort weighing: SbugSomething isn't workinglow priorityNice to have but not crucial

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions