Skip to content
This repository was archived by the owner on Oct 24, 2022. It is now read-only.
This repository was archived by the owner on Oct 24, 2022. It is now read-only.

Create ML model to eliminate false-possitives and increase accuracy #3

@pdparchitect

Description

@pdparchitect

It does not matter how well we tune the regular expressions the method will always be subject to false-positives. One effective way to reduce the noise is to use an ML model for filtering (not detection).

To build the ML model, the following steps are required:

  1. Download a large body of content known to produce false-positives (js files and other source code).
  2. Run the current set of detectors to extract leaks (the generic secrets set is most suitable).
  3. Use brain.js or an equivalent framework to train a model to spot the false-positives.
  4. Compile the model.
  5. Use the model to filter results from problematic detectors (again, the generic secrets is most suitable).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions