Skip to content

[Feature] Implementation of Logistic Regression ML Model for Confidence Score #66

@ikeschmack

Description

@ikeschmack

Ticket Title

[Feature] Implementation of Logistic Regression ML Model for Confidence Score


Description

This feature introduces the full implementation of the logistic regression–based machine learning model used to generate an AI-confidence score for uploaded images. With the OpenCV feature extractor and C2PATool metadata integration already completed, this ticket focuses on building the complete ML pipeline for training, exporting, loading, and performing inference using logistic regression.

The logistic regression model will be trained offline in Python using a dataset of labeled images (AI-generated vs. non-AI). Once trained, the model weights will be exported to a JSON file that the Java backend loads at runtime. The API will then compute a final confidence score (0–1) for each uploaded image using:

  • OpenCV visual features
  • C2PA metadata
  • Logistic regression weight vector + bias
  • Standard logistic sigmoid function

This ticket ensures the backend service can perform inference deterministically, efficiently, and without requiring Python during runtime.


Objectives

  • Implement Java-based logistic regression inference using:
    • Pre-computed weight vector
    • Bias term
    • Sigmoid probability output
  • Add a ModelLoader utility for reading trained model weights (model.json) at application startup.
  • Add a LogisticRegressionService that:
    • Accepts feature vectors generated by FeatureExtractor
    • Computes z = w·x + b
    • Generates final probability via sigmoid(z)
    • Normalizes output into a confidence score (0–1)
  • Define a stable model.json schema for imported model weights.
  • Integrate ML pipeline into AnalyzeService to produce final AI likelihood score.
  • Provide scaffolding for dataset generation (CSV) for ML training (if not already implemented).
  • Ensure the implementation supports future retraining without code changes.

Expected Output Structure

model.json

{
  "weights": [ ... numeric array ... ],
  "bias": 0.12345
}

### **Inference Ouput (Java)**
{
  "confidenceScore": 0.8731,
  "c2paUsed": true/false,
  "modelVersion": "v1"
}

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions