Ticket Title
[Feature] Implementation of Logistic Regression ML Model for Confidence Score
Description
This feature introduces the full implementation of the logistic regression–based machine learning model used to generate an AI-confidence score for uploaded images. With the OpenCV feature extractor and C2PATool metadata integration already completed, this ticket focuses on building the complete ML pipeline for training, exporting, loading, and performing inference using logistic regression.
The logistic regression model will be trained offline in Python using a dataset of labeled images (AI-generated vs. non-AI). Once trained, the model weights will be exported to a JSON file that the Java backend loads at runtime. The API will then compute a final confidence score (0–1) for each uploaded image using:
- OpenCV visual features
- C2PA metadata
- Logistic regression weight vector + bias
- Standard logistic sigmoid function
This ticket ensures the backend service can perform inference deterministically, efficiently, and without requiring Python during runtime.
Objectives
- Implement Java-based logistic regression inference using:
- Pre-computed weight vector
- Bias term
- Sigmoid probability output
- Add a ModelLoader utility for reading trained model weights (
model.json) at application startup.
- Add a LogisticRegressionService that:
- Accepts feature vectors generated by
FeatureExtractor
- Computes
z = w·x + b
- Generates final probability via
sigmoid(z)
- Normalizes output into a confidence score (0–1)
- Define a stable
model.json schema for imported model weights.
- Integrate ML pipeline into
AnalyzeService to produce final AI likelihood score.
- Provide scaffolding for dataset generation (CSV) for ML training (if not already implemented).
- Ensure the implementation supports future retraining without code changes.
Expected Output Structure
model.json
{
"weights": [ ... numeric array ... ],
"bias": 0.12345
}
### **Inference Ouput (Java)**
{
"confidenceScore": 0.8731,
"c2paUsed": true/false,
"modelVersion": "v1"
}
Ticket Title
[Feature] Implementation of Logistic Regression ML Model for Confidence Score
Description
This feature introduces the full implementation of the logistic regression–based machine learning model used to generate an AI-confidence score for uploaded images. With the OpenCV feature extractor and C2PATool metadata integration already completed, this ticket focuses on building the complete ML pipeline for training, exporting, loading, and performing inference using logistic regression.
The logistic regression model will be trained offline in Python using a dataset of labeled images (AI-generated vs. non-AI). Once trained, the model weights will be exported to a JSON file that the Java backend loads at runtime. The API will then compute a final confidence score (0–1) for each uploaded image using:
This ticket ensures the backend service can perform inference deterministically, efficiently, and without requiring Python during runtime.
Objectives
model.json) at application startup.FeatureExtractorz = w·x + bsigmoid(z)model.jsonschema for imported model weights.AnalyzeServiceto produce final AI likelihood score.Expected Output Structure
model.json
{ "weights": [ ... numeric array ... ], "bias": 0.12345 } ### **Inference Ouput (Java)** { "confidenceScore": 0.8731, "c2paUsed": true/false, "modelVersion": "v1" }