This document describes how raw audio data is collected, processed, and used to train and serve the machine failure prediction models in MechaPulse.
- Overview
- Data Collection
- Feature Engineering
- Model Development Notebooks
- Model Training
- Model Serialization
- Model Serving
- Dataset Schema
Raw Audio → Feature Extraction → Labelled Dataset → Model Training → Serialized Model → REST API
The pipeline follows a classical supervised learning workflow:
- Audio samples are recorded from machines in different states (normal / fault conditions).
- Statistical and spectral features are extracted from each sample.
- Features and labels are saved as CSV datasets.
- Jupyter notebooks develop and evaluate candidate models.
- The best model is serialized to a
.pklfile. - The FastAPI backend loads the
.pklfile and serves predictions via HTTP.
Audio is captured in two ways depending on the deployment stage:
| Method | Where | Details |
|---|---|---|
| Desktop microphone | Raspberry Pi / PC | sounddevice library, 48 kHz, mono, 5-second windows |
| SD card logging | ESP32 (future) | Raw WAV files saved to SD card for offline labelling |
Each recording session produces a 5-second WAV file (test.wav) that is immediately processed for features.
All datasets are stored in notebooks/data/:
| File | Description | Target |
|---|---|---|
anomaly_detection_training.csv |
Unlabelled samples for unsupervised anomaly detection | — |
failure_classification_training.csv |
Multi-class labelled samples (fault types) | Fault class label |
failure_classification_training_v2.csv |
Updated multi-class dataset | Fault class label |
sdcard_failure_classification.csv |
Samples collected via SD card on ESP32 | Fault class label |
Each 5-second audio window is transformed into an 8-dimensional feature vector:
| Feature | Formula | Description |
|---|---|---|
RMS |
√(mean(x²)) |
Energy of the signal |
Mean |
`mean( | x |
MA1 |
`max( | FFT |
MA2 |
`second_max( | FFT |
MA3 |
`third_max( | FFT |
F1 |
`freq[argmax( | FFT |
F2 |
`freq[arg_second_max( | FFT |
F3 |
`freq[arg_third_max( | FFT |
FFT computation:
import scipy.fftpack as fftpk
import numpy as np
FFT_full = abs(fftpk.fft(signal))
FFT = FFT_full[range(len(FFT_full) // 2)] # one-sided spectrum
freqs = fftpk.fftfreq(len(FFT), 1.0 / s_rate)[range(len(FFT_full) // 2)]
sorted_FFT = np.sort(FFT)[::-1]
ma1, ma2, ma3 = sorted_FFT[[0, 1, 2]]
f1, f2, f3 = freqs[[np.where(FFT == ma1)[0],
np.where(FFT == ma2)[0],
np.where(FFT == ma3)[0]]]File: notebooks/data_collection.ipynb
Records audio samples, extracts features, assigns labels, and appends rows to a CSV training file.
Workflow:
- Configure machine state label (e.g.,
0for normal,1for fault). - Run the recording cell to capture a 5-second window.
- Features are extracted and a new row is appended to the target CSV.
- Repeat for each machine state to build a balanced dataset.
File: notebooks/anomaly_detection.ipynb
Trains an unsupervised anomaly detection model (e.g., Isolation Forest, One-Class SVM) on normal-state data only. Any sample deviating significantly from the normal distribution is flagged as an anomaly.
Workflow:
- Load
anomaly_detection_training.csv. - Normalize features.
- Train and tune the anomaly detector.
- Evaluate on held-out normal and fault samples.
File: notebooks/failure_classification.ipynb
Trains a multi-class classifier to distinguish between different fault types (e.g., bearing failure, imbalance, looseness).
Workflow:
- Load
failure_classification_training.csvorv2. - Explore class distributions and feature correlations.
- Train candidate models (Random Forest, SVM, Logistic Regression).
- Evaluate with confusion matrix, precision, recall, F1-score.
- Export best model to
.pkl.
File: notebooks/machine_failure_prediction.ipynb
Trains the primary binary classifier (normal vs. any fault) that is deployed in the production FastAPI server.
Workflow:
- Load and merge labelled datasets.
- Feature selection and engineering.
- Train / test split (stratified).
- Hyperparameter tuning.
- Final evaluation (ROC-AUC, accuracy, F1).
- Serialize to
desktop-app/trained_models/machine_failure_detection_model3.pkl.
Supported model types (selectable in the Streamlit Train page and notebooks):
| Model | Library | Notes |
|---|---|---|
| Random Forest | sklearn.ensemble.RandomForestClassifier |
Default production model |
| SVM | sklearn.svm.SVC |
Effective on small datasets |
| Logistic Regression | sklearn.linear_model.LogisticRegression |
Baseline |
Training data format (CSV, header required):
RMS,Mean,MA1,MA2,MA3,F1,F2,F3,label
120.4,98.1,5430,3210,1890,440,880,1320,0
...
Trained models are serialized with Python's pickle module:
import pickle
# Save
with open("machine_failure_detection_model3.pkl", "wb") as f:
pickle.dump(model, f)
# Load (done by the API server at startup)
model = pickle.load(open("machine_failure_detection_model3.pkl", "rb"))Serialized models are stored in desktop-app/trained_models/.
| File | Description |
|---|---|
machine_failure_detection_model.pkl |
Earlier iteration |
machine_failure_detection_model3.pkl |
Current production model |
At FastAPI server startup, the model is loaded into memory:
import pickle, os
_model_path = os.path.join(os.path.dirname(__file__),
"..", "trained_models",
"machine_failure_detection_model3.pkl")
model = pickle.load(open(_model_path, "rb"))At inference time, the 8-feature vector from the request body is converted to a pandas.DataFrame and passed to model.predict():
data_df = pd.DataFrame([input.dict()])
prediction = model.predict([data_df.iloc[0]])
return {"prediction": int(prediction[0])}See docs/API_REFERENCE.md for the full endpoint contract.