Real-time human pose action recognition using MediaPipe Pose Landmarker + TensorFlow LSTM.
- Pose keypoint extraction (33 landmarks × 4 values = 132 features/frame)
- Sequence dataset recording from webcam
- LSTM training pipeline for action classification
- Real-time webcam inference with on-frame prediction display
- rec.py: real-time recognition script
- src/Collection.py: dataset recording script
- src/detection_model.py: model training script
- models/model_67.keras: trained model output
- data/: pose sequence dataset
- pose_landmarker_lite.task: MediaPipe pose model asset
The project is currently configured for:
67( for real ? yes, the gen-alpha trend ... )idle
Each sample is saved as a NumPy file in:
data/<action>/<sequence>/<frame>.npy
no_sequences = 20sequence_length = 60- Feature vector per frame: 132 values (
33 × (x, y, z, visibility))
Install dependencies:
- Python 3.9+
numpyopencv-pythonmediapipetensorflowscikit-learn
You can directly use the pre-trained model and run:
python rec.pyBut if you wish to collect your own data :
Run:
python src/Collection.pyControls:
- Press
sto start recording a sequence - Press
qto quit
Run:
python src/detection_model.pyOutput model is saved to:
TensorBoard logs:
Run:
python rec.pyControls:
- Press
qto quit the detection window
- Camera index is set to
1in both src/Collection.py and rec.py.
If needed, change to0depending on your system. - Inference threshold is currently
0.4in rec.py. - Inference is throttled for performance in rec.py.