Advanced Computer Vision and Video Analytics

A comprehensive implementation of advanced computer vision techniques and video analytics solutions using Python, OpenCV, and deep learning frameworks.

Overview
Features
Installation
Project Structure
Modules & Implementations
Notebook: Image Processing & Digits Classification
Notebook: Shape and Image Transformations
Notebook: Edge Detection and Image Segmentation
Notebook: Histogram Analysis, Equalization & DFT
Notebook: Image Compression & Deep Learning Classification
Notebook: Segmentation, Detection & Classification
Notebook: Blob Detection, Image Enhancement & Classification
Notebook: SIFT, ORB, Watershed, ResNet & Few-Shot Learning
Notebook: Stitching, Denoising, GANs & Segmentation Playground
Notebook: Image Denoising & Video Action Pipeline

Overview

This repository contains implementations of advanced computer vision algorithms and video analytics techniques including:

Image transformations and geometric operations
Object detection and tracking
Face recognition and analysis
Pose estimation
Video processing and frame analysis
Motion detection and activity recognition
Semantic and instance segmentation
3D reconstruction

Features

✨ Image Processing

Geometric transformations (rotation, scaling, translation, shearing, reflection)
Image filtering and enhancement
Morphological operations
Edge detection and contour analysis

✨ Video Analytics

Real-time video processing
Frame-level analysis
Motion tracking
Temporal analysis

✨ Deep Learning Integration

Pre-trained model support (YOLO, ResNet, etc.)
Custom model implementations
Transfer learning examples

✨ Visualization Tools

Matplotlib-based visualization
Real-time plotting
Annotated frame display

Installation

Prerequisites

Python 3.8+
pip or conda

Setup

git clone https://github.com/anshika1279/Computer-Vision-Implementation.git
cd Computer-Vision-Implementation
pip install -r requirements.txt

Key runtime dependencies include:

Notebook tooling: nbformat, nbconvert, notebook (for cleaning/fixing metadata)
CV/DL extras: timm (MobileNetV1), ultralytics, torch/torchvision, tensorflow

Project Structure

README.md: Project overview and instructions.
requirements.txt: Python dependencies for all notebooks.
image_processing_and_digits_classification.ipynb: Image resizing/blur demo plus digits classification with multiple models.
ShapeAndImageTransformations.ipynb: Shape and image transformation examples.
edge_detection_and_image_segmentation.ipynb: Edge detection operators and image segmentation techniques.
Histogram_Analysis_Equalization_DFT.ipynb: Histogram analysis, contrast enhancement, and frequency domain transformations.
image_compression_techniques_DCT_Deep_learning_image_classification.ipynb: DCT compression and CNN-based digit/object classification.
segmentation_detection_classification.ipynb: Advanced CV pipeline with edge/region segmentation, Hough transform, YOLO/R-CNN detection, and Fashion-MNIST/CIFAR-100 classification.
blob_detection_image_enhancement_classification.ipynb: Blob detection algorithms (LoG, DoG, DoH), comprehensive image enhancement techniques, and transfer learning with AlexNet/VGG16 on CIFAR-100.
sift_orb_watershed_resnet_few_shot_learning.ipynb: Feature detection (SIFT, ORB), feature matching (BFMatcher), watershed segmentation, ResNet-18/34 classification on CIFAR-100, and few-shot/one-shot learning with elastic deformation augmentation.
Stitching_Denoising_GAN_SegmentationPlayground.ipynb: Colab-ready playground covering image stitching (simple/panorama/ORB + pose), inpainting, MNIST denoising autoencoders, GANs (MNIST, CIFAR-10), MobileNet V1/V2/V3 fine-tuning, and notebook metadata fixes for GitHub rendering.
Image_Denoising_Video_Action_Pipeline.ipynb: Image denoising comparison (Median, Wavelet, Noise2Void U-Net) and video action recognition pipeline (frame extraction, video processing, UCF101 subset with 3D CNN classification).

Modules & Implementations

Image resizing with multiple interpolation methods and blurring with box, Gaussian, and bilateral filters.
Digits classification using sklearn digits dataset with Gaussian Naive Bayes, RBF SVM, and Random Forest, including cross-validation and ROC visualization.

Notebook: Image Processing & Digits Classification

File: image_processing_and_digits_classification.ipynb
Part 1: Resize (linear, nearest, cubic) and blur (box, Gaussian, bilateral) a local image; expects image.png in the repo root and displays a comparison grid.
Part 2: Train/evaluate classifiers (Gaussian Naive Bayes, RBF SVM, Random Forest) on sklearn digits with 5-fold CV; prints metrics and shows ROC curves.

Notebook: Shape and Image Transformations

File: ShapeAndImageTransformations.ipynb
Part 1: 2D rectangle transformations (translate, scale, rotate, reflect, shear, composite) visualized with Matplotlib.
Part 2: Image transformations on input.jpg using OpenCV (translate, reflect, rotate, scale, crop, shear on x/y) with side-by-side plots.
Part 3: Additional 2D shape transformations with reusable helpers for translate/scale/rotate/reflect/shear and composite examples.

Notebook: Edge Detection and Image Segmentation

File: edge_detection_and_image_segmentation.ipynb
Edge Detection: Implements multiple edge detection operators including:
- Sobel operator (combined X and Y gradients)
- Prewitt edge detection
- Roberts cross operator
- Canny edge detector
Image Segmentation: Demonstrates various segmentation techniques:
- Global thresholding (fixed threshold binarization)
- Adaptive thresholding (local neighborhood-based)
- Watershed segmentation with morphological operations
Preprocessing: Includes color space conversions (BGR→RGB→Grayscale→Binary) and image metrics calculation
Visualization: Displays all results in a comprehensive grid layout with labeled subplots
Outputs: Saves processed images (edge maps, segmented regions) for further analysis

Notebook: Histogram Analysis, Equalization & DFT

File: Histogram_Analysis_Equalization_DFT.ipynb
Histogram Analysis: Computes and plots histograms for both grayscale and color (RGB) images
- Individual channel histograms for color images (B, G, R)
- Histogram normalization to probability distributions
- Visualization with matplotlib for histogram analysis
Contrast Enhancement: Implements histogram equalization for improving image contrast
- Before/after comparison of grayscale images
- Visual quality assessment with text annotations
- Side-by-side display of original and equalized images
Discrete Fourier Transform (DFT): Frequency domain analysis and transformations
- DFT computation with magnitude spectrum visualization
- Inverse DFT for image reconstruction
- Rotation property verification (45° rotation test)
- Demonstrates spatial vs. frequency domain correspondence
Compatible with Google Colab: Uses cv2_imshow for Colab environments

Notebook: Image Compression & Deep Learning Classification

File: image_compression_techniques_DCT_Deep_learning_image_classification .ipynb
DCT-Based Image Compression: Implements both lossy and lossless compression techniques
- Lossy compression with quantization using JPEG standard quantization matrix
- Lossless compression preserving all DCT coefficients
- Block-wise DCT/IDCT operations (8×8 blocks)
- Compression ratio analysis and file size comparison
MNIST Digit Classification: CNN implementations for handwritten digit recognition
- Basic 3-layer CNN architecture
- Enhanced CNN with BatchNormalization, Dropout, and L2 regularization
- Data augmentation (rotation, shifts, zoom)
- Learning rate scheduling and early stopping
CIFAR-10 Classification: Color image classification with CNN
- 10-class object recognition on 32×32 color images
- Similar architecture adapted for RGB inputs
Model Evaluation: Comprehensive performance metrics
- Classification reports with precision, recall, F1-score
- Confusion matrices with heatmap visualization
- ROC curves and AUC scores

Notebook: Segmentation, Detection & Classification

File: segmentation_detection_classification.ipynb
Image Segmentation: Multiple segmentation approaches
- Edge-based segmentation using Canny edge detection
- Region-based segmentation with thresholding techniques
- Visualization with matplotlib for result comparison
Hough Transform: Line detection and feature extraction
- Probabilistic Hough Line Transform for straight line detection
- Configurable parameters for line detection sensitivity
- Visual overlay of detected lines on original images
Object Detection: State-of-the-art detection models
- YOLOv8: Real-time object detection with ultralytics framework
- Faster R-CNN: Region-based detection with ResNet50-FPN backbone
- Bounding box visualization with confidence scores
- Pre-trained models on COCO dataset for 80+ object classes
Deep Learning Classification: Multi-dataset CNN training
- Fashion-MNIST: Clothing classification (10 classes, 28×28 grayscale)
- CIFAR-100: Fine-grained object classification (100 classes, 32×32 RGB)
- Custom CNN architectures with Conv2D, MaxPooling, and Dense layers
- 5-epoch training with validation accuracy tracking
- Classification reports with precision, recall, and F1-scores
Integrated Pipeline: End-to-end processing workflow combining segmentation, detection, and classification
Dual Environment Support: Compatible with both local (matplotlib) and Google Colab (cv2_imshow) environments

Notebook: Blob Detection, Image Enhancement & Classification

File: blob_detection_image_enhancement_classification.ipynb
Blob Detection Algorithms: Implementation of three advanced blob detection methods
- LoG (Laplacian of Gaussian): Scale-space blob detection with adjustable sigma parameters
- DoG (Difference of Gaussian): Efficient approximation of LoG for faster computation
- DoH (Determinant of Hessian): Hessian matrix-based blob detection for feature localization
- Purple region extraction using HSV color masking
- Morphological preprocessing pipeline (erosion, dilation, opening, closing, area operations)
- Handles RGBA images with alpha channel conversion
- Red circle overlay visualization of detected blobs
Image Enhancement Pipeline: Eight comprehensive image processing techniques
- Brightness & Contrast adjustment with alpha/beta parameters
- Image sharpening using custom convolution kernels
- Denoising with Non-Local Means algorithm
- Color enhancement using PIL ImageEnhance
- Image resizing with interpolation
- Inverse transform (bitwise NOT operation)
- Histogram equalization (grayscale and color via YCrCb)
- LAB color space-based color correction
- Grid visualization (3×3 layout) of all enhancement results
Transfer Learning Classification: CIFAR-100 fine-grained classification
- AlexNet: Pre-trained on ImageNet, fine-tuned for 100 classes
- VGG16: Deep architecture with 16 layers, adapted for CIFAR-100
- Modified final classifier layers for 100-class output
- SGD optimizer with momentum (lr=0.0001, momentum=0.9)
- Cross-entropy loss function
- Training loop with tqdm progress bars
- Batch size optimized for memory efficiency (batch_size=16)
- Automatic device selection (CUDA/MPS/CPU)
- ImageNet normalization for transfer learning compatibility
Model Evaluation: Comprehensive accuracy metrics on CIFAR-100 test set
Multi-Image Processing: Batch processing across multiple test images (p1.jpg, p2.jpg, p3.png, p4.png, p5.jpg)

Notebook: SIFT, ORB, Watershed, ResNet & Few-Shot Learning

File: sift_orb_watershed_resnet_few_shot_learning.ipynb
Feature Detection & Description: Classical computer vision feature extraction
- SIFT (Scale-Invariant Feature Transform): Keypoint detection and descriptor computation
- ORB (Oriented FAST and Rotated BRIEF): Fast binary descriptor for real-time applications
- Rich keypoint visualization with OpenCV drawing flags
- Multiple test images for robustness evaluation
Feature Matching: Correspondence finding between image pairs
- BFMatcher (Brute-Force Matcher): Exhaustive descriptor matching
- Hamming distance for ORB binary descriptors
- L2 distance for SIFT float descriptors
- Cross-check validation for bidirectional matching
- Top-50 matches visualization with distance-based sorting
Watershed Segmentation: Marker-based region segmentation
- Binary thresholding (THRESH_BINARY_INV) for preprocessing
- Custom marker seeds for background/foreground separation
- Watershed algorithm with boundary highlighting (red contours)
- Contour detection with RETR_EXTERNAL mode
- Multi-stage visualization (original → watershed → contours)
Deep Learning Classification: CIFAR-100 fine-grained recognition
- ResNet-18: Residual network with 18 layers (ImageNet pre-trained)
- ResNet-34: Deeper 34-layer residual architecture
- Transfer learning with modified final FC layer (100 classes)
- SGD optimizer with momentum (lr=0.0001, momentum=0.9)
- Cross-entropy loss with 5-epoch training
- ImageNet normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
- Training progress tracking with tqdm progress bars
- GPU/CPU automatic device selection
Few-Shot Learning: Meta-learning for low-data scenarios
- Prototypical Networks: Metric learning with prototype computation
  - Simple FC encoder (784→256→64 dimensions)
  - Episode-based sampling (5-way, 5-shot setup)
  - Euclidean distance metric in embedding space
- MNIST Dataset with 80/20 train/test split
- Elastic deformation augmentation for data diversity
One-Shot Learning: Siamese network for similarity learning
- Siamese CNN: Twin network architecture for pair comparison
  - Convolutional feature extractor (Conv2D→ReLU→MaxPool2D×2)
  - Fully connected similarity head (128×4×4→256→128)
  - Pairwise distance computation with L2 norm
- Proper tensor flattening for CNN-to-FC transition
Elastic Transform Augmentation: Advanced data augmentation
- Corrected Implementation: Gaussian-smoothed displacement fields
- Alpha scaling for deformation magnitude control
- Sigma parameter for smoothness adjustment
- scipy.ndimage.gaussian_filter for proper elastic deformation
- cv2.remap with bilinear interpolation and reflection borders
- Side-by-side visualization of original vs. deformed images
Model Evaluation Metrics: Comprehensive performance analysis
- Accuracy, Precision, Recall, F1-score from sklearn.metrics
- Weighted averaging for multi-class scenarios
- Custom Siamese evaluation with pairwise similarity threshold
- Confusion matrix support for detailed error analysis
Dual Mode Support: Compatible with local and Colab environments

Notebook: Stitching, Denoising, GANs & Segmentation Playground

File: Stitching_Denoising_GAN_SegmentationPlayground.ipynb
Image Stitching: Simple and panorama modes using OpenCV Stitcher; ORB-based matching with essential matrix pose recovery; visualization of keypoints and matches.
Inpainting: Mask-based OpenCV inpainting helper for quick cleanup of noisy regions.
Denoising Autoencoders (MNIST): Two variants (basic and improved with BN/Dropout) for noise+blur restoration; PSNR/SSIM metrics and pixel-wise accuracy.
GANs: MNIST MLP GAN and CIFAR-10 DCGAN with loss plots and image grid visualization utilities.
Transfer Learning: MobileNet V1/V2/V3 comparison for dog-breed classification (timm/models), with frozen backbone and classifier fine-tuning.
Utilities: Notebook metadata cleaning snippets (nbformat) to keep GitHub rendering healthy; Colab upload helpers for stitching inputs.

Notebook: Image Denoising & Video Action Pipeline

File: Image_Denoising_Video_Action_Pipeline.ipynb
Task 1: Image Denoising Comparison
- Median Filter Denoising: Channel-wise morphological filtering with disk-shaped structuring elements (size=3), RGB channel processing, PSNR/SSIM/MSE metrics, matplotlib visualization
- Wavelet Denoising: BayesShrink soft thresholding in wavelet domain, adaptive sigma rescaling, per-channel decomposition/reconstruction, quality metrics comparison
- Deep Learning Denoising: Noise2Void-inspired U-Net (Conv2D→MaxPooling→UpSampling), self-supervised patch training (64×64), center pixel masking strategy, 10-epoch training
- Multi-Method Comparison: Quantitative metrics (MSE, PSNR, SSIM) and visual side-by-side evaluation
Task 2: Video Processing & Action Recognition
- Frame Extraction: Configurable interval sampling, sequential frame storage, progress tracking
- Frame Visualization: 2×5 grid display with BGR→RGB conversion
- Video Operations: Adaptive thresholding (GAUSSIAN_C), Gaussian blur (5×5), Canny edges (100/200), bitwise NOT inversion
- Frame Collage: Grid-based spatial sampling, adaptive sizing, half-resolution assembly
- UCF101 Dataset: 5-class subset (Basketball, Biking, PlayingGuitar, Typing, JumpRope), 10 videos/class, 224×224 frames, 16-frame sequences
- 3D CNN Classification: 3-layer Conv3D (64→128→256 filters), batch normalization, 2×2×2 max pooling, 512-unit FC + 50% dropout, L2 regularization
- Training Pipeline: Data augmentation (±30° rotation/shifts/shear/zoom, horizontal flip), class weight balancing, Adam optimizer (lr=0.0001), 100 epochs, ModelCheckpoint callback
- Evaluation: Test accuracy, per-class precision/recall/F1, confusion matrix, training/validation curves
Dependencies: PyWavelets, TensorFlow/Keras, scikit-learn, OpenCV, matplotlib
Colab Ready: Full environment setup with pip installations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced Computer Vision and Video Analytics

Table of Contents

Overview

Features

Installation

Prerequisites

Setup

Project Structure

Modules & Implementations

Notebook: Image Processing & Digits Classification

Notebook: Shape and Image Transformations

Notebook: Edge Detection and Image Segmentation

Notebook: Histogram Analysis, Equalization & DFT

Notebook: Image Compression & Deep Learning Classification

Notebook: Segmentation, Detection & Classification

Notebook: Blob Detection, Image Enhancement & Classification

Notebook: SIFT, ORB, Watershed, ResNet & Few-Shot Learning

Notebook: Stitching, Denoising, GANs & Segmentation Playground

Notebook: Image Denoising & Video Action Pipeline

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Histogram_Analysis_Equalization_DFT.ipynb		Histogram_Analysis_Equalization_DFT.ipynb
Image_Denoising_Video_Action_Pipeline.ipynb		Image_Denoising_Video_Action_Pipeline.ipynb
README.md		README.md
ShapeAndImageTransformations.ipynb		ShapeAndImageTransformations.ipynb
Stitching_Denoising_GAN_SegmentationPlayground.ipynb		Stitching_Denoising_GAN_SegmentationPlayground.ipynb
blob_detection_image_enhancement_classification.ipynb		blob_detection_image_enhancement_classification.ipynb
edge_detection_and_image_segmentation.ipynb		edge_detection_and_image_segmentation.ipynb
image_compression_techniques_DCT_Deep_learning_image_classification .ipynb		image_compression_techniques_DCT_Deep_learning_image_classification .ipynb
image_processing_and_digits_classification.ipynb		image_processing_and_digits_classification.ipynb
requirements.txt		requirements.txt
segmentation_detection_classification.ipynb		segmentation_detection_classification.ipynb
sift_orb_watershed_resnet_few_shot_learning.ipynb		sift_orb_watershed_resnet_few_shot_learning.ipynb

anshika1279/Computer-Vision-Implementation

Folders and files

Latest commit

History

Repository files navigation

Advanced Computer Vision and Video Analytics

Table of Contents

Overview

Features

Installation

Prerequisites

Setup

Project Structure

Modules & Implementations

Notebook: Image Processing & Digits Classification

Notebook: Shape and Image Transformations

Notebook: Edge Detection and Image Segmentation

Notebook: Histogram Analysis, Equalization & DFT

Notebook: Image Compression & Deep Learning Classification

Notebook: Segmentation, Detection & Classification

Notebook: Blob Detection, Image Enhancement & Classification

Notebook: SIFT, ORB, Watershed, ResNet & Few-Shot Learning

Notebook: Stitching, Denoising, GANs & Segmentation Playground

Notebook: Image Denoising & Video Action Pipeline

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages