Underwater image enhancement training workspace based on a U-Net pipeline.
Core code is organized by responsibility:
models/: network architectures (for examplemodels/basic_unet.py)training/: data pipelines and callback helperslosses/: loss definitionsscripts/: utility and validation scripts- root entrypoints:
train_unet.py,train_sharp.py,resume_sharp.py,streamlit_app.py,video_processor.py
Generated artifacts and experiment outputs stay under logs/, results/, and models/checkpoints/.
Production-default scripts:
train_unet.py(main training entrypoint)train_complete.py(core trainer implementation used bytrain_unet.py)scripts/validate_dataset.py(required pre-check for dataset integrity)
Experimental scripts:
train_sharp.py(edge-preserving loss experimentation)resume_sharp.py(fine-tuning existing checkpoints for sharper outputs)sharpen_output.py(post-processing sharpen variants)compare_results.py(quality comparison/metrics reporting)quick_test_sharp.py(quick smoke checks for sharp pipeline)
- Step 1 completed: centralized runtime config via
config.yaml. - Step 2 completed: dataset validation enforced before training starts.
- Step 3 completed: deterministic dataset download and extract script.
- Environment milestone completed: Python 3.11 venv with dependencies installed.
Progress tracker: see IMPLEMENTATION_TODO.md.
Create and install dependencies in a Python 3.11 virtual environment:
py -3.11 -m venv .venv311
.\.venv311\Scripts\python.exe -m pip install --upgrade pip
.\.venv311\Scripts\python.exe -m pip install -r requirements.txtThis project now auto-detects GPU at runtime and prints the selected device.
Quick check:
.\.venv311\Scripts\python.exe -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"If your list is empty on native Windows, use one of these options:
- WSL2 + CUDA (recommended for NVIDIA):
wsl
# inside Ubuntu/WSL:
python -m pip install --upgrade pip
python -m pip install tensorflow[and-cuda]- Native Windows legacy stack (NVIDIA CUDA, no WSL):
# Requires Python 3.10 + TensorFlow 2.10 + CUDA 11.2 + cuDNN 8.1
py -3.10 -m venv .venv310
.\.venv310\Scripts\python.exe -m pip install --upgrade pip
.\.venv310\Scripts\python.exe -m pip install tensorflow==2.10.*
.\.venv310\Scripts\python.exe -m pip install "numpy<2"
.\.venv310\Scripts\python.exe -m pip install -r requirements.txt --no-depsNote: TensorFlow 2.11+ does not provide native Windows GPU support.
If GPU is still not detected, install CUDA 11.2 + cuDNN 8.1 and ensure their bin paths are in PATH.
Optional runtime flags:
$env:USE_GPU = "1" # 1/0 to enable/disable GPU
$env:GPU_MEMORY_GROWTH = "1" # avoid pre-allocating all VRAM
$env:MIXED_PRECISION = "1" # enable mixed_float16 policy- Use only
.venv311for this project. - Workspace settings pin the interpreter to
.venv311/Scripts/python.exein.vscode/settings.json. - If VS Code still shows unresolved imports, run Python: Select Interpreter and pick
.venv311once.
Download and extract UIEB into data/raw and data/reference:
.\.venv311\Scripts\python.exe scripts/download_dataset.pyIf you already have a local dataset, place paired images in:
data/rawdata/reference
Then validate:
.\.venv311\Scripts\python.exe scripts/validate_dataset.py --strict-namesRun default training from config.yaml:
.\.venv311\Scripts\python.exe train_unet.pyRun a quick smoke test (1 epoch):
.\.venv311\Scripts\python.exe -c "from train_unet import main; main({'epochs': 1, 'batch_size': 2})"Install dependencies (if not already installed):
.\.venv311\Scripts\python.exe -m pip install -r requirements.txtRun the Streamlit app:
.\.venv311\Scripts\python.exe -m streamlit run streamlit_app.pyWhat the app supports:
- Select trained checkpoint from
models/checkpoints - Upload image (
jpg,jpeg,png,bmp) - Run enhancement inference
- Compare original vs enhanced output
- Download enhanced image as PNG
- View run metadata from
results/model_registry.json - Compare two experiment runs with metric deltas and validation curves
Use video_processor.py for webcam, video file, RTSP stream, and batch folder enhancement.
Install dependencies (if not already installed):
.\.venv311\Scripts\python.exe -m pip install -r requirements.txt.\.venv311\Scripts\python.exe video_processor.py --mode webcamThreaded webcam mode (can improve responsiveness on some systems):
.\.venv311\Scripts\python.exe video_processor.py --mode webcam --threaded.\.venv311\Scripts\python.exe video_processor.py --mode video --input input.mp4 --output results/processed_videos/input_enhanced.mp4Disable preview window:
.\.venv311\Scripts\python.exe video_processor.py --mode video --input input.mp4 --no-preview.\.venv311\Scripts\python.exe video_processor.py --mode rtsp --input "rtsp://username:password@host:554/stream"Record RTSP output:
.\.venv311\Scripts\python.exe video_processor.py --mode rtsp --input "rtsp://..." --output results/processed_videos/rtsp_recording.mp4.\.venv311\Scripts\python.exe video_processor.py --mode batch --input-folder videos --output-folder results/processed_videos- By default, the script auto-selects a checkpoint from
results/model_registry.jsonormodels/checkpoints. - To use a specific checkpoint, pass
--model models/checkpoints/your_model.keras. - Use
--target-size 128for higher FPS, or--target-size 256for balanced quality/speed. - Use
--no-gputo force CPU mode.
.\.venv311\Scripts\python.exe quick_video_test.pyThis generates a synthetic test clip and writes test_enhanced.mp4.
Default runtime configuration lives in config.yaml and is loaded by utils/config_loader.py.
You can now control augmentation strength from config.yaml:
augmentation:
enabled: true
profile: standard # one of: none, light, standard, strong
flip_prob: 0.5
vertical_flip_prob: 0.5
rotate_prob: 1.0
brightness_prob: 0.5
brightness_delta: 0.1
contrast_prob: 0.5
contrast_lower: 0.8
contrast_upper: 1.2For A/B tests, keep profile fixed and change only one knob at a time (for example brightness_delta).
To override dataset location without editing files:
$env:DATA_PATH = "D:\datasets\uieb"
.\.venv311\Scripts\python.exe train_unet.py