BradleyEdelman · BradleyEdelman · Feb 11, 2025 · Jan 21, 2025 · Jan 27, 2025 · Jan 30, 2025
diff --git a/.flake8 b/.flake8
@@ -0,0 +1,3 @@
+[flake8]
+max-line-length = 88
+extend-ignore = E501
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -4,9 +4,7 @@ on:
   push:
     branches:
       - main
-  pull_request:
-    branches:
-      - main
+
 
 jobs:
   test:

diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml
@@ -0,0 +1,36 @@
+name: Lint
+
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+jobs:
+  lint:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.10'
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install black flake8 isort
+
+      - name: Run Black
+        run: black --check .
+
+      - name: Run Flake8
+        run: flake8 .
+
+      - name: Run isort
+        run: isort --check-only .
diff --git a/.gitignore b/.gitignore
@@ -1,10 +1,15 @@
-# Ignore Databricks folder
+# Ignore Databricks, vscode
 .databricks/
+.vscode/
+
+# virtual environments
+venv*/
 
 # Ignore edgetrain folders
 models/
 logs/
 images/
+results/
 
 # Byte-compiled / optimized / DLL files
 __pycache__/

diff --git a/.isort.cfg b/.isort.cfg
@@ -0,0 +1,2 @@
+[settings]
+profile = black
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,15 @@
+repos:
+  - repo: https://github.com/psf/black
+    rev: 23.3.0 
+    hooks:
+      - id: black
+
+  - repo: https://github.com/pycqa/flake8
+    rev: 6.1.0
+    hooks:
+      - id: flake8
+
+  - repo: https://github.com/timothycrosley/isort
+    rev: 5.12.0
+    hooks:
+      - id: isort
diff --git a/README.md b/README.md
@@ -1,40 +1,44 @@
 # EdgeTrain: Automated Resource Adjustment for Efficient Edge AI Training  
-**Version: 0.1.1-alpha**
+**Version: 0.2.0**
 
-EdgeTrain is a Python package designed to dynamically adjust deep learning training parameters and strategies based on CPU and GPU performance. It optimizes the training process by adjusting batch size and learning rate to ensure efficient training without overutilizing or underutilizing available resources. This package is specifically designed to reduce memory usage for model training on edge AI devices, laptops or other setups that have limited memory.  
+EdgeTrain is a Python package designed to dynamically adjust deep learning training parameters and strategies based on CPU and GPU performance. It optimizes the training process by adjusting batch size and learning rate to ensure efficient training without overutilizing or underutilizing available resources. This package is specifically designed to balance model training performance and memory usage on edge AI devices, laptops or other setups that have limited memory.  
 
 ## Features
 
-### Automated Resource Adjustment
-EdgeTrain currently adjusts the following hyperparameters based on CPU/GPU usage:
-- **Batch Size**: Automatically adjusts batch size for better memory optimization based on resource usage.
-- **Learning Rate**: Dynamically adjusts the learning rate to improve training efficiency.
-
-These adjustments optimize resource utilization throughout training, enabling efficient use of available resources on edge AI devices.
+### Dynamic Resource-Based Training Adjustments
+EdgeTrain monitors CPU and GPU usage in real-time and automatically adjusts hyperparameters during training:
+- **Batch Size**: Increases or decreases to optimize memory usage.
+- **Learning Rate**:  Adjusts based on model performance to improve training efficiency.
 
 ### Resource Logging & Visualization
-EdgeTrain logs critical system metrics (e.g., CPU and GPU usage) and training parameters (batch size, learning rate) for each epoch. The logs enable post-hoc visualization and analysis of:
+EdgeTrain logs system performance and training parameters, allowing post-hoc visualization of:
 - Resource utilization over time.
 - Training parameter adjustments across epochs.
 - Correlations between resource usage and model performance.
 
-The built-in **visualization tools** help you understand how system resources are being utilized and how training parameters evolve during training.
+The provided visualization tools illustrate how system resources are being utilized and how training parameters evolve during training.
 
-### Customizable
+### Customization and control
 EdgeTrain is highly customizable. You can easily modify:
 - **Resource Adjustment Thresholds**: Set CPU/GPU usage ranges to trigger adjustments.
 - **Training Configuration Settings**: Adjust batch size increment, learning rate adjustments, and more.
-- Tailor the optimization process to fit various setups, especially on edge devices with limited resources.
+- **Fixed Pruning Strategy**: Pruning is applied with a constant ratio and stripped at the end to improve deployment efficiency.
+
+## Release Notes for v0.2.0
+This version introduces a **refined adaptive training strategy with a constant pruning ratio**. Key updates:
 
-## Release Notes
-Version: 0.1.1-alpha
-- Fixed circular import issue in `create_model.py`. Now users should not encounter import errors during initialization.
+- **Score Calculation**: This version now computes an **accuracy score** and a **memory score** based on resource usage and model performance.
+- **Parameter Prioritization**: Accuracy and memory scores are weighted according to default or user-defined priority weighting schemes to idenfity a priority order for parameter adjustment. Only the top priority paramater is adjusted in each epoch.
+    - **Batch size priority** is weighted by memory usage.
+    - **Learning rate priority** is inversely weighted by accuracy improvement (i.e. increases if accuracy stagnates).
+- **Fixed Pruning Ratio**: Pruning is constant and is stripped at the end.
+- **Code Quality Improvements**: Added pre-commit hooks and CI linting for consistency.
 
 ## Installation
 You can install the latest version of EdgeTrain via pip:
 
 ```bash
-pip install https://github.com/BradleyEdelman/EdgeTrain/releases/download/v0.1.1-alpha/edgetrain-0.1.1a0.tar.gz
+pip install https://github.com/BradleyEdelman/EdgeTrain/releases/download/v0.2.0-alpha/edgetrain-0.2.0.tar.gz
 ```
 
 Alternatively, clone the repository and install manually:
@@ -45,45 +49,66 @@ git clone https://github.com/BradleyEdelman/edgetrain.git
 
 # Checkout the desired version
 cd edgetrain
-git checkout tags/v0.1.1-alpha
+git checkout tags/v0.2.0
 
 # Install the package
 pip install .
 ```
 
-## Usage
+## Usage Example
 To use EdgeTrain, simply import the package and configure your training environment. Below is an example of using EdgeTrain with a TensorFlow model:
 ```
-# Import library
 import edgetrain
 
 # Example of resource monitoring and training with dynamic adjustments
 train_dataset = {'images': train_images, 'labels': train_labels}
-history = edgetrain.dynamic_train(train_dataset, epochs=10, batch_size=32, lr=1e-3, log_file="resource_log.csv", dynamic_adjustments=True)
+final_model, history = edgetrain.dynamic_train(
+    train_dataset, 
+    epochs=10, 
+    batch_size=32, 
+    lr=1e-3, 
+    log_file="resource_log.csv", 
+    dynamic_adjustments=True
+)
+
+# Plot resource usage, parameter scoring and prioritization, and parameter values over time
+edgetrain.log_usage_plot("resource_log.csv")
 ```
 
 ## File Tree
 ```
 EdgeTrain/
 ├── edgetrain/
 │   ├── __init__.py
+│   ├── adjust_train_parameters.py
+│   ├── calculate_priorities.py
+│   ├── calculate_scores.py
 │   ├── create_model.py
 │   ├── dynamic_train.py
-│   ├── edgetrain_folder
-│   ├── resource_adjust.py
+│   ├── edgetrain_folder.py
 │   ├── resource_monitor.py
-│   ├── train_visualize.py
+│   └── train_visualize.py
+│
+├── notebooks/
+│   └── EdgeTrain_example.ipynb
+│
 ├── tests/
 │   ├── __init__.py
-│   ├── test_adjust_batch_size.py
-│   ├── test_adjust_learning_rate.py
+│   ├── test_adjust_train_parameters.py
+│   ├── test_calculate_priorities.py
+│   ├── test_calculate_scores.py
 │   ├── test_create_model_tf.py
 │   ├── test_log_usage_once.py
-│   ├── test_sys_resources.py
-│   ├── test_dynamic_train.py
-├── example_notebooks/
-│   ├── EdgeTrain_example.ipynb
+│   └── test_sys_resources.py
+│
+├── .github/workflows/
+│   ├── ci.yml
+│   └──lint.yml
+│
+├── .flake8
 ├── .gitignore
+├── .isort.cfg
+├── .pre-commit-config.yaml
 ├── CHANGELOG.md
 ├── LICENSE
 ├── README.md
@@ -93,13 +118,17 @@ EdgeTrain/
 ```
 
 ## Contributions
-You can contribute by:
+Contributions are welcomed:
 - Reporting bugs or requesting features: [GitHub Issues](https://github.com/BradleyEdelman/edgetrain/issues)
+- Improve documentation: Help refine explanations and add examples
+- Testing: Test EdgeTrain using mode complex models and datasets in heavily resource-constrained environments
+
 
 ## License
 This project is licensed under the MIT License - see the LICENSE file for details.
 
-## Known Limitations (Alpha)
-- The package currently supports TensorFlow only. Support for other frameworks, especially lightweight ones is planned for future releases.
-- Model pruning and quantization are future features.
-- Resource usage thresholds for dynamic adjustments are in the initial phase and may require tuning based on the training setup.
+
+## Known Limitations (v0.2.0)
+- Currently supports **TensorFlow only**. Future updates will expand framework support.
+- **Gradient accumulation**: Planned for a future release to further optimize memory usage
+- **Resource usage thresholds** are still in an experimental phase and may require fine-tuning.
diff --git a/edgetrain/__init__.py b/edgetrain/__init__.py
@@ -1,6 +1,15 @@
-from .resource_monitor import sys_resources, log_usage_once
-from .resource_adjust import adjust_threads, adjust_batch_size, adjust_grad_accum, adjust_learning_rate
-from .edgetrain_folder import get_edgetrain_folder
-from .train_visualize import log_usage_plot, log_train_time, training_history_plot
-from .create_model import create_model_tf, create_model_torch
-from .dynamic_train import dynamic_train
+__all__ = [
+    "adjust_training_parameters",
+    "define_priorities",
+    "compute_scores",
+    "normalize_scores",
+    "check_sparsity",
+    "create_model_tf",
+    "dynamic_train",
+    "get_edgetrain_folder",
+    "log_usage_once",
+    "sys_resources",
+    "log_train_time",
+    "log_usage_plot",
+    "training_history_plot",
+]
diff --git a/edgetrain/adjust_train_parameters.py b/edgetrain/adjust_train_parameters.py
@@ -0,0 +1,53 @@
+from edgetrain import sys_resources
+
+
+def adjust_training_parameters(
+    priority_values, batch_size, lr, accuracy_score, resources=None
+):
+    """
+    Adjust the training parameters (batch size, learning rate) based on the highest priority score,
+    moving parameters in the opposite direction if resource usage or accuracy trends improve.
+
+    Parameters:
+    - priority_values (dict): Dictionary containing priority scores for batch size, pruning, and learning rate.
+    - batch_size (int): Current batch size.
+    - lr (float): Current learning rate.
+    - accuracy_score (float): Current accuracy score from the latest epoch (0-1).
+
+    Returns:
+    - adjusted_batch_size (int): Adjusted batch size.
+    - adjusted_lr (float): Adjusted learning rate.
+    """
+
+    # Get system resources
+    if resources is None:
+        resources = sys_resources()
+
+    # Determine which parameter has the highest priority score
+    highest_priority = max(priority_values, key=priority_values.get)
+
+    # Adjust the parameter based on system resources and highest priority score
+    if highest_priority == "batch_size":
+        # Adjust batch size based on memory usage
+        if resources["cpu_memory_percent"] > 75 or resources["gpu_memory_percent"] > 75:
+            adjusted_batch_size = max(16, batch_size // 2)  # Halve batch size
+        elif (
+            resources["cpu_memory_percent"] < 50
+            and resources["gpu_memory_percent"] < 50
+        ):
+            adjusted_batch_size = min(128, batch_size * 2)  # Double batch size
+        else:
+            adjusted_batch_size = batch_size
+        adjusted_lr = lr
+
+    elif highest_priority == "learning_rate":
+        # Adjust learning rate based on accuracy score
+        if accuracy_score < 0.05:  # Example threshold for low accuracy
+            adjusted_lr = max(1e-5, lr * 0.5)  # Reduce learning rate
+        elif accuracy_score > 0.95:  # Example threshold for high accuracy
+            adjusted_lr = min(1e-2, lr * 1.2)  # Slightly increase learning rate
+        else:
+            adjusted_lr = lr
+        adjusted_batch_size = batch_size
+
+    return adjusted_batch_size, adjusted_lr
diff --git a/edgetrain/calculate_priorities.py b/edgetrain/calculate_priorities.py
@@ -0,0 +1,33 @@
+def define_priorities(normalized_scores, user_priorities=None):
+    """
+    Calculate priority scores for adjustments based on resource usage and accuracy.
+
+    Parameters:
+    - normalized_scores (dict): Dictionary containing normalized scores for memory usage and accuracy.
+        - memory_score (float): Score indicating memory usage pressure (0-100).
+        - accuracy_score (float): Score indicating stagnation in accuracy improvement (0-1).
+    - user_priorities (dict, optional): Optional user-defined priorities for resource conservation and accuracy improvement.
+
+    Returns:
+    - priority_value (dict): A dictionary of priority scores for batch size and learning rate.
+    """
+
+    # Default weights if user priorities are not provided
+    default_priorities = {
+        "batch_size_adjustment": 0.4,
+        "accuracy_improvement": 0.6,
+    }
+
+    # Use user-defined priorities if available
+    priorities = user_priorities if user_priorities else default_priorities
+
+    # Calculate weighted priority scores
+    priority_value = {
+        "batch_size": priorities["batch_size_adjustment"]
+        * normalized_scores.get("memory_score"),
+        "learning_rate": (
+            priorities["accuracy_improvement"] * normalized_scores.get("accuracy_score")
+        ),
+    }
+
+    return priority_value
-Original file line number
+Diff line change
@@ Expand Up / @@ -4,9 +4,7 @@ on: @@
       push:
         branches:
           - main
-      pull_request:
-        branches:
-          - main
     jobs:
       test:
@@ Expand Down @@