This repository is a companion page for the following research, submitted for revision at the 9th International Workshop on Green and Sustainable Software (GREENS’25)
Authors Blinded for Review. 2025. Energy, Emissions and Performance: Cross-Language and Cross-Algorithm Analysis in Machine Learning.Submitted for revision at the 9th International Workshop on Green and Sustainable Software (GREENS’25)
- Conda (for managing environments)
- MATLAB (with Statistics and Machine Learning Toolbox)
- Java SDK
- R
- pandoc (brew install pandoc, to handle conversion from .Rmd to html)
- g++ for C++ or(gcc or whatever you need to run cpp on your system)
- Make sure you installed the Prerequisites
- Clone the repository:
git clone https://github.com/Pampaj7/SWAM.git cd SWAM
- You must change the python paths on all the languages:
- run
which pythonon the terminal and copy that pathrRunner.R putyour path inuse_python()runAlgorithm.mput your path inpyenv()PythonHandler.javaput your path in the first parameter ofnew ProcessBuilder()pythonLinker.cppput your path inadd_to_sys_path_py()
- run
- Ensure you have created conda environments named
cppandsw; seeallRunner.shfor more info - Navigate to the src directory of the project using your terminal:
cd ../SWAM/src
- Run the execution script:
./allRunner.sh
- note: this step requires a long time because it runs 30 measures for all the combinations. If you want to speed up the process you can change the number of measures, or if you want to skip this step you can copy our measurements.
- Ensure that all prerequisites are met. If so, the program will begin executing. If not cd into specific language folder and make sure you have the dependencies installed, some languages like cpp require system level libraries
- Upon completion, a file named
raw_merged_emissions.csvwill be generated in thedatafolder, containing:- 30 rows for each unique combination of dataset, algorithm, programming language, and phase (training or testing).
After the program completes its execution, you will find all the generated plots in the graphics folder. Additionally, the data folder will contain all the datasets used to create those plots.
This is the root directory of the repository. The directory is structured as follows:
.
├── README.md
├── data
│ └── mean_emissions.csv Final experimental results data
├── plots Folder Containing plots
├── requirements.txt
└── src
├── cpp Folder containing cpp files
├── java Folder containing java files
├── matlab Folder containing matlab files
├── processedDataset Folder containing processedDataset files
├── python Folder containing python files
├── R Folder containing R files
└── Utils Folder containing a .py used to preprocess the datasets
Java version "20.0.1" 2023-04-18
Gcc version Apple clang version 16.0.0 (clang-1600.0.26.4)
Maven version Apache Maven 3.9.9
Python version 3.10.14
R version 4.3.2
Matlab Version '24.1.0.2653294 (R2024a)