MPCStats
diff --git a/‎mpcstats/.gitignore‎
Lines changed: 5 additions & 0 deletions b/‎mpcstats/.gitignore‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎mpcstats/README.md‎
Lines changed: 45 additions & 0 deletions b/‎mpcstats/README.md‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎mpcstats/benchmark/.gitignore‎
Lines changed: 9 additions & 0 deletions b/‎mpcstats/benchmark/.gitignore‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎mpcstats/benchmark/Makefile‎
Lines changed: 9 additions & 0 deletions b/‎mpcstats/benchmark/Makefile‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎mpcstats/benchmark/README.md‎
Lines changed: 103 additions & 0 deletions b/‎mpcstats/benchmark/README.md‎
Lines changed: 103 additions & 0 deletions
diff --git a/‎mpcstats/benchmark/benchmarker.py‎
Lines changed: 202 additions & 0 deletions b/‎mpcstats/benchmark/benchmarker.py‎
Lines changed: 202 additions & 0 deletions
@@ -0,0 +1,5 @@
+Player-Data/
+logs/
+Programs/
+!Programs/Circuits/
+!Programs/Source/
@@ -0,0 +1,45 @@
+# MPCStats Library
+
+This library allows users to write simple python to calculate stats function using MPC without the need to interact with MP-SPDZ itself. You can see the example in main.py, and all the stats functions implemented in mpcstats_lib.py
+
+## Installation
+
+Clone the repo.
+
+```bash
+git clone https://github.com/ZKStats/MP-SPDZ
+cd MP-SPDZ
+```
+
+Install dependencies.
+
+```bash
+make setup
+```
+
+Build the MPC vm for `semi` protocol
+
+```bash
+make -j8 semi-party.x
+# Make sure `semi-party.x` exists
+ls semi-party.x
+```
+
+If you're on macOS and see the following linker warning, you can safely ignore it:
+
+```bash
+ld: warning: search path '/usr/local/opt/openssl/lib' not found
+```
+
+## Run Example
+
+```bash
+cd mpcstats
+python main.py
+```
+
+In this example, you can see how each data are easily manipulated using MPCStats function. Most descriptions are already commented in the code.
+
+## Implementation
+
+Statistics operations implementation is in [mpcstats_lib.py](./mpcstats_lib.py). We may add new supported functions in the future or feel free to PR!
@@ -0,0 +1,9 @@
+Player-Data/
+Programs/
+logs/
+!Programs/Circuits/
+!Programs/Source/
+computation_defs/*
+!computation_defs/templates/
+*.csv
+HOSTS
@@ -0,0 +1,9 @@
+.PHONY: clean
+
+clean:
+	rm -rf Player-Data
+	rm -rf Programs
+	rm -rf Public-Input
+	rm -rf Schedules
+	rm -rf logs
+
@@ -0,0 +1,103 @@
+# MPCStats Benchmarking
+
+## Preparation
+
+### Protocols definitions
+First specify which protocols to use for benchmarking in:
+`./protocols.py`
+
+For each protocol used, the associated `.x` file needs to be built. To do this, run `./gen_vms.py`
+
+### Datasets for benchmarking
+Next create the datasets to be used for benchmarking in `./datasets`.
+The datasets should be in CSV format without a header line.
+datasets whose names start with '_' are ignored.
+
+#### Dataset generation script
+You can use `./gen_dataset.py` to randomly generate datasets for benchmarking purposes.
+
+### Computation defintions
+Lastly, define the computations to be benchmarked in `./computation_defs/templates`.
+
+By executing `./gen_comp_defs.py`, computaion definition instances for all dataset and template combinations will be created in the `./computations_defs` directory.
+
+Computation definitions whose names start with '_' are ignored.
+
+### Setting up ssl
+On party 0 host, in `mpcstats/benchmark` directory, run:
+
+```
+../../Scripts/setup-ssl.sh 3
+```
+
+Then, copy `Player-Data/P{0,1,2}.{pem,key}` to the other party hosts as explained below.
+
+```
+The certificates should be the same on every host. Also make sure that it's still valid. Certificates generated with `Scripts/setup-ssl.sh` expire after a month.
+```
+
+```bash
+scp pse-eu:'MP-SPDZ/mpcstats/benchmark/Player-Data/*.pem' .
+scp pse-eu:'MP-SPDZ/mpcstats/benchmark/Player-Data/*.key' .
+scp *.pem pse-us:MP-SPDZ/mpcstats/benchmark/Player-Data
+scp *.key pse-us:MP-SPDZ/mpcstats/benchmark/Player-Data
+scp *.pem pse-asia:MP-SPDZ/mpcstats/benchmark/Player-Data
+scp *.key pse-asia:MP-SPDZ/mpcstats/benchmark/Player-Data
+```
+
+Fianlly, call `c_rehash` on the machines to which the pem/key files are copied
+
+```
+c_rehash MP-SPDZ/mpcstats/benchmark/Player-Data
+```
+
+
+## Running the benchmark
+Execute the `./driver.py [scenario ID]` to run the benchmarks and output the results as a CSV to stdout.
+
+To get the list of secnario IDs, run:
+
+```
+./driver.sh -h
+```
+
+### Setting up a remote machine
+Assuming a Ubuntu 24.04, x86, 64-bit instance
+
+- Install necessary libraries
+```
+sudo apt update
+sudo apt-get install -y automake build-essential clang cmake git libboost-all-dev libgmp-dev libntl-dev libsodium-dev libssl-dev libtool python3
+```
+
+- Install MP-SPDZ
+```
+git clone https://github.com/exfinen/MP-SPDZ.git
+cd MP-SPDZ
+git checkout benchmarker
+```
+
+- Copy `*.so` files
+Copy `MP-SPDZ/libFHE.so` amd `MPSPDZ/libSPDZ.so` to the new remote machine
+
+- Copy `*.x` files
+Copy `MP-SPDZ/*.x` to the new remote machine
+
+- Add `*.so` files to the library search path
+Add `export LD_LIBRARY_PATH=<MP-SPDZ directory>` to `.bashrc` or a similar configuration file
+
+### Preparing HOSTS file
+Create `MP-SPDZ/HOSTS` of the following contents:
+
+```
+<Party 0 IP>
+<Party 1 IP>
+<Party 2 IP>
+...
+```
+
+Also make sure that you use correct party number on each party machine.
+
+For example, if you use party number 1 on party-2 machine, MPC will not function. Each party has an assigned port number that is the base port number + party number.
+
+In such a case, if the base port number is the default 5000, party 2 will try to connect to itself using port 5001, but the vm is listening to port 5002 on part-2 machine.
@@ -0,0 +1,202 @@
+#!/usr/bin/env python3
+
+from pathlib import Path
+repo_root = Path(__file__).parent.parent.parent
+
+import sys
+sys.path.append(str(repo_root))
+sys.path.append(f'{repo_root}/mpcstats')
+
+from common_lib import benchmark_dir, mpcstats_dir
+
+import argparse
+import subprocess
+import os
+import time
+from typing import List, Literal, Optional
+import json
+from common_lib import read_script 
+from constants import MAX_MEM_USAGE_KB, EXEC_TIME_SEC
+
+# 1.rss (Resident Set Size):
+#   - Measures the physical memory the process is currently using (in RAM).
+#   - This is typically the most relevant for benchmarking how much memory a process is actively using.
+#    -Best for benchmarking real memory usage because it shows the actual RAM being consumed by the process.
+# 2. vsz (Virtual Memory Size):
+#   - Measures the total virtual memory the process is using, including memory that has been swapped out, memory that has been mapped but not used, and shared memory.
+#   - Less relevant for benchmarking real memory usage because it includes parts of memory that aren’t physically loaded into RAM, such as mapped files and swapped-out memory.
+ 
+# TODO generate list from type definition
+MemoryFieldsType = Literal['rss', 'vsz']
+MemoryFields = ['rss', 'vsz']
+
+os.environ['PATH'] += os.pathsep + str(benchmark_dir)
+
+def parse_args():
+    parser = argparse.ArgumentParser(description="Benchmarking Script")
+    parser.add_argument(
+        'protocol',
+        type=str, 
+        help='MPC protocol',
+    )
+    parser.add_argument(
+        'num_parties',
+        type=int,
+        help='Number of participating parties',
+    )
+    parser.add_argument(
+        '--name',
+        type=str,
+        default=f'computation',
+        help='Name of the computation',
+    )
+    parser.add_argument(
+        '--mem-field',
+        type=str, 
+        choices=MemoryFields,
+        default=MemoryFields[0],
+        help='ps command field to retrieve memory usage',
+    )
+    parser.add_argument(
+        '--edabit',
+        action='store_true',
+        help='Use edaBit',
+    )
+    parser.add_argument(
+        '--mem-get-sleep',
+        type=float, 
+        default=1,
+        help='Time interval (in seconds) to sleep between memory retrievals for execution'
+    )
+    parser.add_argument(
+        '--file',
+        type=str,
+        help='Computation definition file. If not specified, the definition will be read from stdin',
+    )
+    parser.add_argument(
+        '--comp-args',
+        type=str,
+        help='Arguments for `compile.py`',
+    )
+    parser.add_argument(
+        '--remote',
+        type=int,
+        help='Party number in remote execution',
+    )
+    parser.add_argument(
+        '--verbose-compiler',
+        action='store_true',
+        help='Show output from internally called scripts',
+    )
+    parser.add_argument(
+        '--verbose-vm',
+        action='store_true',
+        help='Show output from vm',
+    )
+    return parser.parse_args()
+
+def exec_ps(pid: int, field: MemoryFieldsType) -> int:
+    if os.name == 'posix':
+        res = subprocess.run(
+            ['ps', '-o', f'{field}=', '-p', str(pid)],
+            stdout=subprocess.PIPE,
+        )
+        return int(res.stdout.decode().strip())
+
+def gen_compile_cmd(args: argparse.Namespace) -> list[str]:
+    compile_script = benchmark_dir / 'compile.py'
+    opts = []
+    if args.comp_args is not None:
+        opts.extend(args.comp_args.split())
+    if args.name:
+        opts.extend(['--name', args.name])
+    if args.file:
+        opts.extend(['--file', args.file])
+    if args.edabit:
+        opts.append('--edabit')
+    if args.verbose_compiler:
+        opts.append('--verbose')
+
+    return [compile_script] + opts
+
+def gen_executor_cmd(args: argparse.Namespace) -> list[str]:
+    executor_script = benchmark_dir / 'executor.py'
+    opts = []
+    if args.name:
+        opts.extend(['--name', args.name])
+    if args.file:
+        opts.extend(['--file', args.file])
+    if args.remote is not None:
+        opts.extend(['--remote', str(args.remote)])
+    if args.verbose_vm:
+        opts.append('--verbose')
+ 
+    return [executor_script, args.protocol, str(args.num_parties)] + opts
+
+def monitor_mem_usage(proc: subprocess.Popen, mem_field: str, mem_get_sleep: float) -> int:
+    max_mem_usage = 0
+    while proc.poll() is None:  # While the process is running
+        ps_output = subprocess.run(['ps', '-p', str(proc.pid), '-o', f'{mem_field}='], capture_output=True, text=True)
+        mem_usage = int(ps_output.stdout.strip())
+        if mem_usage > max_mem_usage:
+            max_mem_usage = mem_usage
+        time.sleep(mem_get_sleep)
+
+    return max_mem_usage
+
+def read_proc_stdout(proc: subprocess.Popen) -> list[str]:
+    lines = []
+    while True:
+        line = proc.stdout.readline()
+        if not line:
+            break
+        lines.append(line.strip())
+    proc.wait()
+    return lines
+
+def exec_cmd(cmd: list[str], computation_script: str, mem_field: str, mem_get_sleep: float, verbose: bool) -> object:
+    proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+    proc.stdin.write(computation_script.encode())
+    proc.stdin.close()
+
+    lines = []
+    beg_time = time.time()
+    try:
+        max_mem_usage = monitor_mem_usage(proc, mem_field, mem_get_sleep) 
+        lines = read_proc_stdout(proc)
+        if len(lines) == 0:
+            return {}
+
+        exec_time = time.time() - beg_time
+
+        script = os.path.splitext(cmd[0].name)[0]
+
+        # print out the script output excluding the last line
+        if verbose:
+            other_lines = [line.decode('utf-8') for line in lines[:-1]]
+            print('\n'.join(other_lines))
+
+        # parse the last line to a json object
+        res = json.loads(lines[-1].decode('utf-8'))
+
+        res[f'{script}_{MAX_MEM_USAGE_KB}'] = max_mem_usage
+        res[f'{script}_{EXEC_TIME_SEC}'] = exec_time
+        return res
+    
+    except Exception as e:
+        print(f'Error occurred while monitoring subprocess: {e}\n{lines}')
+        proc.terminate()
+        raise    
+
+args = parse_args()
+
+# read computaiton script from file or stdin
+computation_script = read_script(open(args.file) if args.file else None)
+
+# execute compile script
+compile_result = exec_cmd(gen_compile_cmd(args), computation_script, args.mem_field, 0.1, args.verbose_compiler)
+
+# execute executor script
+executor_result = exec_cmd(gen_executor_cmd(args), computation_script, args.mem_field, args.mem_get_sleep, args.verbose_vm)
+
+print(json.dumps([compile_result, executor_result]), end='')