Skip to content
This repository was archived by the owner on Mar 25, 2025. It is now read-only.

Commit 4034d2a

Browse files
authored
Merge pull request #24 from exfinen/mpcstats-again
wip
2 parents 85723e8 + dd88a37 commit 4034d2a

40 files changed

+2893
-0
lines changed

mpcstats/.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Player-Data/
2+
logs/
3+
Programs/
4+
!Programs/Circuits/
5+
!Programs/Source/

mpcstats/README.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# MPCStats Library
2+
3+
This library allows users to write simple python to calculate stats function using MPC without the need to interact with MP-SPDZ itself. You can see the example in main.py, and all the stats functions implemented in mpcstats_lib.py
4+
5+
## Installation
6+
7+
Clone the repo.
8+
9+
```bash
10+
git clone https://github.com/ZKStats/MP-SPDZ
11+
cd MP-SPDZ
12+
```
13+
14+
Install dependencies.
15+
16+
```bash
17+
make setup
18+
```
19+
20+
Build the MPC vm for `semi` protocol
21+
22+
```bash
23+
make -j8 semi-party.x
24+
# Make sure `semi-party.x` exists
25+
ls semi-party.x
26+
```
27+
28+
If you're on macOS and see the following linker warning, you can safely ignore it:
29+
30+
```bash
31+
ld: warning: search path '/usr/local/opt/openssl/lib' not found
32+
```
33+
34+
## Run Example
35+
36+
```bash
37+
cd mpcstats
38+
python main.py
39+
```
40+
41+
In this example, you can see how each data are easily manipulated using MPCStats function. Most descriptions are already commented in the code.
42+
43+
## Implementation
44+
45+
Statistics operations implementation is in [mpcstats_lib.py](./mpcstats_lib.py). We may add new supported functions in the future or feel free to PR!

mpcstats/benchmark/.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
Player-Data/
2+
Programs/
3+
logs/
4+
!Programs/Circuits/
5+
!Programs/Source/
6+
computation_defs/*
7+
!computation_defs/templates/
8+
*.csv
9+
HOSTS

mpcstats/benchmark/Makefile

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
.PHONY: clean
2+
3+
clean:
4+
rm -rf Player-Data
5+
rm -rf Programs
6+
rm -rf Public-Input
7+
rm -rf Schedules
8+
rm -rf logs
9+

mpcstats/benchmark/README.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# MPCStats Benchmarking
2+
3+
## Preparation
4+
5+
### Protocols definitions
6+
First specify which protocols to use for benchmarking in:
7+
`./protocols.py`
8+
9+
For each protocol used, the associated `.x` file needs to be built. To do this, run `./gen_vms.py`
10+
11+
### Datasets for benchmarking
12+
Next create the datasets to be used for benchmarking in `./datasets`.
13+
The datasets should be in CSV format without a header line.
14+
datasets whose names start with '_' are ignored.
15+
16+
#### Dataset generation script
17+
You can use `./gen_dataset.py` to randomly generate datasets for benchmarking purposes.
18+
19+
### Computation defintions
20+
Lastly, define the computations to be benchmarked in `./computation_defs/templates`.
21+
22+
By executing `./gen_comp_defs.py`, computaion definition instances for all dataset and template combinations will be created in the `./computations_defs` directory.
23+
24+
Computation definitions whose names start with '_' are ignored.
25+
26+
### Setting up ssl
27+
On party 0 host, in `mpcstats/benchmark` directory, run:
28+
29+
```
30+
../../Scripts/setup-ssl.sh 3
31+
```
32+
33+
Then, copy `Player-Data/P{0,1,2}.{pem,key}` to the other party hosts as explained below.
34+
35+
```
36+
The certificates should be the same on every host. Also make sure that it's still valid. Certificates generated with `Scripts/setup-ssl.sh` expire after a month.
37+
```
38+
39+
```bash
40+
scp pse-eu:'MP-SPDZ/mpcstats/benchmark/Player-Data/*.pem' .
41+
scp pse-eu:'MP-SPDZ/mpcstats/benchmark/Player-Data/*.key' .
42+
scp *.pem pse-us:MP-SPDZ/mpcstats/benchmark/Player-Data
43+
scp *.key pse-us:MP-SPDZ/mpcstats/benchmark/Player-Data
44+
scp *.pem pse-asia:MP-SPDZ/mpcstats/benchmark/Player-Data
45+
scp *.key pse-asia:MP-SPDZ/mpcstats/benchmark/Player-Data
46+
```
47+
48+
Fianlly, call `c_rehash` on the machines to which the pem/key files are copied
49+
50+
```
51+
c_rehash MP-SPDZ/mpcstats/benchmark/Player-Data
52+
```
53+
54+
55+
## Running the benchmark
56+
Execute the `./driver.py [scenario ID]` to run the benchmarks and output the results as a CSV to stdout.
57+
58+
To get the list of secnario IDs, run:
59+
60+
```
61+
./driver.sh -h
62+
```
63+
64+
### Setting up a remote machine
65+
Assuming a Ubuntu 24.04, x86, 64-bit instance
66+
67+
- Install necessary libraries
68+
```
69+
sudo apt update
70+
sudo apt-get install -y automake build-essential clang cmake git libboost-all-dev libgmp-dev libntl-dev libsodium-dev libssl-dev libtool python3
71+
```
72+
73+
- Install MP-SPDZ
74+
```
75+
git clone https://github.com/exfinen/MP-SPDZ.git
76+
cd MP-SPDZ
77+
git checkout benchmarker
78+
```
79+
80+
- Copy `*.so` files
81+
Copy `MP-SPDZ/libFHE.so` amd `MPSPDZ/libSPDZ.so` to the new remote machine
82+
83+
- Copy `*.x` files
84+
Copy `MP-SPDZ/*.x` to the new remote machine
85+
86+
- Add `*.so` files to the library search path
87+
Add `export LD_LIBRARY_PATH=<MP-SPDZ directory>` to `.bashrc` or a similar configuration file
88+
89+
### Preparing HOSTS file
90+
Create `MP-SPDZ/HOSTS` of the following contents:
91+
92+
```
93+
<Party 0 IP>
94+
<Party 1 IP>
95+
<Party 2 IP>
96+
...
97+
```
98+
99+
Also make sure that you use correct party number on each party machine.
100+
101+
For example, if you use party number 1 on party-2 machine, MPC will not function. Each party has an assigned port number that is the base port number + party number.
102+
103+
In such a case, if the base port number is the default 5000, party 2 will try to connect to itself using port 5001, but the vm is listening to port 5002 on part-2 machine.

mpcstats/benchmark/benchmarker.py

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
#!/usr/bin/env python3
2+
3+
from pathlib import Path
4+
repo_root = Path(__file__).parent.parent.parent
5+
6+
import sys
7+
sys.path.append(str(repo_root))
8+
sys.path.append(f'{repo_root}/mpcstats')
9+
10+
from common_lib import benchmark_dir, mpcstats_dir
11+
12+
import argparse
13+
import subprocess
14+
import os
15+
import time
16+
from typing import List, Literal, Optional
17+
import json
18+
from common_lib import read_script
19+
from constants import MAX_MEM_USAGE_KB, EXEC_TIME_SEC
20+
21+
# 1.rss (Resident Set Size):
22+
# - Measures the physical memory the process is currently using (in RAM).
23+
# - This is typically the most relevant for benchmarking how much memory a process is actively using.
24+
# -Best for benchmarking real memory usage because it shows the actual RAM being consumed by the process.
25+
# 2. vsz (Virtual Memory Size):
26+
# - Measures the total virtual memory the process is using, including memory that has been swapped out, memory that has been mapped but not used, and shared memory.
27+
# - Less relevant for benchmarking real memory usage because it includes parts of memory that aren’t physically loaded into RAM, such as mapped files and swapped-out memory.
28+
29+
# TODO generate list from type definition
30+
MemoryFieldsType = Literal['rss', 'vsz']
31+
MemoryFields = ['rss', 'vsz']
32+
33+
os.environ['PATH'] += os.pathsep + str(benchmark_dir)
34+
35+
def parse_args():
36+
parser = argparse.ArgumentParser(description="Benchmarking Script")
37+
parser.add_argument(
38+
'protocol',
39+
type=str,
40+
help='MPC protocol',
41+
)
42+
parser.add_argument(
43+
'num_parties',
44+
type=int,
45+
help='Number of participating parties',
46+
)
47+
parser.add_argument(
48+
'--name',
49+
type=str,
50+
default=f'computation',
51+
help='Name of the computation',
52+
)
53+
parser.add_argument(
54+
'--mem-field',
55+
type=str,
56+
choices=MemoryFields,
57+
default=MemoryFields[0],
58+
help='ps command field to retrieve memory usage',
59+
)
60+
parser.add_argument(
61+
'--edabit',
62+
action='store_true',
63+
help='Use edaBit',
64+
)
65+
parser.add_argument(
66+
'--mem-get-sleep',
67+
type=float,
68+
default=1,
69+
help='Time interval (in seconds) to sleep between memory retrievals for execution'
70+
)
71+
parser.add_argument(
72+
'--file',
73+
type=str,
74+
help='Computation definition file. If not specified, the definition will be read from stdin',
75+
)
76+
parser.add_argument(
77+
'--comp-args',
78+
type=str,
79+
help='Arguments for `compile.py`',
80+
)
81+
parser.add_argument(
82+
'--remote',
83+
type=int,
84+
help='Party number in remote execution',
85+
)
86+
parser.add_argument(
87+
'--verbose-compiler',
88+
action='store_true',
89+
help='Show output from internally called scripts',
90+
)
91+
parser.add_argument(
92+
'--verbose-vm',
93+
action='store_true',
94+
help='Show output from vm',
95+
)
96+
return parser.parse_args()
97+
98+
def exec_ps(pid: int, field: MemoryFieldsType) -> int:
99+
if os.name == 'posix':
100+
res = subprocess.run(
101+
['ps', '-o', f'{field}=', '-p', str(pid)],
102+
stdout=subprocess.PIPE,
103+
)
104+
return int(res.stdout.decode().strip())
105+
106+
def gen_compile_cmd(args: argparse.Namespace) -> list[str]:
107+
compile_script = benchmark_dir / 'compile.py'
108+
opts = []
109+
if args.comp_args is not None:
110+
opts.extend(args.comp_args.split())
111+
if args.name:
112+
opts.extend(['--name', args.name])
113+
if args.file:
114+
opts.extend(['--file', args.file])
115+
if args.edabit:
116+
opts.append('--edabit')
117+
if args.verbose_compiler:
118+
opts.append('--verbose')
119+
120+
return [compile_script] + opts
121+
122+
def gen_executor_cmd(args: argparse.Namespace) -> list[str]:
123+
executor_script = benchmark_dir / 'executor.py'
124+
opts = []
125+
if args.name:
126+
opts.extend(['--name', args.name])
127+
if args.file:
128+
opts.extend(['--file', args.file])
129+
if args.remote is not None:
130+
opts.extend(['--remote', str(args.remote)])
131+
if args.verbose_vm:
132+
opts.append('--verbose')
133+
134+
return [executor_script, args.protocol, str(args.num_parties)] + opts
135+
136+
def monitor_mem_usage(proc: subprocess.Popen, mem_field: str, mem_get_sleep: float) -> int:
137+
max_mem_usage = 0
138+
while proc.poll() is None: # While the process is running
139+
ps_output = subprocess.run(['ps', '-p', str(proc.pid), '-o', f'{mem_field}='], capture_output=True, text=True)
140+
mem_usage = int(ps_output.stdout.strip())
141+
if mem_usage > max_mem_usage:
142+
max_mem_usage = mem_usage
143+
time.sleep(mem_get_sleep)
144+
145+
return max_mem_usage
146+
147+
def read_proc_stdout(proc: subprocess.Popen) -> list[str]:
148+
lines = []
149+
while True:
150+
line = proc.stdout.readline()
151+
if not line:
152+
break
153+
lines.append(line.strip())
154+
proc.wait()
155+
return lines
156+
157+
def exec_cmd(cmd: list[str], computation_script: str, mem_field: str, mem_get_sleep: float, verbose: bool) -> object:
158+
proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
159+
proc.stdin.write(computation_script.encode())
160+
proc.stdin.close()
161+
162+
lines = []
163+
beg_time = time.time()
164+
try:
165+
max_mem_usage = monitor_mem_usage(proc, mem_field, mem_get_sleep)
166+
lines = read_proc_stdout(proc)
167+
if len(lines) == 0:
168+
return {}
169+
170+
exec_time = time.time() - beg_time
171+
172+
script = os.path.splitext(cmd[0].name)[0]
173+
174+
# print out the script output excluding the last line
175+
if verbose:
176+
other_lines = [line.decode('utf-8') for line in lines[:-1]]
177+
print('\n'.join(other_lines))
178+
179+
# parse the last line to a json object
180+
res = json.loads(lines[-1].decode('utf-8'))
181+
182+
res[f'{script}_{MAX_MEM_USAGE_KB}'] = max_mem_usage
183+
res[f'{script}_{EXEC_TIME_SEC}'] = exec_time
184+
return res
185+
186+
except Exception as e:
187+
print(f'Error occurred while monitoring subprocess: {e}\n{lines}')
188+
proc.terminate()
189+
raise
190+
191+
args = parse_args()
192+
193+
# read computaiton script from file or stdin
194+
computation_script = read_script(open(args.file) if args.file else None)
195+
196+
# execute compile script
197+
compile_result = exec_cmd(gen_compile_cmd(args), computation_script, args.mem_field, 0.1, args.verbose_compiler)
198+
199+
# execute executor script
200+
executor_result = exec_cmd(gen_executor_cmd(args), computation_script, args.mem_field, args.mem_get_sleep, args.verbose_vm)
201+
202+
print(json.dumps([compile_result, executor_result]), end='')

0 commit comments

Comments
 (0)