LearningToSplit/README.txt at master · mayaswissa/LearningToSplit · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
This repository contains the implementation and experimental artifacts for our
work on RL splitting heuristic for neural network verification.
Our approach was implemented on top of Marabou verifier with a reinforcement-learning-based
splitting policy trained using a combination of Demonstration-Guided Q-Learning
(DQfD) and Double DQN. The repository includes all networks, properties, trained
agents, and experiment logs needed to reproduce the results reported in the
paper.

Benchmarks:
1. ACAS-Xu Safety Properties
   - 45 neural networks
   - 4 safety specifications (phi_1–phi_4)
   - 180 total verification queries

2. Local Robustness Properties
   - Based on ACAS-Xu network N1,1
   - 1000 randomly selected inputs
   - Three epsilon values: 0.08, 0.09, 0.1
   - 3000 total verification queries

Running the Code:
The enhanced version of Marabou with RL support is executed with the following
command structure:

    ./Marabou \
        <path_to_network.onnx> \
        <path_to_property.txt> \
        --save-agent-path=<PATH> \
        --DQN-output-file=<RESULTS_DIR> \
        --DQN-mode=<MODE> \
        --pseudo-impact-start

Modes:
- DQN-mode=1 : Train a new agent
- DQN-mode=2 : Evaluate using a trained agent
- DQN-mode=0 : Disable RL (use only baseline heuristics)

Training Parameters:
When running in training mode (DQN-mode=1), the following parameters control the
learning process:

    --DQN-epochs
    --DQN-iters
    --DQN-LR
    --DQN-weight-decay
    --DQN-batch-size
    --DQN-buffer-size
    --DQN-exploration-rate
    --DQfD-lambda-sup
    --DQfD-lambda-decay
    --DQfD-margin
    --DQN-guided-steps
    --DQN-n-examples
    --DQN-demo-examples