- About the Project
- Prerequisites
- system structure
- Demo System
- Instruction for Collecting Result
- Future Plan
Differential Privacy over SQL (DPSQL) is a system for answering queries over differential privacy.
The file structure is as below
project
│
└───config
└───docs
└───Profile
└───src
│ └───algorithm
└───Test
│ └───TPCH
│ └───Graph
└───Sample
./config stores the configuration files users need for the system.
./docs stores the reference information users need to work with DPSQL:
./Profile stores the Profile information for using mosek in the system.
./src stores main source files.
./src/algorithmstores 3 algorithm we integrated into this system.
./Test stores the queries used in the experiments of the system.
./Sample stores the script for setting up database and collecting experiment results.
Before running this project, please install below tools
- PostgreSQL
- Python3
- Cplex
- Mosek and the licence is under
./Profile.
Please do not install Cplex dependency, which can only handle a small dataset, but download the Cplex API and import that to python with this instruction.
(We are aware that this link is expired and are working on a substitute solution.)
Here are dependencies used in python programs:
matplotlibnumpysysoscollectionsconfigparsermathpsycopg2pglastv4.4argparser
The user should have the permission to read the schema of the database to use this system.
TODO
To run the system, run main.py. There are seven parameters
--d: path to database initialization file;--q: path to query file;--r: path to private relation file;--c: path to the configuration file;--o: path to the output file;--debug: debug mode for more information;--optimal: choose to use optimal algorithm for SJA queries;
One can use --h to get help for parameter instruction.
For more information about input file, users can consult here
For the SQL syntax used in this system, users can consult here
Example:
python main.py --d ./config/database.ini --q ./test.txt --r ./test_relation.txt --c ./config/parameter.config --o out.txt
-
install the dependency
-
create an empty database in
PosgreSQL -
generate
tbldata files by using dbgen from TPCH website and store them in/Sample/data/TPCH -
run script we provide in
/Sample/setupDBTPCH.py
python setupDBTPCH.py --db databasename
- run script we provide in
/Sample/collectResult.py
python collectResult.py
- find the result in
/Sample/result/TPCH
- Distinct count queries type (projection);
- User Interface
- Better user experience;
- Optimization;