Enhancing Interpretability of Rule-Based Classifiers through Feature Graphs

This repository contains the code associated with the project "Enhancing Interpretability of Rule-Based Classifiers through Feature Graphs." In this project, we propose a comprehensive framework for estimating feature contributions in rule-based systems. Our contributions include a graph-based feature visualization strategy, a novel feature importance metric agnostic to rule-based predictors, and a distance metric for comparing rule sets based on feature contributions.

Repository Structure

The folder contains all the code to replicate the experiments conducted and synthetic data generated. The files are organized as follows:

Method Implementation

parsing_rules.py
- Contains parsing functions to parse the rules output from the following rule-based strategies to derive rules from data: decision tree, association rule mining, logic learning machines, and black-box models with rule extraction.
graph_building.py
- Contains functions to compute a variety of rule relevance and feature relevance metrics, and to generate the adjacency matrix for the feature graph, as proposed by our method.
requirements.txt
- Specifies the required Python libraries. Note: The PsyKE library requires Python version <= 3.9.

Notebooks

Synthetic_experiments_notebook.ipynb
- Contains the code to generate all synthetic experiments and to analyze them.
Benchmark_experiments_notebook.ipynb
- Contains the code to validate the method on benchmark datasets.

Folders

synthetic
- Contains the datasets generated in the investigation.
LLM-outputs
- Contains the rules output from the logic learning machine, computed within the Rulex platform.
Feature-selection
- *_imps.csv files: feature importance scores computed according to permutation importance, Gini importance, average SHAP values, and the proposed method (with all using relevance or impurity as feature relevance criterion + relevace, support, lift, confidence or equal as rule relevance criterion);
- *_scores.csv: accuracy of decision trees trained on the top k features (with k ranging from 2 to 12) according to each of the four considered feature importance metrics.

Benchmark Datasets

Datasets Hill Valley, Hypothyroid, Pixel, and Tokyo were retrieved from the Penn Machine Learning Benchmarks (https://github.com/EpistasisLab/pmlb) while the remaining datasets were retrieved from the UCI Machine Learning Repository (https://github.com/uci-ml-repo/ucimlrepo), accessed through the respective Python wrappers.

Dataset	#instances	#features	#binary	#categorical	#continuous	#classes
Zoo	101	17	16	1	0	7
Breast Tissue	106	9	0	0	9	6
Hepatitis	155	19	13	6	0	2
BCW Prognostic	198	34	0	0	34	2
SPECT Heart	267	22	22	0	0	2
Breast Cancer	286	9	0	9	0	2
BCW Diagnostic	569	30	0	0	30	2
Balance Scale	625	4	0	4	0	3
BCW Original	699	9	0	0	30	2
Pima Diabetes	768	8	0	0	8	2
Tokyo	959	44	0	2	42	2
Hill Valley	1212	100	0	0	100	2
Contraceptive	1473	9	0	2	7	3
Car Evaluation	1728	6	0	6	0	4
Pixel	2000	240	0	240	0	10
Hypothyroid	3163	25	17	1	7	2
Waveform	5000	21	0	0	21	3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing Interpretability of Rule-Based Classifiers through Feature Graphs

Repository Structure

Method Implementation

Notebooks

Folders

Benchmark Datasets

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LLM-outputs		LLM-outputs
feature-selection		feature-selection
synthetic		synthetic
Benchmark_experiments_notebook.ipynb		Benchmark_experiments_notebook.ipynb
README.md		README.md
Synthetic_experiments_notebook.ipynb		Synthetic_experiments_notebook.ipynb
graph_building.py		graph_building.py
parsing_rules.py		parsing_rules.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Enhancing Interpretability of Rule-Based Classifiers through Feature Graphs

Repository Structure

Method Implementation

Notebooks

Folders

Benchmark Datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages