Behavior-Aware Expression Dissimilarity (BED)

This repository contains an implementation of the Behavior-Aware Expression Dissimilarity distance measure from the paper Quantifying Behavioral Dissimilarity Between Mathematical Expressions. Additionally, this repository contains reproducible experiments from the paper.

You can find a preprint explaining the dissimilarity and experiments on arXiv: https://arxiv.org/abs/2408.11515

A more streamlined implementation of the measure can be found in the Symbolic Regression Toolkit Python package.

Quickstart Instructions

Create a new (conda) environment
Install dependencies with the command: pip install git+https://github.com/smeznar/BED
Use the BED class as in the example below

from bed import BED

expressions = [["2", "*", "A", "*", "B"], ["C", "*", "B", "-", "A"]]
bed = BED(expressions, [(1, 5), (1, 5)], (0.2, 5), seed=0)
print(bed.calculate_distances())

expressions, x_bounds=None, const_bounds=(0.2, 5), points_sampled=64, consts_sampled=32, expressions2=None, x=None, randomized=False, cutoff_threshold=1e20, default_distance=np.inf, symbol_library=None, seed=None)

Hyperparameter explanation

BED uses the following hyperparameters and their default values:

Parameter	Description	Type	Default Value
expressions	Expressions between which you want to calculate the distance	list[list[str]]
x_bounds	Lower and upper bounds for each variable that occurs in the expressions (Each variable can have a different domain)	list[(float, float)]
const_bounds	Lower and upper bounds for constant values that will be sampled during the computation of the distance	(float, float)	(0.2, 5)
points_sampled	Number of variable values sampled during the computation of the distance	int	64
consts_sampled	Number of constant values sampled during the computation of the distance	int	16
expressions2	If not None, the distance between expressions and expressions2 is computed instead of the distances between each pair of expressions in the expressions parameter	list[list[str]]	None
x	Points on which the distance will be computed. If None, points are sampled instead.	numpy.array	None
randomized	If True, new points and constants are sampled for each calculation of the distance (computationally expensive). If parameter x is not None, this parameter will be ignored.	bool	False
cutoff_threshold	Threshold for the maximum value an expression can evaluate to. Evaluations with the maximum absolute value higher than this threshold will be ignored	float	1e20
default_distance	Distance between two expressions at a point where one expression produces valid evaluations and another invalid ones. Example point -1 for expressions $x$ and $\sqrt{x}$	float	1e10
symbol_library	Library of symbols (SRToolkit.utils.SymbolLibrary) that make us the expressions	SRToolkit.utils.SymbolLibrary	None
seed	Random seed for reproducible results	int	None

Overview of the approach

An overview of the approach is shown below. First, we sample points in the given domain and tuples of free parameters values, which we use to evaluate a given expression and create a description of its behavior (e.g. a matrix with values calculated at these points and parameter values). We then use these matrices to calculate the Wasserstein distance between two expressions at each point. The final distance is calculated as the mean of these distances at each point. The image below shows an example of how the distance is calculated for two expressions.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
BED		BED
data/experiment_scripts		data/experiment_scripts
figures		figures
results		results
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Behavior-Aware Expression Dissimilarity (BED)

Quickstart Instructions

Hyperparameter explanation

Overview of the approach

About

Uh oh!

Releases

Packages

Uh oh!

Languages

smeznar/BED

Folders and files

Latest commit

History

Repository files navigation

Behavior-Aware Expression Dissimilarity (BED)

Quickstart Instructions

Hyperparameter explanation

Overview of the approach

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages