Skip to content

Conversation

@noahhuck
Copy link
Collaborator

Implement an egraph extractor using Knuth's algorithm to replace Bellman-Ford.

@noahhuck noahhuck requested a review from ajpal November 27, 2025 01:19
@ajpal ajpal requested a review from FTRobbin December 1, 2025 21:36
Copy link
Collaborator

@FTRobbin FTRobbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be a drop-in replacement for the existing extractor, as it does not handle containers correctly. But we can delay the support for it in Poach.

Besides, the implementation makes two copies of the db(!) and can use some further optimizations. However, we can first test how far this version goes in terms of performance and come back to further optimizations later.

src/extract.rs Outdated
self.cost_model.enode_cost(
egraph,
func,
&egglog_bridge::FunctionRow {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to create a new FunctionRow here?

src/extract.rs Outdated
//log::debug!("compute_cost_hyperedge head {} sorts {:?}", head, sorts);
// Relying on .zip to truncate the values
for (value, sort) in row.iter().zip(sorts.iter()) {
if let Some(c) = self.compute_cost_node(egraph, *value, sort) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The complexity here is not quite right in the case of a container.

src/extract.rs Outdated
if row.subsumed {
return;
}
id2enode.insert(next_id, (func.clone(), row.vals.to_vec()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just be id2enode.push(...) instead of insert

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, this makes copies of entire tables, which is not good for performance.

for (i, eclass) in row.vals.iter().take(row.vals.len() - 1).enumerate() {
let sort = &func_schema.input[i];
if sort.is_eq_sort() || sort.is_container_sort() {
eclass2parents
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This almost copies the entire db a second time.

next_id += 1;
},
);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are making copies of the db, you can actually filter out the rows that correspond to the eclasses we are interested in, instead of just looking at the tables we are interested in.

src/extract.rs Outdated

for &parent_id in eclass2parents.entry(*eclass).or_default().iter() {
remaining_children[parent_id] -= 1;
if remaining_children[parent_id] == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ignore eclasses that already have a lower cost here.

}
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this code handles containers correctly. Have you tested it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, containers do not correspond to rows in the tables, so id2enode won't have them, and the costs of containers are never computed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants