Question on Supervised Matching Tutorial

Hi Baak,

First of all, thank you again for your excellent work. This package is by far the best I’ve used for fuzzy matching—especially in terms of speed.

However, I found the tutorial for supervised matching (Notebook 3) quite difficult to follow. There are a lot of code blocks without accompanying explanations, which makes it challenging to understand the workflow. I’ve encountered similar issues in some of the other notebooks as well. I have also not find any good solutions to follow on your website.

The “Quick Run” guide is much clearer, but I’m not sure if it provides enough detail for applying supervised learning to my dataset. For example, if I have two columns both named "name" that I want to match, is it sufficient to set supervised_on=True and then split the noised names into training and test sets? I’m not entirely confident in my understanding, so please excuse me—I'm still fairly new to programming. 

By the way, may I ask if you have any plans to update or expand the tutorials? This project has real potential to become the go-to solution for this kind of task.

Thanks again, and I really appreciate your help.

Best regards,
Yifu Li


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on Supervised Matching Tutorial #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question on Supervised Matching Tutorial #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions