syntactic processing

In table_annotator.py on line 632, we process the original column name to match the name against the ontologies of DBpedia and Schema. The original column names are processed using the code below:
```
cleaned_table_columns = [
                re.sub(r"[_-]", " ", " ".join(
                    re.findall("[0-9,a-z,.,\"#!$%\^&\*;:{}=\-_`~()\n\t\d]+|[A-Z](?:[A-Z]*(?![a-z])|[a-z]*)", col)
                )).lower() for col in table_columns.copy()
            ]
```

I wonder if the first `" "` inside the re.sub() call, currently a space, should be converted to `""`, an empty string. Because we already match the `_-` in the regex inside findall, which in turn means the `_` or `_` is replaced by a space using `" ".join()`. This join keeps the matched `_` or `-` in the string, which in turn means the `_` or `-` is replaced by another `" "` using the `re.sub(r"[_-]", " ", ...)`.

For example:
`"Team-Name"` would be converted into ```"team  name"```, 2 spaces between `'team'` and `'name'`. Is this desired behaviour, am I missing something? Or is this a bug?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

syntactic processing #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

syntactic processing #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions