`ExpressionTransformer` should try to rectify feature type information

Hello Villu,

it's been a while and I hope you're fine. I've come back more questions.
Let's start with some code:

```py
# create some data
X = pd.DataFrame(
    {
        "numbers": [1, 2, 3, 40, 5],
        "colors": ["yellow ", "blue", "BLACK", "green", "red"],
    }
)

# create a simple mapper
mapper = DataFrameMapper(
    [
        (
            ["colors"],
            [
                # CategoricalDomain(dtype=str),
                ExpressionTransformer("X[0].lower()"),
                MatchesTransformer("green"),
            ],
            {"alias": "color_green"},
        )
    ],
    df_out=True,
    default=False,
)
```
The following pipeline doesn't make much sense from a machine learning poit of view, but it shows the issue very well:
```py
pmml_pipe = PMMLPipeline(
    [
        ("mapper", mapper)
    ]
)
# fit and transform
pmml_pipe.fit_transform(X)

# export as PMML
sklearn2pmml(pmml_pipe, "output.pmml", with_repr=True)
```

In Python, everything works as expected. Now the issue is within the generated `output.pmml` file, where you can find the following:
```xml
<DataDictionary>
	<DataField name="colors" optype="continuous" dataType="double"/>
</DataDictionary>
```
Knowing that the input has an infinte amount of possible values, how can I set this data type to "string"?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ExpressionTransformer` should try to rectify feature type information #397

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ExpressionTransformer should try to rectify feature type information #397

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`ExpressionTransformer` should try to rectify feature type information #397