Apache Spark ML
Examples: main.py
- Feature extractors, transformers and selectors:
feature.Binarizerfeature.Bucketizerfeature.ChiSqSelectorModel(the result of fitting afeature.ChiSqSelector)feature.ColumnPrunerfeature.CountVectorizerModel(the result of fitting afeature.CountVectorizer)feature.ElementwiseProductfeature.IDFModel(the result of fitting afeature.IDF)feature.ImputerModel(the result of fitting afeature.Imputer)feature.IndexToStringfeature.Interactionfeature.MaxAbsScalerModel(the result of fitting afeature.MaxAbsScaler)feature.MinMaxScalerModel(the result of fitting afeature.MinMaxScaler)feature.NGramfeature.OneHotEncoderModel(the result of fitting afeature.OneHotEncoder)feature.PCAModel(the result of fitting afeature.PCA)feature.PolynomialExpansionfeature.QuantileDiscretizerfeature.RegexTokenizerfeature.RFormulaModel(the result of fitting afeature.RFormula)feature.RobustScalerModel(the result of fitting afeature.RobustScaler)feature.SQLTransformer- Subqueries.
- Control flow expressions
case whenandif. - Arithmetic operators
+,-,*and/. - Comparison operators
<,<=,==,>=and>. - Logical operators
and,orandnot. - Math functions
abs,ceil,exp,expm1,floor,hypot,ln,log10,log1p,powandrint. - Trigonometric functions
sin,asin,sinh,cos,acos,cosh,tan,atan,tanh. - Aggregation functions
greatestandleast. - RegExp functions
regexp_replaceandrlike. - String functions
char_length,character_length,concat,lcase,length,lower,replace,substring,trim,ucaseandupper. - Type cast functions
boolean,cast,double,intandstring. - Value functions
in,isnan,isnull,isnotnull,negativeandpositive.
feature.StandardScalerModel(the result of fitting afeature.StandardScaler)feature.StopWordsRemoverfeature.StringIndexerModel(the result of fitting afeature.StringIndexer)feature.Tokenizerfeature.VectorAssemblerfeature.VectorAttributeRewriterfeature.VectorIndexerModel(the result of fitting afeature.VectorIndexer)feature.VectorSizeHintfeature.VectorSlicer
- Prediction models:
classification.DecisionTreeClassificationModelclassification.GBTClassificationModelclassification.LinearSVCModelclassification.LogisticRegressionModelclassification.MultilayerPerceptronClassificationModelclassification.NaiveBayesModelclassification.RandomForestClassificationModelclustering.KMeansModelfpm.FPGrowthModelregression.DecisionTreeRegressionModelregression.GBTRegressionModelregression.GeneralizedLinearRegressionModelregression.IsotonicRegressionModelregression.LinearRegressionModelregression.RandomForestRegressionModel
- Prediction model chains:
PipelineModel- Referencing the prediction column (
HasPredictionCol#getPredictionCol()) of earlier clustering, classification and regression models. - Referencing the predicted probabilities column (
HasProbabilityCol#getProbabilityCol()) of earlier classification models.
- Hyperparameter selectors and tuners:
JPMML-SparkML
- Feature decorators and transformers:
org.jpmml.sparkml.feature.CategoricalDomainModel(the result of fitting afeature.CategoricalDomain)org.jpmml.sparkml.feature.ContinuousDomainModel(the result of fitting afeature.ContinuousDomain)org.jpmml.sparkml.feature.InvalidCategoryTransformerorg.jpmml.sparkml.feature.VectorDensifier(formerlyorg.jpmml.sparkml.feature.SparseToDenseTransformer)org.jpmml.sparkml.feature.VectorDisassembler
LightGBM
Examples: main.scala
XGBoost
Examples: main.scala