GitHub - Debarchan1994/Bucketizer_dynamic: This function is used to bucketize a column as per your need and add the bucketized values as additional columns to your existing Spark dataframe.Also you don't have to worry about the number the columns you want to bucketize, this function can bucketize as many columns as you want.

This function is used to bucketize a column as per your need and add the bucketized values as additional columns to your existing Spark dataframe.Also you don't have to worry about the number the columns you want to bucketize, this function can bucketize as many columns as you want, you just have to pass the names of the columns with the bucketization ranges into the inputCols paramter as a dictionary with the name of the input columns as key and the necessary ranges in the form of tuple or list for the bucket size as the value. For example inputCols = {'my_column' : (0,100,10)} or {'my_column' : [0,10,100,200,500,1000]}

    Parameters :: \n
                1) df : type(pyspark dataframe)--> The base dataframe
                2) inputCols : type(dictionary)--> Key : column names that need to be bucketized , value : Bucketization range or list
                3) outputCols : type(list)--> Desired names of the final output columns
                4) cat_alternative : type(boolean)--> Defines whether 2 columns should be created of each input column with second column being the indicator for the last value of a particular bucket (example,  main column:  10 ... 20 , second alternate column : <20) or just 1 main column
                                                    (example, main column:  10 ... 20). 
                                                    The default value is False, but if True is passed as parameter then only the function will create 2 columns for each input column.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

Debarchan1994/Bucketizer_dynamic

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages