Little refactor and minor bug#41
Conversation
change pandas loading engine='c'
# Conflicts: # folktables/load_acs.py
…fillna of pandas to solve multiple nan encodings; creating directory for cached data when not available.
|
Could the proposed fix for multiple nan encoding also help in the generate_categories function? Currently the code uses a placeholder value (e.g. -99999999999999.0) because using a nan value in a python dictionary doesn't really work as each nan is seen as a different key. |
I'm sorry, but given that I'm external to the project, I'm fixing only the things that are giving problems to my experiments. |
| return load_definitions(root_dir=self._root_dir, year=self._survey_year, horizon=self._horizon, | ||
| download=download) | ||
|
|
||
| def fillna_safe(x, value=-1): |
There was a problem hiding this comment.
Can you describe what the intended behavior of this function should be?
Note that np.nan_to_num will convert nans to 0.0 unless you set the keyword nan=value. See here.
So, currently, I believe this function will first convert all nans to zero and then pass an array without any nans to pandas.
Am I getting this wrong?
mrtzh
left a comment
There was a problem hiding this comment.
I think there may be an issue with fillna_safe.
|
See also discussion in #39 about the |
fix: refactor fillna and mobility_filter lambda functions with functions, fillna_safe also applies fillna from pandas to solve multiple nan encodings; creating directory for cached data when not available.