Skip to content

Conversation

@robe2037
Copy link

This PR attempts to improve processing speed of align_move() when fill_na_values = TRUE by partially vectorizing the search for nearest time indexes rather than individually searching for the nearest timestamp for each missing value. Rather than iterate through each missing value in turn, this generates a vector of indexes containing the position of the closest timestamp row for each missing value, then uses this index vector to fill the missing values for the entire attribute/variable vector at once.

These changes also prevent unnecessary interpolation at timestamps that already contain recorded values. Previously, timestamps with recorded values would get a second entry in the aligned vector with missing values that would then be interpolated, despite their later being filtered out.

This approach also fixes #146.

fixes nearest-neighbor interpolation bug and improves speed
Not all attributes will have same missing values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fill_na_values does not always use closest timestamp for interpolation

1 participant