-
Notifications
You must be signed in to change notification settings - Fork 23
Description
If I understand correctly, fill_na_values is supposed to use nearest-neighbor interpolation to fill missing values with the present values that are closest in time to the missing record (regardless of whether that timestamp is before or after the record being interpolated). Currently, some values do not get filled with the value corresponding to the closest timestamp:
library(moveVis)
data("move_data")
# add fake interpolation variable
move_data[["x"]] <- 1:nrow(move_data)
aligned <- align_move(move_data, res = units::set_units(2, "min"))
# Aligned timestamp 10:05:59 gets interpolated value `x = 3`
aligned[3, ]
#> A <move2> with `track_id_column` "track" and `time_column` "timestamp"
#> Containing 1 track lasting 0 secs in a
#> Simple feature collection with 1 feature and 3 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 8.962678 ymin: 47.75754 xmax: 8.962678 ymax: 47.75754
#> Geodetic CRS: WGS 84
#> timestamp track x geometry
#> T246a.147 2018-05-15 10:05:59 T246a 3 POINT (8.962678 47.75754)
#> Track features:
#> track
#> T246a T246a
# This comes from the value associated with recorded timestamp 10:08:02,
# but closest recorded time is 10:04:01, which has a value of `x = 2`
move_data[2:3, ]
#> A <move2> with `track_id_column` "track" and `time_column` "timestamp"
#> Containing 1 track lasting 4.02 mins in a
#> Simple feature collection with 2 features and 3 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 8.964713 ymin: 47.75667 xmax: 8.964773 ymax: 47.75683
#> Geodetic CRS: WGS 84
#> geometry timestamp track x
#> T246a.2 POINT (8.964773 47.75667) 2018-05-15 10:04:01 T246a 2
#> T246a.3 POINT (8.964713 47.75683) 2018-05-15 10:08:02 T246a 3
#> Track features:
#> track
#> T246a T246aCreated on 2025-11-03 with reprex v2.1.1
This stems from the code below, which identifies the nearest non-missing records by timestamp. The first ([1]) entry of the preceding timestamps (left) is always selected as the comparison timestamp for interpolation. This means that the interpolation will essentially always (except near the beginning of the track's timestamps) select the later time as the appropriate timestamp to use for interpolation, as it will generally be much closer in time to the interpolation point than the first timestamp in the track.
Lines 197 to 199 in cf88cdc
| if(!is.null(left)){ | |
| non_na <- left[which(!is.na(this_attr[left]))[1]] | |
| } else non_na <- NULL |
This just needs to be modified to select the latest record in left rather than the first.