-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
Simplex(): inconsistent indexing between lib/pred (1-based) and outputs (0-based)
Simplex() currently mixes two indexing conventions:
lib/preduse 1-based indexing- returned results (
Projection.Time, embedding rows, targets) use 0-based indexing
This causes an off-by-one mismatch that is easy to misinterpret.
Minimal example
This example is intended to forecast the 11th and 12th elements.
However, the output instead corresponds to the 12th and 13th elements, while the 11th element has no prediction.
This off-by-one behavior makes it unclear which time step is actually being forecast.
import numpy as np
import pandas as pd
from pyEDM import *
beta, rho, sigma = 8/3, 28.0, 10.0
dT, iterations = 0.01, 1000
x, y, z = np.empty(iterations), np.empty(iterations), np.empty(iterations)
x[0], y[0], z[0] = (0., 1., 1.05)
for i in range(iterations - 1):
dxdt = sigma * (y[i] - x[i])
dydt = x[i] * (rho - z[i]) - y[i]
dzdt = x[i] * y[i] - beta * z[i]
x[i+1] = x[i] + dxdt * dT
y[i+1] = y[i] + dydt * dT
z[i+1] = z[i] + dzdt * dT
x_df = pd.DataFrame({
"Time": np.arange(len(x[:20])),
"X": x[:20]
})
E = 3 # embedding dimension
Tp = 1 # predict one step ahead: x_{t+1}
tau = -1 # delay (pyEDM param is tau)
lib = "1 10" # use time steps 1–5000 to build the attractor (training/library)
pred = "11 12" # forecast time steps 5001–9999 using neighbors from the library
simplex_out = Simplex(
dataFrame=x_df,
lib=lib,
pred=pred,
E=E,
Tp=Tp,
tau=tau,
embedded=False,
columns="X",
target="X",
returnObject=True,
)
print("Observation of element 11:", x[10])
print("Observation of element 12:", x[11])
print("Observation of element 13:", x[12])
print(simplex_out.Projection)Output:
Observation of element 11: 0.8522208237292
Observation of element 12: 0.958566910264601
Observation of element 13: 1.0754996307533533
Time Observations Predictions Pred_Variance
0 10 0.852221 NaN NaN
1 11 0.958567 0.670722 0.007756
2 12 1.075500 0.655379 0.008449
Suggestion
Please consider either:
- using 0-based indexing everywhere (numpy style), or
- returning 1-based times consistently, or
- clearly documenting the mismatch.
Thanks for the great library!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels