Skip to content

Simplex(): inconsistent indexing between lib/pred (1-based) and outputs (0-based) #70

@cxweoth

Description

@cxweoth

Simplex(): inconsistent indexing between lib/pred (1-based) and outputs (0-based)

Simplex() currently mixes two indexing conventions:

  • lib / pred use 1-based indexing
  • returned results (Projection.Time, embedding rows, targets) use 0-based indexing

This causes an off-by-one mismatch that is easy to misinterpret.


Minimal example

This example is intended to forecast the 11th and 12th elements.
However, the output instead corresponds to the 12th and 13th elements, while the 11th element has no prediction.
This off-by-one behavior makes it unclear which time step is actually being forecast.

import numpy as np
import pandas as pd
from pyEDM import *

beta, rho, sigma = 8/3, 28.0, 10.0
dT, iterations = 0.01, 1000

x, y, z = np.empty(iterations), np.empty(iterations), np.empty(iterations)
x[0], y[0], z[0] = (0., 1., 1.05)

for i in range(iterations - 1):
    dxdt = sigma * (y[i] - x[i])
    dydt = x[i] * (rho - z[i]) - y[i]
    dzdt = x[i] * y[i] - beta * z[i]
    x[i+1] = x[i] + dxdt * dT
    y[i+1] = y[i] + dydt * dT
    z[i+1] = z[i] + dzdt * dT


x_df = pd.DataFrame({
    "Time": np.arange(len(x[:20])),
    "X": x[:20]
})

E = 3        # embedding dimension
Tp = 1       # predict one step ahead: x_{t+1}
tau = -1      # delay (pyEDM param is tau)

lib  = "1 10"     # use time steps 1–5000 to build the attractor (training/library)
pred = "11 12"   # forecast time steps 5001–9999 using neighbors from the library

simplex_out = Simplex(
    dataFrame=x_df,
    lib=lib,
    pred=pred,
    E=E,
    Tp=Tp,
    tau=tau,
    embedded=False,
    columns="X",
    target="X",
    returnObject=True,
)

print("Observation of element 11:", x[10])
print("Observation of element 12:", x[11])
print("Observation of element 13:", x[12])
print(simplex_out.Projection)

Output:

Observation of element 11: 0.8522208237292
Observation of element 12: 0.958566910264601
Observation of element 13: 1.0754996307533533
   Time  Observations  Predictions  Pred_Variance
0    10      0.852221          NaN            NaN
1    11      0.958567     0.670722       0.007756
2    12      1.075500     0.655379       0.008449

Suggestion

Please consider either:

  • using 0-based indexing everywhere (numpy style), or
  • returning 1-based times consistently, or
  • clearly documenting the mismatch.

Thanks for the great library!


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions