Skip to content

Conversation

@Lex-Ashu
Copy link
Contributor

@Lex-Ashu Lex-Ashu commented Nov 3, 2025

Pull Request

Description

Fixes a bug in persistence error calculation when dropout is applied to historical PV site data. Previously, the persistence baseline would use dropped values (-1), causing huge errors in wandb metrics. The fix finds the last valid (non-dropped) value before the forecast period instead of naively using the last index position.

Fixes #485

How Has This Been Tested?

Created a test script that verifies the persistence calculation logic works correctly:

  1. Normal case: Last value before forecast is valid - ✓ PASS
  2. Dropout case: Last value is dropped (-1) - ✓ PASS - correctly finds previous valid value
  3. Multiple drops: Multiple dropped values in history - ✓ PASS - correctly finds last valid value

All tests passed. Existing end-to-end tests continue to pass, confirming no regression.

  • Yes

Checklist:

  • My code follows OCF's coding style guidelines

  • I have performed a self-review of my own code

  • I have made corresponding changes to the documentation

  • I have added tests that prove my fix is effective or that my feature works

  • I have checked my code and corrected any misspellings

@Lex-Ashu
Copy link
Contributor Author

Lex-Ashu commented Nov 3, 2025

Hey @AUdaltsova Please have a look, it took me time cause the PR run was failing and i was not able to get the exact issue despite changing a little things.
The CI infrastructure issues is not related to the changes I made but due to a missing pip installation in the reusable workflow.
Do guide me if I am missing something.
Thank you for assigning me.

@Sukh-P
Copy link
Member

Sukh-P commented Nov 4, 2025

Hi @Lex-Ashu just an FYI we are looking into this CI pip error currently and should have a solution soon, thanks

@Lex-Ashu
Copy link
Contributor Author

Lex-Ashu commented Nov 4, 2025

Hi @Lex-Ashu just an FYI we are looking into this CI pip error currently and should have a solution soon, thanks

Works, From the error what i understood is that it is not able to build the package tensorstore cause the Python environment used to build doesn't have a pip command so its not able to build these. I feel workflows/pull_ci.yml needs to have a pip in it so that it can build these things up just an opinion.
Rest let me know if the PR is ready to be merged or i should change something.
Thanks

@Sukh-P
Copy link
Member

Sukh-P commented Nov 4, 2025

@Lex-Ashu the latest changes in pvnet fix this issue so if you merge in the latest changes on main the CI should pass, thanks

@AUdaltsova
Copy link
Contributor

Hi @Lex-Ashu, thanks so much for taking on this & making the change! Sorry I thought I've left a comment about this but I've obviously forgotten: would it be possible to add a small test to this? Hopefully there is something in the test suit you can base it on, but let me know if you have trouble with it. Nothing too elaborate, I would just check that it selects what you expect it to when a) all data is there b) some data at the tail end is -1. I would also maybe do a very quick check before error calculation that a) there is some history data and b) that not all of it is < 0 and throw a warning if not. This is very unlikely but I think it would make it a lot easier to debug if something here breaks.

Thanks again! Let me know if you have any questions.

@AUdaltsova
Copy link
Contributor

Hi @Lex-Ashu, it's been a bit so just wanted to check in - are you planning to return to this issue?

@Lex-Ashu
Copy link
Contributor Author

Hi @Lex-Ashu, it's been a bit so just wanted to check in - are you planning to return to this issue?

Hey,
Sorry for the delay had a little accident so took a break and right now having my end semester exam, No worries I will commit the next change by tom after my exam.
You are just looking for a test file to check the new changes right.
If so that would be done by tom or if you want to add anything let me know I will fix it.
Thank you

@AUdaltsova
Copy link
Contributor

Hi @Lex-Ashu, no need to apologize! Thank you for helping with this :) By the way, I've now tried your fix in action and it works very well, thank you! I've looked at this again, and I'd actually be happy to merge this in if that's alright by you.

Best of luck with your exams!

@Lex-Ashu
Copy link
Contributor Author

@AUdaltsova Sure if it works I am glad I was able to help with this, incase we need a test file for this I will create a new PR for that.
Thanks

@AUdaltsova
Copy link
Contributor

Hi @Lex-Ashu, so there is unfortunately some unrelated issue with the CI run; would you be able to add

[tool.pytest]
tmp_path_retention_policy = "none"
tmp_path_retention_count = "1"

to pyproject to see if that helps?

@AUdaltsova AUdaltsova closed this Jan 6, 2026
@AUdaltsova AUdaltsova reopened this Jan 6, 2026
@AUdaltsova AUdaltsova merged commit 0b257e7 into openclimatefix:main Jan 6, 2026
16 of 21 checks passed
@AUdaltsova
Copy link
Contributor

@all-contributors please add @Lex-Ashu for code

@allcontributors
Copy link
Contributor

@AUdaltsova

@Lex-Ashu already contributed before to code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Persistence error on wandb doesn't account for dropout in site data

3 participants