Skip to content

Fix 1970 date after crash/watchdog/brownout reset on ESP32#1896

Open
weebl2000 wants to merge 1 commit intomeshcore-dev:devfrom
weebl2000:robust-time-esp32
Open

Fix 1970 date after crash/watchdog/brownout reset on ESP32#1896
weebl2000 wants to merge 1 commit intomeshcore-dev:devfrom
weebl2000:robust-time-esp32

Conversation

@weebl2000
Copy link
Contributor

@weebl2000 weebl2000 commented Mar 2, 2026

Currently, time is only set to ~2024 when it powers on cleanly. When time is lost, this results in time dropping to 1970. Save time as backup to RTC slow memory. Should fix time staying accurate during brownouts etc.

@AI7NC this should fix your problems as raised in #1349 (comment)

You can build this PR here: https://mcimages.weebl.me?commitId=robust-time-esp32 or I recommend building dev_plus, which has more fixes for Heltec V4, SX1262.

@towerviewcams
Copy link

@weebl2000 I have this issues in almost all my V4 boards, This wont have the currently working well Beta 13.1 PS with AGC fix that @IoTThinks has in test phase right?

@weebl2000
Copy link
Contributor Author

@weebl2000 I have this issues in almost all my V4 boards, This wont have the currently working well Beta 13.1 PS with AGC fix that @IoTThinks has in test phase right?

Correct. Which PRs are that? I can create a merged-branch for you if you want to test.

@IoTThinks
Copy link
Contributor

IoTThinks commented Mar 2, 2026

@towerviewcams Currently, MC will be back to May 2025 if reset by button or crash.
PowerSaving 13 is based in MC 1.13 main to focus on stability and powersaving.

And this PR is not yet in main.
You feel free to test this PR first.

@AI7NC
Copy link

AI7NC commented Mar 2, 2026

@weebl2000 This is the AGC branch we've been testing together for the V4: #1743

@towerviewcams
Copy link

towerviewcams commented Mar 3, 2026

@weebl2000 @IoTThinks @AI7NC
Trying not to confuse anyone here. So the clock issue is preventing the V4 to advert because of whatever is happening when it crashes...... My V4 boards are crashing every 6-15 hours. I have even lowered TX power just by some chance it is a power issue..... still have several doing this.

We know that BETA 13.1 powersaving is working FANTASTIC @IoTThinks - BUT, this issue is here as well. So,

So, last night, I raised my Flood advert timer from 10 to 24hrs and I don't have any resets and I'm over 12hrs. Local advert is Zero now and no longer 240 mins....... We have a busy mesh here in Oregon transmitting allot....buzzing away with traffic.

  • I did a test with two repeaters set to 6hr flood advert time and they both crashed within 15 mins of each other at the 6hr mark.

**I would like to ask that a look into the Flood Advert code and Zero Hop be checked. It seems to be (for my nodes) around the same time or so that my timers where set. When I get up in the morning, I'll know for sure or not once my nodes try to advert at the 24hr mark.

Its happening with Powersaving 13.1 Beta for me might have been caried forward and no known?.... I'm hoping someone can duplicate this possibly ?

@IoTThinks
Copy link
Contributor

IoTThinks commented Mar 3, 2026

@towerviewcams Yes, it is a known issue to the current main MeshCore.

And I love to have regular adverts too.

You can test this PR locally on your test repeaters first.
Please feedback if it works for you.

We will test this PR on our Heltec boards, NRF52 boards under many scenarios to ensure stability.

Once ok, we will include it to our PS Beta 13.1 this weekend.

Currently, time is only set to ~2024 when it powers on cleanly. When
time is lost, this results in time dropping to 1970. Save time as backup
to RTC slow memory. Should fix time staying accurate during brownouts
etc.
@weebl2000 weebl2000 force-pushed the robust-time-esp32 branch from a44f7fc to ab1c35f Compare March 3, 2026 14:36
@towerviewcams
Copy link

@towerviewcams Yes, it is a known issue to the current main MeshCore.

And I love to have regular adverts too.

You can test this PR locally on your test repeaters first. Please feedback if it works for you.

We will test this PR on our Heltec boards, NRF52 boards under many scenarios to ensure stability.

Once ok, we will include it to our PS Beta 13.1 this weekend.

@IoTThinks @weebl2000 I just checked and both of my V4 did reboot/crash ( ? ) at the 24hr advert time. Then, I had my node reboot or crash (not sure) when I did a admin login 10 mins later. so wierd. ahhhh

ok, I will test this PR on two, V4 boards running full power and on the same power systems that I build for all my repeaters. hopefully @AI7NC can also test also. I'll set these to advert 6hrs and really put to the test.

@AI7NC
Copy link

AI7NC commented Mar 3, 2026

Both of my test repeaters running 13.1 have randomly rebooted since I set them up at 7pm yesterday. They're each on fully charged 15ah batteries so it can't be a power supply brownout. Something is causing reboots. Also RL for the antenna system tested from the IPX that connects to the Heltec is -17dBm (1.3:1) on one and -18.5 on the other so I don't think it's RF returning from the filter/antenna system.

On the off chance it is RF coming in, I've set tx to 10dBm for both units and rebooted. I doubt 10 dBm going into a 1.3:1 is going to cause resets.

signal-2026-03-03-130210_002 signal-2026-03-03-130210_003

@AI7NC
Copy link

AI7NC commented Mar 3, 2026

I've set both my nodes to advert at the 3 hour mark to confirm it'll reset due to it. Then I'll attempt a long advert to see if they run > 12 hours (default I've used).

@towerviewcams
Copy link

Earlier this morning, I set up two V4 repeaters they are running and have already passed their first six hour flood advert. Both of them so far so good more testing time needed obviously

image

@AI7NC
Copy link

AI7NC commented Mar 3, 2026

@towerviewcams It does seem both sent out the advert without rebooting. I'm going to stick with the old 13.1 firmware until the 3 hr mark to confirm broken state at advert. Then I'll look into applying this PR to those nodes and retest with the same conditions.

image image

@towerviewcams
Copy link

towerviewcams commented Mar 3, 2026

@towerviewcams It does seem both sent out the advert without rebooting. I'm going to stick with the old 13.1 firmware until the 3 hr mark to confirm broken state at advert. Then I'll look into applying this PR to those nodes and retest with the same conditions.

image image

OK sounds good. I have five repeaters all running 13.1 beta and they all do it even if I change the timer length. Along with other stock 13.1 firmware that resets as well. So I believe it is something that has been carried forward from stock 1.13

@weebl2000
Copy link
Contributor Author

@AI7NC if you're able to, try soldering a 100-470 uF cap on the 3.3V and/or 5V rail. It'll help with the bursty power load of the Heltec V4.

@towerviewcams
Copy link

@AI7NC if you're able to, try soldering a 100-470 uF cap on the 3.3V and/or 5V rail. It'll help with the bursty power load of the Heltec V4.

Asking why this would be needed. Never seen this mentioned and all my repeaters are V4 and didn't have issue until 1.13 and variants....... The max current spike that I see running full juice is about 850ma.... no issue. I've tested that even at 3.2 volts and its solid.

@weebl2000
Copy link
Contributor Author

weebl2000 commented Mar 4, 2026

@AI7NC if you're able to, try soldering a 100-470 uF cap on the 3.3V and/or 5V rail. It'll help with the bursty power load of the Heltec V4.

Asking why this would be needed. Never seen this mentioned and all my repeaters are V4 and didn't have issue until 1.13 and variants....... The max current spike that I see running full juice is about 850ma.... no issue. I've tested that even at 3.2 volts and its solid.

It's not really needed usually, but if you're experiencing resets it's worth a try.

My heltec v4 repeater is running fine 24/7 with my dev_plus branch.

Screenshot_20260304-012436

@weebl2000
Copy link
Contributor Author

weebl2000 commented Mar 4, 2026

@AI7NC actually, interested to know: is this a node that has been running for longer, also on previous versions? v1.11.0 or earlier?

If so I might know what is causing the crashes heh.

@towerviewcams same question: all these nodes that are crashing, were they ever running v1.11.0 or earlier?

@AI7NC
Copy link

AI7NC commented Mar 4, 2026

My two test nodes I think have only ran 1.13 vanilla prior to adding the '13.1' changes we're testing.

They're set at only 10 dBm TX output and we're coming up on the 3 hour advert. I'm strongly against the reasoning it's a brownout but the lower TX power should rule that out, especially with the timing always coinciding with the advert period and not with high duty transmissions which hasn't seemed to cause reboots.

@weebl2000
Copy link
Contributor Author

My two test nodes I think have only ran 1.13 vanilla prior to adding the '13.1' changes we're testing.

They're set at only 10 dBm TX output and we're coming up on the 3 hour advert. I'm strongly against the reasoning it's a brownout but the lower TX power should rule that out, especially with the timing always coinciding with the advert period and not with high duty transmissions which hasn't seemed to cause reboots.

What's the 13.1 changes you're testing? Sorry I'm a little confused lol.

@AI7NC
Copy link

AI7NC commented Mar 4, 2026

My two test nodes I think have only ran 1.13 vanilla prior to adding the '13.1' changes we're testing.
They're set at only 10 dBm TX output and we're coming up on the 3 hour advert. I'm strongly against the reasoning it's a brownout but the lower TX power should rule that out, especially with the timing always coinciding with the advert period and not with high duty transmissions which hasn't seemed to cause reboots.

What's the 13.1 changes you're testing? Sorry I'm a little confused lol.

#1743 is what we were dubbing 13.1.

Update is my two nodes adverted at the 3hr mark but didn't crash. Guess I have to keep them running to see if they still ultimately restart.

image signal-2026-03-03-170616_003 signal-2026-03-03-170616_002

@weebl2000
Copy link
Contributor Author

weebl2000 commented Mar 4, 2026

#1743 is what we were dubbing 13.1.

Update is my two nodes adverted at the 3hr mark but didn't crash. Guess I have to keep them running to see if they still ultimately restart.

Hmm. I've also seen running the agc reset for a week and it's not crashed yet. Is this the only change you're running?

Granted, I'm running dev_plus and it includes more changes than just the agc reset.

@AI7NC
Copy link

AI7NC commented Mar 4, 2026

This was the distro we were running for testing: https://github.com/IoTThinks/EasySkyMesh/releases/tag/PowerSaving13

@weebl2000
Copy link
Contributor Author

This was the distro we were running for testing: https://github.com/IoTThinks/EasySkyMesh/releases/tag/PowerSaving13

Please try this build https://mcimages.weebl.me/?commitId=PowerSaving-v13.1 - perhaps it fixes the resets.

@towerviewcams
Copy link

@AI7NC actually, interested to know: is this a node that has been running for longer, also on previous versions? v1.11.0 or earlier?

If so I might know what is causing the crashes heh.

@towerviewcams same question: all these nodes that are crashing, were they ever running v1.11.0 or earlier?

Yes, I had a couple that were running 1.11 and I did not have this problem

@towerviewcams
Copy link

towerviewcams commented Mar 4, 2026

This was the distro we were running for testing: https://github.com/IoTThinks/EasySkyMesh/releases/tag/PowerSaving13

Please try this build https://mcimages.weebl.me/?commitId=PowerSaving-v13.1 - perhaps it fixes the resets.

@weebl2000 Kevin @IoTThinks said that if the clock reset crash fixed worked that he would merge it to his 13.1 beta powersaving this weekend. Right now, 13.1 Beta has the clock crash issue. I'm not running or testing 13 at all that @AI7NC mentioned. My repeaters are in 13.1 beta and crashing.

I just noticed that it looks like you've already done that here on your image builder.

@IoTThinks Kevin, do you think we should try this or continue running the clock patch PR that has been very solid so far until you bring it in this weekend? I'm headed to bed now to get up very early for work, but will check and can put the image in before I leave and will be able to monitor everything remotely.

My two test V4 repeaters are solid with time and no crashes so far BTW. So if we get this along with 13.1 Beta that has AGC fix already that would be next step im guessing ?

@IoTThinks
Copy link
Contributor

If your Heltec v4 keeps crashing, there is some issue.

You may reduce the TX power down to 22dBm to see if anymore reset?

If set TX power to 29 dBm, Heltec v4 will use up to 750mA for short spike. And it may cause reset on weak battery.

This PR will mitigate the clock issue during a reset not fixing the root cause of reset.

Let me test and merge this PR in a day or two to PowerSaving 13.1.

@IoTThinks
Copy link
Contributor

@weebl2000

Fix 1970 date after crash/watchdog/brownout reset on ESP32

Have you generated crash/watchdog/brownout reset events on ESP32 during the testing?
I mean since this is to fix the time for crash/watchdog/brownout reset event on ESP32, we need to generate these events and see if we have expected outcome and no side effect?

@towerviewcams
Copy link

towerviewcams commented Mar 4, 2026

@IoTThinks Mine are not crashing with this PR. both of my tests boards are solid that I built yesterday that I posted photos above. However, I am having the reset issue with 13.1 Beta. I tried turning the power down and it still does it. current draw at full power is closer to 850mA measured with my test gear. All my batteries are 8,000mah, We have a very busy mesh here in Oregon. Very busy. lots of transmit. Here is the interesting part, If i set the Food advert to 24 hrs, then my reset is taking longer. closer to 24hrs. I'm also experiencing the reset/crash right at admin remote log in to the repeater. its to weird

To rule out any possible issues with my repeaters in the field, is why I built the two test repeaters at my home, yesterday. They are rock solid. running only stock 1.13 and then update with this PR rebust-time-esp32........

My thoughts-
Kevin, once you merge this into Beta 13.1 that also has the AGC fix, then I will OTA into both my test repeaters here. We will know for sure what's happening. I travel for work, otherwise I would OTA the image that @weebl2000 suggested. I'll be back Thursday night. Kevin, I'm hoping you can merge by Friday so that a good test can happen this weekend if your ok with that sir. I will also put in the image that @weebl2000 has posted above for Powersaving-v13.1 I have not tried this yet. Maybe both will be ready to test Friday?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants