Problem Statement
Right now, any errors calling Error_Handler() disable rtos interrupts, and trap the MCU into a infinite loop. This is an unacceptable vulnerability for the BMS. The BMS must be highly fault-tolerant; a freeze for every error level is dangerous and makes debugging impossible.
Fault Considerations
There are a few things to consider:
-
There will be no ablility to step debug the BMS while it is installed in the car.
- The only way to receive faults will be over CAN (assuming no CAN related faults) OR physical access to the SWD header.
- The I2C EEPROM can be used to store logs, but also must be requested + read over CAN.
-
Any fault that freezes the BMS, or requires a power cycle will latch open the shutdown circuit.
- If the car is driving, that means it will turn off, and require a non-driver assistant to reset the fault. Only then can the car turn on again.
- These faults should be limited to critical safety errors and anything explicitly required by rules.
- logging + sending fault status CAN messages MUST CONTINUE, especially for these kinds of faults. We don't want to shut down the only device telling us information about the battery pack!!! The BMS can alert us of faults prior to turning any high voltage stuff on.
Ideal Solution
- Implement detailed error/state logging.
- Logs must be granular enough to diagnose track-side errors, as they provide the only details you can possibly receive to diagnose errors.
- This also means the error handling functions must safely write to EEPROM.
- Establish different severities of faults, and clearly define what consequences lead to that classification
- Faults must trigger self-recovery attempts (e.g., re-initializing a peripheral) unless it is literally impossible or violates the competition rules.
- Map existing faults,
- Assign appropriate fault handling/severities to all possible failure points (replacing current while(1){} traps).
- Implement the hardware watchdog (IWDG).
- Ensure the independent watchdog is active to catch an overarching system error that is undetected by the software error handler.
Problem Statement
Right now, any errors calling Error_Handler() disable rtos interrupts, and trap the MCU into a infinite loop. This is an unacceptable vulnerability for the BMS. The BMS must be highly fault-tolerant; a freeze for every error level is dangerous and makes debugging impossible.
Fault Considerations
There are a few things to consider:
There will be no ablility to step debug the BMS while it is installed in the car.
Any fault that freezes the BMS, or requires a power cycle will latch open the shutdown circuit.
Ideal Solution