-
Notifications
You must be signed in to change notification settings - Fork 21
Add hooks to handle simulation failure #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
We include a `post_sim_fail_hook` method in desmo and enable components to inject behavior upon simulation failure. Components can use this hook to clean up simulation state or artifacts.
jpgrayson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of providing a means for a model to have top-level error handling is noble, but I'm not sure having a hook method on Component is ultimately going to be a workable paradigm.
| except BaseException as e: | ||
| env.tracemgr.trace_exception() | ||
| result['sim.exception'] = repr(e) | ||
| top.post_fail() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If either top_type.pre_init() or top_type() fail, then the top local will not exist when entering this exception handler. This would lead to a NameError here where we reference top.
To retain this strategy of calling top.post_fail(), yet another nested try/except would have to be added after top is initialized.
The new indentation level would be awkward, but it also means that the scope of error catching would exclude top_type.__init__() and top_type.pre_init(). Anyone writing a Component that implements a post_fail_hook() would need to comprehend this limitation. It also means that there would not be a solution to handling exceptions raised during top.pre_init() or top.__init__().
A further complication is that post_fail_hook() methods might be called either during or after post_simulate(). Thus post_fail_hook() implementations would have to take care to idempotently tear-down any resources that might have been affected by post_simulate().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out the possibility of top being unavailable here.
Yes, you are right in the sense that anyone writing a Component with a post_fail_hook() should be aware that a valid Component exists for Component.post_fail_hook() to be called. I don't know if that's a big deal. Another possibility is to use have a Component.post_fail_hook() and SimEnvironment.post_fail_hook() to handle the case where Component is invalid. I'm not sure if that's necessary.
I don't understand what you mean by tear-down any resources that might have been affected by post_simulate(). Can you elaborate on what you mean by resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding resource teardown, the stated goal for this changeset was to allow components to clean-up in the event of an unhandled exception. I'm using "resource" to refer to anything that would be involved in this clean-up; i.e. in the sense of RAII.
The key issue I'm pointing out is that if a component has resources to finalize/destroy/close/deallocate, then it will ostensibly be doing that either in post_simulate() or get_result() in the non-exception paths. Since proposed post_fail_hook() is called from a try/except that covers elaborate, run, post_simulate, and get_result, that means post_fail_hook could end up being called before, during, or after post_simulate; or before or during get_result. Thus resources may already be deallocated prior to post_fail_hook().
The requirement for deallocation of resources to be idempotent is not particularly unusual, but it does impose some cognitive burden on people writing componets with a post_fail_hook(). We should be aware that adding post_fail_hook has this cost. And it's docstring should probably highlight the wide variety of states the top_type instance might be in when an exception triggers post_fail_hook.
Regarding the possibility of having a SimEnvironment.post_fail_hook(), adding hooks to SimEnvironment would be at odds with the existing paradigm for extending SimEnvironment's behavior by subclassing with overriding methods.
As an aside, I chose to provide hooks for Component because there is a recursive nature to elaborate, post_simulate, and get_result that would make it some what tricky for users of Component to place their super method calls correctly. Using hooks abstracts away that complexity of users needing to know where/when/whether to call the super methods.
SimEnvironment, on the other hand, just has a single instance, so it is not such a big deal for a subclass to override its methods. For the problem at hand, overriding SimEnvironment.run() would be a reasonable thing for a subclass to do if it wanted to handle simulation-time errors in a special manner.
I wonder if it is sufficient for most models to attach resources that need cleanup to the SimEnvironment instance and wrap SimEnvironment.run() in a try/except block in a SubEnvironment.run() method?
| child.post_fail() | ||
| self.post_sim_fail_hook() | ||
|
|
||
| def post_sim_fail_hook(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For naming, I would advocate on_exception() and exception_hook().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean, replace post_sim_fail_hook() with on_exception() and post_fail() with on_exception()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
post_sim_fail_hook() --> exception_hook()
post_fail() --> on_exception()
But this is naming is a secondary issue. The big issue is whether this paradigm is actually viable/appropriate.
8f916a9 to
1ab422a
Compare
We include a
post_sim_fail_hookmethod in desmo and enablecomponents to inject behavior upon simulation failure. Components
can use this hook to clean up simulation state or artifacts.