Skip to content

Conversation

@tingstad
Copy link
Collaborator

@tingstad tingstad commented Jul 3, 2022

Possibly controversial proposal 😄

I was fooled enough times by forgetting to reload sub-modules programmatically that I wrote this.

The readme states:

Why aren't recursive dependencies reloaded?

A design note. The first version of this library tried to infer what modules it would have to reload. It maintained a list of all the modules it had previously reloaded, and watched those for files. Lots of automation. Lots of complexity. In the end, I didn't find it useful at all. Instead, I add in reloads in the modules I'm developing at the top like this. The advantage? It's all dynamic -- I don't have to restart hotload -- it can just keep on runnning. If you've chosen to watch a folder, each file change will trigger a reload to the top-level entry point.

I think this PR is a kind of middle ground. I do not try to detect all modules, only the ones written to stdin as watchfiles.

This commit adds a new step, ReloadModules, which reloads all watched modules except for the init module.

This step needs to know which modules have changed, which is currently only available in the main loop's variables (new_changed, last_changed), so I pass this information to the step if it's named "ReloadModules". Other alternatives could be some global state, or passing some general state to all the steps.

Adds a new step, ReloadModules, which reloads all watched modules except for the init module.

This step needs to know which modules have changed, which is currently only available in the main loops variables (new_changed, last_changed), so I pass this information to the step if it's named "ReloadModules". Other alternatives could be some global state, or passing some general state to all the steps.
def run(self):
os.system("cls" if os.name == "nt" else "clear")

class ReloadModules(Runnable):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps not really a Runnable when run requires an additional argument?

Copy link
Owner

@teodorlu teodorlu Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to allowing run accept (*args, **kwargs). In that case, we could always provide a list of modules.

step.run(changed_modules=changed_modules)

Though if the list of modules is passed to ReloadModules on __init__, perhaps it doesn't need any special run() treatment?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit - I see that you're only reloading modules that have changed. In that case, we'd need to pass the list of changed modules.

@tingstad tingstad mentioned this pull request Jul 3, 2022
for module in sys.modules.copy().values():
file = getattr(module, '__file__', None)
if file is not None:
modules[file] = module
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Could this module loop be a performance issue? An alternative could be to store the modules similar to watchfiles)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This far I haven't really explored possible performance hits. Polling is bad (in general).

Why?

I simply haven't yet needed hotload when working on codebases that are so big that performance becomes an issue.

@teodorlu
Copy link
Owner

teodorlu commented Jul 3, 2022

Possibly controversial proposal 😄

In general - I'm open to exploring possibly controversial proposals under feature flags! If we can work trunk based, I think that's easiest.

ls | hotload --recursive entrypoint.py

Edit - I replied a bit fast. I see that you thought about this.

This commit adds a new step, ReloadModules, which reloads all watched modules except for the init module.

Does this mean that merging this PR does not change hotload semantics unless one manually configures a ReloadedModules step?

@tingstad
Copy link
Collaborator Author

tingstad commented Jul 3, 2022

Does this mean that merging this PR does not change hotload semantics unless one manually configures a ReloadedModules step?

Almost, this line introduces it:

"steps": [ClearTerminal(), ReloadModules(reloaded_module.module), reloaded_module],

There may be one other small catch, if you:

  1. Break a submodule (e.g. syntax error)
  2. Change the top module

Then the terminal will show output from the topmodule, using the older submodule, "hiding" the error message. If you change the submodule again, the error appears. Not a very big problem, but can be confusing the few times it happens.

@teodorlu
Copy link
Owner

teodorlu commented Jul 3, 2022

I guess we might get in trouble if module order reloading matters. But I think that would be a bad pattern - do stuff on "module load time" that depends on state in other modules.

We could theoretically do a topological sort of the dependency graph, but that's a lot harder than just "reload the stuff on stdin".

Also, the user chooses the order in which to reload stuff on passing stdin to hotload.

conf = {
"watch": [watchfiles],
"steps": [ClearTerminal(), reloaded_module],
"steps": [ClearTerminal(), ReloadModules(reloaded_module.module), reloaded_module],
Copy link
Owner

@teodorlu teodorlu Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReloadModules(reloaded_module.module) reads kind of funny to me. Lots of "module", but not necessarily easy to understand what's going on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, reading the implementation is necessary to understand this line.

Naming the argument could help: ReloadModules(main=reloaded_module.module) or ReloadModules(exclude=reloaded_module.module).

But I am of course open to renaming ReloadModules and reloaded_module too.

I have not perfected names and such in this PR, I wanted to initially share the concept.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can polish later :)

@teodorlu
Copy link
Owner

teodorlu commented Jul 3, 2022

A meta comment.

I don't use all of hotload's functionality. So we could perhaps remove stuff to make it easier for ourselves. An example is pre_reload_hook and post_reload_hook.

Moving forward, I don't want to break your workflow. So writing down what our current use is would be good. Either in a markdown file, or as a test. Just a markdown file might be easier. Kind of like a public interface contract.

Edit: I found out that I actually do use the hooks - when calling hotload with --entrypoint. So the hooks isn't a good example.

But a "cleaner" public interface would still be good.

return {path: _file_changed(path) for path in filepaths}

def _changed_modules(new_changed, last_changed):
list = []
Copy link
Owner

@teodorlu teodorlu Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list shadows the list class.

Perhaps call it changed_modules or something else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@tingstad
Copy link
Collaborator Author

tingstad commented Jul 3, 2022

I guess we might get in trouble if module order reloading matters. But I think that would be a bad pattern - do stuff on "module load time" that depends on state in other modules.

We could theoretically do a topological sort of the dependency graph, but that's a lot harder than just "reload the stuff on stdin".

Also, the user chooses the order in which to reload stuff on passing stdin to hotload.

I only assume that any modules passed to stdin should reload before the init module. The order passed to stdin does not metter. Only modification of files matter, triggering just the file modified. Do you think this usually will work? If a dependency graph is needed, this PR will not work well. (It does however work well in my current case with not too many modules.)

@tingstad
Copy link
Collaborator Author

tingstad commented Jul 3, 2022

A meta comment.

I don't use all of hotload's functionality. So we could perhaps remove stuff to make it easier for ourselves. An example is pre_reload_hook and post_reload_hook.

Moving forward, I don't want to break your workflow. So writing down what our current use is would be good. Either in a markdown file, or as a test. Just a markdown file might be easier. Kind of like a public interface contract.

Edit: I found out that I actually do use the hooks - when calling hotload with --entrypoint. So the hooks isn't a good example.

But a "cleaner" public interface would still be good.

Feel free to clean :) My use case is #7 (87cc272)

@teodorlu
Copy link
Owner

teodorlu commented Jul 3, 2022

I only assume that any modules passed to stdin should reload before the init module. The order passed to stdin does not metter. Only modification of files matter, triggering just the file modified. Do you think this usually will work? If a dependency graph is needed, this PR will not work well. (It does however work well in my current case with not too many modules.)

Good point.

Actually -- regardless of whether it would work or not, I think this is the behavior we want.

  1. Write hotloadable code - so that just a single module can be reloaded
  2. Then enjoy the near-instant reloads

@tingstad
Copy link
Collaborator Author

tingstad commented Jul 3, 2022

In general - I'm open to exploring possibly controversial proposals under feature flags! If we can work trunk based, I think that's easiest.

That's a good idea actually. I could put ReloadedModules behind a command option. We can iron out other stuff first. Specifically, I'm not sure what --entrypoint does, and if it relates to these other changes?

@teodorlu
Copy link
Owner

teodorlu commented Jul 3, 2022

I'm going to do the following:

  1. Merge this PR
  2. Merge "put reload recursive" behind a feature flag (code only local for now).
  3. Give you commit rights to teodorlu/hotload.

To enable the feature flag, set the env var HOTLOAD_RECURSIVE=HOTLOAD_RECURSIVE or call with hotload mymodule --recursive.

Ref commit rights to teodorlu/hotload: I'm going on vacation next monday. Feel free to merge bug fixes / doc fixes / tests straight to master. For new features, please put those under feature flags.

I'm back in a week, looking forward to chat then :)

@teodorlu teodorlu merged commit 777ce94 into teodorlu:master Jul 3, 2022
@tingstad tingstad deleted the reloadall branch July 3, 2022 19:23
@teodorlu
Copy link
Owner

teodorlu commented Jul 3, 2022

That's a good idea actually. I could put ReloadedModules behind a command option. We can iron out other stuff first. Specifically, I'm not sure what --entrypoint does, and if it relates to these other changes?

entrypoint lets your specify a function that hotload should call after it's done reloading.

Example:

def plus(x, y):
    return x + 2

def hotload_entrypoint():
    print(plus(1,2))
    print(plus(10,20))

# now you can
#
#   ls | hotload script.py --entrypoint hotload_entrypoint
#
# to run your script, without checking for env vars and stuff :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants