Skip to content

Conversation

@savetheclocktower
Copy link

@savetheclocktower savetheclocktower commented Sep 1, 2025

This is a monumental change, but I’ll try to summarize it.

The first tagged version of pathwatcher is twelve years old. The Node ecosystem was vastly different back then. The native bindings didn’t even originally use nan; they consumed libuv directly. It was not yet possible to run more than one Node instance in the same process, so context-awareness wasn’t even a glint in anyone’s eye yet.

When we had to go through all the old native modules and ensure they were context-aware for the PulsarNext endeavor, pathwatcher was by far the biggest challenge. Lots of the native code still consumed libuv directly despite depending on nan.

I’m not a C++ expert, to say the least. I was able to understand tutorials aimed at migrating nan code to N-API, but I lack the ability to take lower-level code and re-imagine it in a context-aware environment. Still, I made the effort!

First attempt: modest rewrite

In lieu of understanding exactly what would achieve context-awareness, I wrote some tests that used threads in order to introduce multiple contexts within the same process and treated them as a proving ground for whatever I came up with.

I then tried to perform the minimum amount of upgrading that could possibly work — first by adapting the existing nan bindings to add context-awareness, then by attempting a minimal N-API rewrite. The goal was to keep things the same as much as possible, but with the added requirement that whatever I came up with must still demonstrate the ability to work in multiple contexts at once without segfaulting.

This was nearly a year ago and I don’t remember the exact details… but the n-api branch shows the results of that experiment. In short, the existing approach was too low-level for the amount of abstraction that context-awareness demanded.

I started looking around for other solutions.

Second attempt: bring in outside help

Rewriting the library entirely was no mean feat — it would have to perform similarly to the existing pathwatcher without any API changes. That’s a particular challenge because pathwatcher is the only file-watching library I can find that attempts to work synchronously.

PathWatcher.watch('foo.js', () => console.log('changed!'));
fs.writeFileSync('foo.js', 'new content');
// logs 'changed!'

All modern file-watchers expect that it will be some amount of time between your request to monitor a file/directory for changes and its fulfillment. (This made it more or less impossible to modernize pathwatcher by rewriting it to use some other file-watcher under the hood — and, yes, I definitely attempted that as well.)

Eventually I found efsw. It uses the right underlying APIs for each platform (old pathwatcher never used FSEvents on macOS, even after it became widely available) and it allowed for synchronousness. The process was slow-going because of my C++ inexperience, and it involved lots of long talks with Claude, the code robot which I find least distasteful of all the options out there. I took all the code Claude gave me with several grains of salt, checking its approaches against non-AI sources for sanity’s sake.

Complications on macOS

Eventually, I realized that efsw’s macOS approach was flawed. It used one FSEvents watcher per requested file/path, but there’s a hard system-wide limit of (I think) 1024 watchers, so that’s just not practical. (I didn't know about this limit until I had evidence of file-watching calls silently failing, at which point I dug into Console.app and learned why.)

I spent about a week reacting to this constraint in a fatally misguided way. pathwatcher doesn’t support recursive directory watching, but efsw allows you to add either recursive or non-recursive watchers. So I tried to optimize for watcher reuse on macOS by opting into recursive file-watching in efsw, then introducing a scheme similar to the NativeWatcherRegistry that watchPath uses. This got the specs to pass, but it unquestionably made the code worse — more complex and harder to reason about — just to solve an issue that was happening on only one of the three platforms we needed to support. (I understand why this watcher-reuse code was written for Atom back in the day, but I absolutely hate it and would love to find a way to make it unnecessary in Pulsar. So it was dumb of me to turn to this code I hated, even if I thought it would solve my problem.)

I refocused. This was a macOS-specific issue, so it needed a macOS-specific solution. Luckily, I discovered that each FSEvents watcher can watch arbitrarily many files, so no single consumer should need to declare more than two. Even more luckily, I was able to write a macOS-only adapter that is used instead of efsw’s approach in a way that is API-compatible. (We declare two watchers because, when you change the exact list of watched paths, you must restart the stream; so we alternate the streams between “primary” and “backup” and stagger their starting/stopping when paths change to ensure we don’t miss any events.)

(The author of efsw was unaware of the watcher limit on macOS, and is interested in integrating these changes… but my implementation cares only about non-recursive watching, so it’s not something I can contribute upstream in its current form.)

Stability

The resulting pathwatcher library has been running on the updated-latest-electron branch for many months now. I have not suffered from any pathwatcher-related crashes for months, and I have not observed any real-world regressions in pathwatcher behavior. I am not 100% certain we’ll be able to deprecate pathwatcher in the near future, as much as I’d like to; but I’m convinced this rewrite is a solid enough base on which to proceed.

Testing

I would be a liar if I said that this rewrite were seamless and required no changes to the existing test suite.

One thing that the rewrite appears to struggle with is frequent watching and unwatching of the same file or path. This is something that is highly unlikely to happen outside of the unit tests — but unit tests are how I prove that this rewrite is up to snuff, so you can see the dilemma.

I mitigated this by introducing 50ms of waiting after each test; this is long enough to get the tests to pass 100% of the time on my machine.

Still, I would love to make this a non-issue in the future. (I can't prove that it won't be a problem in the real world.) I have done some experimentation with an approach such as this:

  • Someone subscribes to ./foo. Add it to the native watcher.
  • Later, they unsubscribe from ./foo. Stop reporting changes to ./foo immediately, but do not remove it from the native watcher yet. Instead, schedule it to happen at some point in the future (like 1 second from now).
  • If the user re-subscribes to ./foo within that span of time: cancel the scheduled removal, stop ignoring changes to ./foo, and leave the native watcher unchanged.
  • But if the user hasn’t re-subscribed to ./foo after that time: remove it from the native watcher, then remove ./foo from the ignore list (because we're not watching it anymore, so there's no need to keep it on the list).

I have tried this approach and gotten it mostly working, but it's not bulletproof. For one thing, it complicates the code. To perform a task after a delay, I can (a) schedule it to happen in another thread (at which point I need to signal the main thread so it can finish the unwatching operation); or (b) schedule it with libuv. Option A broke my brain. Option B is conceptually much simpler, but leaves me less confident that we’re operating with context awareness. Indeed, using this approach, I can produce sporadic segfaults under a multi-thread stress test, and I’m not yet conversant enough in the debugging tools to figure out exactly what’s going wrong.

But if we ever recruited someone to the project with experience in these areas, this is one of the first things I’d be eager for them to take a look at.

Review

How is anyone supposed to review this? I have no idea. I would suggest that you look it over, make sure nothing appears straightforwardly insane, and then rely on whatever experience you have with using PulsarNext in the last ~10 months as proof that this migration is fundamentally sound.

You may also find this issue useful. Early on it's mainly about nsfw (a different file-watching library that watchPath uses) but then it becomes about preventing crashes that happen because of pathwatcher. It gives more context for my evolving approaches and might have some details that I'm not remembering at this moment.

The future

I don’t want to maintain this library indefinitely. Our other file-watching solutions (like whatever we wrap in watchPath) deal only with directories, but in places where I have need to watch a single file, I’ve started to prefer the builtin fs.watch rather than introduce new code that depends on pathwatcher.

Still, is there a future where we don’t use it anymore? Only if we’re willing to weaken its API contract. Packages don’t use it directly, but we expose the File and Directory classes to packages, and the change listeners that get registered on those instances use pathwatcher under the hood. That’s not something that is easy to convert to an async API. Still, I bet we could do it; it’s just a matter of figuring out the approach that changes behavior as little as possible.

Next steps

Pulsar has never consumed its own fork of pathwatcher. The last commit to the master branch of this repo is from the last Atom maintainer, and the stable release of Pulsar consumes the NPM module pathwatcher rather than a repository reference.

This means that, once we land this to master, we can bump the major version and publish it to NPM as @pulsar-edit/pathwatcher. Then PulsarNext can declare a dependency on that version instead of a random branch of my random fork.

This is an ongoing nightmare!

It took me a while to realize that the original binding was a relic of how early it had been written in the life of Node. It used a lot of `libuv` functions directly for things that have had easier-to-use abstractions for a while now — both in `nan` and in N-API.

It was also a good opportunity to decaffeinate the JavaScript files. Furthermore, we don't need the `HandleMap` stuff, nor the scarily-named `UnsafePersistent`, since we have proper `Map`s in JavaScript now. (I'm not sure we _ever_ needed `HandleMap`, but I digress.)

This isn't over! It works on macOS, but it needs platform-specific adjustments for Windows and Linux. Both of those will be painful. I'm sure I can also improve upon the macOS file-watching somehow.
This comes with the painful revelation that one spec has been spuriously passing for years, and the functionality it claims to support doesn't seem to work (at least on macOS). That spec is skipped until that bug can be fixed.

This required a more modern version of Jasmine that could grok `async`/`await` syntax. The magical `jasmine-tagged` stuff is currently being done manually instead.
(man, that was silly)
…and prevent an instance from being destroyed while processing events.
…for `detectResurrectionAfterDelay`.

(The Pulsar specs rely on being able to stub it to circumvent the async-ness.)
(mutices?)
I wasted a few hours trying to handle a case that `pathwatcher` never claimed to support.

The upside is that I did manage to simplify the custom FSEvents watcher implementation and eliminate some potentially costly looping!
…when we're not watching in the first place.
…because EFSW was returning handles larger than `Number.MAX_SAFE_INTEGER`.
@savetheclocktower
Copy link
Author

I totally get how intimidating this PR is. But since it's the basis of what's been shipped in PulsarNext for months, I think I might give this another ~10 days of openness at the absolute max, and then merge it in the absence of objections?

@DeeDeeG
Copy link
Member

DeeDeeG commented Sep 19, 2025

However we land this, I feel it'd be helpful for this PR conversation to permalink to a branch where all this commit history is browsable -- if there are some things that we don't fully understand, it can be helpful "archeologically/forensically" digging through the blame view and seeing "okay this was just tried as a guess, we don't have to be wedded to it if we need to change it down the line for whatever reason."

As much as this commit history is not straightforward, it does lay out the motivation a little more transparently than a squash, so a copy like this should be retained somewhere I'd argue.

Thanks for doing all this, still combing through and looking for insights, but I start with the impression that this was a lot of work!

UPDATE: Here's a link to such a branch: https://github.com/pulsar-edit/node-pathwatcher/commits/n-api-with-efsw-bundled/

(The content-addressable (immutable per se) tip of the branch as of writing, for posterity, is commit d2281e6.)

Copy link
Member

@DeeDeeG DeeDeeG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I haven't made a ton of progress trying to review this. Partly due to it being very big, and partly down to the fact that I'm not advanced in JS nor C/C++.

[EDIT to add more of my train of through and reasoning/justification: I have glanced at the diffstat for all the folders, the LICENSE file for the vendored EFSW appears to be sufficient that I'm confident we are allowed to vendor it in such a way, CI is passing (on all three major OSes we support), including several new specs. PR body strikes me as well-reasoned, and this is obviously a ton of work that has been in-use for months now. IIRC we've had positive feedback and users asking when this will be in the primary version of Pulsar, so this is tested and working for multiple people if I vaguely recall correctly.]

I am going to give this my nod of approval mostly on the strength of it having attestedly been used in Pulsar Next for some time now, and the lack of complaints as far as I know (on top of it solving important problems -- context-awareness, and IIRC, watcher performance for some users??).

I wish I had more to add, but yeah; this is mostly a rubber stamp, but looks good to me (to the extent I can form any informed opinion)!

Thank you much for doing all this!

@DeeDeeG
Copy link
Member

DeeDeeG commented Sep 21, 2025

I also want to commend the coordination efforts with an upstream project (efsw) to share notes and potentially generalize the benefit of this research to help the broader ecosystem -- ourselves included potentially, but that is certainly a part of how that is meant to work also. Good open-source practices, IMHO!

@savetheclocktower savetheclocktower merged commit e448e08 into pulsar-edit:master Sep 21, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants