Proposal for Backwards Compatibility & Releases#11
Proposal for Backwards Compatibility & Releases#11riedgar-ms wants to merge 17 commits intofairlearn:masterfrom riedgar-ms:master
Conversation
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
| ## Proposed Solution | ||
|
|
||
| Starting with `v0.5.0` we should make a commitment that anything which works at `v0.n.m_0` will also work for `v0.n.m` so long as `m >= m_0`. | ||
|
|
There was a problem hiding this comment.
This is very reasonable. I like this.
There was a problem hiding this comment.
Perhaps something I should highlight in the text is that I'm not saying anything about n and n+1 for support. I am anticipating breakages there (although we would try to minimise them).
releases/ReleaseCompatibility.md
Outdated
| For full compliance with the above, we will have to devise a means of allowing each notebook to specify its `m_0`. | ||
| Our notebooks do not provide comprehensive coverage of Fairlearn, but enforcing a constraint like this will be a dramatic improvement over the current situation (particularly since new users will likely try our notebooks first). | ||
|
|
||
| In order to support less mature functionality, we should also add a `fairlearn.experimental` package (see [a similar namespace in SciKit-Learn](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.experimental)). |
There was a problem hiding this comment.
I'm not sure we need this yet. I guess I wonder how much overhead this would be on the testing side (in terms of coding this up and then maintaining it) and what we might put into this subpackage.
There was a problem hiding this comment.
I'm not opposed to fairlearn.experimental in general, although I don't think it would have solved any of our issues. We're reworking all the core parts and that's why we're causing issues right now.
Scenarios for fairlearn.experimental in my view are:
- new mitigation techniques that we're trying out (e.g. for research purposes) and that we don't necessarily want users to actively use unless they're aware of the caveats
- let's say someone wants to replicate our charts in matplotlib, that may be very incomplete at first and may need some work to get in a state where there's enough functionality (but would be too much for a single PR)
This is specific to Python code, so this wouldn't be usable for the dashboard and adding new components, right?
There was a problem hiding this comment.
Yes, fairlearn.experimental certainly wouldn't have helped with our current refactorings. The idea is that going forward, it gives contributors somewhere to 'play' without getting stuck supporting their earlier poor choices (which are inevitable, since users always come up with ingenious ways of breaking things :-) ).
For the UX, I don't know remotely enough about TS to comment. Perhaps something similar could happen there.
I'd very much appreciate experiences from other open source projects here.
There was a problem hiding this comment.
TS does have namespaces, so I've added a note.
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
adrinjalali
left a comment
There was a problem hiding this comment.
Looks like a good start to me. I would really focus much less on notebooks (if any), and if there's anything in the notebooks which is not in the tests, I'd move them to tests.
It'd also be nice to limit the line length for an easier review between different revisions as usual ;)
releases/ReleaseCompatibility.md
Outdated
| Before it, `v0.4.5` reworked a smaller subset of this functionality, which caused smaller breakages. | ||
| There are also further changes to Metrics planned, which may well cause further breaks (although these should be more minor). | ||
| All this is obviously undesirable from a user standpoint - these look like 'patch' level releases, but are actually breaking their code. | ||
| Concretely, we've had users install Fairlearn with `pip`, and then find themselves unable to run Notebooks from our GitHub project - not because the functionality was missing, but because it had been renamed. |
There was a problem hiding this comment.
This is an issue which is fixed with the versioned documentation and not letting users rely on notebooks from the main repo.
There was a problem hiding this comment.
I've added a note to that effect - although I'm sure people will still manage to download the wrong version.
As for concentrating the testing on the notebooks, I was thinking of that from a 'first user impression' perspective. The notebooks are where we're most likely to hit trouble.
(Although as you note below, we are wanting to move from notebooks to examples and we could just run the tests instead)
There was a problem hiding this comment.
I've changed the text to have the tests run, rather than just the notebooks.
| Starting with `v0.5.0` we should make a commitment that anything which works at `v0.n.m_0` will also work for `v0.n.m` so long as `m >= m_0`. | ||
| However, we do *not* guarantee compatibility between `n` and `n+1` in this scheme (although we would seek to minimise breakage). |
There was a problem hiding this comment.
this is what we have in sklearn as well.
releases/ReleaseCompatibility.md
Outdated
| Starting with `v0.5.0` we should make a commitment that anything which works at `v0.n.m_0` will also work for `v0.n.m` so long as `m >= m_0`. | ||
| However, we do *not* guarantee compatibility between `n` and `n+1` in this scheme (although we would seek to minimise breakage). | ||
|
|
||
| We will enforce this using builds which run the notebooks currently in the repository against versions of Fairlearn published on PyPI. |
There was a problem hiding this comment.
Aren't we moving away from notebooks?
This also has to do with release workflow. Here's a proposal:
We could say 0.xx.y+1 is a patch or bug fix release to 0.xx.y.
The master branch is 0.xx+1.dev0, and therefore changes on master can be breaking backward compatibility.
Whenever we want to push a patch release out, we backport the PRs which we'd like to include in 0.xx.y+1 from the master, into the branch for the release.
This also implies that there is a branch for at least each major release, and minor releases are tags on at different times on that branch. The different versions of the documentation are also generated from those branches.
In my experience checking for backward compatibility in reviews is not hard, and if we get used to caring about it, then it'll be fine. In terms of testing, if we really want to test against a different release, we could run the tests against a different release, instead of running the notebooks.
There was a problem hiding this comment.
moving away from notebooks?
Yes.... I should at the very least rephrase to be about running things from the examples directory.
However, your point about just running the tests is also good. I may have been overly concerned about saving on CPU cycles.
@romanlutz and @MiroDudik , what do you think about the release and branching strategy @adrinjalali is proposing? It looks reasonable to me, but would be more overhead over what we're currently doing.
There was a problem hiding this comment.
I've updated the proposal, making it a lot closer to the procedure described by @adrinjalali above.
There was a problem hiding this comment.
Re notebooks - yes! It's just not done yet. Perhaps we should prioritize it.
We're pre-1.0 so we can currently only mess with 0... The only way to distinguish major releases from minor ones is therefore this last level (patch level). Otherwise we'll jump from 0.5.0 to 0.6.0 to 0.7.0 rapidly. It's really impossible to tell for users whether 0.7.0 is a major change or not at that point, since everything is flat. So fort that reason I'm not a huge fan.
I agree with testing against releases, not just running notebooks. That was just a quick band aid.
There was a problem hiding this comment.
"Not a huge fan" of what exactly - there are a few things going on here. I think you're concerned that our breaking changes are going to result in us going through minor versions quickly?
There was a problem hiding this comment.
I think I was confusing myself a little bit. I think it's important to have a consistent approach and this is one such approach. No objections.
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
|
These all seem like good ideas to me! My two cents from the outside would be to focus more on communicating about the project's status, scope and intentions. That might be as minimal as writing a paragraph or three bullets at the top of the README about what the team considers stable, and what it intends to focus on doing in the next 3-6 months. That might be more immediately valuable. Similarly, I'd vote for removing code, guides and examples that are known to be incomplete, not representative of values in fairlearn/fairlearn#436, etc. That may help elevate the role of improving the documentation, user guides and examples which may be a part of goals that the team has for project growth and adoption. Helping redpeople update version numbers seems good, but adding a changelog can point out the big things. A more immediate problem might be improving first-time encounters with the project so as to avoid things like fairlearn/fairlearn#448. |
|
@kevinrobinson , can you explain what you mean by "adding a changelog can point out the big things?" We already have a Changes.md file; do you mean something else? Thank you for hopping in on fairlearn/fairlearn#448 (memo to self: look at the issue list more frequently). We have a build as part of the PR gate which highlights those breaks, since we've been caught by that before. I'm not sure if it was green for a full week after I put out v0.4.6, though :-( |
|
@riedgar-ms sure, no problem! re: changelog yep, having a CHANGES.md seems great! In the scenario where a user is thinking "I am using this library, and want to understand what will break, and how I need to migrate when I upgrade," that may involve flipping the perspective a bit. In other words, if you're trying to support users in understanding the stability and changes in the library, it may help to revise the changelog away from a developer-centric list of work that has been done, and more towards a user-centric communication about what has changed, why it has changed, and what they can do about it. |
|
Understood. We do have careful instructions for migrating from v0.2 (which had an even smaller userbase), but we haven't done that since. Your point that we should be doing so is a good one! |
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
|
@kevinrobinson , have updated with a note about making sure |
Taking a step back, I think CHANGES.md has its place that makes sense. I don't mind additionally adding context on how to upgrade as @kevinrobinson describes. I'm not sure whether this should be mixed because it may make the document less useful for both developers and users. |
Do you mean a Migration.md (say) which would be pointed to from the Changes.md @romanlutz ? I agree that the migration instructions could get long for the Changes.md - although that would be an incentive to avoid breaking changes :-) |
Rather a ReST file that’s part of the webpage. It can have upgrade instructions for every version. We can link to it from anywhere we want. |
|
having a detailed changelog and a highlights post which could also include how to migrate separately is usually appreciated by users. |
…iscussion Signed-off-by: Richard Edgar <riedgar@microsoft.com>
Signed-off-by: Richard Edgar <riedgar@microsoft.com>
We have been very cavalier with backwards compatibility to date. This is a proposal to address that, and provide some more stability to our users, while recognising that we're still early-stage.