[pull] master from git:master#97
Merged
pull[bot] merged 18 commits intoturkdevops:masterfrom Aug 29, 2025
Merged
Conversation
- Start with an example that mirrors the example in the `git-merge` man page, to make it easier for folks to understand the difference between a rebase and a merge. - Mention that rebase can combine or reorder commits Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously there were two explanations, this combines them both into a single explanation. Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove duplicate explanation of `git rebase <upstream> <branch>` which is already explained above. Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a very clear explanation with examples of using --onto which is currently buried in the very long DESCRIPTION section. This moves it to its own section, so that we can reference the explanation from the `--onto` option by name. Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
- make it clearer that we're talking about a multistep process - give a more technically accurate description how rebase works with the merge backend. - condense the explanation of how git rebase skips commits with the same textual changes into a single bullet point and remove the explanatory diagram. Lots of things which are more complicated are already being explained without a diagram. - remove the explanation of how exactly `--fork-point` and `--root` work since that information is in the OPTIONS section - put all discussion of `ORIG_HEAD` inside the note Signed-off-by: Julia Evans <julia@jvns.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In process_ranges_merge_commit(), the line-level log first creates an
array of diff queues by iterating over all parents of a merge commit
and computing a tree diff for each. Then in a second loop it iterates
over those diff queues, and if it finds that none of the interesting
paths were modified in one of them, then it will return early. This
means that when none of the interesting paths were modified between a
merge and its first parent, then the tree diff between the merge and
its second (Nth...) parent was computed in vain.
Unify these two loops, so when it iterates over all parents of a merge
commit, then it first computes the tree diff between the merge and
that particular parent and then processes the resulting diff queue
right away. This way we can spare some tree diff computing, thereby
speeding up line-level log in repositories with mergy history:
# git.git, 25.8% of commits are merges:
Benchmark 1: ./git_v2.51.0 -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0
Time (mean ± σ): 1.001 s ± 0.009 s [User: 0.906 s, System: 0.095 s]
Range (min … max): 0.991 s … 1.023 s 10 runs
Benchmark 2: ./git -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0
Time (mean ± σ): 445.5 ms ± 3.4 ms [User: 358.8 ms, System: 84.3 ms]
Range (min … max): 440.1 ms … 450.3 ms 10 runs
Summary
'./git -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0' ran
2.25 ± 0.03 times faster than './git_v2.51.0 -C ~/src/git log -L:'lookup_commit(':commit.c v2.51.0'
# linux.git, 7.5% of commits are merges:
Benchmark 1: ./git_v2.51.0 -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16
Time (mean ± σ): 3.246 s ± 0.007 s [User: 2.835 s, System: 0.409 s]
Range (min … max): 3.232 s … 3.255 s 10 runs
Benchmark 2: ./git -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16
Time (mean ± σ): 2.467 s ± 0.014 s [User: 2.113 s, System: 0.353 s]
Range (min … max): 2.455 s … 2.505 s 10 runs
Summary
'./git -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16' ran
1.32 ± 0.01 times faster than './git_v2.51.0 -C ~/src/linux.git log -L:build_restore_work_registers:arch/mips/mm/tlbex.c v6.16'
And since now each iteration computes a tree diff and processes its
result, there is no reason to store the diff queues for each merge
parent anymore, so replace that diff queue array with a loop-local
diff queue variable. With this change the static free_diffqueues()
helper function in 'line-log.c' has no more callers left, remove it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can easily iterate through the parents of a merge commit without turning the list of parents into a dynamically allocated array of parents, so let's do so. This way we can avoid a memory allocation for each processed merge commit, though its effect on runtime seems to be unmeasurable. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
process_ranges_ordinary_commit() uses a local diff queue variable, which it leaves uninitialized before passing its address to queue_diffs(). This is not an issue, because at the end of that function the contents of an other diff queue is moved into it by simply overwriting whatever is in there, i.e. without reading any uninitialized memory. Still, seeing the uninitialized diff queue being passed around scared me more than once, so out of caution let's make sure that it's initialized. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
In process_ranges_arbitrary_commit() the condition deciding whether the given commit is not a merge, i.e. that it doesn't have more than one parent, is head-scratchingly backwards, flip it. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch code tries to avoid asking the remote side for an object we already have. It does this by traversing recent commits reachable from our refs looking for matches. Commit 5d4cc78 (fetch-pack: die if in commit graph but not obj db, 2024-11-05) introduced an extra check there: if we think we have an object because it's in the commit graph, we double-check that we actually have it in our object database with a call to odb_has_object(). But that call does not pass any flags, and so the function won't call reprepared_packed_git() if it does not find the object. That opens us up to the usual race against some other process repacking the odb: 1. We scan the list of packs in objects/pack but haven't yet opened them. 2. Somebody else packs the object into a new pack (which we don't know about), and deletes the old pack it was in. 3. Our odb_has_object() calls tries to open that old pack, but finds it is gone. We declare that we don't have the object. And this causes us to erroneously complain and abort the fetch, thinking our commit-graph and object database are out of sync. Instead, we should pass HAS_OBJECT_RECHECK_PACKED, which will add a new step: 4. We re-scan the pack directory again, find the new pack, and locate the object. Often the fetch code tries to avoid these kinds of re-scans if it's likely that we won't have the object. If the other side has told us about object X and we want to know if we have it, we'll skip the re-scan (to avoid spending a lot of effort when there are many such objects). We can accept the racy false negative in that case because the worst case is that we ask the other side to send us the object. But this is not one of those cases. These are objects which are accessible from _our_ refs, and which we already found in the commit graph file. We should have them, and if we don't, we'll die() immediately. So the performance impact is negligible, and getting the right answer is important. There's no test here because it's inherently racy. In fact, I had trouble even developing a minimal test. The problem seen in the wild can be produced like this: # Any git.git mirror which supports partial clones; I think this # should work with any repo that contains submodules, but note that # $obj below is specific to this repo url=https://github.com/git/git.git # This is a commit that is not at the tip of any branches (so after # we have it, we'll still have some commits to fetch). obj=cf6f63ea6bf35173e02e18bdc6a4ba41288acff9 git init git fetch --filter=tree:0 $url $obj:refs/heads/foo git checkout foo git commit-graph write --reachable git fetch $url What happens here is that the initial fetch grabs that older commit (and its ancestors) but no trees or blobs, and the subsequent checkout grabs the necessary trees and blobs just for that commit. The final fetch spawns a long sequence of child fetches due to fetch_submodules(), which wants to check whether there have been any gitlink modifications which should trigger a fetch of the related submodule (we'll leave aside the irony that we did not even check out any submodules yet). That series of fetches causes us to accumulate packs, which eventually triggers background maintenance to run. That repacks all-into-one, and the pack containing $obj goes away in favor of a new pack. And then the fetch eventually fails with: fatal: You are attempting to fetch cf6f63e, which is in the commit graph file but not in the object database. In the scenario above, the race becomes likely because of the long series of quick fetches. But I _think_ the bug is independent of partial clones entirely, and you could run into the same thing with a single fetch, some other process running "git repack" simultaneously, and a bit of bad luck. I haven't been able to reproduce, though. I'm not sure if that's because there's some mis-analysis above, or if the race window is just small enough that it's hard to trigger. At any rate, re-scanning here seems like an obviously correct thing to do with no downside, and it does fix the partial-clone case shown above. Reported-by: Дилян Палаузов <dilyan.palauzov@aegee.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using one of the start_delayed_*() functions, clients of the progress API can request that a progress meter is only shown after some time. To do that, the implementation intends to count down the number of seconds stored in struct progress by observing flag progress_update, which the timer interrupt handler sets when a second has elapsed. This works during the first second of the delay. But the code forgets to reset the flag to zero, so that subsequent calls of display_progress() think that another second has elapsed and decrease the count again until zero is reached. Due to the frequency of the calls, this happens without an observable delay in practice, so that the effective delay is always just one second. This bug has been with us since the inception of the feature. Despite having been touched on various occasions, such as 8aade10 (progress: simplify "delayed" progress API), 9c5951c (progress: drop delay-threshold code), and 44a4693 (progress: create GIT_PROGRESS_DELAY), the short delay went unnoticed. Copy the flag state into a local variable and reset the global flag right away so that we can detect the next clock tick correctly. Since we have not had any complaints that the delay of one second is too short nor that GIT_PROGRESS_DELAY is ignored, people seem to be comfortable with the status quo. Therefore, set the default to 1 to keep the current behavior. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The compatibility object format is only implemented for loose objects, not packed objects, so anyone attempting to push or fetch data into a repository with this option will likely not see it work as expected. In addition, the underlying storage of loose object mapping is likely to change because the current format is inefficient and does not handle important mapping information such as that of submodules. It would have been preferable to initially document that this was not yet ready for prime time, but we did not do so. We hinted at the fact that this functionality is incomplete in the description, but did not say so explicitly. Let's do so now: indicate that this feature is incomplete and subject to change and that the option is not designed to be used by end users. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation for "git rebase" has been updated. * je/doc-rebase: doc: git-rebase: update discussion of internals doc: git-rebase: move --onto explanation down doc: git rebase: clarify arguments syntax doc: git rebase: dedup merge conflict discussion doc: git-rebase: start with an example
The start_delayed_progress() function in the progress eye-candy API did not clear its internal state, making an initial delay value larger than 1 second ineffective, which has been corrected. * js/progress-delay-fix: progress: pay attention to (customized) delay time
"git log -L..." compared trees of multiple parents with the tree of the merge result in an unnecessarily inefficient way. * sg/line-log-merge-optim: line-log: simplify condition checking for merge commits line-log: initialize diff queue in process_ranges_ordinary_commit() line-log: get rid of the parents array in process_ranges_merge_commit() line-log: avoid unnecessary tree diffs when processing merge commits
Under a race against another process that is repacking the repository, especially a partially cloned one, "git fetch" may mistakenly think some objects we do have are missing, which has been corrected. * jk/fetch-check-graph-objects-fix: fetch-pack: re-scan when double-checking graph objects
The compatObjectFormat extension is used to hide an incomplete feature that is not yet usable for any purpose other than developing the feature further. Document it as such to discourage its use by mere mortals. * bc/doc-compat-object-format-not-working: docs: note that extensions.compatobjectformat is incomplete
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.3)
Can you help keep this open source service alive? 💖 Please sponsor : )