Skip to content

[pull] master from git:master#68

Merged
pull[bot] merged 52 commits intoturkdevops:masterfrom
git:master
Jun 17, 2025
Merged

[pull] master from git:master#68
pull[bot] merged 52 commits intoturkdevops:masterfrom
git:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Jun 17, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

derrickstolee and others added 30 commits May 16, 2025 12:15
This will be helpful in a future change, which will reuse this logic.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to more easily compute delta bases among objects that appear at
the exact same path, add a --path-walk option to 'git pack-objects'.

This option will use the path-walk API instead of the object walk given
by the revision machinery. Since objects will be provided in batches
representing a common path, those objects can be tested for delta bases
immediately instead of waiting for a sort of the full object list by
name-hash. This has multiple benefits, including avoiding collisions by
name-hash.

The objects marked as UNINTERESTING are included in these batches, so we
are guaranteeing some locality to find good delta bases.

After the individual passes are done on a per-path basis, the default
name-hash is used to find other opportunistic delta bases that did not
match exactly by the full path name.

The current implementation performs delta calculations while walking
objects, which is not ideal for a few reasons. First, this will cause
the "Enumerating objects" phase to be much longer than usual. Second, it
does not take advantage of threading during the path-scoped delta
calculations. Even with this lack of threading, the path-walk option is
sometimes faster than the usual approach. Future changes will refactor
this code to allow for threading, but that complexity is deferred until
later to keep this patch as simple as possible.

This new walk is incompatible with some features and is ignored by
others:

 * Object filters are not currently integrated with the path-walk API,
   such as sparse-checkout or tree depth. A blobless packfile could be
   integrated easily, but that is deferred for later.

 * Server-focused features such as delta islands, shallow packs, and
   using a bitmap index are incompatible with the path-walk API.

 * The path walk API is only compatible with the --revs option, not
   taking object lists or pack lists over stdin. These alternative ways
   to specify the objects currently ignores the --path-walk option
   without even a warning.

Future changes will create performance tests that demonstrate the power
of this approach.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t0450 test script verifies that builtin usage matches the synopsis
in the documentation. Adjust the builtin to match and then remove 'git
pack-objects' from the exception list.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change added a --path-walk option to 'git pack-objects'.
Create a performance test that demonstrates the time and space benefits
of the feature.

In order to get an appropriate comparison, we need to avoid reusing
deltas and recompute them from scratch.

Compare the creation of a thin pack representing a small push and the
creation of a relatively large non-thin pack.

Running on my copy of the Git repository results in this data (removing
the repack tests for --name-hash-version):

Test                                                     this tree
------------------------------------------------------------------------
5313.2: thin pack with --name-hash-version=1             0.02(0.01+0.01)
5313.3: thin pack size with --name-hash-version=1                   1.6K
5313.4: big pack with --name-hash-version=1              2.55(4.20+0.26)
5313.5: big pack size with --name-hash-version=1                   16.4M
5313.6: shallow fetch pack with --name-hash-version=1    1.24(2.03+0.08)
5313.7: shallow pack size with --name-hash-version=1               12.2M
5313.10: thin pack with --name-hash-version=2            0.03(0.01+0.01)
5313.11: thin pack size with --name-hash-version=2                  1.6K
5313.12: big pack with --name-hash-version=2             1.91(3.23+0.20)
5313.13: big pack size with --name-hash-version=2                  16.4M
5313.14: shallow fetch pack with --name-hash-version=2   1.06(1.57+0.10)
5313.15: shallow pack size with --name-hash-version=2              12.5M
5313.18: thin pack with --path-walk                      0.03(0.01+0.01)
5313.19: thin pack size with --path-walk                            1.6K
5313.20: big pack with --path-walk                       2.05(3.24+0.27)
5313.21: big pack size with --path-walk                            16.3M
5313.22: shallow fetch pack with --path-walk             1.08(1.66+0.07)
5313.23: shallow pack size with --path-walk                        12.4M

This can be reformatted as follows:

Pack Type            Hash v1   Hash v2     Path Walk
---------------------------------------------------
thin pack    (time)    0.02s      0.03s      0.03s
             (size)    1.6K       1.6K       1.6K
big pack     (time)    2.55s      1.91s      2.05s
             (size)   16.4M      16.4M      16.3M
shallow pack (time)    1.24s      1.06s      1.08s
             (size)   12.2M      12.5M      12.4M

Note that the timing is slower because there is no threading in the
--path-walk case (yet). Also, the shallow pack cases are really not
using the --path-walk logic right now because it is disabled until some
additions are made to the path walk API.

The cases where the --path-walk option really shines is when the default
name-hash is overwhelmed with unhelpful collisions. An open source
example can be found in the microsoft/fluentui repo [1] at a certain
commit [2].

[1] https://github.com/microsoft/fluentui
[2] e70848ebac1cd720875bccaa3026f4a9ed700e08

Running the tests on this repo results in the following comparison table:

Pack Type            Hash v1    Hash v2    Path Walk
---------------------------------------------------
thin pack    (time)    0.36s      0.12s      0.08s
             (size)    1.2M      22.0K      18.4K
big pack     (time)    2.00s      2.90s      2.21s
             (size)   20.4M      25.9M      19.5M
shallow pack (time)    1.41s      1.80s      1.65s
             (size)   34.4M      33.7M      33.6M

Notice in particular that in the small thin pack, the time performance
has improved from 0.36s for --name-hash-version=1 to 0.08s and this is
likely due to the improved size of the resulting pack: 18.4K instead of
1.2M.  The relatively new --name-hash-version=2 is competitive with
--path-walk (0.12s and 22.0K) but not quite as successful.

Finally, running this on a copy of the Linux kernel repository results
in these data points:

Pack Type            Hash v1    Hash v2    Path Walk
---------------------------------------------------
thin pack    (time)    0.03s      0.13s      0.03s
             (size)    4.6K       4.6K       4.6K
big pack     (time)   15.29s     12.32s     13.92s
             (size)  201.1M     159.1M     158.5M
shallow pack (time)   10.88s     22.93s     22.74s
             (size)  269.2M     273.8M     267.7M

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many tests that validate whether 'git pack-objects' works as
expected. Instead of duplicating these tests, add a new test environment
variable, GIT_TEST_PACK_PATH_WALK, that implies --path-walk by default
when specified.

This was useful in testing the implementation of the --path-walk
implementation, helping to find tests that are overly specific to the
default object walk. These include:

 - t0411-clone-from-partial.sh : One test fetches from a repo that does
   not have the boundary objects. This causes the path-based walk to
   fail. Disable the variable for this test.

 - t5306-pack-nobase.sh : Similar to t0411, one test fetches from a repo
   without a boundary object.

 - t5310-pack-bitmaps.sh : One test compares the case when packing with
   bitmaps to the case when packing without them. Since we disable the
   test variable when writing bitmaps, this causes a difference in the
   object list (the --path-walk option adds an extra object). Specify
   --no-path-walk in both processes for the comparison. Another test
   checks for a specific delta base, but when computing dynamically
   without using bitmaps, the base object it too small to be considered
   in the delta calculations so no base is used.

 - t5316-pack-delta-depth.sh : This script cares about certain delta
   choices and their chain lengths. The --path-walk option changes how
   these chains are selected, and thus changes the results of this test.

 - t5322-pack-objects-sparse.sh : This demonstrates the effectiveness of
   the --sparse option and how it combines with --path-walk.

 - t5332-multi-pack-reuse.sh : This test verifies that the preferred
   pack is used for delta reuse when possible. The --path-walk option is
   not currently aware of the preferred pack at all, so finds a
   different delta base.

 - t7406-submodule-update.sh : When using the variable, the --depth
   option collides with the --path-walk feature, resulting in a warning
   message. Disable the variable so this warning does not appear.

I want to call out one specific test change that is only temporary:

 - t5530-upload-pack-error.sh : One test cares specifically about an
   "unable to read" error message. Since the current implementation
   performs delta calculations within the path-walk API callback, a
   different "unable to get size" error message appears. When this
   is changed in a future refactoring, this test change can be reverted.

Similar to GIT_TEST_NAME_HASH_VERSION, we do not add this option to the
linux-TEST-vars CI build as that's already an overloaded build.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be notoriously difficult to detect if delta bases are being
computed properly during 'git push'. Construct an example where it will
make a kilobyte worth of difference when a delta base is not found. We
can then use the progress indicators to distinguish between bytes and
KiB depending on whether the delta base is found and used.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 'git pack-objects' supports a --path-walk option, allow passing it
through in 'git repack'. This presents interesting testing opportunities for
comparing the different repacking strategies against each other.

Add the --path-walk option to the performance tests in p5313.

For the microsoft/fluentui repo [1] checked out at a specific commit [2],
the --path-walk tests in p5313 look like this:

Test                                                     this tree
-------------------------------------------------------------------------
5313.18: thin pack with --path-walk                      0.08(0.06+0.02)
5313.19: thin pack size with --path-walk                           18.4K
5313.20: big pack with --path-walk                       2.10(7.80+0.26)
5313.21: big pack size with --path-walk                            19.8M
5313.22: shallow fetch pack with --path-walk             1.62(3.38+0.17)
5313.23: shallow pack size with --path-walk                        33.6M
5313.24: repack with --path-walk                         81.29(96.08+0.71)
5313.25: repack size with --path-walk                             142.5M

[1] https://github.com/microsoft/fluentui
[2] e70848ebac1cd720875bccaa3026f4a9ed700e08

Along with the earlier tests in p5313, I'll instead reformat the
comparison as follows:

Repack Method    Pack Size       Time
---------------------------------------
Hash v1             439.4M      87.24s
Hash v2             161.7M      21.51s
Path Walk           142.5M      81.29s

There are a few things to notice here:

 1. The benefits of --name-hash-version=2 over --name-hash-version=1 are
    significant, but --path-walk still compresses better than that
    option.

 2. The --path-walk command is still using --name-hash-version=1 for the
    second pass of delta computation, using the increased name hash
    collisions as a potential method for opportunistic compression on
    top of the path-focused compression.

 3. The --path-walk algorithm is currently sequential and does not use
    multiple threads for delta compression. Threading will be
    implemented in a future change so the computation time will improve
    to better compete in this metric.

There are small benefits in size for my copy of the Git repository:

Repack Method    Pack Size       Time
---------------------------------------
Hash v1             248.8M      30.44s
Hash v2             249.0M      30.15s
Path Walk           213.2M     142.50s

As well as in the nodejs/node repository [3]:

Repack Method    Pack Size       Time
---------------------------------------
Hash v1             739.9M      71.18s
Hash v2             764.6M      67.82s
Path Walk           698.1M     208.10s

[3] https://github.com/nodejs/node

This benefit also repeats in my copy of the Linux kernel repository:

Repack Method    Pack Size       Time
---------------------------------------
Hash v1               2.5G     554.41s
Hash v2               2.5G     549.62s
Path Walk             2.2G    1562.36s

It is important to see that even when the repository shape does not have
many name-hash collisions, there is a slight space boost to be found
using this method.

As this repacking strategy was released in Git for Windows 2.47.0, some
users have reported cases where the --path-walk compression is slightly
worse than the --name-hash-version=2 option. In those cases, it may be
beneficial to combine the two options. However, there has not been a
released version of Git that has both options and I don't have access to
these repos for testing.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users may want to enable the --path-walk option for 'git pack-objects' by
default, especially underneath commands like 'git push' or 'git repack'.

This should be limited to client repositories, since the --path-walk option
disables bitmap walks, so would be bad to include in Git servers when
serving fetches and clones. There is potential that it may be helpful to
consider when repacking the repository, to take advantage of improved deltas
across historical versions of the same files.

Much like how "pack.useSparse" was introduced and included in
"feature.experimental" before being enabled by default, use the repository
settings infrastructure to make the new "pack.usePathWalk" config enabled by
"feature.experimental" and "feature.manyFiles".

In order to test that this config works, add a new trace2 region around
the path walk code that can be checked by a 'git push' command.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Repositories registered with Scalar are expected to be client-only
repositories that are rather large. This means that they are more likely to
be good candidates for using the --path-walk option when running 'git
pack-objects', especially under the hood of 'git push'. Enable this config
in Scalar repositories.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the --path-walk option to 'git pack-objects' would compute
deltas inline with the path-walk logic. This would make the progress
indicator look like it is taking a long time to enumerate objects, and
then very quickly computed deltas.

Instead of computing deltas on each region of objects organized by tree,
store a list of regions corresponding to these groups. These can later
be pulled from the list for delta compression before doing the "global"
delta search.

This presents a new progress indicator that can be used in tests to
verify that this stage is happening.

The current implementation is not integrated with threads, but we are
setting it up to arrive in the next change.

Since we do not attempt to sort objects by size until after exploring
all trees, we can remove the previous change to t5530 due to a different
error message appearing first.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adapting the implementation of ll_find_deltas(), create a threaded
version of the --path-walk compression step in 'git pack-objects'.

This involves adding a 'regions' member to the thread_params struct,
allowing each thread to own a section of paths. We can simplify the way
jobs are split because there is no value in extending the batch based on
name-hash the way sections of the object entry array are attempted to be
grouped. We re-use the 'list_size' and 'remaining' items for the purpose
of borrowing work in progress from other "victim" threads when a thread
has finished its batch of work more quickly.

Using the Git repository as a test repo, the p5313 performance test
shows that the resulting size of the repo is the same, but the threaded
implementation gives gains of varying degrees depending on the number of
objects being packed. (This was tested on a 16-core machine.)

Test                        HEAD~1      HEAD
---------------------------------------------------
5313.20: big pack             2.38      1.99 -16.4%
5313.21: big pack size       16.1M     16.0M  -0.2%
5313.24: repack             107.32     45.41 -57.7%
5313.25: repack size        213.3M    213.2M  -0.0%

(Test output is formatted to better fit in message.)

This ~60% reduction in 'git repack --path-walk' time is typical across
all repos I used for testing. What is interesting is to compare when the
overall time improves enough to outperform the --name-hash-version=1
case. These time improvements correlate with repositories with data
shapes that significantly improve their data size as well. The
--path-walk feature frequently takes longer than --name-hash-version=2,
trading some extra computation for some additional compression. The
natural place where this additional computation comes from is the two
compression passes that --path-walk takes, though the first pass is
naturally faster due to the path boundaries avoiding a number of delta
compression attempts.

For example, the microsoft/fluentui repo has significant size reduction
from --name-hash-version=1 to --name-hash-version=2 followed by further
improvements with --path-walk. The threaded computation makes
--path-walk more competitive in time compared to --name-hash-version=2,
though still ~31% more expensive in that metric.

Repack Method       Pack Size       Time
------------------------------------------
Hash v1                439.4M      87.24s
Hash v2                161.7M      21.51s
Path Walk (Before)     142.5M      81.29s
Path Walk (After)      142.5M      28.16s

Similar results hold for the Git repository:

Repack Method       Pack Size       Time
------------------------------------------
Hash v1                248.8M      30.44s
Hash v2                249.0M      30.15s
Path Walk (Before)     213.2M     142.50s
Path Walk (After)      213.3M      45.41s

...as well as the nodejs/node repository:

Repack Method       Pack Size       Time
------------------------------------------
Hash v1                739.9M      71.18s
Hash v2                764.6M      67.82s
Path Walk (Before)     698.1M     208.10s
Path Walk (After)      698.0M      75.10s

Finally, the Linux kernel repository is a good test for this repacking
time change, even though the space savings is more subtle:

Repack Method       Pack Size       Time
------------------------------------------
Hash v1                  2.5G     554.41s
Hash v2                  2.5G     549.62s
Path Walk (before)       2.2G    1562.36s
Path Walk (before)       2.2G     559.00s

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for allowing both the --shallow and --path-walk options
in the 'git pack-objects' builtin, create a new 'edge_aggressive' option
in the path-walk API. This option will help walk the boundary more
thoroughly and help avoid sending extra objects during fetches and
pushes.

The only use of the 'edge_hint_aggressive' option in the revision API is
within mark_edges_uninteresting(), which is usually called before
between prepare_revision_walk() and before visiting commits with
get_revision(). In prepare_revision_walk(), the UNINTERESTING commits
are walked until a boundary is found.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There does not appear to be anything particularly incompatible about the
--shallow and --path-walk options of 'git pack-objects'. If shallow
commits are to be handled differently, then it is by the revision walk
that defines the commit set and which are interesting or uninteresting.

However, before the previous change, a trivial removal of the warning
would cause a failure in t5500-fetch-pack.sh when
GIT_TEST_PACK_PATH_WALK is enabled. The shallow fetch would provide more
objects than we desired, due to some incorrect behavior of the path-walk
API, especially around walking uninteresting objects.

The recently-added tests in t5538-push-shallow.sh help to confirm this
behavior is working with the --path-walk option if
GIT_TEST_PACK_PATH_WALK is enabled. These tests passed previously due to
the --path-walk feature being disabled in the presence of a shallow
clone.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The do_match_pathspec() function has the DO_MATCH_LEADING_PATHSPEC
option to allow pathspecs to match when matching "src" against a
pathspec like "src/path/...". This support is not exposed by
match_pathspec, and the internal flags to do_match_pathspec are not
exposed outside of dir.c

The upcoming support for pathspecs in git diff --no-index need the
LEADING matching behavior when iterating down through a directory with
readdir.

We could try to expose the match_pathspec_with_flags to the public API.
However, DO_MATCH_EXCLUDES really shouldn't be public, and its a bit
weird to only have a few of the flags become public.

Instead, add match_leading_pathspec() as a function which sets both
DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC when is_dir is true.

This will be used in a following change to support pathspec matching in
git diff --no-index.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A following change will add support for pathspecs to the git diff
--no-index command. This mode of git diff does not load any repository.

Add a new PATHSPEC_NO_REPOSITORY flag indicating that we're parsing
pathspecs without a repository.

Both PATHSPEC_ATTR and PATHSPEC_FROMTOP require a repository to
function. Thus, verify that both of these are set in magic_mask to
ensure they won't be accepted when PATHSPEC_NO_REPOSITORY is set.

Check PATHSPEC_NO_REPOSITORY when warning about paths outside the
directory tree. When the flag is set, do not look for a git repository
when generating the warning message.

Finally, add a BUG in match_pathspec_item if the istate is NULL but the
pathspec has PATHSPEC_ATTR set. Callers which support PATHSPEC_ATTR
should always pass a valid istate, and callers which don't pass a valid
istate should have set PATHSPEC_ATTR in the magic_mask field to disable
support for attribute-based pathspecs.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --no-index option of git-diff enables using the diff machinery from
git while operating outside of a repository. This mode of git diff is
able to compare directories and produce a diff of their contents.

When operating git diff in a repository, git has the notion of
"pathspecs" which can specify which files to compare. In particular,
when using git to diff two trees, you might invoke:

  $ git diff-tree -r <treeish1> <treeish2>.

where the treeish could point to a subdirectory of the repository.

When invoked this way, users can limit the selected paths of the tree by
using a pathspec. Either by providing some list of paths to accept, or
by removing paths via a negative refspec.

The git diff --no-index mode does not support pathspecs, and cannot
limit the diff output in this way. Other diff programs such as GNU
difftools have options for excluding paths based on a pattern match.
However, using git diff as a diff replacement has several advantages
over many popular diff tools, including coloring moved lines, rename
detections, and similar.

Teach git diff --no-index how to handle pathspecs to limit the
comparisons. This will only be supported if both provided paths are
directories.

For comparisons where one path isn't a directory, the --no-index mode
already has some DWIM shortcuts implemented in the fixup_paths()
function.

Modify the fixup_paths function to return 1 if both paths are
directories. If this is the case, interpret any extra arguments to git
diff as pathspecs via parse_pathspec.

Use parse_pathspec to load the remaining arguments (if any) to git diff
--no-index as pathspec items. Disable PATHSPEC_ATTR support since we do
not have a repository to do attribute lookup. Disable PATHSPEC_FROMTOP
since we do not have a repository root. All pathspecs are treated as
rooted at the provided comparison paths.

After loading the pathspec data, calculate skip offsets for skipping
past the root portion of the paths. This is required to ensure that
pathspecs start matching from the provided path, rather than matching
from the absolute path. We could instead pass the paths as prefix values
to parse_pathspec. This is slightly problematic because the paths come
from the command line and don't necessarily have the proper trailing
slash. Additionally, that would require parsing pathspecs multiple
times.

Pass the pathspec object and the skip offsets into queue_diff, which
in-turn must pass them along to read_directory_contents.

Modify read_directory_contents to check against the pathspecs when
scanning the directory. Use the skip offset to skip past the initial
root of the path, and only match against portions that are below the
intended directory structure being compared.

The search algorithm for finding paths is recursive with read_dir. To
make pathspec matching work properly, we must set both
DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC.

Without DO_MATCH_DIRECTORY, paths like "a/b/c/d" will not match against
pathspecs like "a/b/c". This is usually achieved by setting the is_dir
parameter of match_pathspec.

Without DO_MATCH_LEADING_PATHSPEC, paths like "a/b/c" would not match
against pathspecs like "a/b/c/d". This is crucial because we recursively
iterate down the directories. We could simply avoid checking pathspecs
at subdirectories, but this would force recursion down directories
which would simply be skipped.

If we always passed DO_MATCH_LEADING_PATHSPEC, then we will
incorrectly match in certain cases such as matching 'a/c' against
':(glob)**/d'. The match logic will see that a matches the leading part
of the **/ and accept this even tho c doesn't match.

To avoid this, use the match_leading_pathspec() variant recently
introduced. This sets both flags when is_dir is set, but leaves them
both cleared when is_dir is 0.

Add test cases and documentation covering the new functionality. Note
for the documentation I opted not to move the placement of '--' which is
sometimes used to disambiguate arguments. The diff --no-index mode
requires exactly 2 arguments determining what to compare. Any additional
arguments are interpreted as pathspecs and must come afterwards. Use of
'--' would not actually disambiguate anything, since there will never be
ambiguity over which arguments represent paths or pathspecs.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add userdiff patterns to support R programming language.

Also, add three userdiff tests for R programming language
files. These files define simple function and nested function,
with and without indentation.

Signed-off-by: Rodrigo Carvalho <rodrigorsdc@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit links `git-credential-yahoo` as a credential helper for
Yahoo accounts. Also, Google's `sendgmail` tool has been linked as an
alternative method for sending emails through Gmail.

Signed-off-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current documentation for git-send-email had an inconsistent use of
"", ``, and '' for quoting. This commit improves the formatting by
using the same style throughout the documentation. Missing full stops
have also been added at some places.

Finally, the cpan links of necessary perl modules have been added to
make their installation easier.

While at it, the unecessary use of $ with <num> and <int> placeholders
has also been removed.

Signed-off-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a recent attempt to add links of email helpers to git-scm.com [1], I
came to a conclusion that the links in the gitcredentials page are meant
for people needing credential helpers for cloning, fetching and pushing
repositories to remote hosts, and not sending emails. gitcredentials
docs don't even talk about send emails, thus confirming this view.

So, lets remove these links from the gitcredentials page. The links are
still available in the git-send-email documentation, which is the right
place for them.

[1]: git/git-scm.com#2005

Signed-off-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
…send-email

The current example for Gmail suggests using app passwords for
send-email if user has multi-factor authentication set up for their
account. However, it does not clarify that the user cannot use their
normal password in case they do not have multi-factor authentication
enabled. Most likely the example was written in the days when Google
allowed using normal passwords without multi-factor authentication.

Clarify that regular passwords do not work for Gmail and app-passwords
are the only way for basic authentication. Also encourage users to use
OAuth2.0 as a more secure alternative.

While at it, also prefer using the word "mechanism" over "method" for
`OAUTHBEARER` and `XOAUTH2` since that is what official docs use.

Signed-off-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a couple of cases where our tests end up announcing that a
certain prerequisite is or isn't fulfilled. While this is supposed to
help the developer it has the downside that it breaks the TAP format.

We could convert these cases to just have a "#" prefix, but it feels
rather unlikely that these are generally useful in the first place. We
already do announce why a specific test is being skipped, so we should
try to use this mechanism to the best extent possible.

Stop announcing these prereqs to fix the TAP format. Where possible,
convert the tests to rely on the prerequisites themselves to announce
why a test ran or didn't ran.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a couple users of `test_create_repo()` that use this function
outside of any test case. This function is nowadays only a thin wrapper
around `git init`, which by default prints a message to stdout that the
repository has been initialized. The resulting output may thus confuse
TAP parsers.

Refactor these users to instead create the repository in a "setup" test
case so that we don't explicitly have to silence them. There's one
exception in t1007: we use `push_repo()` and its `pop_repo()` equivalent
multiple times, so to reduce the noise introduced by this patch we
instead silence this invocation.

While at it, convert callsites to use git-init(1) directly as the
`test_create_repo()` function has been deprecated in f0d4d39
(test-lib: split up and deprecate test_create_repo(), 2021-05-10).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests in t9822 depend on filesystem support for ISO-8859-1 encoding. We
thus have a block of code that acts as a prerequisite -- if we fail to
write a file with an ISO-8859-1-encoded file name to disk then we skip
all tests.

When the prerequisite fails though we end up printing an error message
to stderr, which breaks the TAP format. Fix this by converting the code
to a proper prerequisite, which handles output redirection for us.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests in t9835 and t9836 verify that git-p4(1) works with both
Python 2 and 3, respectively. To determine whether we have those Python
versions in the first place we create a wrapper script that directly
executes the git-p4(1) script with `python2` or `python3` binaries. We
then condition the execution of tests on whether that wrapper script can
be executed successfully.

The logic that does all of this is not contained in a prerequisite block
though, so the output it generates causes us to break the TAP format.
Refactor the logic to use `test_lazy_prereq()` to fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have several flags like "--verbose", "--verbose-only" or "-x" that
cause us to generate shell traces. The generated tracing output is split
up in these cases so that the test's stdout is printed to file
descriptor 3 whereas its stderr is printed to file descriptor 4.
Depending on which options have been given, we then end up either:

  - Redirecting both file descriptors to a file.

  - Redirecting them to stdout and stderr, respectively.

  - Closing them in case we're running in none-verbose mode.

The second case causes problems though when passing output to a TAP
parser. We print the test's stdout to the console's stdout, and that
results in broken TAP output.

Fix the issue by instead redirecting the test's stdout to the shell's
stderr. This makes it impossible to discern stdout from stderr, but
going by my own experience I never came across a usecase where I would
have needed this distinction.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the Bash version is too old to support BASH_XTRACEFD we print a
warning to stderr. This warning is not prefixed with "#", which causes
TAP parsers to (wrongly) interpret the warning as part of the protocol.

Fix this issue by prefixing the warning with a "#" so that it is treated
as comment.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t7815, we have the following test:

    test_expect_failure !CYGWIN 'git grep .fi a' '
        git grep .fi a
    '

The test passes if '.' matches a NUL byte, which we expect to only
happen on Cygwin. The upcoming changes to support parsing TAP output in
Meson surface that this test, surprisingly, passes on macOS as well.

It is unclear how long the test has been passing on macOS already.
064eed3 (config.mak.uname: only set NO_REGEX on cygwin for v1.7,
2025-04-17) mentions that the test started to pass for Cygwin. This was
attributed to a new implementation of regcomp(3p) and friends, which was
inherited from FreeBSD. Given the BSD lineage of macOS it is feasible
that it also inherited similar code eventually that made the test pass
now.

It is somewhat dubious what the test actually brings to the table given
that it is quite platform specific. Ideally, we would fix this mess by
having a configure-time check whether regcomp(3p) works as expected,
including NUL bytes, and use our bundled version of the regex library in
case it doesn't. Like this, we could ensure that all platforms work the
same in this edge case and mark the new behaviour as expected.

This change is outside of the scope of this patch series, which only
introduces support for TAP. So instead of fixing the bigger issue,
ignore the test on Darwin like we already do for Cygwin.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When tests are executed via `test_expect_failure` we rather obviously
expect the test itself to fail. If it unexpectedly does not fail then we
count the test as a "fixed" test and announce that a known breakage has
vanished:

    ok 1 - setup
    ok 2 - create refs/heads/main # TODO known breakage vanished
    ok 3 - create refs/heads/main with oldvalue verification
    ...
    ok 299 - update-ref should also create reflog for HEAD
    # 1 known breakage(s) vanished; please update test(s)
    # passed all remaining 298 test(s)
    1..299

While we announce that tests should be updated, the overall test suite
still passes. This makes it quite hard to detect when a test that has
previously failed succeeds now as the developer needs to pay close
attention to the exact output. Even more importantly, tests that only
succeed on _some_ systems are even easier to miss now, as one would have
to explicitly take a look at respective CI jobs to notice that those do
pass now.

Furthermore, we are about to introduce support for parsing TAP output in
Meson. In contrast to prove(1), which treats unexpected passes as a
successful test run, Meson treats those as failure. Neither of these
tools is wrong in doing so. Quoting the TAP specification [1]:

    Should a todo test point begin succeeding, the harness may report it
    in some way that indicates that whatever was supposed to be done has
    been, and it should be promoted to a normal Test Point.

So it is essentially implementation-defined how exactly the unexpected
pass is reported, and whether it should cause the overall test suite to
fail or not. It is unarguably a bad thing for us though if these tools
interpret these differently, as it would mean that test results now
depend on whether the developer uses prove(1) or Meson.

Unify the behaviour by causing a test suite to fail when there are any
unexpected passes. As prove(1) does not consider an unexpected pass to
be an error this leads to somewhat funky output:

    t1400-update-ref.sh ................................ Dubious, test returned 1 (wstat 256, 0x100)
    All 299 subtests passed
            (1 TODO test unexpectedly succeeded)

    ...

    Test Summary Report
    -------------------
    t1400-update-ref.sh                              (Wstat: 256 (exited 1) Tests: 299 Failed: 0)
      TODO passed:   2
      Non-zero exit status: 1

But as we directly announce that the root cause is an unexpected TODO
that has succeeded it's not all that bad.

[1]: https://testanything.org/tap-version-14-specification.html

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Meson has the ability to create a kwargs dictionary that can then be
passed to any function call with the `kwargs:` positional argument. This
allows one to deduplicate common parameters that one wishes to pass to
several different function invocations.

Our tests already have one common parameter that we use everywhere,
"timeout", and we're about to add a second common parameter in the next
commit. Let's prepare for this by introducing `test_kwargs` so that we
can deduplicate these common arguments.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pks-t and others added 22 commits June 2, 2025 07:51
By default, Meson only knows to pay respect to the exit code of tests to
judge whether or not it ran successfully. This can be changed though by
specifying the "protocol" parameter. Next to the default "exitcode"
protocol, Meson also supports the "tap" output that our tests already
know to generate.

Unfortunately, the "tap" protocol was incompatible with `meson test
--interactive` and caused a hang. We have upstreamed a fix [1] though,
so with the recent release of Meson 1.8 that fix is finally out and we
can start using the "tap" protocol when running with a recent-enough
version of this build tool.

With this change in place, Meson now properly detects how many subtests
ran and whether test suites have been skipped:

    ```
    $ meson test t002*
    ninja: Entering directory `/home/pks/Development/git/build'
     1/10 t0024-crlf-archive                  OK              0.17s   2 subtests passed
     2/10 t0022-crlf-rename                   OK              0.18s   2 subtests passed
     3/10 t0029-core-unsetenvvars             SKIP            0.15s
     4/10 t0023-crlf-am                       OK              0.18s   2 subtests passed
     5/10 t0025-crlf-renormalize              OK              0.21s   3 subtests passed
     6/10 t0026-eol-config                    OK              0.25s   5 subtests passed
     7/10 t0020-crlf                          OK              0.81s   36 subtests passed
     8/10 t0028-working-tree-encoding         OK              0.85s   22 subtests passed
     9/10 t0021-conversion                    OK              3.45s   38 subtests passed
    10/10 t0027-auto-crlf                     OK             26.35s   2600 subtests passed

    Ok:                9
    Fail:              0
    Skipped:           1
    ```

Note that when running `meson test --interactive` the test results will
now be marked as "ignored". This is because in interactive mode the file
descriptors will remain connected to the user's terminal, and it is
expected that the user interacts with the tests (e.g., spawn a debugger
or use `test_pause`). As such, the TAP output cannot be parsed reliably
by Meson in that case, so the tests are marked as ignored accordingly.

[1]: mesonbuild/meson#13980

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the parameter `struct repository *repo` to the cmd_walken function.

Since commit 9b1cb50 (builtin: add a repository parameter for
builtin functions, 2024-09-13), all the cmd_* have the `repo` parameter
and new commands must follow this convention, so the documentation
should also be changed.

Change the `git_config` calls to `repo_config`, also passing the `repo`
parameter, as since 036876a (config: hide functions using
`the_repository` by default, 2024-08-13) the non-repo config functions
are no longer recommended as they use the global `repository` variable.

Helped-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instruct in the documentation to also add an entry in meson.build for
builtin/walken.c, as currently both Meson and Make are supported.

Helped-by: Karthik Nayak <karthik.188@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In sequencer.c, caller only pass TODO_SQUASH or TODO_FIXUP to
update_squash_messages(), any other command passed in should be
considered as BUG. Replace `return error('unknown command')`
with `BUG('not a FIXUP or SQUASH')`.

Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
BUG() is not end-user facing but programmer facing, and we do not
use _("...") in them. Replace all `BUG(_("..."))` with `BUG("...")`

Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the 'run_tests' test wrapper so that the first argument may refer to
any specifier that uniquely identifies an object (e.g. a ref name,
'<OID>:<path>', '<OID>^{<type>}', etc.), rather than only a full object ID.

Also add tests that use non-OID identifiers, ensuring appropriate parsing in
'cat-file'. The identifiers used in some of the added tests include a space,
which is incompatible with the '%(rest)' atom. To accommodate that without
removing the test case, use 'test_expect_failure' when 'object_name'
includes a space.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a formatting atom, used with the --batch-check/--batch-command options,
that prints the octal representation of the object mode if a given revision
includes that information, e.g. one that follows the format
<tree-ish>:<path>. If the mode information does not exist, an empty string
is printed instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an object specification is passed to 'cat-file --batch[-check]'
referring to a submodule (e.g. 'HEAD:path/to/my/submodule'), the current
behavior of the command is to print the "missing" error message. However, it
is often valuable for callers to distinguish between paths that are actually
missing and "the submodule tree entry exists, but the object does not exist
in the repository".

To disambiguate without needing to invoke a separate Git process (e.g.
'ls-tree'), print the message "<oid> submodule" for such objects instead of
"<object> missing". In addition to the change from "missing" to "submodule",
the new message differs from the old in that it always prints the resolved
tree entry's OID, rather than the input object specification.

Note that this implementation maintains a distinction between submodules
where the commit OID is not present in the repo, and submodules where the
commit OID *is* present; the former will now print "<object> submodule", but
the latter will still print the full object content.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin/fetch-pack.c:cmd_fetch_pack(), if finish_connect() failed,
it returns error code without cleanup which cause memory leak. Add
cleanup label before frees in the end of cmd_fetch_pack(), and add
`goto cleanup` if finish_connect() failed.

Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit-graph.c:graph_write(), if read_one_commit() failed,
progress allocated in start_delayed_progress() will leak. Add
stop_progress() before goto cleanup.

Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Doc update to the more recent world order.

* lo/my-first-ow-doc-update:
  MyFirstContribution: add walken.c to meson.build
  MyFirstContribution: use struct repository in examples
"git pack-objects" learns to find delta bases from blobs at the
same path, using the --path-walk API.

* ds/path-walk-2:
  pack-objects: allow --shallow and --path-walk
  path-walk: add new 'edge_aggressive' option
  pack-objects: thread the path-based compression
  pack-objects: refactor path-walk delta phase
  scalar: enable path-walk during push via config
  pack-objects: enable --path-walk via config
  repack: add --path-walk option
  t5538: add tests to confirm deltas in shallow pushes
  pack-objects: introduce GIT_TEST_PACK_PATH_WALK
  p5313: add performance tests for --path-walk
  pack-objects: update usage to match docs
  pack-objects: add --path-walk option
  pack-objects: extract should_attempt_deltas()
Userdiff patterns for the R language.

* rc/userdiff-r:
  userdiff: add support for R programming language
Documentation for "git send-email" has been updated with a bit more
credential helper and OAuth information.

* ag/send-email-docs:
  docs: make the purpose of using app password for Gmail more clear in send-email
  docs: remove credential helper links for emails from gitcredentials
  docs: improve formatting in git-send-email documentation
  docs: add credential helper for yahoo and link Google's sendgmail tool
"git cat-file --batch" learns to understand %(objectmode) atom to
allow the caller to tell missing objects (due to repository
corruption) and submodules (whose commit objects are OK to be
missing) apart.

* vd/cat-file-objectmode-update:
  cat-file.c: add batch handling for submodules
  cat-file: add %(objectmode) atom
  t1006: update 'run_tests' to test generic object specifiers
Code clean-up.

* ly/sequencer-update-squash-is-fixup-only:
  sequencer: replace error() with BUG() in update_squash_messages ()
Code clean-up.

* ly/do-not-localize-bug-messages:
  BUG(): remove leading underscore of the format string
A memory-leak in an error code path has been plugged.

* ly/commit-graph-graph-write-leakfix:
  commit-graph: fix start_delayed_progress() leak
A memory-leak in an error code path has been plugged.

* ly/fetch-pack-leakfix:
  builtin/fetch-pack: cleanup before return error
"git diff --no-index dirA dirB" can limit the comparison with
pathspec at the end of the command line, just like normal "git
diff".

* jk/diff-no-index-with-pathspec:
  diff --no-index: support limiting by pathspec
  pathspec: add flag to indicate operation without repository
  pathspec: add match_leading_pathspec variant
Meson-based build/test framework now understands TAP output
generated by our tests.

* ps/meson-tap-parse:
  meson: parse TAP output generated by our tests
  meson: introduce kwargs variable for tests
  test-lib: fail on unexpectedly passing tests
  t7815: fix unexpectedly passing test on macOS
  t/test-lib: fix TAP format for BASH_XTRACEFD warning
  t/test-lib: don't print shell traces to stdout
  t983*: use prereq to check for Python-specific git-p4(1) support
  t9822: use prereq to check for ISO-8859-1 support
  t: silence output from `test_create_repo()`
  t: stop announcing prereqs
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@pull pull bot added the ⤵️ pull label Jun 17, 2025
@pull pull bot merged commit f9aa0ee into turkdevops:master Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants