diff --git a/Documentation/RelNotes/2.52.0.adoc b/Documentation/RelNotes/2.52.0.adoc index a86e2c09e06969..ba213c0d6c7df3 100644 --- a/Documentation/RelNotes/2.52.0.adoc +++ b/Documentation/RelNotes/2.52.0.adoc @@ -70,6 +70,10 @@ UI, Workflows & Features * "Symlink symref" has been added to the list of things that will disappear at Git 3.0 boundary. + * "git maintenance" command learns the "geometric" strategy where it + avoids doing maintenance tasks that rebuilds everything from + scratch. + Performance, Internal Implementation, Development Support etc. -------------------------------------------------------------- @@ -158,6 +162,9 @@ Performance, Internal Implementation, Development Support etc. * Two slightly different ways to get at "all the packfiles" in API has been cleaned up. + * The code to walk revision graph to compute merge base has been + optimized. + Fixes since v2.51 ----------------- @@ -382,6 +389,14 @@ including security updates, are included in this release. and "git bisect unknown", which has been corrected. (merge 2bb3a012f3 rz/bisect-help-unknown later to maint). + * The 'q'(uit) command in "git add -p" has been improved to quit + without doing any meaningless work before leaving, and giving EOF + (typically control-D) to the prompt is made to behave the same way. + + * The wildmatch code had a corner case bug that mistakenly makes + "foo**/bar" match with "foobar", which has been corrected. + (merge 1940a02dc1 jk/match-pathname-fix later to maint). + * Other code cleanup, docfix, build fix, etc. (merge 529a60a885 ua/t1517-short-help-tests later to maint). (merge 22d421fed9 ac/deglobal-fmt-merge-log-config later to maint). @@ -393,3 +408,4 @@ including security updates, are included in this release. (merge a66fc22bf9 rs/get-oid-with-flags-cleanup later to maint). (merge 15b8abde07 js/mingw-includes-cleanup later to maint). (merge 2cebca0582 tb/cat-file-objectmode-update later to maint). + (merge 8f487db07a kh/doc-patch-id-1 later to maint). diff --git a/Documentation/config/maintenance.adoc b/Documentation/config/maintenance.adoc index 2f719342183322..d0c38f03fabd60 100644 --- a/Documentation/config/maintenance.adoc +++ b/Documentation/config/maintenance.adoc @@ -16,19 +16,36 @@ detach. maintenance.strategy:: This string config option provides a way to specify one of a few - recommended schedules for background maintenance. This only affects - which tasks are run during `git maintenance run --schedule=X` - commands, provided no `--task=` arguments are provided. - Further, if a `maintenance..schedule` config value is set, - then that value is used instead of the one provided by - `maintenance.strategy`. The possible strategy strings are: + recommended strategies for repository maintenance. This affects + which tasks are run during `git maintenance run`, provided no + `--task=` arguments are provided. This setting impacts manual + maintenance, auto-maintenance as well as scheduled maintenance. The + tasks that run may be different depending on the maintenance type. + -* `none`: This default setting implies no tasks are run at any schedule. +The maintenance strategy can be further tweaked by setting +`maintenance..enabled` and `maintenance..schedule`. If set, these +values are used instead of the defaults provided by `maintenance.strategy`. ++ +The possible strategies are: ++ +* `none`: This strategy implies no tasks are run at all. This is the default + strategy for scheduled maintenance. +* `gc`: This strategy runs the `gc` task. This is the default strategy for + manual maintenance. +* `geometric`: This strategy performs geometric repacking of packfiles and + keeps auxiliary data structures up-to-date. The strategy expires data in the + reflog and removes worktrees that cannot be located anymore. When the + geometric repacking strategy would decide to do an all-into-one repack, then + the strategy generates a cruft pack for all unreachable objects. Objects that + are already part of a cruft pack will be expired. ++ +This repacking strategy is a full replacement for the `gc` strategy and is +recommended for large repositories. * `incremental`: This setting optimizes for performing small maintenance activities that do not delete any data. This does not schedule the `gc` task, but runs the `prefetch` and `commit-graph` tasks hourly, the `loose-objects` and `incremental-repack` tasks daily, and the `pack-refs` - task weekly. + task weekly. Manual repository maintenance uses the `gc` task. maintenance..enabled:: This boolean config option controls whether the maintenance task @@ -75,6 +92,22 @@ maintenance.incremental-repack.auto:: number of pack-files not in the multi-pack-index is at least the value of `maintenance.incremental-repack.auto`. The default value is 10. +maintenance.geometric-repack.auto:: + This integer config option controls how often the `geometric-repack` + task should be run as part of `git maintenance run --auto`. If zero, + then the `geometric-repack` task will not run with the `--auto` + option. A negative value will force the task to run every time. + Otherwise, a positive value implies the command should run either when + there are packfiles that need to be merged together to retain the + geometric progression, or when there are at least this many loose + objects that would be written into a new packfile. The default value is + 100. + +maintenance.geometric-repack.splitFactor:: + This integer config option controls the factor used for the geometric + sequence. See the `--geometric=` option in linkgit:git-repack[1] for + more details. Defaults to `2`. + maintenance.reflog-expire.auto:: This integer config option controls how often the `reflog-expire` task should be run as part of `git maintenance run --auto`. If zero, then diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index 45da0f27acde47..92a1af36a2765c 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -7,8 +7,8 @@ git-patch-id - Compute unique ID for a patch SYNOPSIS -------- -[verse] -'git patch-id' [--stable | --unstable | --verbatim] +[synopsis] +git patch-id [--stable | --unstable | --verbatim] DESCRIPTION ----------- @@ -21,7 +21,7 @@ the same time also reasonably unique, i.e., two patches that have the same The main usecase for this command is to look for likely duplicate commits. -When dealing with 'git diff-tree' output, it takes advantage of +When dealing with `git diff-tree` output, it takes advantage of the fact that the patch is prefixed with the object name of the commit, and outputs two 40-byte hexadecimal strings. The first string is the patch ID, and the second string is the commit ID. @@ -30,35 +30,35 @@ This can be used to make a mapping from patch ID to commit ID. OPTIONS ------- ---verbatim:: +`--verbatim`:: Calculate the patch-id of the input as it is given, do not strip any whitespace. + -This is the default if patchid.verbatim is true. +This is the default if `patchid.verbatim` is `true`. ---stable:: +`--stable`:: Use a "stable" sum of hashes as the patch ID. With this option: + -- - Reordering file diffs that make up a patch does not affect the ID. In particular, two patches produced by comparing the same two trees - with two different settings for "-O" result in the same + with two different settings for `-O` result in the same patch ID signature, thereby allowing the computed result to be used as a key to index some meta-information about the change between the two trees; - Result is different from the value produced by git 1.9 and older - or produced when an "unstable" hash (see --unstable below) is + or produced when an "unstable" hash (see `--unstable` below) is configured - even when used on a diff output taken without any use - of "-O", thereby making existing databases storing such + of `-O`, thereby making existing databases storing such "unstable" or historical patch-ids unusable. - All whitespace within the patch is ignored and does not affect the id. -- + -This is the default if patchid.stable is set to true. +This is the default if `patchid.stable` is set to `true`. ---unstable:: +`--unstable`:: Use an "unstable" hash as the patch ID. With this option, the result produced is compatible with the patch-id value produced by git 1.9 and older and whitespace is ignored. Users with pre-existing diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index b16db85e779ab2..c43f33d8893153 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,6 +1,6 @@ #!/bin/sh -DEF_VER=v2.51.GIT +DEF_VER=v2.52.0-rc0 LF=' ' diff --git a/add-patch.c b/add-patch.c index ae9a20d8f23baf..173a53241ebf07 100644 --- a/add-patch.c +++ b/add-patch.c @@ -1569,8 +1569,10 @@ static int patch_update_file(struct add_p_state *s, if (*s->s.reset_color_interactive) fputs(s->s.reset_color_interactive, stdout); fflush(stdout); - if (read_single_character(s) == EOF) + if (read_single_character(s) == EOF) { + quit = 1; break; + } if (!s->answer.len) continue; @@ -1601,7 +1603,7 @@ static int patch_update_file(struct add_p_state *s, } else if (hunk->use == UNDECIDED_HUNK) { hunk->use = USE_HUNK; } - } else if (ch == 'd' || ch == 'q') { + } else if (ch == 'd') { if (file_diff->hunk_nr) { for (; hunk_index < file_diff->hunk_nr; hunk_index++) { hunk = file_diff->hunk + hunk_index; @@ -1613,10 +1615,9 @@ static int patch_update_file(struct add_p_state *s, } else if (hunk->use == UNDECIDED_HUNK) { hunk->use = SKIP_HUNK; } - if (ch == 'q') { - quit = 1; - break; - } + } else if (ch == 'q') { + quit = 1; + break; } else if (s->answer.buf[0] == 'K') { if (permitted & ALLOW_GOTO_PREVIOUS_HUNK) hunk_index = dec_mod(hunk_index, diff --git a/builtin/gc.c b/builtin/gc.c index 541d7471f19072..d212cbb9b84781 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -34,6 +34,7 @@ #include "pack-objects.h" #include "path.h" #include "reflog.h" +#include "repack.h" #include "rerere.h" #include "blob.h" #include "tree.h" @@ -55,7 +56,6 @@ static const char * const builtin_gc_usage[] = { }; static timestamp_t gc_log_expire_time; -static struct strvec repack = STRVEC_INIT; static struct tempfile *pidfile; static struct lock_file log_lock; static struct string_list pack_garbage = STRING_LIST_INIT_DUP; @@ -255,6 +255,7 @@ enum maintenance_task_label { TASK_PREFETCH, TASK_LOOSE_OBJECTS, TASK_INCREMENTAL_REPACK, + TASK_GEOMETRIC_REPACK, TASK_GC, TASK_COMMIT_GRAPH, TASK_PACK_REFS, @@ -448,7 +449,7 @@ static int rerere_gc_condition(struct gc_config *cfg UNUSED) return should_gc; } -static int too_many_loose_objects(struct gc_config *cfg) +static int too_many_loose_objects(int limit) { /* * Quickly check if a "gc" is needed, by estimating how @@ -470,7 +471,7 @@ static int too_many_loose_objects(struct gc_config *cfg) if (!dir) return 0; - auto_threshold = DIV_ROUND_UP(cfg->gc_auto_threshold, 256); + auto_threshold = DIV_ROUND_UP(limit, 256); while ((ent = readdir(dir)) != NULL) { if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose || ent->d_name[hexsz_loose] != '\0') @@ -616,48 +617,50 @@ static uint64_t estimate_repack_memory(struct gc_config *cfg, return os_cache + heap; } -static int keep_one_pack(struct string_list_item *item, void *data UNUSED) +static int keep_one_pack(struct string_list_item *item, void *data) { - strvec_pushf(&repack, "--keep-pack=%s", basename(item->string)); + struct strvec *args = data; + strvec_pushf(args, "--keep-pack=%s", basename(item->string)); return 0; } static void add_repack_all_option(struct gc_config *cfg, - struct string_list *keep_pack) + struct string_list *keep_pack, + struct strvec *args) { if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") && !(cfg->cruft_packs && cfg->repack_expire_to)) - strvec_push(&repack, "-a"); + strvec_push(args, "-a"); else if (cfg->cruft_packs) { - strvec_push(&repack, "--cruft"); + strvec_push(args, "--cruft"); if (cfg->prune_expire) - strvec_pushf(&repack, "--cruft-expiration=%s", cfg->prune_expire); + strvec_pushf(args, "--cruft-expiration=%s", cfg->prune_expire); if (cfg->max_cruft_size) - strvec_pushf(&repack, "--max-cruft-size=%lu", + strvec_pushf(args, "--max-cruft-size=%lu", cfg->max_cruft_size); if (cfg->repack_expire_to) - strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); + strvec_pushf(args, "--expire-to=%s", cfg->repack_expire_to); } else { - strvec_push(&repack, "-A"); + strvec_push(args, "-A"); if (cfg->prune_expire) - strvec_pushf(&repack, "--unpack-unreachable=%s", cfg->prune_expire); + strvec_pushf(args, "--unpack-unreachable=%s", cfg->prune_expire); } if (keep_pack) - for_each_string_list(keep_pack, keep_one_pack, NULL); + for_each_string_list(keep_pack, keep_one_pack, args); if (cfg->repack_filter && *cfg->repack_filter) - strvec_pushf(&repack, "--filter=%s", cfg->repack_filter); + strvec_pushf(args, "--filter=%s", cfg->repack_filter); if (cfg->repack_filter_to && *cfg->repack_filter_to) - strvec_pushf(&repack, "--filter-to=%s", cfg->repack_filter_to); + strvec_pushf(args, "--filter-to=%s", cfg->repack_filter_to); } -static void add_repack_incremental_option(void) +static void add_repack_incremental_option(struct strvec *args) { - strvec_push(&repack, "--no-write-bitmap-index"); + strvec_push(args, "--no-write-bitmap-index"); } -static int need_to_gc(struct gc_config *cfg) +static int need_to_gc(struct gc_config *cfg, struct strvec *repack_args) { /* * Setting gc.auto to 0 or negative can disable the @@ -698,10 +701,10 @@ static int need_to_gc(struct gc_config *cfg) string_list_clear(&keep_pack, 0); } - add_repack_all_option(cfg, &keep_pack); + add_repack_all_option(cfg, &keep_pack, repack_args); string_list_clear(&keep_pack, 0); - } else if (too_many_loose_objects(cfg)) - add_repack_incremental_option(); + } else if (too_many_loose_objects(cfg->gc_auto_threshold)) + add_repack_incremental_option(repack_args); else return 0; @@ -850,6 +853,7 @@ int cmd_gc(int argc, int keep_largest_pack = -1; int skip_foreground_tasks = 0; timestamp_t dummy; + struct strvec repack_args = STRVEC_INIT; struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT; struct gc_config cfg = GC_CONFIG_INIT; const char *prune_expire_sentinel = "sentinel"; @@ -889,7 +893,7 @@ int cmd_gc(int argc, show_usage_with_options_if_asked(argc, argv, builtin_gc_usage, builtin_gc_options); - strvec_pushl(&repack, "repack", "-d", "-l", NULL); + strvec_pushl(&repack_args, "repack", "-d", "-l", NULL); gc_config(&cfg); @@ -912,14 +916,14 @@ int cmd_gc(int argc, die(_("failed to parse prune expiry value %s"), cfg.prune_expire); if (aggressive) { - strvec_push(&repack, "-f"); + strvec_push(&repack_args, "-f"); if (cfg.aggressive_depth > 0) - strvec_pushf(&repack, "--depth=%d", cfg.aggressive_depth); + strvec_pushf(&repack_args, "--depth=%d", cfg.aggressive_depth); if (cfg.aggressive_window > 0) - strvec_pushf(&repack, "--window=%d", cfg.aggressive_window); + strvec_pushf(&repack_args, "--window=%d", cfg.aggressive_window); } if (opts.quiet) - strvec_push(&repack, "-q"); + strvec_push(&repack_args, "-q"); if (opts.auto_flag) { if (cfg.detach_auto && opts.detach < 0) @@ -928,7 +932,7 @@ int cmd_gc(int argc, /* * Auto-gc should be least intrusive as possible. */ - if (!need_to_gc(&cfg)) { + if (!need_to_gc(&cfg, &repack_args)) { ret = 0; goto out; } @@ -950,7 +954,7 @@ int cmd_gc(int argc, find_base_packs(&keep_pack, cfg.big_pack_threshold); } - add_repack_all_option(&cfg, &keep_pack); + add_repack_all_option(&cfg, &keep_pack, &repack_args); string_list_clear(&keep_pack, 0); } @@ -1012,9 +1016,9 @@ int cmd_gc(int argc, repack_cmd.git_cmd = 1; repack_cmd.close_object_store = 1; - strvec_pushv(&repack_cmd.args, repack.v); + strvec_pushv(&repack_cmd.args, repack_args.v); if (run_command(&repack_cmd)) - die(FAILED_RUN, repack.v[0]); + die(FAILED_RUN, repack_args.v[0]); if (cfg.prune_expire) { struct child_process prune_cmd = CHILD_PROCESS_INIT; @@ -1053,7 +1057,7 @@ int cmd_gc(int argc, !opts.quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0, NULL); - if (opts.auto_flag && too_many_loose_objects(&cfg)) + if (opts.auto_flag && too_many_loose_objects(cfg.gc_auto_threshold)) warning(_("There are too many unreachable loose objects; " "run 'git prune' to remove them.")); @@ -1065,6 +1069,7 @@ int cmd_gc(int argc, out: maintenance_run_opts_release(&opts); + strvec_clear(&repack_args); gc_config_release(&cfg); return 0; } @@ -1267,6 +1272,19 @@ static int maintenance_task_gc_background(struct maintenance_run_opts *opts, return run_command(&child); } +static int gc_condition(struct gc_config *cfg) +{ + /* + * Note that it's fine to drop the repack arguments here, as we execute + * git-gc(1) as a separate child process anyway. So it knows to compute + * these arguments again. + */ + struct strvec repack_args = STRVEC_INIT; + int ret = need_to_gc(cfg, &repack_args); + strvec_clear(&repack_args); + return ret; +} + static int prune_packed(struct maintenance_run_opts *opts) { struct child_process child = CHILD_PROCESS_INIT; @@ -1548,6 +1566,108 @@ static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts return 0; } +static int maintenance_task_geometric_repack(struct maintenance_run_opts *opts, + struct gc_config *cfg) +{ + struct pack_geometry geometry = { + .split_factor = 2, + }; + struct pack_objects_args po_args = { + .local = 1, + }; + struct existing_packs existing_packs = EXISTING_PACKS_INIT; + struct string_list kept_packs = STRING_LIST_INIT_DUP; + struct child_process child = CHILD_PROCESS_INIT; + int ret; + + repo_config_get_int(the_repository, "maintenance.geometric-repack.splitFactor", + &geometry.split_factor); + + existing_packs.repo = the_repository; + existing_packs_collect(&existing_packs, &kept_packs); + pack_geometry_init(&geometry, &existing_packs, &po_args); + pack_geometry_split(&geometry); + + child.git_cmd = 1; + + strvec_pushl(&child.args, "repack", "-d", "-l", NULL); + if (geometry.split < geometry.pack_nr) + strvec_pushf(&child.args, "--geometric=%d", + geometry.split_factor); + else + add_repack_all_option(cfg, NULL, &child.args); + if (opts->quiet) + strvec_push(&child.args, "--quiet"); + if (the_repository->settings.core_multi_pack_index) + strvec_push(&child.args, "--write-midx"); + + if (run_command(&child)) { + ret = error(_("failed to perform geometric repack")); + goto out; + } + + ret = 0; + +out: + existing_packs_release(&existing_packs); + pack_geometry_release(&geometry); + return ret; +} + +static int geometric_repack_auto_condition(struct gc_config *cfg UNUSED) +{ + struct pack_geometry geometry = { + .split_factor = 2, + }; + struct pack_objects_args po_args = { + .local = 1, + }; + struct existing_packs existing_packs = EXISTING_PACKS_INIT; + struct string_list kept_packs = STRING_LIST_INIT_DUP; + int auto_value = 100; + int ret; + + repo_config_get_int(the_repository, "maintenance.geometric-repack.auto", + &auto_value); + if (!auto_value) + return 0; + if (auto_value < 0) + return 1; + + repo_config_get_int(the_repository, "maintenance.geometric-repack.splitFactor", + &geometry.split_factor); + + existing_packs.repo = the_repository; + existing_packs_collect(&existing_packs, &kept_packs); + pack_geometry_init(&geometry, &existing_packs, &po_args); + pack_geometry_split(&geometry); + + /* + * When we'd merge at least two packs with one another we always + * perform the repack. + */ + if (geometry.split) { + ret = 1; + goto out; + } + + /* + * Otherwise, we estimate the number of loose objects to determine + * whether we want to create a new packfile or not. + */ + if (too_many_loose_objects(auto_value)) { + ret = 1; + goto out; + } + + ret = 0; + +out: + existing_packs_release(&existing_packs); + pack_geometry_release(&geometry); + return ret; +} + typedef int (*maintenance_task_fn)(struct maintenance_run_opts *opts, struct gc_config *cfg); typedef int (*maintenance_auto_fn)(struct gc_config *cfg); @@ -1590,11 +1710,16 @@ static const struct maintenance_task tasks[] = { .background = maintenance_task_incremental_repack, .auto_condition = incremental_repack_auto_condition, }, + [TASK_GEOMETRIC_REPACK] = { + .name = "geometric-repack", + .background = maintenance_task_geometric_repack, + .auto_condition = geometric_repack_auto_condition, + }, [TASK_GC] = { .name = "gc", .foreground = maintenance_task_gc_foreground, .background = maintenance_task_gc_background, - .auto_condition = need_to_gc, + .auto_condition = gc_condition, }, [TASK_COMMIT_GRAPH] = { .name = "commit-graph", @@ -1700,39 +1825,116 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts, return result; } +enum maintenance_type { + /* As invoked via `git maintenance run --schedule=`. */ + MAINTENANCE_TYPE_SCHEDULED = (1 << 0), + /* As invoked via `git maintenance run` and with `--auto`. */ + MAINTENANCE_TYPE_MANUAL = (1 << 1), +}; + struct maintenance_strategy { struct { - int enabled; + unsigned type; enum schedule_priority schedule; } tasks[TASK__COUNT]; }; static const struct maintenance_strategy none_strategy = { 0 }; -static const struct maintenance_strategy default_strategy = { + +static const struct maintenance_strategy gc_strategy = { .tasks = { - [TASK_GC].enabled = 1, + [TASK_GC] = { + .type = MAINTENANCE_TYPE_MANUAL | MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_DAILY, + }, }, }; + static const struct maintenance_strategy incremental_strategy = { .tasks = { - [TASK_COMMIT_GRAPH].enabled = 1, - [TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY, - [TASK_PREFETCH].enabled = 1, - [TASK_PREFETCH].schedule = SCHEDULE_HOURLY, - [TASK_INCREMENTAL_REPACK].enabled = 1, - [TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY, - [TASK_LOOSE_OBJECTS].enabled = 1, - [TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY, - [TASK_PACK_REFS].enabled = 1, - [TASK_PACK_REFS].schedule = SCHEDULE_WEEKLY, + [TASK_COMMIT_GRAPH] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_HOURLY, + }, + [TASK_PREFETCH] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_HOURLY, + }, + [TASK_INCREMENTAL_REPACK] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_DAILY, + }, + [TASK_LOOSE_OBJECTS] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_DAILY, + }, + [TASK_PACK_REFS] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_WEEKLY, + }, + /* + * Historically, the "incremental" strategy was only available + * in the context of scheduled maintenance when set up via + * "maintenance.strategy". We have later expanded that config + * to also cover manual maintenance. + * + * To retain backwards compatibility with the previous status + * quo we thus run git-gc(1) in case manual maintenance was + * requested. This is the same as the default strategy, which + * would have been in use beforehand. + */ + [TASK_GC] = { + .type = MAINTENANCE_TYPE_MANUAL, + }, + }, +}; + +static const struct maintenance_strategy geometric_strategy = { + .tasks = { + [TASK_COMMIT_GRAPH] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_HOURLY, + }, + [TASK_GEOMETRIC_REPACK] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_DAILY, + }, + [TASK_PACK_REFS] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_DAILY, + }, + [TASK_RERERE_GC] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_WEEKLY, + }, + [TASK_REFLOG_EXPIRE] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_WEEKLY, + }, + [TASK_WORKTREE_PRUNE] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_WEEKLY, + }, }, }; +static struct maintenance_strategy parse_maintenance_strategy(const char *name) +{ + if (!strcasecmp(name, "incremental")) + return incremental_strategy; + if (!strcasecmp(name, "gc")) + return gc_strategy; + if (!strcasecmp(name, "geometric")) + return geometric_strategy; + die(_("unknown maintenance strategy: '%s'"), name); +} + static void initialize_task_config(struct maintenance_run_opts *opts, const struct string_list *selected_tasks) { struct strbuf config_name = STRBUF_INIT; struct maintenance_strategy strategy; + enum maintenance_type type; const char *config_str; /* @@ -1760,19 +1962,20 @@ static void initialize_task_config(struct maintenance_run_opts *opts, * - Unscheduled maintenance uses our default strategy. * * Both of these are affected by the gitconfig though, which may - * override specific aspects of our strategy. + * override specific aspects of our strategy. Furthermore, both + * strategies can be overridden by setting "maintenance.strategy". */ if (opts->schedule) { strategy = none_strategy; - - if (!repo_config_get_string_tmp(the_repository, "maintenance.strategy", &config_str)) { - if (!strcasecmp(config_str, "incremental")) - strategy = incremental_strategy; - } + type = MAINTENANCE_TYPE_SCHEDULED; } else { - strategy = default_strategy; + strategy = gc_strategy; + type = MAINTENANCE_TYPE_MANUAL; } + if (!repo_config_get_string_tmp(the_repository, "maintenance.strategy", &config_str)) + strategy = parse_maintenance_strategy(config_str); + for (size_t i = 0; i < TASK__COUNT; i++) { int config_value; @@ -1780,8 +1983,8 @@ static void initialize_task_config(struct maintenance_run_opts *opts, strbuf_addf(&config_name, "maintenance.%s.enabled", tasks[i].name); if (!repo_config_get_bool(the_repository, config_name.buf, &config_value)) - strategy.tasks[i].enabled = config_value; - if (!strategy.tasks[i].enabled) + strategy.tasks[i].type = config_value ? type : 0; + if (!(strategy.tasks[i].type & type)) continue; if (opts->schedule) { diff --git a/commit-reach.c b/commit-reach.c index a339e41aa4ed1e..cc18c86d3bb315 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -60,6 +60,7 @@ static int paint_down_to_common(struct repository *r, struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; int i; timestamp_t last_gen = GENERATION_NUMBER_INFINITY; + struct commit_list **tail = result; if (!min_generation && !corrected_commit_dates_enabled(r)) queue.compare = compare_commits_by_commit_date; @@ -95,7 +96,7 @@ static int paint_down_to_common(struct repository *r, if (flags == (PARENT1 | PARENT2)) { if (!(commit->object.flags & RESULT)) { commit->object.flags |= RESULT; - commit_list_insert_by_date(commit, result); + tail = commit_list_append(commit, tail); } /* Mark parents of a found merge stale */ flags |= STALE; @@ -128,6 +129,7 @@ static int paint_down_to_common(struct repository *r, } clear_prio_queue(&queue); + commit_list_sort_by_date(result); return 0; } @@ -136,7 +138,7 @@ static int merge_bases_many(struct repository *r, struct commit **twos, struct commit_list **result) { - struct commit_list *list = NULL; + struct commit_list *list = NULL, **tail = result; int i; for (i = 0; i < n; i++) { @@ -171,8 +173,9 @@ static int merge_bases_many(struct repository *r, while (list) { struct commit *commit = pop_commit(&list); if (!(commit->object.flags & STALE)) - commit_list_insert_by_date(commit, result); + tail = commit_list_append(commit, tail); } + commit_list_sort_by_date(result); return 0; } @@ -425,7 +428,7 @@ static int get_merge_bases_many_0(struct repository *r, int cleanup, struct commit_list **result) { - struct commit_list *list; + struct commit_list *list, **tail = result; struct commit **rslt; size_t cnt, i; int ret; @@ -461,7 +464,8 @@ static int get_merge_bases_many_0(struct repository *r, return -1; } for (i = 0; i < cnt; i++) - commit_list_insert_by_date(rslt[i], result); + tail = commit_list_append(rslt[i], tail); + commit_list_sort_by_date(result); free(rslt); return 0; } diff --git a/diff.c b/diff.c index 22415aeceec6aa..a1961526c0dab1 100644 --- a/diff.c +++ b/diff.c @@ -1351,7 +1351,7 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, int len = eds->len; unsigned flags = eds->flags; - if (o->dry_run) + if (!o->file) return; switch (s) { @@ -3765,9 +3765,9 @@ static void builtin_diff(const char *name_a, if (o->word_diff) init_diff_words_data(&ecbdata, o, one, two); - if (o->dry_run) { + if (!o->file) { /* - * Unlike the !dry_run case, we need to ignore the + * Unlike the normal output case, we need to ignore the * return value from xdi_diff_outf() here, because * xdi_diff_outf() takes non-zero return from its * callback function as a sign of error and returns @@ -4423,7 +4423,6 @@ static void run_external_diff(const struct external_diff *pgm, { struct child_process cmd = CHILD_PROCESS_INIT; struct diff_queue_struct *q = &diff_queued_diff; - int quiet = !(o->output_format & DIFF_FORMAT_PATCH) || o->dry_run; int rc; /* @@ -4432,7 +4431,7 @@ static void run_external_diff(const struct external_diff *pgm, * external diff program lacks the ability to tell us whether * it's empty then we consider it non-empty without even asking. */ - if (!pgm->trust_exit_code && quiet) { + if (!pgm->trust_exit_code && !o->file) { o->found_changes = 1; return; } @@ -4457,7 +4456,10 @@ static void run_external_diff(const struct external_diff *pgm, diff_free_filespec_data(one); diff_free_filespec_data(two); cmd.use_shell = 1; - cmd.no_stdout = quiet; + if (!o->file) + cmd.no_stdout = 1; + else if (o->file != stdout) + cmd.out = xdup(fileno(o->file)); rc = run_command(&cmd); if (!pgm->trust_exit_code && rc == 0) o->found_changes = 1; @@ -4618,7 +4620,7 @@ static void run_diff_cmd(const struct external_diff *pgm, p->status == DIFF_STATUS_RENAMED) o->found_changes = 1; } else { - if (!o->dry_run) + if (o->file) fprintf(o->file, "* Unmerged path %s\n", name); o->found_changes = 1; } @@ -6196,15 +6198,15 @@ static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o) /* return 1 if any change is found; otherwise, return 0 */ static int diff_flush_patch_quietly(struct diff_filepair *p, struct diff_options *o) { - int saved_dry_run = o->dry_run; + FILE *saved_file = o->file; int saved_found_changes = o->found_changes; int ret; - o->dry_run = 1; + o->file = NULL; o->found_changes = 0; diff_flush_patch(p, o); ret = o->found_changes; - o->dry_run = saved_dry_run; + o->file = saved_file; o->found_changes |= saved_found_changes; return ret; } @@ -6832,38 +6834,18 @@ void diff_flush(struct diff_options *options) DIFF_FORMAT_NAME | DIFF_FORMAT_NAME_STATUS | DIFF_FORMAT_CHECKDIFF)) { - /* - * make sure diff_Flush_patch_quietly() to be silent. - */ - FILE *dev_null = NULL; - int saved_color_moved = options->color_moved; - - if (options->flags.diff_from_contents) { - dev_null = xfopen("/dev/null", "w"); - options->color_moved = 0; - } for (i = 0; i < q->nr; i++) { struct diff_filepair *p = q->queue[i]; if (!check_pair_status(p)) continue; - if (options->flags.diff_from_contents) { - FILE *saved_file = options->file; - int found_changes; + if (options->flags.diff_from_contents && + !diff_flush_patch_quietly(p, options)) + continue; - options->file = dev_null; - found_changes = diff_flush_patch_quietly(p, options); - options->file = saved_file; - if (!found_changes) - continue; - } flush_one_pair(p, options); } - if (options->flags.diff_from_contents) { - fclose(dev_null); - options->color_moved = saved_color_moved; - } separator++; } @@ -6914,15 +6896,6 @@ void diff_flush(struct diff_options *options) if (output_format & DIFF_FORMAT_NO_OUTPUT && options->flags.exit_with_status && options->flags.diff_from_contents) { - /* - * run diff_flush_patch for the exit status. setting - * options->file to /dev/null should be safe, because we - * aren't supposed to produce any output anyway. - */ - diff_free_file(options); - options->file = xfopen("/dev/null", "w"); - options->close_file = 1; - options->color_moved = 0; for (i = 0; i < q->nr; i++) { struct diff_filepair *p = q->queue[i]; if (check_pair_status(p)) diff --git a/diff.h b/diff.h index 2fa256c3ef0079..31eedd5c0c39d3 100644 --- a/diff.h +++ b/diff.h @@ -408,8 +408,6 @@ struct diff_options { #define COLOR_MOVED_WS_ERROR (1<<0) unsigned color_moved_ws_handling; - bool dry_run; - struct repository *repo; struct strmap *additional_path_headers; diff --git a/dir.c b/dir.c index f683f8ba498373..b00821f294fea2 100644 --- a/dir.c +++ b/dir.c @@ -1388,18 +1388,25 @@ int match_pathname(const char *pathname, int pathlen, if (fspathncmp(pattern, name, prefix)) return 0; - pattern += prefix; - patternlen -= prefix; - name += prefix; - namelen -= prefix; /* * If the whole pattern did not have a wildcard, * then our prefix match is all we need; we * do not need to call fnmatch at all. */ - if (!patternlen && !namelen) + if (patternlen == prefix && namelen == prefix) return 1; + + /* + * Retain one character of the prefix to + * pass to fnmatch, which lets it distinguish + * the start of a directory component correctly. + */ + prefix--; + pattern += prefix; + patternlen -= prefix; + name += prefix; + namelen -= prefix; } return fnmatch_icase_mem(pattern, patternlen, diff --git a/t/perf/p6010-merge-base.sh b/t/perf/p6010-merge-base.sh new file mode 100755 index 00000000000000..54f52fa23ee1e7 --- /dev/null +++ b/t/perf/p6010-merge-base.sh @@ -0,0 +1,101 @@ +#!/bin/sh + +test_description='Test git merge-base' + +. ./perf-lib.sh + +test_perf_fresh_repo + +# +# Creates lots of merges to make history traversal costly. In +# particular it creates 2^($max_level-1)-1 2-way merges on top of +# 2^($max_level-1) root commits. E.g., the commit history looks like +# this for a $max_level of 3: +# +# _1_ +# / \ +# 2 3 +# / \ / \ +# 4 5 6 7 +# +# The numbers are the fast-import marks, which also are the commit +# messages. 1 is the HEAD commit and a merge, 2 and 3 are also merges, +# 4-7 are the root commits. +# +build_history () { + local max_level="$1" && + local level="${2:-1}" && + local mark="${3:-1}" && + if test $level -eq $max_level + then + echo "reset refs/heads/master" && + echo "from $ZERO_OID" && + echo "commit refs/heads/master" && + echo "mark :$mark" && + echo "committer C 1234567890 +0000" && + echo "data < 1234567890 +0000" && + echo "data < 1234567890 +0000" && + echo "data <expect +' + +test_perf 'git merge-base' ' + git merge-base --all one two >actual +' + +test_expect_success 'verify result' ' + test_cmp expect actual +' + +test_done diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh index 273d71411fe05d..db8bde280ecfc9 100755 --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -847,6 +847,17 @@ test_expect_success 'directories and ** matches' ' test_cmp expect actual ' +test_expect_success '** not confused by matching leading prefix' ' + cat >.gitignore <<-\EOF && + foo**/bar + EOF + git check-ignore foobar foo/bar >actual && + cat >expect <<-\EOF && + foo/bar + EOF + test_cmp expect actual +' + ############################################################################ # # test whitespace handling diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 851ca6dd91a9ca..4285314f35f8f2 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -1431,4 +1431,15 @@ test_expect_success 'invalid option s is rejected' ' test_cmp expect actual ' +test_expect_success 'EOF quits' ' + echo a >file && + echo a >file2 && + git add file file2 && + echo X >file && + echo X >file2 && + git add -p out && + test_grep file out && + test_grep ! file2 out +' + test_done diff --git a/t/t4020-diff-external.sh b/t/t4020-diff-external.sh index c8a23d51483e37..7ec5854f74d651 100755 --- a/t/t4020-diff-external.sh +++ b/t/t4020-diff-external.sh @@ -44,6 +44,16 @@ test_expect_success 'GIT_EXTERNAL_DIFF environment and --no-ext-diff' ' ' +test_expect_success 'GIT_EXTERNAL_DIFF and --output' ' + cat >expect <<-EOF && + file $(git rev-parse --verify HEAD:file) 100644 file $(test_oid zero) 100644 + EOF + GIT_EXTERNAL_DIFF=echo git diff --output=out >stdout && + cut -d" " -f1,3- actual && + test_must_be_empty stdout && + test_cmp expect actual +' + test_expect_success SYMLINKS 'typechange diff' ' rm -f file && ln -s elif file && diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index ddd273d8dc24fb..614184a0978f79 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -465,6 +465,176 @@ test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' ) ' +run_and_verify_geometric_pack () { + EXPECTED_PACKS="$1" && + + # Verify that we perform a geometric repack. + rm -f "trace2.txt" && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --geometric=2 \ + --quiet --write-midx packfiles && + test_line_count = "$EXPECTED_PACKS" packfiles && + + # And verify that there are no loose objects anymore. + git count-objects -v >count && + test_grep '^count: 0$' count +} + +test_expect_success 'geometric repacking task' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + git config set maintenance.auto false && + test_commit initial && + + # The initial repack causes an all-into-one repack. + GIT_TRACE2_EVENT="$(pwd)/initial-repack.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --cruft --cruft-expiration=2.weeks.ago \ + --quiet --write-midx before && + run_and_verify_geometric_pack 1 && + ls -l .git/objects/pack/*.pack >after && + test_cmp before after && + + # This incremental change creates a new packfile that only + # soaks up loose objects. The packfiles are not getting merged + # at this point. + test_commit loose && + run_and_verify_geometric_pack 2 && + + # Both packfiles have 3 objects, so the next run would cause us + # to merge all packfiles together. This should be turned into + # an all-into-one-repack. + GIT_TRACE2_EVENT="$(pwd)/all-into-one-repack.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --cruft --cruft-expiration=2.weeks.ago \ + --quiet --write-midx /dev/null && + test_subcommand git repack -d -l --cruft --cruft-expiration=2.weeks.ago \ + --quiet --write-midx packs && + test_line_count = 2 packs && + ls .git/objects/pack/*.mtimes >cruft && + test_line_count = 1 cruft + ) +' + +test_geometric_repack_needed () { + NEEDED="$1" + GEOMETRIC_CONFIG="$2" && + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + git ${GEOMETRIC_CONFIG:+-c maintenance.geometric-repack.$GEOMETRIC_CONFIG} \ + maintenance run --auto --task=geometric-repack 2>/dev/null && + case "$NEEDED" in + true) + test_grep "\[\"git\",\"repack\"," trace2.txt;; + false) + ! test_grep "\[\"git\",\"repack\"," trace2.txt;; + *) + BUG "invalid parameter: $NEEDED";; + esac +} + +test_expect_success 'geometric repacking with --auto' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + + # An empty repository does not need repacking, except when + # explicitly told to do it. + test_geometric_repack_needed false && + test_geometric_repack_needed false auto=0 && + test_geometric_repack_needed false auto=1 && + test_geometric_repack_needed true auto=-1 && + + test_oid_init && + + # Loose objects cause a repack when crossing the limit. Note + # that the number of objects gets extrapolated by having a look + # at the "objects/17/" shard. + test_commit "$(test_oid blob17_1)" && + test_geometric_repack_needed false && + test_commit "$(test_oid blob17_2)" && + test_geometric_repack_needed false auto=257 && + test_geometric_repack_needed true auto=256 && + + # Force another repack. + test_commit first && + test_commit second && + test_geometric_repack_needed true auto=-1 && + + # We now have two packfiles that would be merged together. As + # such, the repack should always happen unless the user has + # disabled the auto task. + test_geometric_repack_needed false auto=0 && + test_geometric_repack_needed true auto=9000 + ) +' + +test_expect_success 'geometric repacking honors configured split factor' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + git config set maintenance.auto false && + + # Create three different packs with 9, 2 and 1 object, respectively. + # This is done so that only a subset of packs would be merged + # together so that we can verify that `git repack` receives the + # correct geometric factor. + for i in $(test_seq 9) + do + echo first-$i | git hash-object -w --stdin -t blob || return 1 + done && + git repack --geometric=2 -d && + + for i in $(test_seq 2) + do + echo second-$i | git hash-object -w --stdin -t blob || return 1 + done && + git repack --geometric=2 -d && + + echo third | git hash-object -w --stdin -t blob && + git repack --geometric=2 -d && + + test_geometric_repack_needed false splitFactor=2 && + test_geometric_repack_needed true splitFactor=3 && + test_subcommand git repack -d -l --geometric=3 --quiet --write-midx expect && + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + git -c maintenance.strategy=$STRATEGY maintenance run --quiet "$@" && + sed -n 's/{"event":"child_start","sid":"[^/"]*",.*,"argv":\["\(.*\)\"]}/\1/p' actual + test_cmp expect actual +} + +test_expect_success 'maintenance.strategy is respected' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + + test_must_fail git -c maintenance.strategy=unknown maintenance run 2>err && + test_grep "unknown maintenance strategy: .unknown." err && + + test_strategy incremental <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git gc --quiet --no-detach --skip-foreground-tasks + EOF + + test_strategy incremental --schedule=weekly <<-\EOF && + git pack-refs --all --prune + git prune-packed --quiet + git multi-pack-index write --no-progress + git multi-pack-index expire --no-progress + git multi-pack-index repack --no-progress --batch-size=1 + git commit-graph write --split --reachable --no-progress + EOF + + test_strategy gc <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git gc --quiet --no-detach --skip-foreground-tasks + EOF + + test_strategy gc --schedule=weekly <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git gc --quiet --no-detach --skip-foreground-tasks + EOF + + test_strategy geometric <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git repack -d -l --geometric=2 --quiet --write-midx + git commit-graph write --split --reachable --no-progress + git worktree prune --expire 3.months.ago + git rerere gc + EOF + + test_strategy geometric --schedule=weekly <<-\EOF + git pack-refs --all --prune + git reflog expire --all + git repack -d -l --geometric=2 --quiet --write-midx + git commit-graph write --split --reachable --no-progress + git worktree prune --expire 3.months.ago + git rerere gc + EOF + ) +' + test_expect_success 'register and unregister' ' test_when_finished git config --global --unset-all maintenance.repo && @@ -1093,6 +1333,11 @@ test_expect_success 'fails when running outside of a repository' ' nongit test_must_fail git maintenance unregister ' +test_expect_success 'fails when configured to use an invalid strategy' ' + test_must_fail git -c maintenance.strategy=invalid maintenance run --schedule=hourly 2>err && + test_grep "unknown maintenance strategy: .invalid." err +' + test_expect_success 'register and unregister bare repo' ' test_when_finished "git config --global --unset-all maintenance.repo || :" && test_might_fail git config --global --unset-all maintenance.repo &&