Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions Documentation/RelNotes/2.51.0.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ UI, Workflows & Features
service names (like smtp) in addition to the numeric port numbers
(like 25).

* Lift the limitation to use changed-path filter in "git log" so that
it can be used for a pathspec with multiple literal paths.

* Clean up the way how signature on commit objects are exported to
and imported from fast-import stream.


Performance, Internal Implementation, Development Support etc.
--------------------------------------------------------------
Expand Down Expand Up @@ -195,6 +201,13 @@ including security updates, are included in this release.
expansion.
(merge 7d275cd5c0 jb/gpg-program-variable-is-a-pathname later to maint).

* Our <sane-ctype.h> header file relied on that the system-supplied
<ctype.h> header is not later included, which would override our
macro definitions, but "amazon linux" broke this assumption. Fix
this by preemptively including <ctype.h> near the beginning of
<sane-ctype.h> ourselves.
(merge 9d3b33125f ps/sane-ctype-workaround later to maint).

* Other code cleanup, docfix, build fix, etc.
(merge b257adb571 lo/my-first-ow-doc-update later to maint).
(merge 8b34b6a220 ly/sequencer-update-squash-is-fixup-only later to maint).
Expand Down
17 changes: 17 additions & 0 deletions Documentation/git-fast-export.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,23 @@ resulting tag will have an invalid signature.
is the same as how earlier versions of this command without
this option behaved.
+
When exported, a signature starts with:
+
gpgsig <git-hash-algo> <signature-format>
+
where <git-hash-algo> is the Git object hash so either "sha1" or
"sha256", and <signature-format> is the signature type, so "openpgp",
"x509", "ssh" or "unknown".
+
For example, an OpenPGP signature on a SHA-1 commit starts with
`gpgsig sha1 openpgp`, while an SSH signature on a SHA-256 commit
starts with `gpgsig sha256 ssh`.
+
While all the signatures of a commit are exported, an importer may
choose to accept only some of them. For example
linkgit:git-fast-import[1] currently stores at most one signature per
Git hash algorithm in each commit.
+
NOTE: This is highly experimental and the format of the data stream may
change in the future without compatibility guarantees.

Expand Down
38 changes: 32 additions & 6 deletions Documentation/git-fast-import.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -445,7 +445,7 @@ one).
original-oid?
('author' (SP <name>)? SP LT <email> GT SP <when> LF)?
'committer' (SP <name>)? SP LT <email> GT SP <when> LF
('gpgsig' SP <alg> LF data)?
('gpgsig' SP <algo> SP <format> LF data)?
('encoding' SP <encoding> LF)?
data
('from' SP <commit-ish> LF)?
Expand Down Expand Up @@ -518,13 +518,39 @@ their syntax.
^^^^^^^^

The optional `gpgsig` command is used to include a PGP/GPG signature
that signs the commit data.
or other cryptographic signature that signs the commit data.

Here <alg> specifies which hashing algorithm is used for this
signature, either `sha1` or `sha256`.
....
'gpgsig' SP <git-hash-algo> SP <signature-format> LF data
....

The `gpgsig` command takes two arguments:

* `<git-hash-algo>` specifies which Git object format this signature
applies to, either `sha1` or `sha256`. This allows to know which
representation of the commit was signed (the SHA-1 or the SHA-256
version) which helps with both signature verification and
interoperability between repos with different hash functions.

* `<signature-format>` specifies the type of signature, such as
`openpgp`, `x509`, `ssh`, or `unknown`. This is a convenience for
tools that process the stream, so they don't have to parse the ASCII
armor to identify the signature type.

A commit may have at most one signature for the SHA-1 object format
(stored in the "gpgsig" header) and one for the SHA-256 object format
(stored in the "gpgsig-sha256" header).

See below for a detailed description of the `data` command which
contains the raw signature data.

Signatures are not yet checked in the current implementation
though. (Already setting the `extensions.compatObjectFormat`
configuration option might help with verifying both SHA-1 and SHA-256
object format signatures when it will be implemented.)

NOTE: This is highly experimental and the format of the data stream may
change in the future without compatibility guarantees.
NOTE: This is highly experimental and the format of the `gpgsig`
command may change in the future without compatibility guarantees.

`encoding`
^^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion blame.c
Original file line number Diff line number Diff line change
Expand Up @@ -1311,7 +1311,7 @@ static void add_bloom_key(struct blame_bloom_data *bd,
}

bd->keys[bd->nr] = xmalloc(sizeof(struct bloom_key));
fill_bloom_key(path, strlen(path), bd->keys[bd->nr], bd->settings);
bloom_key_fill(bd->keys[bd->nr], path, strlen(path), bd->settings);
bd->nr++;
}

Expand Down
84 changes: 77 additions & 7 deletions bloom.c
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ int load_bloom_filter_from_graph(struct commit_graph *g,
* Not considered to be cryptographically secure.
* Implemented as described in https://en.wikipedia.org/wiki/MurmurHash#Algorithm
*/
uint32_t murmur3_seeded_v2(uint32_t seed, const char *data, size_t len)
static uint32_t murmur3_seeded_v2(uint32_t seed, const char *data, size_t len)
{
const uint32_t c1 = 0xcc9e2d51;
const uint32_t c2 = 0x1b873593;
Expand Down Expand Up @@ -221,9 +221,7 @@ static uint32_t murmur3_seeded_v1(uint32_t seed, const char *data, size_t len)
return seed;
}

void fill_bloom_key(const char *data,
size_t len,
struct bloom_key *key,
void bloom_key_fill(struct bloom_key *key, const char *data, size_t len,
const struct bloom_filter_settings *settings)
{
int i;
Expand All @@ -243,7 +241,7 @@ void fill_bloom_key(const char *data,
key->hashes[i] = hash0 + i * hash1;
}

void clear_bloom_key(struct bloom_key *key)
void bloom_key_clear(struct bloom_key *key)
{
FREE_AND_NULL(key->hashes);
}
Expand Down Expand Up @@ -280,6 +278,55 @@ void deinit_bloom_filters(void)
deep_clear_bloom_filter_slab(&bloom_filters, free_one_bloom_filter);
}

struct bloom_keyvec *bloom_keyvec_new(const char *path, size_t len,
const struct bloom_filter_settings *settings)
{
struct bloom_keyvec *vec;
const char *p;
size_t sz;
size_t nr = 1;

p = path;
while (*p) {
/*
* At this point, the path is normalized to use Unix-style
* path separators. This is required due to how the
* changed-path Bloom filters store the paths.
*/
if (*p == '/')
nr++;
p++;
}

sz = sizeof(struct bloom_keyvec);
sz += nr * sizeof(struct bloom_key);
vec = (struct bloom_keyvec *)xcalloc(1, sz);
if (!vec)
return NULL;
vec->count = nr;

bloom_key_fill(&vec->key[0], path, len, settings);
nr = 1;
p = path + len - 1;
while (p > path) {
if (*p == '/') {
bloom_key_fill(&vec->key[nr++], path, p - path, settings);
}
p--;
}
assert(nr == vec->count);
return vec;
}

void bloom_keyvec_free(struct bloom_keyvec *vec)
{
if (!vec)
return;
for (size_t nr = 0; nr < vec->count; nr++)
bloom_key_clear(&vec->key[nr]);
free(vec);
}

static int pathmap_cmp(const void *hashmap_cmp_fn_data UNUSED,
const struct hashmap_entry *eptr,
const struct hashmap_entry *entry_or_key,
Expand Down Expand Up @@ -500,9 +547,9 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,

hashmap_for_each_entry(&pathmap, &iter, e, entry) {
struct bloom_key key;
fill_bloom_key(e->path, strlen(e->path), &key, settings);
bloom_key_fill(&key, e->path, strlen(e->path), settings);
add_key_to_filter(&key, filter, settings);
clear_bloom_key(&key);
bloom_key_clear(&key);
}

cleanup:
Expand Down Expand Up @@ -540,3 +587,26 @@ int bloom_filter_contains(const struct bloom_filter *filter,

return 1;
}

int bloom_filter_contains_vec(const struct bloom_filter *filter,
const struct bloom_keyvec *vec,
const struct bloom_filter_settings *settings)
{
int ret = 1;

for (size_t nr = 0; ret > 0 && nr < vec->count; nr++)
ret = bloom_filter_contains(filter, &vec->key[nr], settings);

return ret;
}

uint32_t test_bloom_murmur3_seeded(uint32_t seed, const char *data, size_t len,
int version)
{
assert(version == 1 || version == 2);

if (version == 2)
return murmur3_seeded_v2(seed, data, len);
else
return murmur3_seeded_v1(seed, data, len);
}
54 changes: 42 additions & 12 deletions bloom.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,24 +74,40 @@ struct bloom_key {
uint32_t *hashes;
};

/*
* A bloom_keyvec is a vector of bloom_keys, which
* can be used to store multiple keys for a single
* pathspec item.
*/
struct bloom_keyvec {
size_t count;
struct bloom_key key[FLEX_ARRAY];
};

int load_bloom_filter_from_graph(struct commit_graph *g,
struct bloom_filter *filter,
uint32_t graph_pos);

void bloom_key_fill(struct bloom_key *key, const char *data, size_t len,
const struct bloom_filter_settings *settings);
void bloom_key_clear(struct bloom_key *key);

/*
* Calculate the murmur3 32-bit hash value for the given data
* using the given seed.
* Produces a uniformly distributed hash value.
* Not considered to be cryptographically secure.
* Implemented as described in https://en.wikipedia.org/wiki/MurmurHash#Algorithm
* bloom_keyvec_new - Allocate and populate a bloom_keyvec with keys for the
* given path.
*
* This function splits the input path by '/' and generates a bloom key for each
* prefix, in reverse order of specificity. For example, given the input
* "a/b/c", it will generate bloom keys for:
* - "a/b/c"
* - "a/b"
* - "a"
*
* The resulting keys are stored in a newly allocated bloom_keyvec.
*/
uint32_t murmur3_seeded_v2(uint32_t seed, const char *data, size_t len);

void fill_bloom_key(const char *data,
size_t len,
struct bloom_key *key,
const struct bloom_filter_settings *settings);
void clear_bloom_key(struct bloom_key *key);
struct bloom_keyvec *bloom_keyvec_new(const char *path, size_t len,
const struct bloom_filter_settings *settings);
void bloom_keyvec_free(struct bloom_keyvec *vec);

void add_key_to_filter(const struct bloom_key *key,
struct bloom_filter *filter,
Expand Down Expand Up @@ -137,4 +153,18 @@ int bloom_filter_contains(const struct bloom_filter *filter,
const struct bloom_key *key,
const struct bloom_filter_settings *settings);

/*
* bloom_filter_contains_vec - Check if all keys in a key vector are in the
* Bloom filter.
*
* Returns 1 if **all** keys in the vector are present in the filter,
* 0 if **any** key is not present.
*/
int bloom_filter_contains_vec(const struct bloom_filter *filter,
const struct bloom_keyvec *v,
const struct bloom_filter_settings *settings);

uint32_t test_bloom_murmur3_seeded(uint32_t seed, const char *data, size_t len,
int version);

#endif
Loading