2023-04-11 09:41:48 +02:00
|
|
|
#include "git-compat-util.h"
|
2018-03-23 18:45:21 +01:00
|
|
|
#include "repository.h"
|
2017-06-14 20:07:36 +02:00
|
|
|
#include "config.h"
|
2023-04-22 22:17:26 +02:00
|
|
|
#include "date.h"
|
2023-03-21 07:26:03 +01:00
|
|
|
#include "environment.h"
|
2023-03-21 07:25:54 +01:00
|
|
|
#include "gettext.h"
|
2023-02-24 01:09:27 +01:00
|
|
|
#include "hex.h"
|
2014-10-01 12:28:42 +02:00
|
|
|
#include "lockfile.h"
|
2012-10-26 17:53:55 +02:00
|
|
|
#include "refs.h"
|
|
|
|
#include "pkt-line.h"
|
|
|
|
#include "commit.h"
|
|
|
|
#include "tag.h"
|
2018-04-10 23:26:18 +02:00
|
|
|
#include "exec-cmd.h"
|
2012-10-26 17:53:55 +02:00
|
|
|
#include "pack.h"
|
|
|
|
#include "sideband.h"
|
|
|
|
#include "fetch-pack.h"
|
|
|
|
#include "remote.h"
|
|
|
|
#include "run-command.h"
|
2013-07-08 22:56:53 +02:00
|
|
|
#include "connect.h"
|
2023-04-11 05:00:38 +02:00
|
|
|
#include "trace2.h"
|
2012-10-26 17:53:55 +02:00
|
|
|
#include "transport.h"
|
|
|
|
#include "version.h"
|
2020-03-30 16:03:46 +02:00
|
|
|
#include "oid-array.h"
|
2017-05-15 19:32:20 +02:00
|
|
|
#include "oidset.h"
|
2017-08-19 00:20:26 +02:00
|
|
|
#include "packfile.h"
|
2023-05-16 08:34:06 +02:00
|
|
|
#include "object-store-ll.h"
|
2023-05-16 08:33:59 +02:00
|
|
|
#include "path.h"
|
fetch-pack: write shallow, then check connectivity
When fetching, connectivity is checked after the shallow file is
updated. There are 2 issues with this: (1) the connectivity check is
only performed up to ancestors of existing refs (which is not thorough
enough if we were deepening an existing ref in the first place), and (2)
there is no rollback of the shallow file if the connectivity check
fails.
To solve (1), update the connectivity check to check the ancestry chain
completely in the case of a deepening fetch by refraining from passing
"--not --all" when invoking rev-list in connected.c.
To solve (2), have fetch_pack() perform its own connectivity check
before updating the shallow file. To support existing use cases in which
"git fetch-pack" is used to download objects without much regard as to
the connectivity of the resulting objects with respect to the existing
repository, the connectivity check is only done if necessary (that is,
the fetch is not a clone, and the fetch involves shallow/deepen
functionality). "git fetch" still performs its own connectivity check,
preserving correctness but sometimes performing redundant work. This
redundancy is mitigated by the fact that fetch_pack() reports if it has
performed a connectivity check itself, and if the transport supports
connect or stateless-connect, it will bubble up that report so that "git
fetch" knows not to perform the connectivity check in such a case.
This was noticed when a user tried to deepen an existing repository by
fetching with --no-shallow from a server that did not send all necessary
objects - the connectivity check as run by "git fetch" succeeded, but a
subsequent "git fsck" failed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-03 00:08:43 +02:00
|
|
|
#include "connected.h"
|
2018-06-15 00:54:28 +02:00
|
|
|
#include "fetch-negotiator.h"
|
2018-07-27 16:37:17 +02:00
|
|
|
#include "fsck.h"
|
2020-04-30 21:48:50 +02:00
|
|
|
#include "shallow.h"
|
fetch: teach independent negotiation (no packfile)
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.
This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)
In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.
In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.
There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 23:16:01 +02:00
|
|
|
#include "commit-reach.h"
|
|
|
|
#include "commit-graph.h"
|
fetch-pack: ignore SIGPIPE when writing to index-pack
When fetching, we send the incoming pack to index-pack (or
unpack-objects) via the sideband demuxer. If index-pack hits an error
(e.g., because an object fails fsck), then it will die immediately. This
may cause us to get SIGPIPE on the fetch, as we're still trying to write
pack contents from the sideband demuxer (which is typically a thread,
and thus takes down the whole fetch process).
You can see this in action with:
./t5702-protocol-v2.sh --stress --run=59
which ends with (wrapped for readability):
test_must_fail: died by signal 13: git -c protocol.version=2 \
-c transfer.fsckobjects=1 -c fetch.uriprotocols=http,https \
clone http://127.0.0.1:5708/smart/http_parent http_child
not ok 59 - packfile-uri with transfer.fsckobjects fails on bad object
This is mostly cosmetic. The actual error of interest (in this case, the
object that failed the fsck check) comes from index-pack straight to
stderr, so the user still sees it. They _might_ even see fetch-pack
complaining about index-pack failing, because the main thread is racing
with the sideband-demuxer. But they'll definitely see the signal death
in the exit code, which is what the test is complaining about.
We can make this more predictable by just ignoring SIGPIPE. The sideband
demuxer uses write_or_die(), so it will notice and stop (gracefully,
because we hook die_routine() to exit just the thread). And during this
section we're not writing anywhere else where we'd be concerned about
SIGPIPE preventing us from wasting effort writing to nowhere.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-19 21:58:55 +01:00
|
|
|
#include "sigchain.h"
|
2022-07-16 18:59:59 +02:00
|
|
|
#include "mergesort.h"
|
2012-10-26 17:53:55 +02:00
|
|
|
|
|
|
|
static int transfer_unpack_limit = -1;
|
|
|
|
static int fetch_unpack_limit = -1;
|
|
|
|
static int unpack_limit = 100;
|
|
|
|
static int prefer_ofs_delta = 1;
|
|
|
|
static int no_done;
|
2016-06-12 12:53:59 +02:00
|
|
|
static int deepen_since_ok;
|
2016-06-12 12:54:04 +02:00
|
|
|
static int deepen_not_ok;
|
2012-10-26 17:53:55 +02:00
|
|
|
static int fetch_fsck_objects = -1;
|
|
|
|
static int transfer_fsck_objects = -1;
|
|
|
|
static int agent_supported;
|
2017-12-08 16:58:40 +01:00
|
|
|
static int server_supports_filtering;
|
2020-11-12 00:29:31 +01:00
|
|
|
static int advertise_sid;
|
2020-04-30 21:48:57 +02:00
|
|
|
static struct shallow_lock shallow_lock;
|
2013-05-26 03:16:15 +02:00
|
|
|
static const char *alternate_shallow_file;
|
2021-03-28 15:15:51 +02:00
|
|
|
static struct fsck_options fsck_options = FSCK_OPTIONS_MISSING_GITMODULES;
|
2018-07-27 16:37:17 +02:00
|
|
|
static struct strbuf fsck_msg_types = STRBUF_INIT;
|
2020-06-10 22:57:23 +02:00
|
|
|
static struct string_list uri_protocols = STRING_LIST_INIT_DUP;
|
2012-10-26 17:53:55 +02:00
|
|
|
|
2014-03-25 14:23:26 +01:00
|
|
|
/* Remember to update object flag allocation in object.h */
|
2012-10-26 17:53:55 +02:00
|
|
|
#define COMPLETE (1U << 0)
|
2018-06-15 00:54:28 +02:00
|
|
|
#define ALTERNATE (1U << 1)
|
fetch: teach independent negotiation (no packfile)
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.
This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)
In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.
In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.
There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-04 23:16:01 +02:00
|
|
|
#define COMMON (1U << 6)
|
|
|
|
#define REACH_SCRATCH (1U << 7)
|
2012-10-26 17:53:55 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* After sending this many "have"s if we do not get any new ACK , we
|
|
|
|
* give up traversing our history.
|
|
|
|
*/
|
|
|
|
#define MAX_IN_VAIN 256
|
|
|
|
|
2018-06-15 00:54:26 +02:00
|
|
|
static int multi_ack, use_sideband;
|
2015-05-21 22:23:38 +02:00
|
|
|
/* Allow specifying sha1 if it is a ref tip. */
|
|
|
|
#define ALLOW_TIP_SHA1 01
|
2015-05-21 22:23:39 +02:00
|
|
|
/* Allow request of a sha1 if it is reachable from a ref (possibly hidden ref). */
|
|
|
|
#define ALLOW_REACHABLE_SHA1 02
|
2015-05-21 22:23:38 +02:00
|
|
|
static unsigned int allow_unadvertised_object_request;
|
2012-10-26 17:53:55 +02:00
|
|
|
|
2016-06-12 12:53:54 +02:00
|
|
|
__attribute__((format (printf, 2, 3)))
|
|
|
|
static inline void print_verbose(const struct fetch_pack_args *args,
|
|
|
|
const char *fmt, ...)
|
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
if (!args->verbose)
|
|
|
|
return;
|
|
|
|
|
|
|
|
va_start(params, fmt);
|
|
|
|
vfprintf(stderr, fmt, params);
|
|
|
|
va_end(params);
|
|
|
|
fputc('\n', stderr);
|
|
|
|
}
|
|
|
|
|
fetch-pack: cache results of for_each_alternate_ref
We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.
Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.
The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.
However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.
Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.
There are two extra optimizations I didn't do here:
- we actually store an array of pointer-to-object.
Technically we could just walk the obj_hash table
looking for entries with the ALTERNATE flag set (because
our use case doesn't care about the order here).
But that hash table may be mostly composed of
non-ALTERNATE entries, so we'd waste time walking over
them. So it would be a slight win in memory use, but a
loss in CPU.
- the items we pull out of the cache are actual "struct
object"s, but then we feed "obj->sha1" to our
sub-functions, which promptly call parse_object().
This second parse is cheap, because it starts with
lookup_object() and will bail immediately when it sees
we've already parsed the object. We could save the extra
hash lookup, but it would involve refactoring the
functions we call. It may or may not be worth the
trouble.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 21:53:03 +01:00
|
|
|
struct alternate_object_cache {
|
|
|
|
struct object **items;
|
|
|
|
size_t nr, alloc;
|
|
|
|
};
|
|
|
|
|
2018-10-08 20:09:23 +02:00
|
|
|
static void cache_one_alternate(const struct object_id *oid,
|
fetch-pack: cache results of for_each_alternate_ref
We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.
Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.
The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.
However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.
Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.
There are two extra optimizations I didn't do here:
- we actually store an array of pointer-to-object.
Technically we could just walk the obj_hash table
looking for entries with the ALTERNATE flag set (because
our use case doesn't care about the order here).
But that hash table may be mostly composed of
non-ALTERNATE entries, so we'd waste time walking over
them. So it would be a slight win in memory use, but a
loss in CPU.
- the items we pull out of the cache are actual "struct
object"s, but then we feed "obj->sha1" to our
sub-functions, which promptly call parse_object().
This second parse is cheap, because it starts with
lookup_object() and will bail immediately when it sees
we've already parsed the object. We could save the extra
hash lookup, but it would involve refactoring the
functions we call. It may or may not be worth the
trouble.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 21:53:03 +01:00
|
|
|
void *vcache)
|
|
|
|
{
|
|
|
|
struct alternate_object_cache *cache = vcache;
|
2018-06-29 03:21:51 +02:00
|
|
|
struct object *obj = parse_object(the_repository, oid);
|
fetch-pack: cache results of for_each_alternate_ref
We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.
Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.
The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.
However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.
Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.
There are two extra optimizations I didn't do here:
- we actually store an array of pointer-to-object.
Technically we could just walk the obj_hash table
looking for entries with the ALTERNATE flag set (because
our use case doesn't care about the order here).
But that hash table may be mostly composed of
non-ALTERNATE entries, so we'd waste time walking over
them. So it would be a slight win in memory use, but a
loss in CPU.
- the items we pull out of the cache are actual "struct
object"s, but then we feed "obj->sha1" to our
sub-functions, which promptly call parse_object().
This second parse is cheap, because it starts with
lookup_object() and will bail immediately when it sees
we've already parsed the object. We could save the extra
hash lookup, but it would involve refactoring the
functions we call. It may or may not be worth the
trouble.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 21:53:03 +01:00
|
|
|
|
|
|
|
if (!obj || (obj->flags & ALTERNATE))
|
|
|
|
return;
|
|
|
|
|
|
|
|
obj->flags |= ALTERNATE;
|
|
|
|
ALLOC_GROW(cache->items, cache->nr + 1, cache->alloc);
|
|
|
|
cache->items[cache->nr++] = obj;
|
|
|
|
}
|
|
|
|
|
2018-06-15 00:54:28 +02:00
|
|
|
static void for_each_cached_alternate(struct fetch_negotiator *negotiator,
|
|
|
|
void (*cb)(struct fetch_negotiator *,
|
2018-06-15 00:54:26 +02:00
|
|
|
struct object *))
|
fetch-pack: cache results of for_each_alternate_ref
We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.
Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.
The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.
However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.
Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.
There are two extra optimizations I didn't do here:
- we actually store an array of pointer-to-object.
Technically we could just walk the obj_hash table
looking for entries with the ALTERNATE flag set (because
our use case doesn't care about the order here).
But that hash table may be mostly composed of
non-ALTERNATE entries, so we'd waste time walking over
them. So it would be a slight win in memory use, but a
loss in CPU.
- the items we pull out of the cache are actual "struct
object"s, but then we feed "obj->sha1" to our
sub-functions, which promptly call parse_object().
This second parse is cheap, because it starts with
lookup_object() and will bail immediately when it sees
we've already parsed the object. We could save the extra
hash lookup, but it would involve refactoring the
functions we call. It may or may not be worth the
trouble.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 21:53:03 +01:00
|
|
|
{
|
|
|
|
static int initialized;
|
|
|
|
static struct alternate_object_cache cache;
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
if (!initialized) {
|
|
|
|
for_each_alternate_ref(cache_one_alternate, &cache);
|
|
|
|
initialized = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < cache.nr; i++)
|
2018-06-15 00:54:28 +02:00
|
|
|
cb(negotiator, cache.items[i]);
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
|
2022-05-16 22:11:00 +02:00
|
|
|
static struct commit *deref_without_lazy_fetch_extended(const struct object_id *oid,
|
|
|
|
int mark_tags_complete,
|
|
|
|
enum object_type *type,
|
|
|
|
unsigned int oi_flags)
|
2020-08-18 06:01:35 +02:00
|
|
|
{
|
2022-05-16 22:11:00 +02:00
|
|
|
struct object_info info = { .typep = type };
|
fetch-pack: optimize loading of refs via commit graph
In order to negotiate a packfile, we need to dereference refs to see
which commits we have in common with the remote. To do so, we first look
up the object's type -- if it's a tag, we peel until we hit a non-tag
object. If we hit a commit eventually, then we return that commit.
In case the object ID points to a commit directly, we can avoid the
initial lookup of the object type by opportunistically looking up the
commit via the commit-graph, if available, which gives us a slight speed
bump of about 2% in a huge repository with about 2.3M refs:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 31.634 s ± 0.258 s [User: 28.400 s, System: 5.090 s]
Range (min … max): 31.280 s … 31.896 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 31.129 s ± 0.543 s [User: 27.976 s, System: 5.056 s]
Range (min … max): 30.172 s … 31.479 s 5 runs
Summary
'HEAD: git-fetch' ran
1.02 ± 0.02 times faster than 'HEAD~: git-fetch'
In case this fails, we fall back to the old code which peels the
objects to a commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 15:09:54 +02:00
|
|
|
struct commit *commit;
|
|
|
|
|
|
|
|
commit = lookup_commit_in_graph(the_repository, oid);
|
|
|
|
if (commit)
|
|
|
|
return commit;
|
2020-08-18 06:01:35 +02:00
|
|
|
|
|
|
|
while (1) {
|
|
|
|
if (oid_object_info_extended(the_repository, oid, &info,
|
2022-05-16 22:11:00 +02:00
|
|
|
oi_flags))
|
2020-08-18 06:01:35 +02:00
|
|
|
return NULL;
|
2022-05-16 22:11:00 +02:00
|
|
|
if (*type == OBJ_TAG) {
|
2020-08-18 06:01:35 +02:00
|
|
|
struct tag *tag = (struct tag *)
|
|
|
|
parse_object(the_repository, oid);
|
|
|
|
|
|
|
|
if (!tag->tagged)
|
|
|
|
return NULL;
|
|
|
|
if (mark_tags_complete)
|
|
|
|
tag->object.flags |= COMPLETE;
|
|
|
|
oid = &tag->tagged->oid;
|
|
|
|
} else {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2021-08-04 15:56:11 +02:00
|
|
|
|
2022-05-16 22:11:00 +02:00
|
|
|
if (*type == OBJ_COMMIT) {
|
2021-08-04 15:56:11 +02:00
|
|
|
struct commit *commit = lookup_commit(the_repository, oid);
|
|
|
|
if (!commit || repo_parse_commit(the_repository, commit))
|
|
|
|
return NULL;
|
|
|
|
return commit;
|
|
|
|
}
|
|
|
|
|
2020-08-18 06:01:35 +02:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2022-05-16 22:11:00 +02:00
|
|
|
|
|
|
|
static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
|
|
|
|
int mark_tags_complete)
|
|
|
|
{
|
|
|
|
enum object_type type;
|
|
|
|
unsigned flags = OBJECT_INFO_SKIP_FETCH_OBJECT | OBJECT_INFO_QUICK;
|
|
|
|
return deref_without_lazy_fetch_extended(oid, mark_tags_complete,
|
|
|
|
&type, flags);
|
|
|
|
}
|
|
|
|
|
2018-06-15 00:54:28 +02:00
|
|
|
static int rev_list_insert_ref(struct fetch_negotiator *negotiator,
|
2018-06-15 00:54:26 +02:00
|
|
|
const struct object_id *oid)
|
2012-10-26 17:53:55 +02:00
|
|
|
{
|
2020-08-18 06:01:35 +02:00
|
|
|
struct commit *c = deref_without_lazy_fetch(oid, 0);
|
2012-10-26 17:53:55 +02:00
|
|
|
|
2020-08-18 06:01:35 +02:00
|
|
|
if (c)
|
|
|
|
negotiator->add_tip(negotiator, c);
|
2012-10-26 17:53:55 +02:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-08-25 19:09:48 +02:00
|
|
|
static int rev_list_insert_ref_oid(const char *refname UNUSED,
|
2022-08-19 12:08:32 +02:00
|
|
|
const struct object_id *oid,
|
2022-08-25 19:09:48 +02:00
|
|
|
int flag UNUSED,
|
2022-08-19 12:08:32 +02:00
|
|
|
void *cb_data)
|
2012-10-26 17:53:55 +02:00
|
|
|
{
|
2020-08-18 06:01:35 +02:00
|
|
|
return rev_list_insert_ref(cb_data, oid);
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
enum ack_type {
|
|
|
|
NAK = 0,
|
|
|
|
ACK,
|
|
|
|
ACK_continue,
|
|
|
|
ACK_common,
|
|
|
|
ACK_ready
|
|
|
|
};
|
|
|
|
|
2018-12-29 22:19:14 +01:00
|
|
|
static void consume_shallow_list(struct fetch_pack_args *args,
|
|
|
|
struct packet_reader *reader)
|
2012-10-26 17:53:55 +02:00
|
|
|
{
|
2016-06-12 12:53:56 +02:00
|
|
|
if (args->stateless_rpc && args->deepen) {
|
2012-10-26 17:53:55 +02:00
|
|
|
/* If we sent a depth we will get back "duplicate"
|
|
|
|
* shallow and unshallow commands every time there
|
|
|
|
* is a block of have lines exchanged.
|
|
|
|
*/
|
2018-12-29 22:19:14 +01:00
|
|
|
while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
|
|
|
|
if (starts_with(reader->line, "shallow "))
|
2012-10-26 17:53:55 +02:00
|
|
|
continue;
|
2018-12-29 22:19:14 +01:00
|
|
|
if (starts_with(reader->line, "unshallow "))
|
2012-10-26 17:53:55 +02:00
|
|
|
continue;
|
2016-06-12 12:53:55 +02:00
|
|
|
die(_("git fetch-pack: expected shallow list"));
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
2018-12-29 22:19:14 +01:00
|
|
|
if (reader->status != PACKET_READ_FLUSH)
|
|
|
|
die(_("git fetch-pack: expected a flush packet after shallow list"));
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-12-29 22:19:14 +01:00
|
|
|
static enum ack_type get_ack(struct packet_reader *reader,
|
|
|
|
struct object_id *result_oid)
|
2012-10-26 17:53:55 +02:00
|
|
|
{
|
pkt-line: provide a LARGE_PACKET_MAX static buffer
Most of the callers of packet_read_line just read into a
static 1000-byte buffer (callers which handle arbitrary
binary data already use LARGE_PACKET_MAX). This works fine
in practice, because:
1. The only variable-sized data in these lines is a ref
name, and refs tend to be a lot shorter than 1000
characters.
2. When sending ref lines, git-core always limits itself
to 1000 byte packets.
However, the only limit given in the protocol specification
in Documentation/technical/protocol-common.txt is
LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in
pack-protocol.txt, and then only describing what we write,
not as a specific limit for readers.
This patch lets us bump the 1000-byte limit to
LARGE_PACKET_MAX. Even though git-core will never write a
packet where this makes a difference, there are two good
reasons to do this:
1. Other git implementations may have followed
protocol-common.txt and used a larger maximum size. We
don't bump into it in practice because it would involve
very long ref names.
2. We may want to increase the 1000-byte limit one day.
Since packets are transferred before any capabilities,
it's difficult to do this in a backwards-compatible
way. But if we bump the size of buffer the readers can
handle, eventually older versions of git will be
obsolete enough that we can justify bumping the
writers, as well. We don't have plans to do this
anytime soon, but there is no reason not to start the
clock ticking now.
Just bumping all of the reading bufs to LARGE_PACKET_MAX
would waste memory. Instead, since most readers just read
into a temporary buffer anyway, let's provide a single
static buffer that all callers can use. We can further wrap
this detail away by having the packet_read_line wrapper just
use the buffer transparently and return a pointer to the
static storage. That covers most of the cases, and the
remaining ones already read into their own LARGE_PACKET_MAX
buffers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:02:57 +01:00
|
|
|
int len;
|
2014-06-18 21:56:03 +02:00
|
|
|
const char *arg;
|
2012-10-26 17:53:55 +02:00
|
|
|
|
2018-12-29 22:19:14 +01:00
|
|
|
if (packet_reader_read(reader) != PACKET_READ_NORMAL)
|
2018-02-08 19:47:49 +01:00
|
|
|
die(_("git fetch-pack: expected ACK/NAK, got a flush packet"));
|
2018-12-29 22:19:14 +01:00
|
|
|
len = reader->pktlen;
|
|
|
|
|
|
|
|
if (!strcmp(reader->line, "NAK"))
|
2012-10-26 17:53:55 +02:00
|
|
|
return NAK;
|
2018-12-29 22:19:14 +01:00
|
|
|
if (skip_prefix(reader->line, "ACK ", &arg)) {
|
2019-08-18 22:04:04 +02:00
|
|
|
const char *p;
|
|
|
|
if (!parse_oid_hex(arg, result_oid, &p)) {
|
|
|
|
len -= p - reader->line;
|
2014-06-18 21:56:03 +02:00
|
|
|
if (len < 1)
|
fetch-pack: fix out-of-bounds buffer offset in get_ack
When we read acks from the remote, we expect either:
ACK <sha1>
or
ACK <sha1> <multi-ack-flag>
We parse the "ACK <sha1>" bit from the line, and then start
looking for the flag strings at "line+45"; if we don't have
them, we assume it's of the first type. But if we do have
the first type, then line+45 is not necessarily inside our
string at all!
It turns out that this works most of the time due to the way
we parse the packets. They should come in with a newline,
and packet_read puts an extra NUL into the buffer, so we end
up with:
ACK <sha1>\n\0
with the newline at offset 44 and the NUL at offset 45. We
then strip the newline, putting a NUL at offset 44. So
when we look at "line+45", we are looking past the end of
our string; but it's OK, because we hit the terminator from
the original string.
This breaks down, however, if the other side does not
terminate their packets with a newline. In that case, our
packet is one character shorter, and we start looking
through uninitialized memory for the flag. No known
implementation sends such a packet, so it has never come up
in practice.
This patch tightens the check by looking for a short,
flagless ACK before trying to parse the flag.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:00:28 +01:00
|
|
|
return ACK;
|
2019-08-18 22:04:04 +02:00
|
|
|
if (strstr(p, "continue"))
|
2012-10-26 17:53:55 +02:00
|
|
|
return ACK_continue;
|
2019-08-18 22:04:04 +02:00
|
|
|
if (strstr(p, "common"))
|
2012-10-26 17:53:55 +02:00
|
|
|
return ACK_common;
|
2019-08-18 22:04:04 +02:00
|
|
|
if (strstr(p, "ready"))
|
2012-10-26 17:53:55 +02:00
|
|
|
return ACK_ready;
|
|
|
|
return ACK;
|
|
|
|
}
|
|
|
|
}
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static void send_request(struct fetch_pack_args *args,
|
|
|
|
int fd, struct strbuf *buf)
|
|
|
|
{
|
|
|
|
if (args->stateless_rpc) {
|
|
|
|
send_sideband(fd, -1, buf->buf, buf->len, LARGE_PACKET_MAX);
|
|
|
|
packet_flush(fd);
|
2019-03-05 05:11:39 +01:00
|
|
|
} else {
|
|
|
|
if (write_in_full(fd, buf->buf, buf->len) < 0)
|
|
|
|
die_errno(_("unable to write to remote"));
|
|
|
|
}
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
|
2018-06-15 00:54:28 +02:00
|
|
|
static void insert_one_alternate_object(struct fetch_negotiator *negotiator,
|
2018-06-15 00:54:26 +02:00
|
|
|
struct object *obj)
|
2012-10-26 17:53:55 +02:00
|
|
|
{
|
2020-08-18 06:01:35 +02:00
|
|
|
rev_list_insert_ref(negotiator, &obj->oid);
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
#define INITIAL_FLUSH 16
|
|
|
|
#define PIPESAFE_FLUSH 32
|
2016-07-19 00:21:38 +02:00
|
|
|
#define LARGE_FLUSH 16384
|
2012-10-26 17:53:55 +02:00
|
|
|
|
2018-03-15 18:31:28 +01:00
|
|
|
static int next_flush(int stateless_rpc, int count)
|
2012-10-26 17:53:55 +02:00
|
|
|
{
|
2018-03-15 18:31:28 +01:00
|
|
|
if (stateless_rpc) {
|
2016-07-19 00:21:38 +02:00
|
|
|
if (count < LARGE_FLUSH)
|
|
|
|
count <<= 1;
|
|
|
|
else
|
|
|
|
count = count * 11 / 10;
|
|
|
|
} else {
|
|
|
|
if (count < PIPESAFE_FLUSH)
|
|
|
|
count <<= 1;
|
|
|
|
else
|
|
|
|
count += PIPESAFE_FLUSH;
|
|
|
|
}
|
2012-10-26 17:53:55 +02:00
|
|
|
return count;
|
|
|
|
}
|
|
|
|
|
2018-07-03 00:39:44 +02:00
|
|
|
static void mark_tips(struct fetch_negotiator *negotiator,
|
|
|
|
const struct oid_array *negotiation_tips)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (!negotiation_tips) {
|
2020-08-18 06:01:35 +02:00
|
|
|
for_each_rawref(rev_list_insert_ref_oid, negotiator);
|
2018-07-03 00:39:44 +02:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < negotiation_tips->nr; i++)
|
2020-08-18 06:01:35 +02:00
|
|
|
rev_list_insert_ref(negotiator, &negotiation_tips->oid[i]);
|
2018-07-03 00:39:44 +02:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2022-07-26 18:27:11 +02:00
|
|
|
static void send_filter(struct fetch_pack_args *args,
|
|
|
|
struct strbuf *req_buf,
|
|
|
|
int server_supports_filter)
|
|
|
|
{
|
|
|
|
if (args->filter_options.choice) {
|
|
|
|
const char *spec =
|
|
|
|
expand_list_objects_filter_spec(&args->filter_options);
|
|
|
|
if (server_supports_filter) {
|
|
|
|
print_verbose(args, _("Server supports filter"));
|
|
|
|
packet_buf_write(req_buf, "filter %s", spec);
|
|
|
|
trace2_data_string("fetch", the_repository,
|
|
|
|
"filter/effective", spec);
|
|
|
|
} else {
|
|
|
|
warning("filtering not recognized by server, ignoring");
|
|
|
|
trace2_data_string("fetch", the_repository,
|
|
|
|
"filter/unsupported", spec);
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
trace2_data_string("fetch", the_repository,
|
|
|
|
"filter/none", "");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2018-06-15 00:54:28 +02:00
|
|
|
static int find_common(struct fetch_negotiator *negotiator,
|
2018-06-15 00:54:26 +02:00
|
|
|
struct fetch_pack_args *args,
|
2017-05-01 04:28:54 +02:00
|
|
|
int fd[2], struct object_id *result_oid,
|
2012-10-26 17:53:55 +02:00
|
|
|
struct ref *refs)
|
|
|
|
{
|
|
|
|
int fetching;
|
|
|
|
int count = 0, flushes = 0, flush_at = INITIAL_FLUSH, retval;
|
fetch-pack: add tracing for negotiation rounds
Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering
the entire negotiation process. However, we'd like additional data, such
as timing for each round of negotiation or the number of "haves" in each
round. Additionally, "independent negotiation" (AKA push negotiation)
has no tracing at all. Having this data would allow us to compare the
performance of the various negotation implementations, and to debug
unexpectedly slow fetch & push sessions.
Add per-round trace2 regions for all negotiation implementations (V0+V1,
V2, and independent negotiation), as well as an overall region for
independent negotiation. Add trace2 data logging for the number of haves
and "in vain" objects for each round, and for the total number of rounds
once negotiation completes. Finally, add a few checks into various
tests to verify that the number of rounds is logged as expected.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 00:04:05 +02:00
|
|
|
int negotiation_round = 0, haves = 0;
|
2017-05-01 04:28:54 +02:00
|
|
|
const struct object_id *oid;
|
2012-10-26 17:53:55 +02:00
|
|
|
unsigned in_vain = 0;
|
|
|
|
int got_continue = 0;
|
|
|
|
int got_ready = 0;
|
|
|
|
struct strbuf req_buf = STRBUF_INIT;
|
|
|
|
size_t state_len = 0;
|
2018-12-29 22:19:14 +01:00
|
|
|
struct packet_reader reader;
|
2012-10-26 17:53:55 +02:00
|
|
|
|
|
|
|
if (args->stateless_rpc && multi_ack == 1)
|
2022-01-05 21:02:19 +01:00
|
|
|
die(_("the option '%s' requires '%s'"), "--stateless-rpc", "multi_ack_detailed");
|
2012-10-26 17:53:55 +02:00
|
|
|
|
2018-12-29 22:19:14 +01:00
|
|
|
packet_reader_init(&reader, fd[0], NULL, 0,
|
pack-protocol.txt: accept error packets in any context
In the Git pack protocol definition, an error packet may appear only in
a certain context. However, servers can face a runtime error (e.g. I/O
error) at an arbitrary timing. This patch changes the protocol to allow
an error packet to be sent instead of any packet.
Without this protocol spec change, when a server cannot process a
request, there's no way to tell that to a client. Since the server
cannot produce a valid response, it would be forced to cut a connection
without telling why. With this protocol spec change, the server can be
more gentle in this situation. An old client may see these error packets
as an unexpected packet, but this is not worse than having an unexpected
EOF.
Following this protocol spec change, the error packet handling code is
moved to pkt-line.c. Implementation wise, this implementation uses
pkt-line to communicate with a subprocess. Since this is not a part of
Git protocol, it's possible that a packet that is not supposed to be an
error packet is mistakenly parsed as an error packet. This error packet
handling is enabled only for the Git pack protocol parsing code
considering this.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-29 22:19:15 +01:00
|
|
|
PACKET_READ_CHOMP_NEWLINE |
|
|
|
|
PACKET_READ_DIE_ON_ERR_PACKET);
|
2018-12-29 22:19:14 +01:00
|
|
|
|
2020-08-18 06:01:37 +02:00
|
|
|
mark_tips(negotiator, args->negotiation_tips);
|
|
|
|
for_each_cached_alternate(negotiator, insert_one_alternate_object);
|
2012-10-26 17:53:55 +02:00
|
|
|
|
|
|
|
fetching = 0;
|
|
|
|
for ( ; refs ; refs = refs->next) {
|
2017-05-01 04:28:54 +02:00
|
|
|
struct object_id *remote = &refs->old_oid;
|
2012-10-26 17:53:55 +02:00
|
|
|
const char *remote_hex;
|
|
|
|
struct object *o;
|
|
|
|
|
2022-03-28 16:02:06 +02:00
|
|
|
if (!args->refetch) {
|
|
|
|
/*
|
|
|
|
* If that object is complete (i.e. it is an ancestor of a
|
|
|
|
* local ref), we tell them we have it but do not have to
|
|
|
|
* tell them about its ancestors, which they already know
|
|
|
|
* about.
|
|
|
|
*
|
|
|
|
* We use lookup_object here because we are only
|
|
|
|
* interested in the case we *know* the object is
|
|
|
|
* reachable and we have already scanned it.
|
|
|
|
*/
|
|
|
|
if (((o = lookup_object(the_repository, remote)) != NULL) &&
|
|
|
|
(o->flags & COMPLETE)) {
|
|
|
|
continue;
|
|
|
|
}
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
|
2017-05-01 04:28:54 +02:00
|
|
|
remote_hex = oid_to_hex(remote);
|
2012-10-26 17:53:55 +02:00
|
|
|
if (!fetching) {
|
|
|
|
struct strbuf c = STRBUF_INIT;
|
|
|
|
if (multi_ack == 2) strbuf_addstr(&c, " multi_ack_detailed");
|
|
|
|
if (multi_ack == 1) strbuf_addstr(&c, " multi_ack");
|
|
|
|
if (no_done) strbuf_addstr(&c, " no-done");
|
|
|
|
if (use_sideband == 2) strbuf_addstr(&c, " side-band-64k");
|
|
|
|
if (use_sideband == 1) strbuf_addstr(&c, " side-band");
|
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
|
|
|
if (args->deepen_relative) strbuf_addstr(&c, " deepen-relative");
|
2012-10-26 17:53:55 +02:00
|
|
|
if (args->use_thin_pack) strbuf_addstr(&c, " thin-pack");
|
|
|
|
if (args->no_progress) strbuf_addstr(&c, " no-progress");
|
|
|
|
if (args->include_tag) strbuf_addstr(&c, " include-tag");
|
|
|
|
if (prefer_ofs_delta) strbuf_addstr(&c, " ofs-delta");
|
2016-06-12 12:53:59 +02:00
|
|
|
if (deepen_since_ok) strbuf_addstr(&c, " deepen-since");
|
2016-06-12 12:54:04 +02:00
|
|
|
if (deepen_not_ok) strbuf_addstr(&c, " deepen-not");
|
2012-10-26 17:53:55 +02:00
|
|
|
if (agent_supported) strbuf_addf(&c, " agent=%s",
|
|
|
|
git_user_agent_sanitized());
|
2020-11-12 00:29:31 +01:00
|
|
|
if (advertise_sid)
|
|
|
|
strbuf_addf(&c, " session-id=%s", trace2_session_id());
|
2017-12-08 16:58:40 +01:00
|
|
|
if (args->filter_options.choice)
|
|
|
|
strbuf_addstr(&c, " filter");
|
2012-10-26 17:53:55 +02:00
|
|
|
packet_buf_write(&req_buf, "want %s%s\n", remote_hex, c.buf);
|
|
|
|
strbuf_release(&c);
|
|
|
|
} else
|
|
|
|
packet_buf_write(&req_buf, "want %s\n", remote_hex);
|
|
|
|
fetching++;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!fetching) {
|
|
|
|
strbuf_release(&req_buf);
|
|
|
|
packet_flush(fd[1]);
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2018-05-18 00:51:46 +02:00
|
|
|
if (is_repository_shallow(the_repository))
|
2013-12-05 14:02:34 +01:00
|
|
|
write_shallow_commits(&req_buf, 1, NULL);
|
2012-10-26 17:53:55 +02:00
|
|
|
if (args->depth > 0)
|
|
|
|
packet_buf_write(&req_buf, "deepen %d", args->depth);
|
2016-06-12 12:53:59 +02:00
|
|
|
if (args->deepen_since) {
|
2017-04-26 21:29:31 +02:00
|
|
|
timestamp_t max_age = approxidate(args->deepen_since);
|
2017-04-21 12:45:48 +02:00
|
|
|
packet_buf_write(&req_buf, "deepen-since %"PRItime, max_age);
|
2016-06-12 12:53:59 +02:00
|
|
|
}
|
2016-06-12 12:54:04 +02:00
|
|
|
if (args->deepen_not) {
|
|
|
|
int i;
|
|
|
|
for (i = 0; i < args->deepen_not->nr; i++) {
|
|
|
|
struct string_list_item *s = args->deepen_not->items + i;
|
|
|
|
packet_buf_write(&req_buf, "deepen-not %s", s->string);
|
|
|
|
}
|
|
|
|
}
|
2022-07-26 18:27:11 +02:00
|
|
|
send_filter(args, &req_buf, server_supports_filtering);
|
2012-10-26 17:53:55 +02:00
|
|
|
packet_buf_flush(&req_buf);
|
|
|
|
state_len = req_buf.len;
|
|
|
|
|
2016-06-12 12:53:56 +02:00
|
|
|
if (args->deepen) {
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18 21:47:50 +02:00
|
|
|
const char *arg;
|
2017-05-01 04:28:54 +02:00
|
|
|
struct object_id oid;
|
2012-10-26 17:53:55 +02:00
|
|
|
|
|
|
|
send_request(args, fd[1], &req_buf);
|
2018-12-29 22:19:14 +01:00
|
|
|
while (packet_reader_read(&reader) == PACKET_READ_NORMAL) {
|
|
|
|
if (skip_prefix(reader.line, "shallow ", &arg)) {
|
2017-05-01 04:28:54 +02:00
|
|
|
if (get_oid_hex(arg, &oid))
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("invalid shallow line: %s"), reader.line);
|
2018-05-18 00:51:44 +02:00
|
|
|
register_shallow(the_repository, &oid);
|
2012-10-26 17:53:55 +02:00
|
|
|
continue;
|
|
|
|
}
|
2018-12-29 22:19:14 +01:00
|
|
|
if (skip_prefix(reader.line, "unshallow ", &arg)) {
|
2017-05-01 04:28:54 +02:00
|
|
|
if (get_oid_hex(arg, &oid))
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("invalid unshallow line: %s"), reader.line);
|
2019-06-20 09:41:14 +02:00
|
|
|
if (!lookup_object(the_repository, &oid))
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("object not found: %s"), reader.line);
|
2012-10-26 17:53:55 +02:00
|
|
|
/* make sure that it is parsed as shallow */
|
2018-06-29 03:21:51 +02:00
|
|
|
if (!parse_object(the_repository, &oid))
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("error in object: %s"), reader.line);
|
2017-05-07 00:10:06 +02:00
|
|
|
if (unregister_shallow(&oid))
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("no shallow found: %s"), reader.line);
|
2012-10-26 17:53:55 +02:00
|
|
|
continue;
|
|
|
|
}
|
2018-12-29 22:19:14 +01:00
|
|
|
die(_("expected shallow/unshallow, got %s"), reader.line);
|
2012-10-26 17:53:55 +02:00
|
|
|
}
|
|
|
|
} else if (!args->stateless_rpc)
|
|
|
|
send_request(args, fd[1], &req_buf);
|
|
|
|
|
|
|
|
if (!args->stateless_rpc) {
|
|
|
|
/* If we aren't using the stateless-rpc interface
|
|
|
|
* we don't need to retain the headers.
|
|
|
|
*/
|
|
|
|
strbuf_setlen(&req_buf, 0);
|
|
|
|
state_len = 0;
|
|
|
|
}
|
|
|
|
|
2019-10-03 01:49:28 +02:00
|
|
|
trace2_region_enter("fetch-pack", "negotiation_v0_v1", the_repository);
|
2012-10-26 17:53:55 +02:00
|
|
|
flushes = 0;
|
|
|
|
retval = -1;
|
2018-06-15 00:54:28 +02:00
|
|
|
while ((oid = negotiator->next(negotiator))) {
|
2017-05-01 04:28:54 +02:00
|
|
|
packet_buf_write(&req_buf, "have %s\n", oid_to_hex(oid));
|
|
|
|
print_verbose(args, "have %s", oid_to_hex(oid));
|
2012-10-26 17:53:55 +02:00
|
|
|
in_vain++;
|
fetch-pack: add tracing for negotiation rounds
Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering
the entire negotiation process. However, we'd like additional data, such
as timing for each round of negotiation or the number of "haves" in each
round. Additionally, "independent negotiation" (AKA push negotiation)
has no tracing at all. Having this data would allow us to compare the
performance of the various negotation implementations, and to debug
unexpectedly slow fetch & push sessions.
Add per-round trace2 regions for all negotiation implementations (V0+V1,
V2, and independent negotiation), as well as an overall region for
independent negotiation. Add trace2 data logging for the number of haves
and "in vain" objects for each round, and for the total number of rounds
once negotiation completes. Finally, add a few checks into various
tests to verify that the number of rounds is logged as expected.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 00:04:05 +02:00
|
|
|
haves++;
|
2012-10-26 17:53:55 +02:00
|
|
|
if (flush_at <= ++count) {
|
|
|
|
int ack;
|
|
|
|
|
fetch-pack: add tracing for negotiation rounds
Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering
the entire negotiation process. However, we'd like additional data, such
as timing for each round of negotiation or the number of "haves" in each
round. Additionally, "independent negotiation" (AKA push negotiation)
has no tracing at all. Having this data would allow us to compare the
performance of the various negotation implementations, and to debug
unexpectedly slow fetch & push sessions.
Add per-round trace2 regions for all negotiation implementations (V0+V1,
V2, and independent negotiation), as well as an overall region for
independent negotiation. Add trace2 data logging for the number of haves
and "in vain" objects for each round, and for the total number of rounds
once negotiation completes. Finally, add a few checks into various
tests to verify that the number of rounds is logged as expected.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 00:04:05 +02:00
|
|
|
negotiation_round++;
|
|
|
|
trace2_region_enter_printf("negotiation_v0_v1", "round",
|
|
|
|
the_repository, "%d",
|
|
|
|
negotiation_round);
|
|
|
|
trace2_data_intmax("negotiation_v0_v1", the_repository,
|
|
|
|
"haves_added", haves);
|
|
|
|
trace2_data_intmax("negotiation_v0_v1", the_repository,
|
|
|
|
"in_vain", in_vain);
|
|
|
|
haves = 0;
|
2012-10-26 17:53:55 +02:00
|
|
|
packet_buf_flush(&req_buf);
|
|
|
|
send_request(args, fd[1], &req_buf);
|
|
|
|
strbuf_setlen(&req_buf, state_len);
|
|
|
|
flushes++;
|
2018-03-15 18:31:28 +01:00
|
|
|
flush_at = next_flush(args->stateless_rpc, count);
|
2012-10-26 17:53:55 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We keep one window "ahead" of the other side, and
|
|
|
|
* will wait for an ACK only on the next one
|
|
|
|
*/
|
|
|
|
if (!args->stateless_rpc && count == INITIAL_FLUSH)
|
|
|
|
continue;
|
|
|
|
|
2018-12-29 22:19:14 +01:00
|
|
|
consume_shallow_list(args, &reader);
|
2012-10-26 17:53:55 +02:00
|
|
|
do {
|
2018-12-29 22:19:14 +01:00
|
|
|
ack = get_ack(&reader, result_oid);
|
2016-06-12 12:53:54 +02:00
|
|
|
if (ack)
|
2016-06-12 12:53:55 +02:00
|
|
|
print_verbose(args, _("got %s %d %s"), "ack",
|
2017-05-01 04:28:54 +02:00
|
|
|
ack, oid_to_hex(result_oid));
|
2012-10-26 17:53:55 +02:00
|
|
|
switch (ack) {
|
|
|
|
case ACK:
|
fetch-pack: add tracing for negotiation rounds
Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering
the entire negotiation process. However, we'd like additional data, such
as timing for each round of negotiation or the number of "haves" in each
round. Additionally, "independent negotiation" (AKA push negotiation)
has no tracing at all. Having this data would allow us to compare the
performance of the various negotation implementations, and to debug
unexpectedly slow fetch & push sessions.
Add per-round trace2 regions for all negotiation implementations (V0+V1,
V2, and independent negotiation), as well as an overall region for
independent negotiation. Add trace2 data logging for the number of haves
and "in vain" objects for each round, and for the total number of rounds
once negotiation completes. Finally, add a few checks into various
tests to verify that the number of rounds is logged as expected.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 00:04:05 +02:00
|
|
|
trace2_region_leave_printf("negotiation_v0_v1", "round",
|
|
|
|
the_repository, "%d",
|
|
|
|
negotiation_round);
|
2012-10-26 17:53:55 +02:00
|
|
|
flushes = 0;
|
|
|
|
multi_ack = 0;
|
|
|
|
retval = 0;
|
|
|
|
goto done;
|
|
|
|
case ACK_common:
|
|
|
|
case ACK_ready:
|
|
|
|
case ACK_continue: {
|
|
|
|
struct commit *commit =
|
2018-06-29 03:21:59 +02:00
|
|
|
lookup_commit(the_repository,
|
|
|
|
result_oid);
|
2018-06-15 00:54:27 +02:00
|
|
|
int was_common;
|
2018-08-03 00:30:42 +02:00
|
|
|
|
2012-10-26 17:53:55 +02:00
|
|
|
if (!commit)
|
2017-05-01 04:28:54 +02:00
|
|
|
die(_("invalid commit %s"), oid_to_hex(result_oid));
|
2018-06-15 00:54:28 +02:00
|
|
|
was_common = negotiator->ack(negotiator, commit);
|
2012-10-26 17:53:55 +02:00
|
|
|
if (args->stateless_rpc
|
|
|
|
&& ack == ACK_common
|
2018-06-15 00:54:27 +02:00
|
|
|
&& !was_common) {
|
2012-10-26 17:53:55 +02:00
|
|
|
/* We need to replay the have for this object
|
|
|
|
* on the next RPC request so the peer knows
|
|
|
|
* it is in common with us.
|
|
|
|
*/
|
2017-05-01 04:28:54 +02:00
|
|
|
const char *hex = oid_to_hex(result_oid);
|
2012-10-26 17:53:55 +02:00
|
|
|
packet_buf_write(&req_buf, "have %s\n", hex);
|
|
|
|
state_len = req_buf.len;
|
fetch-pack: add tracing for negotiation rounds
Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering
the entire negotiation process. However, we'd like additional data, such
as timing for each round of negotiation or the number of "haves" in each
round. Additionally, "independent negotiation" (AKA push negotiation)
has no tracing at all. Having this data would allow us to compare the
performance of the various negotation implementations, and to debug
unexpectedly slow fetch & push sessions.
Add per-round trace2 regions for all negotiation implementations (V0+V1,
V2, and independent negotiation), as well as an overall region for
independent negotiation. Add trace2 data logging for the number of haves
and "in vain" objects for each round, and for the total number of rounds
once negotiation completes. Finally, add a few checks into various
tests to verify that the number of rounds is logged as expected.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 00:04:05 +02:00
|
|
|
haves++;
|
2016-09-23 19:41:35 +02:00
|
|
|
/*
|
|
|
|
* Reset in_vain because an ack
|
|
|
|
* for this commit has not been
|
|
|
|
* seen.
|
|
|
|
*/
|
|
|
|
in_vain = 0;
|
|
|
|
} else if (!args->stateless_rpc
|
|
|
|
|| ack != ACK_common)
|
|
|
|
in_vain = 0;
|
2012-10-26 17:53:55 +02:00
|
|
|
retval = 0;
|
|
|
|
got_continue = 1;
|
2018-06-15 00:54:24 +02:00
|
|
|
if (ack == ACK_ready)
|
2012-10-26 17:53:55 +02:00
|
|
|
got_ready = 1;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|