Git Source Code Mirror - This is a publish-only repository and all pull requests are ignored. Please follow Documentation/SubmittingPatches procedure for any of your improvements. https://git-scm.com/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
git/remote-curl.c

1621 lines
42 KiB

#include "cache.h"
#include "config.h"
#include "remote.h"
#include "connect.h"
#include "strbuf.h"
#include "walker.h"
#include "http.h"
#include "exec-cmd.h"
#include "run-command.h"
#include "pkt-line.h"
#include "string-list.h"
#include "sideband.h"
#include "strvec.h"
http: hoist credential request out of handle_curl_result When we are handling a curl response code in http_request or in the remote-curl RPC code, we use the handle_curl_result helper to translate curl's response into an easy-to-use code. When we see an HTTP 401, we do one of two things: 1. If we already had a filled-in credential, we mark it as rejected, and then return HTTP_NOAUTH to indicate to the caller that we failed. 2. If we didn't, then we ask for a new credential and tell the caller HTTP_REAUTH to indicate that they may want to try again. Rejecting in the first case makes sense; it is the natural result of the request we just made. However, prompting for more credentials in the second step does not always make sense. We do not know for sure that the caller is going to make a second request, and nor are we sure that it will be to the same URL. Logically, the prompt belongs not to the request we just finished, but to the request we are (maybe) about to make. In practice, it is very hard to trigger any bad behavior. Currently, if we make a second request, it will always be to the same URL (even in the face of redirects, because curl handles the redirects internally). And we almost always retry on HTTP_REAUTH these days. The one exception is if we are streaming a large RPC request to the server (e.g., a pushed packfile), in which case we cannot restart. It's extremely unlikely to see a 401 response at this stage, though, as we would typically have seen it when we sent a probe request, before streaming the data. This patch drops the automatic prompt out of case 2, and instead requires the caller to do it. This is a few extra lines of code, and the bug it fixes is unlikely to come up in practice. But it is conceptually cleaner, and paves the way for better handling of credentials across redirects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
9 years ago
#include "credential.h"
#include "oid-array.h"
#include "send-pack.h"
#include "protocol.h"
#include "quote.h"
#include "transport.h"
static struct remote *remote;
/* always ends with a trailing slash */
static struct strbuf url = STRBUF_INIT;
struct options {
int verbosity;
unsigned long depth;
char *deepen_since;
struct string_list deepen_not;
struct string_list push_options;
char *filter;
unsigned progress : 1,
check_self_contained_and_connected : 1,
cloning : 1,
update_shallow : 1,
followtags : 1,
dry_run : 1,
signed push: teach smart-HTTP to pass "git push --signed" around The "--signed" option received by "git push" is first passed to the transport layer, which the native transport directly uses to notice that a push certificate needs to be sent. When the transport-helper is involved, however, the option needs to be told to the helper with set_helper_option(), and the helper needs to take necessary action. For the smart-HTTP helper, the "necessary action" involves spawning the "git send-pack" subprocess with the "--signed" option. Once the above all gets wired in, the smart-HTTP transport now can use the push certificate mechanism to authenticate its pushes. Add a test that is modeled after tests for the native transport in t5534-push-signed.sh to t5541-http-push-smart.sh. Update the test Apache configuration to pass GNUPGHOME environment variable through. As PassEnv would trigger warnings for an environment variable that is not set, export it from test-lib.sh set to a harmless value when GnuPG is not being used in the tests. Note that the added test is deliberately loose and does not check the nonce in this step. This is because the stateless RPC mode is inevitably flaky and a nonce that comes back in the actual push processing is one issued by a different process; if the two interactions with the server crossed a second boundary, the nonces will not match and such a check will fail. A later patch in the series will work around this shortcoming. Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
thin : 1,
/* One of the SEND_PACK_PUSH_CERT_* constants. */
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
push_cert : 2,
deepen_relative : 1,
/* see documentation of corresponding flag in fetch-pack.h */
from_promisor : 1,
refetch : 1,
atomic : 1,
object_format : 1,
force_if_includes : 1;
const struct git_hash_algo *hash_algo;
};
static struct options options;
static struct string_list cas_options = STRING_LIST_INIT_DUP;
static int set_option(const char *name, const char *value)
{
if (!strcmp(name, "verbosity")) {
char *end;
int v = strtol(value, &end, 10);
if (value == end || *end)
return -1;
options.verbosity = v;
return 0;
}
else if (!strcmp(name, "progress")) {
if (!strcmp(value, "true"))
options.progress = 1;
else if (!strcmp(value, "false"))
options.progress = 0;
else
return -1;
return 0;
}
else if (!strcmp(name, "depth")) {
char *end;
unsigned long v = strtoul(value, &end, 10);
if (value == end || *end)
return -1;
options.depth = v;
return 0;
}
else if (!strcmp(name, "deepen-since")) {
options.deepen_since = xstrdup(value);
return 0;
}
else if (!strcmp(name, "deepen-not")) {
string_list_append(&options.deepen_not, value);
return 0;
}
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
else if (!strcmp(name, "deepen-relative")) {
if (!strcmp(value, "true"))
options.deepen_relative = 1;
else if (!strcmp(value, "false"))
options.deepen_relative = 0;
else
return -1;
return 0;
}
else if (!strcmp(name, "followtags")) {
if (!strcmp(value, "true"))
options.followtags = 1;
else if (!strcmp(value, "false"))
options.followtags = 0;
else
return -1;
return 0;
}
else if (!strcmp(name, "dry-run")) {
if (!strcmp(value, "true"))
options.dry_run = 1;
else if (!strcmp(value, "false"))
options.dry_run = 0;
else
return -1;
return 0;
}
else if (!strcmp(name, "check-connectivity")) {
if (!strcmp(value, "true"))
options.check_self_contained_and_connected = 1;
else if (!strcmp(value, "false"))
options.check_self_contained_and_connected = 0;
else
return -1;
return 0;
}
else if (!strcmp(name, "cas")) {
struct strbuf val = STRBUF_INIT;
remote-curl: make --force-with-lease work with non-ASCII ref names When we invoke a remote transport helper and pass an option with an argument, we quote the argument as a C-style string if necessary. This is the case for the cas option, which implements the --force-with-lease command-line flag, when we're passing a non-ASCII refname. However, the remote curl helper isn't designed to parse such an argument, meaning that if we try to use --force-with-lease with an HTTP push and a non-ASCII refname, we get an error like this: error: cannot parse expected object name '0000000000000000000000000000000000000000"' Note the double quote, which get_oid has reminded us is not valid in an hex object ID. Even if we had been able to parse it, we would send the wrong data to the server: we'd send an escaped ref, which would not behave as the user wanted and might accidentally result in updating or deleting a ref we hadn't intended. Since we need to expect a quoted C-style string here, just check if the first argument is a double quote, and if so, unquote it. Note that if the refname contains a double quote, then we will have double-quoted it already, so there is no ambiguity. We test for this case only in the smart protocol, since the DAV-based protocol is not capable of handling this capability. We use UTF-8 because this is nicer in our tests and friendlier to Windows, but the code should work for all non-ASCII refs. While we're at it, since the name of the option is now well established and isn't going to change, let's inline it instead of using the #define constant. Reported-by: Frej Bjon <frej.bjon@nemit.fi> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2 years ago
strbuf_addstr(&val, "--force-with-lease=");
if (*value != '"')
strbuf_addstr(&val, value);
else if (unquote_c_style(&val, value, NULL))
return -1;
string_list_append(&cas_options, val.buf);
strbuf_release(&val);
return 0;
} else if (!strcmp(name, TRANS_OPT_FORCE_IF_INCLUDES)) {
if (!strcmp(value, "true"))
options.force_if_includes = 1;
else if (!strcmp(value, "false"))
options.force_if_includes = 0;
else
return -1;
return 0;
} else if (!strcmp(name, "cloning")) {
if (!strcmp(value, "true"))
options.cloning = 1;
else if (!strcmp(value, "false"))
options.cloning = 0;
else
return -1;
return 0;
} else if (!strcmp(name, "update-shallow")) {
if (!strcmp(value, "true"))
options.update_shallow = 1;
else if (!strcmp(value, "false"))
options.update_shallow = 0;
else
return -1;
return 0;
signed push: teach smart-HTTP to pass "git push --signed" around The "--signed" option received by "git push" is first passed to the transport layer, which the native transport directly uses to notice that a push certificate needs to be sent. When the transport-helper is involved, however, the option needs to be told to the helper with set_helper_option(), and the helper needs to take necessary action. For the smart-HTTP helper, the "necessary action" involves spawning the "git send-pack" subprocess with the "--signed" option. Once the above all gets wired in, the smart-HTTP transport now can use the push certificate mechanism to authenticate its pushes. Add a test that is modeled after tests for the native transport in t5534-push-signed.sh to t5541-http-push-smart.sh. Update the test Apache configuration to pass GNUPGHOME environment variable through. As PassEnv would trigger warnings for an environment variable that is not set, export it from test-lib.sh set to a harmless value when GnuPG is not being used in the tests. Note that the added test is deliberately loose and does not check the nonce in this step. This is because the stateless RPC mode is inevitably flaky and a nonce that comes back in the actual push processing is one issued by a different process; if the two interactions with the server crossed a second boundary, the nonces will not match and such a check will fail. A later patch in the series will work around this shortcoming. Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
} else if (!strcmp(name, "pushcert")) {
if (!strcmp(value, "true"))
options.push_cert = SEND_PACK_PUSH_CERT_ALWAYS;
signed push: teach smart-HTTP to pass "git push --signed" around The "--signed" option received by "git push" is first passed to the transport layer, which the native transport directly uses to notice that a push certificate needs to be sent. When the transport-helper is involved, however, the option needs to be told to the helper with set_helper_option(), and the helper needs to take necessary action. For the smart-HTTP helper, the "necessary action" involves spawning the "git send-pack" subprocess with the "--signed" option. Once the above all gets wired in, the smart-HTTP transport now can use the push certificate mechanism to authenticate its pushes. Add a test that is modeled after tests for the native transport in t5534-push-signed.sh to t5541-http-push-smart.sh. Update the test Apache configuration to pass GNUPGHOME environment variable through. As PassEnv would trigger warnings for an environment variable that is not set, export it from test-lib.sh set to a harmless value when GnuPG is not being used in the tests. Note that the added test is deliberately loose and does not check the nonce in this step. This is because the stateless RPC mode is inevitably flaky and a nonce that comes back in the actual push processing is one issued by a different process; if the two interactions with the server crossed a second boundary, the nonces will not match and such a check will fail. A later patch in the series will work around this shortcoming. Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
else if (!strcmp(value, "false"))
options.push_cert = SEND_PACK_PUSH_CERT_NEVER;
else if (!strcmp(value, "if-asked"))
options.push_cert = SEND_PACK_PUSH_CERT_IF_ASKED;
signed push: teach smart-HTTP to pass "git push --signed" around The "--signed" option received by "git push" is first passed to the transport layer, which the native transport directly uses to notice that a push certificate needs to be sent. When the transport-helper is involved, however, the option needs to be told to the helper with set_helper_option(), and the helper needs to take necessary action. For the smart-HTTP helper, the "necessary action" involves spawning the "git send-pack" subprocess with the "--signed" option. Once the above all gets wired in, the smart-HTTP transport now can use the push certificate mechanism to authenticate its pushes. Add a test that is modeled after tests for the native transport in t5534-push-signed.sh to t5541-http-push-smart.sh. Update the test Apache configuration to pass GNUPGHOME environment variable through. As PassEnv would trigger warnings for an environment variable that is not set, export it from test-lib.sh set to a harmless value when GnuPG is not being used in the tests. Note that the added test is deliberately loose and does not check the nonce in this step. This is because the stateless RPC mode is inevitably flaky and a nonce that comes back in the actual push processing is one issued by a different process; if the two interactions with the server crossed a second boundary, the nonces will not match and such a check will fail. A later patch in the series will work around this shortcoming. Signed-off-by: Junio C Hamano <gitster@pobox.com>
8 years ago
else
return -1;
return 0;
remote-curl: pass on atomic capability to remote side When pushing more than one reference with the --atomic option, the server is supposed to perform a single atomic transaction to update the references, leaving them either all to succeed or all to fail. This works fine when pushing locally or over SSH, but when pushing over HTTP, we fail to pass the atomic capability to the remote side. In fact, we have not reported this capability to any remote helpers during the life of the feature. Now normally, things happen to work nevertheless, since we actually check for most types of failures, such as non-fast-forward updates, on the client side, and just abort the entire attempt. However, if the server side reports a problem, such as the inability to lock a ref, the transaction isn't atomic, because we haven't passed the appropriate capability over and the remote side has no way of knowing that we wanted atomic behavior. Fix this by passing the option from the transport code through to remote helpers, and from the HTTP remote helper down to send-pack. With this change, we can detect if the server side rejects the push and report back appropriately. Note the difference in the messages: the remote side reports "atomic transaction failed", while our own checking rejects pushes with the message "atomic push failed". Document the atomic option in the remote helper documentation, so other implementers can implement it if they like. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
} else if (!strcmp(name, "atomic")) {
if (!strcmp(value, "true"))
options.atomic = 1;
else if (!strcmp(value, "false"))
options.atomic = 0;
else
return -1;
return 0;
} else if (!strcmp(name, "push-option")) {
if (*value != '"')
string_list_append(&options.push_options, value);
else {
struct strbuf unquoted = STRBUF_INIT;
if (unquote_c_style(&unquoted, value, NULL) < 0)
die(_("invalid quoting in push-option value: '%s'"), value);
string_list_append_nodup(&options.push_options,
strbuf_detach(&unquoted, NULL));
}
return 0;
} else if (!strcmp(name, "family")) {
if (!strcmp(value, "ipv4"))
git_curl_ipresolve = CURL_IPRESOLVE_V4;
else if (!strcmp(value, "ipv6"))
git_curl_ipresolve = CURL_IPRESOLVE_V6;
else if (!strcmp(value, "all"))
git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
else
return -1;
return 0;
} else if (!strcmp(name, "from-promisor")) {
options.from_promisor = 1;
return 0;
} else if (!strcmp(name, "refetch")) {
options.refetch = 1;
return 0;
} else if (!strcmp(name, "filter")) {
options.filter = xstrdup(value);
return 0;
} else if (!strcmp(name, "object-format")) {
int algo;
options.object_format = 1;
if (strcmp(value, "true")) {
algo = hash_algo_by_name(value);
if (algo == GIT_HASH_UNKNOWN)
die("unknown object format '%s'", value);
options.hash_algo = &hash_algos[algo];
}
return 0;
} else {
return 1 /* unsupported */;
}
}
struct discovery {
char *service;
char *buf_alloc;
char *buf;
size_t len;
remote-curl: always parse incoming refs When remote-curl receives a list of refs from a server, it keeps the whole buffer intact. When we get a "list" command, we feed the result to get_remote_heads, and when we get a "fetch" or "push" command, we feed it to fetch-pack or send-pack, respectively. If the HTTP response from the server is truncated for any reason, we will get an incomplete ref advertisement. If we then feed this incomplete list to fetch-pack, one of a few things may happen: 1. If the truncation is in a packet header, fetch-pack will notice the bogus line and complain. 2. If the truncation is inside a packet, fetch-pack will keep waiting for us to send the rest of the packet, which we never will. 3. If the truncation is at a packet boundary, fetch-pack will keep waiting for us to send the next packet, which we never will. As a result, fetch-pack hangs, waiting for input. However, remote-curl believes it has sent all of the advertisement, and therefore waits for fetch-pack to speak. The two processes end up in a deadlock. We do notice the broken ref list if we feed it to get_remote_heads. So if git asks the helper to do a "list" followed by a "fetch", we are safe; we'll abort during the list operation, which parses the refs. This patch teaches remote-curl to always parse and save the incoming ref list when we read the ref advertisement from a server. That means that we will always verify and abort before even running fetch-pack (or send-pack) when reading a corrupted list, even if we do not run the "list" command explicitly. Since we save the result, in the common case of running "list" then "fetch", we do not do any extra parsing at all. In the case of just a "fetch", we do an extra round of parsing, but only once. Note also that the "fetch" case will now also initialize server_capabilities from the remote (in remote-curl; we already would do so inside fetch-pack). Doing "list+fetch" already does this. It doesn't actually matter now, but the new behavior is arguably more correct, should remote-curl ever start caring about the server's capability list. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
struct ref *refs;
struct oid_array shallow;
enum protocol_version version;
unsigned proto_git : 1;
};
static struct discovery *last_discovery;
static struct ref *parse_git_refs(struct discovery *heads, int for_push)
{
struct ref *list = NULL;
struct packet_reader reader;
packet_reader_init(&reader, -1, heads->buf, heads->len,
PACKET_READ_CHOMP_NEWLINE |
PACKET_READ_GENTLE_ON_EOF |
PACKET_READ_DIE_ON_ERR_PACKET);
heads->version = discover_version(&reader);
switch (heads->version) {
case protocol_v2:
/*
* Do nothing. This isn't a list of refs but rather a
* capability advertisement. Client would have run
* 'stateless-connect' so we'll dump this capability listing
* and let them request the refs themselves.
*/
break;
case protocol_v1:
case protocol_v0:
get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
NULL, &heads->shallow);
options.hash_algo = reader.hash_algo;
break;
case protocol_unknown_version:
BUG("unknown protocol version");
}
return list;
}
static const struct git_hash_algo *detect_hash_algo(struct discovery *heads)
{
const char *p = memchr(heads->buf, '\t', heads->len);
int algo;
if (!p)
return the_hash_algo;
algo = hash_algo_by_length((p - heads->buf) / 2);
if (algo == GIT_HASH_UNKNOWN)
return NULL;
return &hash_algos[algo];
}
static struct ref *parse_info_refs(struct discovery *heads)
{
char *data, *start, *mid;
char *ref_name;
int i = 0;
struct ref *refs = NULL;
struct ref *ref = NULL;
struct ref *last_ref = NULL;
options.hash_algo = detect_hash_algo(heads);
if (!options.hash_algo)
die("%sinfo/refs not valid: could not determine hash algorithm; "
"is this a git repository?",
transport_anonymize_url(url.buf));
data = heads->buf;
start = NULL;
mid = data;
while (i < heads->len) {
if (!start) {
start = &data[i];
}
if (data[i] == '\t')
mid = &data[i];
if (data[i] == '\n') {
if (mid - start != options.hash_algo->hexsz)
die(_("%sinfo/refs not valid: is this a git repository?"),
transport_anonymize_url(url.buf));
data[i] = 0;
ref_name = mid + 1;
ref = alloc_ref(ref_name);
get_oid_hex_algop(start, &ref->old_oid, options.hash_algo);
if (!refs)
refs = ref;
if (last_ref)
last_ref->next = ref;
last_ref = ref;
start = NULL;
}
i++;
}
ref = alloc_ref("HEAD");
if (!http_fetch_ref(url.buf, ref) &&
!resolve_remote_symref(ref, refs)) {
ref->next = refs;
refs = ref;
} else {
free(ref);
}
return refs;
}
static void free_discovery(struct discovery *d)
{
if (d) {
if (d == last_discovery)
last_discovery = NULL;
free(d->shallow.oid);
free(d->buf_alloc);
remote-curl: always parse incoming refs When remote-curl receives a list of refs from a server, it keeps the whole buffer intact. When we get a "list" command, we feed the result to get_remote_heads, and when we get a "fetch" or "push" command, we feed it to fetch-pack or send-pack, respectively. If the HTTP response from the server is truncated for any reason, we will get an incomplete ref advertisement. If we then feed this incomplete list to fetch-pack, one of a few things may happen: 1. If the truncation is in a packet header, fetch-pack will notice the bogus line and complain. 2. If the truncation is inside a packet, fetch-pack will keep waiting for us to send the rest of the packet, which we never will. 3. If the truncation is at a packet boundary, fetch-pack will keep waiting for us to send the next packet, which we never will. As a result, fetch-pack hangs, waiting for input. However, remote-curl believes it has sent all of the advertisement, and therefore waits for fetch-pack to speak. The two processes end up in a deadlock. We do notice the broken ref list if we feed it to get_remote_heads. So if git asks the helper to do a "list" followed by a "fetch", we are safe; we'll abort during the list operation, which parses the refs. This patch teaches remote-curl to always parse and save the incoming ref list when we read the ref advertisement from a server. That means that we will always verify and abort before even running fetch-pack (or send-pack) when reading a corrupted list, even if we do not run the "list" command explicitly. Since we save the result, in the common case of running "list" then "fetch", we do not do any extra parsing at all. In the case of just a "fetch", we do an extra round of parsing, but only once. Note also that the "fetch" case will now also initialize server_capabilities from the remote (in remote-curl; we already would do so inside fetch-pack). Doing "list+fetch" already does this. It doesn't actually matter now, but the new behavior is arguably more correct, should remote-curl ever start caring about the server's capability list. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
free_refs(d->refs);
free(d->service);
free(d);
}
}
static int show_http_message(struct strbuf *type, struct strbuf *charset,
struct strbuf *msg)
remote-curl: show server content on http errors If an http request to a remote git server fails, we show only the http response code, or sometimes a custom message for particular codes. This gives the server no opportunity to offer a more detailed explanation of the reason for the failure, or to give extra advice. This patch teaches remote-curl to record and display the body content of a failed http response. We only display such responses when the content-type is advertised as text/plain, as it is the most likely to look presentable on the user's terminal (and it is hoped to be a good indication that the message is intended for git clients, and not for a web browser). Each line of the new output is prepended with "remote:". Example output may look like this (assuming the server is configured to display such a helpful message): $ GIT_SMART_HTTP=0 git clone https://example.com/some/repo.git Cloning into 'repo'... remote: Sorry, fetching via dumb http is forbidden. remote: Please upgrade your git client to v1.6.6 or greater remote: and make sure that smart-http is enabled. error: The requested URL returned error: 403 while accessing http://localhost:5001/some/repo.git/info/refs fatal: HTTP request failed For the sake of simplicity, we only record and display these errors during the initial fetch of the ref list, as that is the initial contact with the server and where the most common, interesting errors happen (and there is already precedent, as that is the only place we currently massage http error codes into more helpful messages). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
{
const char *p, *eol;
/*
* We only show text/plain parts, as other types are likely
* to be ugly to look at on the user's terminal.
*/
if (strcmp(type->buf, "text/plain"))
remote-curl: show server content on http errors If an http request to a remote git server fails, we show only the http response code, or sometimes a custom message for particular codes. This gives the server no opportunity to offer a more detailed explanation of the reason for the failure, or to give extra advice. This patch teaches remote-curl to record and display the body content of a failed http response. We only display such responses when the content-type is advertised as text/plain, as it is the most likely to look presentable on the user's terminal (and it is hoped to be a good indication that the message is intended for git clients, and not for a web browser). Each line of the new output is prepended with "remote:". Example output may look like this (assuming the server is configured to display such a helpful message): $ GIT_SMART_HTTP=0 git clone https://example.com/some/repo.git Cloning into 'repo'... remote: Sorry, fetching via dumb http is forbidden. remote: Please upgrade your git client to v1.6.6 or greater remote: and make sure that smart-http is enabled. error: The requested URL returned error: 403 while accessing http://localhost:5001/some/repo.git/info/refs fatal: HTTP request failed For the sake of simplicity, we only record and display these errors during the initial fetch of the ref list, as that is the initial contact with the server and where the most common, interesting errors happen (and there is already precedent, as that is the only place we currently massage http error codes into more helpful messages). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
return -1;
if (charset->len)
strbuf_reencode(msg, charset->buf, get_log_output_encoding());
remote-curl: show server content on http errors If an http request to a remote git server fails, we show only the http response code, or sometimes a custom message for particular codes. This gives the server no opportunity to offer a more detailed explanation of the reason for the failure, or to give extra advice. This patch teaches remote-curl to record and display the body content of a failed http response. We only display such responses when the content-type is advertised as text/plain, as it is the most likely to look presentable on the user's terminal (and it is hoped to be a good indication that the message is intended for git clients, and not for a web browser). Each line of the new output is prepended with "remote:". Example output may look like this (assuming the server is configured to display such a helpful message): $ GIT_SMART_HTTP=0 git clone https://example.com/some/repo.git Cloning into 'repo'... remote: Sorry, fetching via dumb http is forbidden. remote: Please upgrade your git client to v1.6.6 or greater remote: and make sure that smart-http is enabled. error: The requested URL returned error: 403 while accessing http://localhost:5001/some/repo.git/info/refs fatal: HTTP request failed For the sake of simplicity, we only record and display these errors during the initial fetch of the ref list, as that is the initial contact with the server and where the most common, interesting errors happen (and there is already precedent, as that is the only place we currently massage http error codes into more helpful messages). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
strbuf_trim(msg);
if (!msg->len)
return -1;
p = msg->buf;
do {
eol = strchrnul(p, '\n');
fprintf(stderr, "remote: %.*s\n", (int)(eol - p), p);
p = eol + 1;
} while(*eol);
return 0;
}
static int get_protocol_http_header(enum protocol_version version,
struct strbuf *header)
{
if (version > 0) {
strbuf_addf(header, GIT_PROTOCOL_HEADER ": version=%d",
version);
return 1;
}
return 0;
}
remote-curl: refactor smart-http discovery After making initial contact with an http server, we have to decide if the server supports smart-http, and if so, which version. Our rules are a bit inconsistent: 1. For v0, we require that the content-type indicates a smart-http response. We also require the response to look vaguely like a pkt-line starting with "#". If one of those does not match, we fall back to dumb-http. But according to our http protocol spec[1]: Dumb servers MUST NOT return a return type starting with `application/x-git-`. If we see the expected content-type, we should consider it smart-http. At that point we can parse the pkt-line for real, and complain if it is not syntactically valid. 2. For v2, we do not actually check the content-type. Our v2 protocol spec says[2]: When using the http:// or https:// transport a client makes a "smart" info/refs request as described in `http-protocol.txt`[...] and the http spec is clear that for a smart-http response[3]: The Content-Type MUST be `application/x-$servicename-advertisement`. So it is required according to the spec. These inconsistencies were easy to miss because of the way the original code was written as an inline conditional. Let's pull it out into its own function for readability, and improve a few things: - we now predicate the smart/dumb decision entirely on the presence of the correct content-type - we do a real pkt-line parse before deciding how to proceed (and die if it isn't valid) - use skip_prefix() for comparing service strings, instead of constructing expected output in a strbuf; this avoids dealing with memory cleanup Note that this _is_ tightening what the client will allow. It's all according to the spec, but it's possible that other implementations might violate these. However, violating these particular rules seems like an odd choice for a server to make. [1] Documentation/technical/http-protocol.txt, l. 166-167 [2] Documentation/technical/protocol-v2.txt, l. 63-64 [3] Documentation/technical/http-protocol.txt, l. 247 Helped-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
static void check_smart_http(struct discovery *d, const char *service,
struct strbuf *type)
{
const char *p;
struct packet_reader reader;
/*
* If we don't see x-$service-advertisement, then it's not smart-http.
* But once we do, we commit to it and assume any other protocol
* violations are hard errors.
*/
if (!skip_prefix(type->buf, "application/x-", &p) ||
!skip_prefix(p, service, &p) ||
strcmp(p, "-advertisement"))
return;
packet_reader_init(&reader, -1, d->buf, d->len,
PACKET_READ_CHOMP_NEWLINE |
PACKET_READ_DIE_ON_ERR_PACKET);
if (packet_reader_read(&reader) != PACKET_READ_NORMAL)
die(_("invalid server response; expected service, got flush packet"));
remote-curl: refactor smart-http discovery After making initial contact with an http server, we have to decide if the server supports smart-http, and if so, which version. Our rules are a bit inconsistent: 1. For v0, we require that the content-type indicates a smart-http response. We also require the response to look vaguely like a pkt-line starting with "#". If one of those does not match, we fall back to dumb-http. But according to our http protocol spec[1]: Dumb servers MUST NOT return a return type starting with `application/x-git-`. If we see the expected content-type, we should consider it smart-http. At that point we can parse the pkt-line for real, and complain if it is not syntactically valid. 2. For v2, we do not actually check the content-type. Our v2 protocol spec says[2]: When using the http:// or https:// transport a client makes a "smart" info/refs request as described in `http-protocol.txt`[...] and the http spec is clear that for a smart-http response[3]: The Content-Type MUST be `application/x-$servicename-advertisement`. So it is required according to the spec. These inconsistencies were easy to miss because of the way the original code was written as an inline conditional. Let's pull it out into its own function for readability, and improve a few things: - we now predicate the smart/dumb decision entirely on the presence of the correct content-type - we do a real pkt-line parse before deciding how to proceed (and die if it isn't valid) - use skip_prefix() for comparing service strings, instead of constructing expected output in a strbuf; this avoids dealing with memory cleanup Note that this _is_ tightening what the client will allow. It's all according to the spec, but it's possible that other implementations might violate these. However, violating these particular rules seems like an odd choice for a server to make. [1] Documentation/technical/http-protocol.txt, l. 166-167 [2] Documentation/technical/protocol-v2.txt, l. 63-64 [3] Documentation/technical/http-protocol.txt, l. 247 Helped-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
if (skip_prefix(reader.line, "# service=", &p) && !strcmp(p, service)) {
/*
* The header can include additional metadata lines, up
* until a packet flush marker. Ignore these now, but
* in the future we might start to scan them.
*/
for (;;) {
packet_reader_read(&reader);
if (reader.pktlen <= 0) {
break;
}
}
/*
* v0 smart http; callers expect us to soak up the
* service and header packets
*/
d->buf = reader.src_buffer;
d->len = reader.src_len;
d->proto_git = 1;
} else if (!strcmp(reader.line, "version 2")) {
remote-curl: refactor smart-http discovery After making initial contact with an http server, we have to decide if the server supports smart-http, and if so, which version. Our rules are a bit inconsistent: 1. For v0, we require that the content-type indicates a smart-http response. We also require the response to look vaguely like a pkt-line starting with "#". If one of those does not match, we fall back to dumb-http. But according to our http protocol spec[1]: Dumb servers MUST NOT return a return type starting with `application/x-git-`. If we see the expected content-type, we should consider it smart-http. At that point we can parse the pkt-line for real, and complain if it is not syntactically valid. 2. For v2, we do not actually check the content-type. Our v2 protocol spec says[2]: When using the http:// or https:// transport a client makes a "smart" info/refs request as described in `http-protocol.txt`[...] and the http spec is clear that for a smart-http response[3]: The Content-Type MUST be `application/x-$servicename-advertisement`. So it is required according to the spec. These inconsistencies were easy to miss because of the way the original code was written as an inline conditional. Let's pull it out into its own function for readability, and improve a few things: - we now predicate the smart/dumb decision entirely on the presence of the correct content-type - we do a real pkt-line parse before deciding how to proceed (and die if it isn't valid) - use skip_prefix() for comparing service strings, instead of constructing expected output in a strbuf; this avoids dealing with memory cleanup Note that this _is_ tightening what the client will allow. It's all according to the spec, but it's possible that other implementations might violate these. However, violating these particular rules seems like an odd choice for a server to make. [1] Documentation/technical/http-protocol.txt, l. 166-167 [2] Documentation/technical/protocol-v2.txt, l. 63-64 [3] Documentation/technical/http-protocol.txt, l. 247 Helped-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
/*
* v2 smart http; do not consume version packet, which will
* be handled elsewhere.
*/
d->proto_git = 1;
} else {
die(_("invalid server response; got '%s'"), reader.line);
remote-curl: refactor smart-http discovery After making initial contact with an http server, we have to decide if the server supports smart-http, and if so, which version. Our rules are a bit inconsistent: 1. For v0, we require that the content-type indicates a smart-http response. We also require the response to look vaguely like a pkt-line starting with "#". If one of those does not match, we fall back to dumb-http. But according to our http protocol spec[1]: Dumb servers MUST NOT return a return type starting with `application/x-git-`. If we see the expected content-type, we should consider it smart-http. At that point we can parse the pkt-line for real, and complain if it is not syntactically valid. 2. For v2, we do not actually check the content-type. Our v2 protocol spec says[2]: When using the http:// or https:// transport a client makes a "smart" info/refs request as described in `http-protocol.txt`[...] and the http spec is clear that for a smart-http response[3]: The Content-Type MUST be `application/x-$servicename-advertisement`. So it is required according to the spec. These inconsistencies were easy to miss because of the way the original code was written as an inline conditional. Let's pull it out into its own function for readability, and improve a few things: - we now predicate the smart/dumb decision entirely on the presence of the correct content-type - we do a real pkt-line parse before deciding how to proceed (and die if it isn't valid) - use skip_prefix() for comparing service strings, instead of constructing expected output in a strbuf; this avoids dealing with memory cleanup Note that this _is_ tightening what the client will allow. It's all according to the spec, but it's possible that other implementations might violate these. However, violating these particular rules seems like an odd choice for a server to make. [1] Documentation/technical/http-protocol.txt, l. 166-167 [2] Documentation/technical/protocol-v2.txt, l. 63-64 [3] Documentation/technical/http-protocol.txt, l. 247 Helped-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
4 years ago
}
}
static struct discovery *discover_refs(const char *service, int for_push)
{
struct strbuf type = STRBUF_INIT;
struct strbuf charset = STRBUF_INIT;
struct strbuf buffer = STRBUF_INIT;
struct strbuf refs_url = STRBUF_INIT;
remote-curl: rewrite base url from info/refs redirects For efficiency and security reasons, an earlier commit in this series taught http_get_* to re-write the base url based on redirections we saw while making a specific request. This commit wires that option into the info/refs request, meaning that a redirect from http://example.com/foo.git/info/refs to https://example.com/bar.git/info/refs will behave as if "https://example.com/bar.git" had been provided to git in the first place. The tests bear some explanation. We introduce two new hierearchies into the httpd test config: 1. Requests to /smart-redir-limited will work only for the initial info/refs request, but not any subsequent requests. As a result, we can confirm whether the client is re-rooting its requests after the initial contact, since otherwise it will fail (it will ask for "repo.git/git-upload-pack", which is not redirected). 2. Requests to smart-redir-auth will redirect, and require auth after the redirection. Since we are using the redirected base for further requests, we also update the credential struct, in order not to mislead the user (or credential helpers) about which credential is needed. We can therefore check the GIT_ASKPASS prompts to make sure we are prompting for the new location. Because we have neither multiple servers nor https support in our test setup, we can only redirect between paths, meaning we need to turn on credential.useHttpPath to see the difference. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
9 years ago
struct strbuf effective_url = STRBUF_INIT;
struct strbuf protocol_header = STRBUF_INIT;
struct string_list extra_headers = STRING_LIST_INIT_DUP;
struct discovery *last = last_discovery;
int http_ret, maybe_smart = 0;
struct http_get_options http_options;
enum protocol_version version = get_protocol_version_config();
if (last && !strcmp(service, last->service))
return last;
free_discovery(last);
strbuf_addf(&refs_url, "%sinfo/refs", url.buf);
if ((starts_with(url.buf, "http://") || starts_with(url.buf, "https://")) &&
git_env_bool("GIT_SMART_HTTP", 1)) {
maybe_smart = 1;
if (!strchr(url.buf, '?'))
strbuf_addch(&refs_url, '?');
else
strbuf_addch(&refs_url, '&');
strbuf_addf(&refs_url, "service=%s", service);
}
/*
* NEEDSWORK: If we are trying to use protocol v2 and we are planning
* to perform a push, then fallback to v0 since the client doesn't know
* how to push yet using v2.
*/
if (version == protocol_v2 && !strcmp("git-receive-pack", service))
version = protocol_v0;
/* Add the extra Git-Protocol header */
if (get_protocol_http_header(version, &protocol_header))
string_list_append(&extra_headers, protocol_header.buf);
memset(&http_options, 0, sizeof(http_options));
http_options.content_type = &type;
http_options.charset = &charset;
http_options.effective_url = &effective_url;
http_options.base_url = &url;
http_options.extra_headers = &extra_headers;
http: make redirects more obvious We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since c93c92f30 (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
6 years ago
http_options.initial_request = 1;
http_options.no_cache = 1;
http_ret = http_get_strbuf(refs_url.buf, &buffer, &http_options);
switch (http_ret) {
case HTTP_OK:
break;
case HTTP_MISSING_TARGET:
show_http_message(&type, &charset, &buffer);
die(_("repository '%s' not found"),
transport_anonymize_url(url.buf));
case HTTP_NOAUTH:
show_http_message(&type, &charset, &buffer);
die(_("Authentication failed for '%s'"),
transport_anonymize_url(url.buf));
case HTTP_NOMATCHPUBLICKEY:
show_http_message(&type, &charset, &buffer);
die(_("unable to access '%s' with http.pinnedPubkey configuration: %s"),
transport_anonymize_url(url.buf), curl_errorstr);
default:
show_http_message(&type, &charset, &buffer);
die(_("unable to access '%s': %s"),
transport_anonymize_url(url.buf), curl_errorstr);
}