2006-04-22 08:57:45 +02:00
|
|
|
/*
|
|
|
|
* Copyright (C) 2005 Junio C Hamano
|
|
|
|
*/
|
|
|
|
#include "cache.h"
|
2017-06-14 20:07:36 +02:00
|
|
|
#include "config.h"
|
2015-08-12 19:12:01 +02:00
|
|
|
#include "tempfile.h"
|
2006-04-22 08:57:45 +02:00
|
|
|
#include "quote.h"
|
|
|
|
#include "diff.h"
|
|
|
|
#include "diffcore.h"
|
binary patch.
This adds "binary patch" to the diff output and teaches apply
what to do with them.
On the diff generation side, traditionally, we said "Binary
files differ\n" without giving anything other than the preimage
and postimage object name on the index line. This was good
enough for applying a patch generated from your own repository
(very useful while rebasing), because the postimage would be
available in such a case. However, this was not useful when the
recipient of such a patch via e-mail were to apply it, even if
the preimage was available.
This patch allows the diff to generate "binary" patch when
operating under --full-index option. The binary patch follows
the usual extended git diff headers, and looks like this:
"GIT binary patch\n"
<length byte><data>"\n"
...
"\n"
Each line is prefixed with a "length-byte", whose value is upper
or lowercase alphabet that encodes number of bytes that the data
on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ...,
'Z' means 26, 'a' means 27, ...). <data> is 1 or more groups of
5-byte sequence, each of which encodes up to 4 bytes in base85
encoding. Because 52 / 4 * 5 = 65 and we have the length byte,
an output line is capped to 66 characters. The payload is the
same diff-delta as we use in the packfiles.
On the consumption side, git-apply now can decode and apply the
binary patch when --allow-binary-replacement is given, the diff
was generated with --full-index, and the receiving repository
has the preimage blob, which is the same condition as it always
required when accepting an "Binary files differ\n" patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 01:51:44 +02:00
|
|
|
#include "delta.h"
|
2006-04-22 08:57:45 +02:00
|
|
|
#include "xdiff-interface.h"
|
2006-09-08 10:03:18 +02:00
|
|
|
#include "color.h"
|
2007-04-13 08:05:29 +02:00
|
|
|
#include "attr.h"
|
2007-10-19 21:47:56 +02:00
|
|
|
#include "run-command.h"
|
2008-01-02 10:50:11 +01:00
|
|
|
#include "utf8.h"
|
2018-05-16 01:42:15 +02:00
|
|
|
#include "object-store.h"
|
2008-10-05 23:43:21 +02:00
|
|
|
#include "userdiff.h"
|
2015-08-18 02:21:59 +02:00
|
|
|
#include "submodule-config.h"
|
2009-10-19 14:38:32 +02:00
|
|
|
#include "submodule.h"
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
#include "hashmap.h"
|
2021-12-09 11:30:09 +01:00
|
|
|
#include "mem-pool.h"
|
2010-03-25 03:21:32 +01:00
|
|
|
#include "ll-merge.h"
|
2012-10-28 17:50:54 +01:00
|
|
|
#include "string-list.h"
|
2020-07-28 22:23:39 +02:00
|
|
|
#include "strvec.h"
|
2016-09-01 01:27:20 +02:00
|
|
|
#include "graph.h"
|
2017-08-19 00:20:36 +02:00
|
|
|
#include "packfile.h"
|
2019-01-27 01:35:31 +01:00
|
|
|
#include "parse-options.h"
|
2018-05-26 15:55:24 +02:00
|
|
|
#include "help.h"
|
2019-06-25 15:40:31 +02:00
|
|
|
#include "promisor-remote.h"
|
2021-09-08 13:23:56 +02:00
|
|
|
#include "dir.h"
|
2022-02-02 03:37:34 +01:00
|
|
|
#include "strmap.h"
|
2006-04-22 08:57:45 +02:00
|
|
|
|
2006-12-14 12:15:57 +01:00
|
|
|
#ifdef NO_FAST_WORKING_DIRECTORY
|
|
|
|
#define FAST_WORKING_DIRECTORY 0
|
|
|
|
#else
|
|
|
|
#define FAST_WORKING_DIRECTORY 1
|
|
|
|
#endif
|
|
|
|
|
2006-08-15 19:23:48 +02:00
|
|
|
static int diff_detect_rename_default;
|
2017-05-08 18:03:38 +02:00
|
|
|
static int diff_indent_heuristic = 1;
|
rename: bump limit defaults yet again
These were last bumped in commit 92c57e5c1d29 (bump rename limit
defaults (again), 2011-02-19), and were bumped both because processors
had gotten faster, and because people were getting ugly merges that
caused problems and reporting it to the mailing list (suggesting that
folks were willing to spend more time waiting).
Since that time:
* Linus has continued recommending kernel folks to set
diff.renameLimit=0 (maps to 32767, currently)
* Folks with repositories with lots of renames were happy to set
merge.renameLimit above 32767, once the code supported that, to
get correct cherry-picks
* Processors have gotten faster
* It has been discovered that the timing methodology used last time
probably used too large example files.
The last point is probably worth explaining a bit more:
* The "average" file size used appears to have been average blob size
in the linux kernel history at the time (probably v2.6.25 or
something close to it).
* Since bigger files are modified more frequently, such a computation
weights towards larger files.
* Larger files may be more likely to be modified over time, but are
not more likely to be renamed -- the mean and median blob size
within a tree are a bit higher than the mean and median of blob
sizes in the history leading up to that version for the linux
kernel.
* The mean blob size in v2.6.25 was half the average blob size in
history leading to that point
* The median blob size in v2.6.25 was about 40% of the mean blob size
in v2.6.25.
* Since the mean blob size is more than double the median blob size,
any file as big as the mean will not be compared to any files of
median size or less (because they'd be more than 50% dissimilar).
* Since it is the number of files compared that provides the O(n^2)
behavior, median-sized files should matter more than mean-sized
ones.
The combined effect of the above is that the file size used in past
calculations was likely about 5x too large. Combine that with a CPU
performance improvement of ~30%, and we can increase the limits by
a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated
time limits.
Keeping the same approximate time limit probably makes sense for
diff.renameLimit (there is no progress feedback in e.g. git log -p),
but the experience above suggests merge.renameLimit could be extended
significantly. In fact, it probably would make sense to have an
unlimited default setting for merge.renameLimit, but that would
likely need to be coupled with changes to how progress is displayed.
(See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/
for details in that area.) For now, let's just bump the approximate
time limit from 10s to 1m.
(Note: We do not want to use actual time limits, because getting results
that depend on how loaded your system is that day feels bad, and because
we don't discover that we won't get all the renames until after we've
put in a lot of work rather than just upfront telling the user there are
too many files involved.)
Using the original time limit of 2s for diff.renameLimit, and bumping
merge.renameLimit from 10s to 60s, I found the following timings using
the simple script at the end of this commit message (on an AWS c5.xlarge
which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"):
N Timing
1300 1.995s
7100 59.973s
So let's round down to nice even numbers and bump the limits from
400->1000, and from 1000->7000.
Here is the measure_rename_perf script (adapted from
https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/
in particular to avoid triggering the linear handling from
basename-guided rename detection):
#!/bin/bash
n=$1; shift
rm -rf repo
mkdir repo && cd repo
git init -q -b main
mkdata() {
mkdir $1
for i in `seq 1 $2`; do
(sed "s/^/$i /" <../sample
echo tag: $1
) >$1/$i
done
}
mkdata initial $n
git add .
git commit -q -m initial
mkdata new $n
git add .
cd new
for i in *; do git mv $i $i.renamed; done
cd ..
git rm -q -rf initial
git commit -q -m new
time git diff-tree -M -l0 --summary HEAD^ HEAD
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 02:45:24 +02:00
|
|
|
static int diff_rename_limit_default = 1000;
|
2008-08-15 13:39:26 +02:00
|
|
|
static int diff_suppress_blank_empty;
|
2012-09-15 22:59:59 +02:00
|
|
|
static int diff_use_color_default = -1;
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
static int diff_color_moved_default;
|
2018-07-18 21:31:56 +02:00
|
|
|
static int diff_color_moved_ws_default;
|
2012-09-27 21:12:52 +02:00
|
|
|
static int diff_context_default = 3;
|
2017-01-12 13:21:11 +01:00
|
|
|
static int diff_interhunk_context_default;
|
2009-01-21 04:46:57 +01:00
|
|
|
static const char *diff_word_regex_cfg;
|
2007-12-17 14:42:20 +01:00
|
|
|
static const char *external_diff_cmd_cfg;
|
2013-12-19 01:08:12 +01:00
|
|
|
static const char *diff_order_file_cfg;
|
2007-08-31 22:13:42 +02:00
|
|
|
int diff_auto_refresh_index = 1;
|
2008-08-19 05:08:09 +02:00
|
|
|
static int diff_mnemonic_prefix;
|
2010-05-03 04:03:41 +02:00
|
|
|
static int diff_no_prefix;
|
2020-05-22 12:46:18 +02:00
|
|
|
static int diff_relative;
|
2012-03-01 13:26:46 +01:00
|
|
|
static int diff_stat_graph_width;
|
2011-04-29 11:36:20 +02:00
|
|
|
static int diff_dirstat_permille_default = 30;
|
2010-08-05 10:49:55 +02:00
|
|
|
static struct diff_options default_diff_options;
|
2013-01-16 08:51:57 +01:00
|
|
|
static long diff_algorithm;
|
2016-10-05 00:26:27 +02:00
|
|
|
static unsigned ws_error_highlight_default = WSEH_NEW;
|
2006-04-22 08:57:45 +02:00
|
|
|
|
2006-09-08 10:03:18 +02:00
|
|
|
static char diff_colors[][COLOR_MAXLEN] = {
|
2009-02-13 22:53:40 +01:00
|
|
|
GIT_COLOR_RESET,
|
2015-05-27 22:48:46 +02:00
|
|
|
GIT_COLOR_NORMAL, /* CONTEXT */
|
2009-02-13 22:53:40 +01:00
|
|
|
GIT_COLOR_BOLD, /* METAINFO */
|
|
|
|
GIT_COLOR_CYAN, /* FRAGINFO */
|
|
|
|
GIT_COLOR_RED, /* OLD */
|
|
|
|
GIT_COLOR_GREEN, /* NEW */
|
|
|
|
GIT_COLOR_YELLOW, /* COMMIT */
|
|
|
|
GIT_COLOR_BG_RED, /* WHITESPACE */
|
2009-11-27 07:55:18 +01:00
|
|
|
GIT_COLOR_NORMAL, /* FUNCINFO */
|
2017-06-30 22:53:09 +02:00
|
|
|
GIT_COLOR_BOLD_MAGENTA, /* OLD_MOVED */
|
|
|
|
GIT_COLOR_BOLD_BLUE, /* OLD_MOVED ALTERNATIVE */
|
|
|
|
GIT_COLOR_FAINT, /* OLD_MOVED_DIM */
|
|
|
|
GIT_COLOR_FAINT_ITALIC, /* OLD_MOVED_ALTERNATIVE_DIM */
|
|
|
|
GIT_COLOR_BOLD_CYAN, /* NEW_MOVED */
|
|
|
|
GIT_COLOR_BOLD_YELLOW, /* NEW_MOVED ALTERNATIVE */
|
|
|
|
GIT_COLOR_FAINT, /* NEW_MOVED_DIM */
|
|
|
|
GIT_COLOR_FAINT_ITALIC, /* NEW_MOVED_ALTERNATIVE_DIM */
|
range-diff: use dim/bold cues to improve dual color mode
It *is* a confusing thing to look at a diff of diffs. All too easy is it
to mix up whether the -/+ markers refer to the "inner" or the "outer"
diff, i.e. whether a `+` indicates that a line was added by either the
old or the new diff (or both), or whether the new diff does something
different than the old diff.
To make things easier to process for normal developers, we introduced
the dual color mode which colors the lines according to the commit diff,
i.e. lines that are added by a commit (whether old, new, or both) are
colored in green. In non-dual color mode, the lines would be colored
according to the outer diff: if the old commit added a line, it would be
colored red (because that line addition is only present in the first
commit range that was specified on the command-line, i.e. the "old"
commit, but not in the second commit range, i.e. the "new" commit).
However, this dual color mode is still not making things clear enough,
as we are looking at two levels of diffs, and we still only pick a color
according to *one* of them (the outer diff marker is colored
differently, of course, but in particular with deep indentation, it is
easy to lose track of that outer diff marker's background color).
Therefore, let's add another dimension to the mix. Still use
green/red/normal according to the commit diffs, but now also dim the
lines that were only in the old commit, and use bold face for the lines
that are only in the new commit.
That way, it is much easier not to lose track of, say, when we are
looking at a line that was added in the previous iteration of a patch
series but the new iteration adds a slightly different version: the
obsolete change will be dimmed, the current version of the patch will be
bold.
At least this developer has a much easier time reading the range-diffs
that way.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 13:33:32 +02:00
|
|
|
GIT_COLOR_FAINT, /* CONTEXT_DIM */
|
|
|
|
GIT_COLOR_FAINT_RED, /* OLD_DIM */
|
|
|
|
GIT_COLOR_FAINT_GREEN, /* NEW_DIM */
|
|
|
|
GIT_COLOR_BOLD, /* CONTEXT_BOLD */
|
|
|
|
GIT_COLOR_BOLD_RED, /* OLD_BOLD */
|
|
|
|
GIT_COLOR_BOLD_GREEN, /* NEW_BOLD */
|
2006-06-13 18:45:44 +02:00
|
|
|
};
|
|
|
|
|
2018-05-26 15:55:21 +02:00
|
|
|
static const char *color_diff_slots[] = {
|
|
|
|
[DIFF_CONTEXT] = "context",
|
|
|
|
[DIFF_METAINFO] = "meta",
|
|
|
|
[DIFF_FRAGINFO] = "frag",
|
|
|
|
[DIFF_FILE_OLD] = "old",
|
|
|
|
[DIFF_FILE_NEW] = "new",
|
|
|
|
[DIFF_COMMIT] = "commit",
|
|
|
|
[DIFF_WHITESPACE] = "whitespace",
|
|
|
|
[DIFF_FUNCINFO] = "func",
|
|
|
|
[DIFF_FILE_OLD_MOVED] = "oldMoved",
|
|
|
|
[DIFF_FILE_OLD_MOVED_ALT] = "oldMovedAlternative",
|
|
|
|
[DIFF_FILE_OLD_MOVED_DIM] = "oldMovedDimmed",
|
|
|
|
[DIFF_FILE_OLD_MOVED_ALT_DIM] = "oldMovedAlternativeDimmed",
|
|
|
|
[DIFF_FILE_NEW_MOVED] = "newMoved",
|
|
|
|
[DIFF_FILE_NEW_MOVED_ALT] = "newMovedAlternative",
|
|
|
|
[DIFF_FILE_NEW_MOVED_DIM] = "newMovedDimmed",
|
|
|
|
[DIFF_FILE_NEW_MOVED_ALT_DIM] = "newMovedAlternativeDimmed",
|
range-diff: use dim/bold cues to improve dual color mode
It *is* a confusing thing to look at a diff of diffs. All too easy is it
to mix up whether the -/+ markers refer to the "inner" or the "outer"
diff, i.e. whether a `+` indicates that a line was added by either the
old or the new diff (or both), or whether the new diff does something
different than the old diff.
To make things easier to process for normal developers, we introduced
the dual color mode which colors the lines according to the commit diff,
i.e. lines that are added by a commit (whether old, new, or both) are
colored in green. In non-dual color mode, the lines would be colored
according to the outer diff: if the old commit added a line, it would be
colored red (because that line addition is only present in the first
commit range that was specified on the command-line, i.e. the "old"
commit, but not in the second commit range, i.e. the "new" commit).
However, this dual color mode is still not making things clear enough,
as we are looking at two levels of diffs, and we still only pick a color
according to *one* of them (the outer diff marker is colored
differently, of course, but in particular with deep indentation, it is
easy to lose track of that outer diff marker's background color).
Therefore, let's add another dimension to the mix. Still use
green/red/normal according to the commit diffs, but now also dim the
lines that were only in the old commit, and use bold face for the lines
that are only in the new commit.
That way, it is much easier not to lose track of, say, when we are
looking at a line that was added in the previous iteration of a patch
series but the new iteration adds a slightly different version: the
obsolete change will be dimmed, the current version of the patch will be
bold.
At least this developer has a much easier time reading the range-diffs
that way.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 13:33:32 +02:00
|
|
|
[DIFF_CONTEXT_DIM] = "contextDimmed",
|
|
|
|
[DIFF_FILE_OLD_DIM] = "oldDimmed",
|
|
|
|
[DIFF_FILE_NEW_DIM] = "newDimmed",
|
|
|
|
[DIFF_CONTEXT_BOLD] = "contextBold",
|
|
|
|
[DIFF_FILE_OLD_BOLD] = "oldBold",
|
|
|
|
[DIFF_FILE_NEW_BOLD] = "newBold",
|
2018-05-26 15:55:21 +02:00
|
|
|
};
|
|
|
|
|
2018-05-26 15:55:24 +02:00
|
|
|
define_list_config_array_extra(color_diff_slots, {"plain"});
|
|
|
|
|
2014-06-18 21:41:50 +02:00
|
|
|
static int parse_diff_color_slot(const char *var)
|
diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.
[diff]
color = auto
[diff.color]
new = blue
old = yellow
frag = reverse
When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal. Other choices are "always",
and "never". Usual boolean true/false can also be used.
The colormap entries can specify colors for the following slots:
plain - lines that appear in both old and new file (context)
meta - diff --git header and extended git diff headers
frag - @@ -n,m +l,k @@ lines (hunk header)
old - lines deleted from old file
new - lines added to new file
The following color names can be used:
normal, bold, dim, l, blink, reverse, reset,
black, red, green, yellow, blue, magenta, cyan,
white
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 13:06:23 +02:00
|
|
|
{
|
2018-05-26 15:55:21 +02:00
|
|
|
if (!strcasecmp(var, "plain"))
|
2015-05-27 22:48:46 +02:00
|
|
|
return DIFF_CONTEXT;
|
2018-05-26 15:55:21 +02:00
|
|
|
return LOOKUP_CONFIG(color_diff_slots, var);
|
diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.
[diff]
color = auto
[diff.color]
new = blue
old = yellow
frag = reverse
When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal. Other choices are "always",
and "never". Usual boolean true/false can also be used.
The colormap entries can specify colors for the following slots:
plain - lines that appear in both old and new file (context)
meta - diff --git header and extended git diff headers
frag - @@ -n,m +l,k @@ lines (hunk header)
old - lines deleted from old file
new - lines added to new file
The following color names can be used:
normal, bold, dim, l, blink, reverse, reset,
black, red, green, yellow, blue, magenta, cyan,
white
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 13:06:23 +02:00
|
|
|
}
|
|
|
|
|
2012-10-28 17:50:54 +01:00
|
|
|
static int parse_dirstat_params(struct diff_options *options, const char *params_string,
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
struct strbuf *errmsg)
|
2011-04-29 11:36:18 +02:00
|
|
|
{
|
2012-10-28 17:50:54 +01:00
|
|
|
char *params_copy = xstrdup(params_string);
|
|
|
|
struct string_list params = STRING_LIST_INIT_NODUP;
|
|
|
|
int ret = 0;
|
|
|
|
int i;
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
|
2012-10-28 17:50:54 +01:00
|
|
|
if (*params_copy)
|
|
|
|
string_list_split_in_place(¶ms, params_copy, ',', -1);
|
|
|
|
for (i = 0; i < params.nr; i++) {
|
|
|
|
const char *p = params.items[i].string;
|
|
|
|
if (!strcmp(p, "changes")) {
|
2017-10-31 19:19:11 +01:00
|
|
|
options->flags.dirstat_by_line = 0;
|
|
|
|
options->flags.dirstat_by_file = 0;
|
2012-10-28 17:50:54 +01:00
|
|
|
} else if (!strcmp(p, "lines")) {
|
2017-10-31 19:19:11 +01:00
|
|
|
options->flags.dirstat_by_line = 1;
|
|
|
|
options->flags.dirstat_by_file = 0;
|
2012-10-28 17:50:54 +01:00
|
|
|
} else if (!strcmp(p, "files")) {
|
2017-10-31 19:19:11 +01:00
|
|
|
options->flags.dirstat_by_line = 0;
|
|
|
|
options->flags.dirstat_by_file = 1;
|
2012-10-28 17:50:54 +01:00
|
|
|
} else if (!strcmp(p, "noncumulative")) {
|
2017-10-31 19:19:11 +01:00
|
|
|
options->flags.dirstat_cumulative = 0;
|
2012-10-28 17:50:54 +01:00
|
|
|
} else if (!strcmp(p, "cumulative")) {
|
2017-10-31 19:19:11 +01:00
|
|
|
options->flags.dirstat_cumulative = 1;
|
2011-04-29 11:36:18 +02:00
|
|
|
} else if (isdigit(*p)) {
|
|
|
|
char *end;
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
int permille = strtoul(p, &end, 10) * 10;
|
|
|
|
if (*end == '.' && isdigit(*++end)) {
|
2011-04-29 11:36:20 +02:00
|
|
|
/* only use first digit */
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
permille += *end - '0';
|
2011-04-29 11:36:20 +02:00
|
|
|
/* .. and ignore any further digits */
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
while (isdigit(*++end))
|
2011-04-29 11:36:20 +02:00
|
|
|
; /* nothing */
|
|
|
|
}
|
2012-10-28 17:50:54 +01:00
|
|
|
if (!*end)
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
options->dirstat_permille = permille;
|
|
|
|
else {
|
2012-10-28 17:50:54 +01:00
|
|
|
strbuf_addf(errmsg, _(" Failed to parse dirstat cut-off percentage '%s'\n"),
|
|
|
|
p);
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
ret++;
|
|
|
|
}
|
|
|
|
} else {
|
2012-10-28 17:50:54 +01:00
|
|
|
strbuf_addf(errmsg, _(" Unknown dirstat parameter '%s'\n"), p);
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
ret++;
|
2011-04-29 11:36:18 +02:00
|
|
|
}
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
|
2011-04-29 11:36:18 +02:00
|
|
|
}
|
2012-10-28 17:50:54 +01:00
|
|
|
string_list_clear(¶ms, 0);
|
|
|
|
free(params_copy);
|
Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:36:22 +02:00
|
|
|
return ret;
|
2011-04-29 11:36:18 +02:00
|
|
|
}
|
|
|
|
|
2012-11-13 16:42:45 +01:00
|
|
|
static int parse_submodule_params(struct diff_options *options, const char *value)
|
|
|
|
{
|
|
|
|
if (!strcmp(value, "log"))
|
2016-09-01 01:27:21 +02:00
|
|
|
options->submodule_format = DIFF_SUBMODULE_LOG;
|
2012-11-13 16:42:45 +01:00
|
|
|
else if (!strcmp(value, "short"))
|
2016-09-01 01:27:21 +02:00
|
|
|
options->submodule_format = DIFF_SUBMODULE_SHORT;
|
2016-09-01 01:27:25 +02:00
|
|
|
else if (!strcmp(value, "diff"))
|
|
|
|
options->submodule_format = DIFF_SUBMODULE_INLINE_DIFF;
|
2019-02-16 12:24:41 +01:00
|
|
|
/*
|
|
|
|
* Please update $__git_diff_submodule_formats in
|
|
|
|
* git-completion.bash when you add new formats.
|
|
|
|
*/
|
2012-11-13 16:42:45 +01:00
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-05-02 18:01:14 +02:00
|
|
|
int git_config_rename(const char *var, const char *value)
|
2009-04-09 20:46:15 +02:00
|
|
|
{
|
|
|
|
if (!value)
|
|
|
|
return DIFF_DETECT_RENAME;
|
|
|
|
if (!strcasecmp(value, "copies") || !strcasecmp(value, "copy"))
|
|
|
|
return DIFF_DETECT_COPY;
|
|
|
|
return git_config_bool(var,value) ? DIFF_DETECT_RENAME : 0;
|
|
|
|
}
|
|
|
|
|
2013-01-16 08:51:58 +01:00
|
|
|
long parse_algorithm_value(const char *value)
|
2013-01-16 08:51:57 +01:00
|
|
|
{
|
|
|
|
if (!value)
|
|
|
|
return -1;
|
|
|
|
else if (!strcasecmp(value, "myers") || !strcasecmp(value, "default"))
|
|
|
|
return 0;
|
|
|
|
else if (!strcasecmp(value, "minimal"))
|
|
|
|
return XDF_NEED_MINIMAL;
|
|
|
|
else if (!strcasecmp(value, "patience"))
|
|
|
|
return XDF_PATIENCE_DIFF;
|
|
|
|
else if (!strcasecmp(value, "histogram"))
|
|
|
|
return XDF_HISTOGRAM_DIFF;
|
2019-02-16 12:24:41 +01:00
|
|
|
/*
|
|
|
|
* Please update $__git_diff_algorithms in git-completion.bash
|
|
|
|
* when you add new algorithms.
|
|
|
|
*/
|
2013-01-16 08:51:57 +01:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2016-10-05 00:09:18 +02:00
|
|
|
static int parse_one_token(const char **arg, const char *token)
|
|
|
|
{
|
|
|
|
const char *rest;
|
|
|
|
if (skip_prefix(*arg, token, &rest) && (!*rest || *rest == ',')) {
|
|
|
|
*arg = rest;
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int parse_ws_error_highlight(const char *arg)
|
|
|
|
{
|
|
|
|
const char *orig_arg = arg;
|
|
|
|
unsigned val = 0;
|
|
|
|
|
|
|
|
while (*arg) {
|
|
|
|
if (parse_one_token(&arg, "none"))
|
|
|
|
val = 0;
|
|
|
|
else if (parse_one_token(&arg, "default"))
|
|
|
|
val = WSEH_NEW;
|
|
|
|
else if (parse_one_token(&arg, "all"))
|
|
|
|
val = WSEH_NEW | WSEH_OLD | WSEH_CONTEXT;
|
|
|
|
else if (parse_one_token(&arg, "new"))
|
|
|
|
val |= WSEH_NEW;
|
|
|
|
else if (parse_one_token(&arg, "old"))
|
|
|
|
val |= WSEH_OLD;
|
|
|
|
else if (parse_one_token(&arg, "context"))
|
|
|
|
val |= WSEH_CONTEXT;
|
|
|
|
else {
|
|
|
|
return -1 - (int)(arg - orig_arg);
|
|
|
|
}
|
|
|
|
if (*arg)
|
|
|
|
arg++;
|
|
|
|
}
|
|
|
|
return val;
|
|
|
|
}
|
|
|
|
|
2006-07-08 10:05:16 +02:00
|
|
|
/*
|
|
|
|
* These are to give UI layer defaults.
|
|
|
|
* The core-level commands such as git-diff-files should
|
|
|
|
* never be affected by the setting of diff.renames
|
|
|
|
* the user happens to have in the configuration file.
|
|
|
|
*/
|
2016-02-25 09:59:21 +01:00
|
|
|
void init_diff_ui_defaults(void)
|
|
|
|
{
|
2017-12-27 11:18:35 +01:00
|
|
|
diff_detect_rename_default = DIFF_DETECT_RENAME;
|
2016-02-25 09:59:21 +01:00
|
|
|
}
|
|
|
|
|
2016-09-05 11:44:53 +02:00
|
|
|
int git_diff_heuristic_config(const char *var, const char *value, void *cb)
|
|
|
|
{
|
2016-12-23 21:32:22 +01:00
|
|
|
if (!strcmp(var, "diff.indentheuristic"))
|
2016-09-05 11:44:53 +02:00
|
|
|
diff_indent_heuristic = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
static int parse_color_moved(const char *arg)
|
|
|
|
{
|
|
|
|
switch (git_parse_maybe_bool(arg)) {
|
|
|
|
case 0:
|
|
|
|
return COLOR_MOVED_NO;
|
|
|
|
case 1:
|
|
|
|
return COLOR_MOVED_DEFAULT;
|
|
|
|
default:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!strcmp(arg, "no"))
|
|
|
|
return COLOR_MOVED_NO;
|
2017-06-30 22:53:08 +02:00
|
|
|
else if (!strcmp(arg, "plain"))
|
|
|
|
return COLOR_MOVED_PLAIN;
|
2018-07-17 01:05:39 +02:00
|
|
|
else if (!strcmp(arg, "blocks"))
|
|
|
|
return COLOR_MOVED_BLOCKS;
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
else if (!strcmp(arg, "zebra"))
|
|
|
|
return COLOR_MOVED_ZEBRA;
|
|
|
|
else if (!strcmp(arg, "default"))
|
|
|
|
return COLOR_MOVED_DEFAULT;
|
2018-07-24 23:58:45 +02:00
|
|
|
else if (!strcmp(arg, "dimmed-zebra"))
|
|
|
|
return COLOR_MOVED_ZEBRA_DIM;
|
2017-06-30 22:53:09 +02:00
|
|
|
else if (!strcmp(arg, "dimmed_zebra"))
|
|
|
|
return COLOR_MOVED_ZEBRA_DIM;
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
else
|
2018-08-16 00:08:22 +02:00
|
|
|
return error(_("color moved setting must be one of 'no', 'default', 'blocks', 'zebra', 'dimmed-zebra', 'plain'"));
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
}
|
|
|
|
|
2018-11-13 22:33:57 +01:00
|
|
|
static unsigned parse_color_moved_ws(const char *arg)
|
2018-07-17 01:05:40 +02:00
|
|
|
{
|
|
|
|
int ret = 0;
|
|
|
|
struct string_list l = STRING_LIST_INIT_DUP;
|
|
|
|
struct string_list_item *i;
|
|
|
|
|
|
|
|
string_list_split(&l, arg, ',', -1);
|
|
|
|
|
|
|
|
for_each_string_list_item(i, &l) {
|
|
|
|
struct strbuf sb = STRBUF_INIT;
|
|
|
|
strbuf_addstr(&sb, i->string);
|
|
|
|
strbuf_trim(&sb);
|
|
|
|
|
2018-11-23 12:16:52 +01:00
|
|
|
if (!strcmp(sb.buf, "no"))
|
|
|
|
ret = 0;
|
|
|
|
else if (!strcmp(sb.buf, "ignore-space-change"))
|
2018-07-17 01:05:40 +02:00
|
|
|
ret |= XDF_IGNORE_WHITESPACE_CHANGE;
|
|
|
|
else if (!strcmp(sb.buf, "ignore-space-at-eol"))
|
|
|
|
ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
|
|
|
|
else if (!strcmp(sb.buf, "ignore-all-space"))
|
|
|
|
ret |= XDF_IGNORE_WHITESPACE;
|
diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18 21:31:55 +02:00
|
|
|
else if (!strcmp(sb.buf, "allow-indentation-change"))
|
|
|
|
ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
|
2018-11-13 22:33:57 +01:00
|
|
|
else {
|
|
|
|
ret |= COLOR_MOVED_WS_ERROR;
|
|
|
|
error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), sb.buf);
|
|
|
|
}
|
2018-07-17 01:05:40 +02:00
|
|
|
|
|
|
|
strbuf_release(&sb);
|
|
|
|
}
|
|
|
|
|
diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18 21:31:55 +02:00
|
|
|
if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
|
2018-11-13 22:33:57 +01:00
|
|
|
(ret & XDF_WHITESPACE_FLAGS)) {
|
2019-01-29 21:47:53 +01:00
|
|
|
error(_("color-moved-ws: allow-indentation-change cannot be combined with other whitespace modes"));
|
2018-11-13 22:33:57 +01:00
|
|
|
ret |= COLOR_MOVED_WS_ERROR;
|
|
|
|
}
|
diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18 21:31:55 +02:00
|
|
|
|
2018-07-17 01:05:40 +02:00
|
|
|
string_list_clear(&l, 0);
|
|
|
|
|
|
|
|
return ret;
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
}
|
|
|
|
|
2008-05-14 19:46:53 +02:00
|
|
|
int git_diff_ui_config(const char *var, const char *value, void *cb)
|
diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.
[diff]
color = auto
[diff.color]
new = blue
old = yellow
frag = reverse
When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal. Other choices are "always",
and "never". Usual boolean true/false can also be used.
The colormap entries can specify colors for the following slots:
plain - lines that appear in both old and new file (context)
meta - diff --git header and extended git diff headers
frag - @@ -n,m +l,k @@ lines (hunk header)
old - lines deleted from old file
new - lines added to new file
The following color names can be used:
normal, bold, dim, l, blink, reverse, reset,
black, red, green, yellow, blue, magenta, cyan,
white
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 13:06:23 +02:00
|
|
|
{
|
2006-12-13 10:13:28 +01:00
|
|
|
if (!strcmp(var, "diff.color") || !strcmp(var, "color.diff")) {
|
2011-08-18 07:03:48 +02:00
|
|
|
diff_use_color_default = git_config_colorbool(var, value);
|
diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.
[diff]
color = auto
[diff.color]
new = blue
old = yellow
frag = reverse
When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal. Other choices are "always",
and "never". Usual boolean true/false can also be used.
The colormap entries can specify colors for the following slots:
plain - lines that appear in both old and new file (context)
meta - diff --git header and extended git diff headers
frag - @@ -n,m +l,k @@ lines (hunk header)
old - lines deleted from old file
new - lines added to new file
The following color names can be used:
normal, bold, dim, l, blink, reverse, reset,
black, red, green, yellow, blue, magenta, cyan,
white
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 13:06:23 +02:00
|
|
|
return 0;
|
|
|
|
}
|
diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):
[OM] -void sensitive_stuff(void)
[OM] -{
[OM] - if (!is_authorized_user())
[OM] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OM] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NM] + sensitive_stuff(spanning,
[NM] + multiple,
[NM] + lines);
[NM] +}
However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:
[OM] -void sensitive_stuff(void)
[OM] -{
[OMA] - if (!is_authorized_user())
[OMA] - die("unauthorized");
[OM] - sensitive_stuff(spanning,
[OM] - multiple,
[OM] - lines);
[OMA] -}
void another_function()
{
[del] - printf("foo");
[add] + printf("bar");
}
[NM] +void sensitive_stuff(void)
[NM] +{
[NMA] + sensitive_stuff(spanning,
[NMA] + multiple,
[NMA] + lines);
[NM] + if (!is_authorized_user())
[NM] + die("unauthorized");
[NMA] +}
If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.
This patch implements the first mode:
* basic alternating 'Zebra' mode
This conveys all information needed to the user. Defer customization to
later patches.
First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').
Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.
It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.
While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.
For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 22:53:07 +02:00
|
|
|
if (!strcmp(var, "diff.colormoved")) {
|
|
|
|
int cm = parse_color_moved(value);
|
|
|
|
if (cm < 0)
|
|
|
|
return -1;
|
|
|
|
diff_color_moved_default = cm;
|
|
|
|
return 0;
|
|
|
|
}
|
2018-07-18 21:31:56 +02:00
|
|
|
if (!strcmp(var, "diff.colormovedws")) {
|
2018-11-13 22:33:57 +01:00
|
|
|
unsigned cm = parse_color_moved_ws(value);
|
|
|
|
if (cm & COLOR_MOVED_WS_ERROR)
|
2018-07-18 21:31:56 +02:00
|
|
|
return -1;
|
|
|
|
diff_color_moved_ws_default = cm;
|
|
|
|
return 0;
|
|
|
|
}
|
2012-09-27 21:12:52 +02:00
|
|
|
if (!strcmp(var, "diff.context")) {
|
|
|
|
diff_context_default = git_config_int(var, value);
|
|
|
|
if (diff_context_default < 0)
|
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
2017-01-12 13:21:11 +01:00
|
|
|
if (!strcmp(var, "diff.interhunkcontext")) {
|
|
|
|
diff_interhunk_context_default = git_config_int(var, value);
|
|
|
|
if (diff_interhunk_context_default < 0)
|
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
2006-07-07 13:01:23 +02:00
|
|
|
if (!strcmp(var, "diff.renames")) {
|
2009-04-09 20:46:15 +02:00
|
|
|
diff_detect_rename_default = git_config_rename(var, value);
|
2006-07-07 13:01:23 +02:00
|
|
|
return 0;
|
|
|
|
}
|
2007-08-31 22:13:42 +02:00
|
|
|
if (!strcmp(var, "diff.autorefreshindex")) {
|
|
|
|
diff_auto_refresh_index = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2008-08-19 05:08:09 +02:00
|
|
|
if (!strcmp(var, "diff.mnemonicprefix")) {
|
|
|
|
diff_mnemonic_prefix = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2010-05-03 04:03:41 +02:00
|
|
|
if (!strcmp(var, "diff.noprefix")) {
|
|
|
|
diff_no_prefix = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2020-05-22 12:46:18 +02:00
|
|
|
if (!strcmp(var, "diff.relative")) {
|
|
|
|
diff_relative = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2012-03-01 13:26:46 +01:00
|
|
|
if (!strcmp(var, "diff.statgraphwidth")) {
|
|
|
|
diff_stat_graph_width = git_config_int(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2008-07-05 07:24:43 +02:00
|
|
|
if (!strcmp(var, "diff.external"))
|
|
|
|
return git_config_string(&external_diff_cmd_cfg, var, value);
|
2009-01-21 04:46:57 +01:00
|
|
|
if (!strcmp(var, "diff.wordregex"))
|
|
|
|
return git_config_string(&diff_word_regex_cfg, var, value);
|
2013-12-19 01:08:12 +01:00
|
|
|
if (!strcmp(var, "diff.orderfile"))
|
|
|
|
return git_config_pathname(&diff_order_file_cfg, var, value);
|
2007-04-23 02:52:55 +02:00
|
|
|
|
2010-08-05 10:49:55 +02:00
|
|
|
if (!strcmp(var, "diff.ignoresubmodules"))
|
|
|
|
handle_ignore_submodules_arg(&default_diff_options, value);
|
|
|
|
|
2012-11-13 16:42:45 +01:00
|
|
|
if (!strcmp(var, "diff.submodule")) {
|
|
|
|
if (parse_submodule_params(&default_diff_options, value))
|
|
|
|
warning(_("Unknown value for 'diff.submodule' config variable: '%s'"),
|
|
|
|
value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-01-16 08:51:57 +01:00
|
|
|
if (!strcmp(var, "diff.algorithm")) {
|
|
|
|
diff_algorithm = parse_algorithm_value(value);
|
|
|
|
if (diff_algorithm < 0)
|
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-10-13 19:24:31 +02:00
|
|
|
if (git_color_config(var, value, cb) < 0)
|
|
|
|
return -1;
|
|
|
|
|
2008-05-14 19:46:53 +02:00
|
|
|
return git_diff_basic_config(var, value, cb);
|
2008-01-04 09:59:34 +01:00
|
|
|
}
|
|
|
|
|
2008-05-14 19:46:53 +02:00
|
|
|
int git_diff_basic_config(const char *var, const char *value, void *cb)
|
2008-01-04 09:59:34 +01:00
|
|
|
{
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18 21:47:50 +02:00
|
|
|
const char *name;
|
|
|
|
|
2008-08-05 20:27:30 +02:00
|
|
|
if (!strcmp(var, "diff.renamelimit")) {
|
|
|
|
diff_rename_limit_default = git_config_int(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
drop odd return value semantics from userdiff_config
When the userdiff_config function was introduced in be58e70
(diff: unify external diff and funcname parsing code,
2008-10-05), it used a return value convention unlike any
other config callback. Like other callbacks, it used "-1" to
signal error. But it returned "1" to indicate that it found
something, and "0" otherwise; other callbacks simply
returned "0" to indicate that no error occurred.
This distinction was necessary at the time, because the
userdiff namespace overlapped slightly with the color
configuration namespace. So "diff.color.foo" could mean "the
'foo' slot of diff coloring" or "the 'foo' component of the
"color" userdiff driver". Because the color-parsing code
would die on an unknown color slot, we needed the userdiff
code to indicate that it had matched the variable, letting
us bypass the color-parsing code entirely.
Later, in 8b8e862 (ignore unknown color configuration,
2009-12-12), the color-parsing code learned to silently
ignore unknown slots. This means we no longer need to
protect userdiff-matched variables from reaching the
color-parsing code.
We can therefore change the userdiff_config calling
convention to a more normal one. This drops some code from
each caller, which is nice. But more importantly, it reduces
the cognitive load for readers who may wonder why
userdiff_config is unlike every other config callback.
There's no need to add a new test confirming that this
works; t4020 already contains a test that sets
diff.color.external.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-07 19:23:02 +01:00
|
|
|
if (userdiff_config(var, value) < 0)
|
|
|
|
return -1;
|
2008-10-26 05:45:55 +01:00
|
|
|
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18 21:47:50 +02:00
|
|
|
if (skip_prefix(var, "diff.color.", &name) ||
|
|
|
|
skip_prefix(var, "color.diff.", &name)) {
|
|
|
|
int slot = parse_diff_color_slot(name);
|
ignore unknown color configuration
When parsing the config file, if there is a value that is
syntactically correct but unused, we generally ignore it.
This lets non-core porcelains store arbitrary information in
the config file, and it means that configuration files can
be shared between new and old versions of git (the old
versions might simply ignore certain configuration).
The one exception to this is color configuration; if we
encounter a color.{diff,branch,status}.$slot variable, we
die if it is not one of the recognized slots (presumably as
a safety valve for user misconfiguration). This behavior
has existed since 801235c (diff --color: use
$GIT_DIR/config, 2006-06-24), but hasn't yet caused a
problem. No porcelain has wanted to store extra colors, and
we once a color area (like color.diff) has been introduced,
we've never changed the set of color slots.
However, that changed recently with the addition of
color.diff.func. Now a user with color.diff.func in their
config can no longer freely switch between v1.6.6 and older
versions; the old versions will complain about the existence
of the variable.
This patch loosens the check to match the rest of
git-config; unknown color slots are simply ignored. This
doesn't fix this particular problem, as the older version
(without this patch) is the problem, but it at least
prevents it from happening again in the future.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-12 13:25:24 +01:00
|
|
|
if (slot < 0)
|
|
|
|
return 0;
|
2008-02-11 19:53:56 +01:00
|
|
|
if (!value)
|
|
|
|
return config_error_nonbool(var);
|
2014-10-07 21:33:09 +02:00
|
|
|
return color_parse(value, diff_colors[slot]);
|
diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.
[diff]
color = auto
[diff.color]
new = blue
old = yellow
frag = reverse
When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal. Other choices are "always",
and "never". Usual boolean true/false can also be used.
The colormap entries can specify colors for the following slots:
plain - lines that appear in both old and new file (context)
meta - diff --git header and extended git diff headers
frag - @@ -n,m +l,k @@ lines (hunk header)
old - lines deleted from old file
new - lines added to new file
The following color names can be used:
normal, bold, dim, l, blink, reverse, reset,
black, red, green, yellow, blue, magenta, cyan,
white
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 13:06:23 +02:00
|
|
|
}
|
2007-04-23 02:52:55 +02:00
|
|
|
|
2020-01-31 10:57:49 +01:00
|
|
|
if (!strcmp(var, "diff.wserrorhighlight")) {
|
|
|
|
int val = parse_ws_error_highlight(value);
|
|
|
|
if (val < 0)
|
|
|
|
return -1;
|
|
|
|
ws_error_highlight_default = val;
|
|