Commit Graph

267 Commits

Author SHA1 Message Date
Junio C Hamano 6901ffe80c Merge branch 'jc/diff-s-with-other-options'
The "-s" (silent, squelch) option of the "diff" family of commands
did not interact with other options that specify the output format
well.  This has been cleaned up so that it will clear all the
formatting options given before.

* jc/diff-s-with-other-options:
  diff: fix interaction between the "-s" option and other options
2023-06-13 12:29:45 -07:00
Junio C Hamano 9d484b92ed diff: fix interaction between the "-s" option and other options
Sergey Organov noticed and reported "--patch --no-patch --raw"
behaves differently from just "--raw".  It turns out that there are
a few interesting bugs in the implementation and documentation.

 * First, the documentation for "--no-patch" was unclear that it
   could be read to mean "--no-patch" countermands an earlier
   "--patch" but not other things.  The intention of "--no-patch"
   ever since it was introduced at d09cd15d (diff: allow --no-patch
   as synonym for -s, 2013-07-16) was to serve as a synonym for
   "-s", so "--raw --patch --no-patch" should have produced no
   output, but it can be (mis)read to allow showing only "--raw"

 * Then the interaction between "-s" and other format options were
   poorly implemented.  Modern versions of Git uses one bit each to
   represent formatting options like "--patch", "--stat" in a single
   output_format word, but for historical reasons, "-s" also is
   represented as another bit in the same word.  This allows two
   interesting bugs to happen, and we have both X-<.

   (1) After setting a format bit, then setting NO_OUTPUT with "-s",
       the code to process another "--<format>" option drops the
       NO_OUTPUT bit to allow output to be shown again.  However,
       the code to handle "-s" only set NO_OUTPUT without unsetting
       format bits set earlier, so the earlier format bit got
       revealed upon seeing the second "--<format>" option.  This is
       the problem Sergey observed.

   (2) After setting NO_OUTPUT with "-s", code to process
       "--<format>" option can forget to unset NO_OUTPUT, leaving
       the command still silent.

It is tempting to change the meaning of "--no-patch" to mean
"disable only the patch format output" and reimplement "-s" as "not
showing anything", but it would be an end-user visible change in
behavior.  Let's fix the interactions of these bits to first make
"-s" work as intended.

The fix is conceptually very simple.

 * Whenever we set DIFF_FORMAT_FOO because we saw the "--foo"
   option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
   given), we make sure we drop DIFF_FORMAT_NO_OUTPUT.  We forgot to
   do so in some of the options and caused (2) above.

 * When processing "-s" option, we should not just set
   DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
   We didn't do so and retained format bits set by options
   previously seen, causing (1) above.

It is even more tempting to lose NO_OUTPUT bit and instead take
output_format word being 0 as its replacement, but that would break
the mechanism "git show" uses to default to "--patch" output, where
the distinction between telling the command to be silent with "-s"
and having no output format specified on the command line matters,
and an explicit output format given on the command line should not
be "combined" with the default "--patch" format.

So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we
may want to replace it with OPTION_GIVEN bit, and

 * make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and
   DIFF_FORMAT_OPTION_GIVEN bit on for each format.  "--no-raw",
   etc. will set off DIFF_FORMAT_$format bit but still record the
   fact that we saw an option from the command line by setting

 * make "-s" (and its synonym "--no-patch") clear all other bits
   and set only the DIFF_FORMAT_OPTION_GIVEN bit on.

which I suspect would make the code much cleaner without breaking
any end-user expectations.

Once that is in place, transitioning "--no-patch" to mean the
counterpart of "--patch", just like "--no-raw" only defeats an
earlier "--raw", would be quite simple at the code level.  The
social cost of migrating the end-user expectations might be too
great for it to be worth, but at least the "GIVEN" bit clean-up
alone may be worth it.

Signed-off-by: Junio C Hamano <>
2023-05-05 14:24:32 -07:00
Jeff King b39a569729 diff: add --default-prefix option
You can change the output of prefixes with diff.noprefix and
diff.mnemonicprefix, but there's no easy way to override them from the
command-line. We do have "--no-prefix", but there's no way to get back
to the default prefix. So let's add an option to do that.

Signed-off-by: Jeff King <>
Signed-off-by: Junio C Hamano <>
2023-03-09 08:32:21 -08:00
John Cai ebdc46c242 docs: link generating patch sections
Currently, in the git-log documentation, the reference to generating
patches does not match the section title. This can make the section
"Generating patch text with -p" hard to find, since typically readers of
the documentation will copy and paste to search the page.

Let's make this more convenient for readers by linking it directly to
the section.

Since git-log pulls in diff-generate-patch.txt, we can provide a direct
link to the section. Otherwise, change the verbiage to match exactly
what the section title is, to at least make searching for it an easier

Signed-off-by: John Cai <>
Signed-off-by: Junio C Hamano <>
2023-01-13 12:55:14 -08:00
Junio C Hamano 9a160990ef Merge branch 'js/diff-filter-negation-fix'
"git diff --diff-filter=aR" is now parsed correctly.

* js/diff-filter-negation-fix:
  diff-filter: be more careful when looking for negative bits
  diff.c: move the diff filter bits definitions up a bit
  docs(diff): lose incorrect claim about `diff-files --diff-filter=A`
2022-02-16 15:14:30 -08:00
Elijah Newren db757e8b8d show, log: provide a --remerge-diff capability
When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).

This capability works by creating a temporary object directory and
marking it as the primary object store.  This makes it so that any blobs
or trees created during the automatic merge are easily removable
afterwards by just deleting all objects from the temporary object

There are a few ways that this implementation is suboptimal:
  * `log --remerge-diff` becomes slow, because the temporary object
    directory can fill with many loose objects while running
  * the log output can be muddied with misplaced "warning: cannot merge
    binary files" messages, since ll-merge.c unconditionally writes those
    messages to stderr while running instead of allowing callers to
    manage them.
  * important conflict and warning messages are simply dropped; thus for
    conflicts like modify/delete or rename/rename or file/directory which
    are not representable with content conflict markers, there may be no
    way for a user of --remerge-diff to know that there had been a
    conflict which was resolved (and which possibly motivated other
    changes in the merge commit).
  * when fixing the previous issue, note that some unimportant conflict
    and warning messages might start being included.  We should instead
    make sure these remain dropped.
Subsequent commits will address these issues.

Signed-off-by: Elijah Newren <>
Signed-off-by: Junio C Hamano <>
2022-02-02 10:02:27 -08:00
Johannes Schindelin d843e319f8 docs(diff): lose incorrect claim about `diff-files --diff-filter=A`
Originally, before we had `--intent-to-add`, there was no way that `git
diff-files` could see added files: if a file did not exist in the index,
`git diff-files` would not show it because it looks only at worktree
files when there is an index entry at the same path.

We used this example in the documentation of the diff options to explain
that not every `--diff-filter=<option>` has an effect in all scenarios.

Even when we added `--intent-to-add`, the comment was still correct,
because initially we showed such files as modified instead of added.

However, when that bug was fixed in feea6946a5 (diff-files: treat
"i-t-a" files as "not-in-index", 2020-06-20), the comment in the
documentation became incorrect.

Let's just remove it.

Signed-off-by: Johannes Schindelin <>
Signed-off-by: Junio C Hamano <>
2022-01-28 10:18:17 -08:00
Junio C Hamano 4c90d8908a Merge branch 'jn/log-m-does-not-imply-p'
Earlier "git log -m" was changed to always produce patch output,
which would break existing scripts, which has been reverted.

* jn/log-m-does-not-imply-p:
  Revert 'diff-merges: let "-m" imply "-p"'
2021-08-11 12:36:18 -07:00
Jonathan Nieder 6a38e33331 Revert 'diff-merges: let "-m" imply "-p"'
This reverts commit f5bfcc823b, which
made "git log -m" imply "--patch" by default.  The logic was that
"-m", which makes diff generation for merges perform a diff against
each parent, has no use unless I am viewing the diff, so we could save
the user some typing by turning on display of the resulting diff
automatically.  That wasn't expected to adversely affect scripts
because scripts would either be using a command like "git diff-tree"
that already emits diffs by default or would be combining -m with a
diff generation option such as --name-status.  By saving typing for
interactive use without adversely affecting scripts in the wild, it
would be a pure improvement.

The problem is that although diff generation options are only relevant
for the displayed diff, a script author can imagine them affecting
path limiting.  For example, I might run

	git log -w --format=%H -- README

hoping to list commits that edited README, excluding whitespace-only
changes.  In fact, a whitespace-only change is not TREESAME so the use
of -w here has no effect (since we don't apply these diff generation
flags to the diff_options struct rev_info::pruning used for this
purpose), but the documentation suggests that it should work

	Suppose you specified foo as the <paths>. We shall call
	commits that modify foo !TREESAME, and the rest TREESAME. (In
	a diff filtered for foo, they look different and equal,

and a script author who has not tested whitespace-only changes
wouldn't notice.

Similarly, a script author could include

	git log -m --first-parent --format=%H -- README

to filter the first-parent history for commits that modified README.
The -m is a no-op but it reflects the script author's intent.  For
example, until 1e20a407fe (stash list: stop passing "-m" to "git
log", 2021-05-21), "git stash list" did this.

As a result, we can't safely change "-m" to imply "-p" without fear of
breaking such scripts.  Restore the previous behavior.

Noticed because Rust's src/bootstrap/ made use of this
same construct:  That
script has been updated to omit the unnecessary "-m" option, but we
can expect other scripts in the wild to have similar expectations.

Signed-off-by: Jonathan Nieder <>
Signed-off-by: Junio C Hamano <>
2021-08-09 13:52:01 -07:00
Elijah Newren 9dd29dbef0 diffcore-rename: treat a rename_limit of 0 as unlimited
In commit 89973554b5 (diffcore-rename: make diff-tree -l0 mean
-l<large>, 2017-11-29), -l0 was given a special magical "large" value,
but one which was not large enough for some uses (as can be seen from
commit 9f7e4bfa3b (diff: remove silent clamp of renameLimit,
2017-11-13).  Make 0 (or a negative value) be treated as unlimited
instead and update the documentation to mention this.

Signed-off-by: Elijah Newren <>
Signed-off-by: Junio C Hamano <>
2021-07-15 16:54:24 -07:00
Elijah Newren 6623a528e0 doc: clarify documentation for rename/copy limits
A few places in the docs implied that rename/copy detection is always
quadratic or that all (unpaired) files were involved in the quadratic
portion of rename/copy detection.  The following two commits each
introduced an exception to this:

    9027f53cb5 (Do linear-time/space rename logic for exact renames,
    bd24aa2f97 (diffcore-rename: guide inexact rename detection based
                  on basenames, 2021-02-14)

(As a side note, for copy detection, the basename guided inexact rename
detection is turned off and the exact renames will only result in
sources (without the dests) being removed from the set of files used in
quadratic detection.  So, for copy detection, the documentation was
closer to correct.)

Avoid implying that all files involved in rename/copy detection are
subject to the full quadratic algorithm.  While at it, also note the
default values for all these settings.

Signed-off-by: Elijah Newren <>
Signed-off-by: Junio C Hamano <>
2021-07-15 16:54:24 -07:00
Junio C Hamano 8e444e66df Merge branch 'so/log-m-implies-p'
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.

* so/log-m-implies-p:
  diff-merges: let "-m" imply "-p"
  diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
  stash list: stop passing "-m" to "git log"
  git-svn: stop passing "-m" to "git rev-list"
  diff-merges: move specific diff-index "-m" handling to diff-index
  t4013: test "git diff-index -m"
  t4013: test "git diff-tree -m"
  t4013: test "git log -m --stat"
  t4013: test "git log -m --raw"
  t4013: test that "-m" alone has no effect in "git log"
2021-06-14 13:33:27 +09:00
Sergey Organov f5bfcc823b diff-merges: let "-m" imply "-p"
Fix long standing inconsistency between -c/--cc that do imply -p on
one side, and -m that did not imply -p on the other side.

Change corresponding test accordingly, as "log -m" output should now
match one from "log -m -p", rather than from just "log".

Change documentation accordingly.


After this patch

  git log -m

produces diffs without need to provide -p as well, that improves both
consistency and usability. It gets even more useful if one sets
"log.diffMerges" configuration variable to "first-parent" to force -m
produce usual diff with respect to first parent only.

This patch, however, does not change behavior when specific diff
format is explicitly provided on the command-line, so that commands

  git log -m --raw
  git log -m --stat

are not affected, nor does it change commands where specific diff
format is active by default, such as:

  git diff-tree -m

It's also worth to be noticed that exact historical semantics of -m is
still provided by --diff-merges=separate.

Signed-off-by: Sergey Organov <>
Signed-off-by: Junio C Hamano <>
2021-05-21 09:24:14 +09:00
Junio C Hamano 93e0b28dbb Merge branch 'ab/pathname-encoding-doc'
Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.

* ab/pathname-encoding-doc:
  doc: clarify the filename encoding in git diff
2021-04-30 13:50:27 +09:00
Andrey Bienkowski 9364bf465d doc: clarify the filename encoding in git diff
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and

Signed-off-by: Junio C Hamano <>
2021-04-20 12:57:26 -07:00
Sergey Organov 364bc11fe5 doc/diff-options: document new --diff-merges features
Document changes in -m and --diff-merges=m semantics, as well as new
--diff-merges=on option.

Signed-off-by: Sergey Organov <>
Signed-off-by: Junio C Hamano <>
2021-04-16 23:38:35 -07:00
Junio C Hamano 1eb4136ac2 diff: --{rotate,skip}-to=<path>
In the implementation of "git difftool", there is a case where the
user wants to start viewing the diffs at a specific path and
continue on to the rest, optionally wrapping around to the
beginning.  Since it is somewhat cumbersome to implement such a
feature as a post-processing step of "git diff" output, let's
support it internally with two new options.

 - "git diff --rotate-to=C", when the resulting patch would show
   paths A B C D E without the option, would "rotate" the paths to
   shows patch to C D E A B instead.  It is an error when there is
   no patch for C is shown.

 - "git diff --skip-to=C" would instead "skip" the paths before C,
   and shows patch to C D E.  Again, it is an error when there is no
   patch for C is shown.

 - "git log [-p]" also accepts these two options, but it is not an
   error if there is no change to the specified path.  Instead, the
   set of output paths are rotated or skipped to the specified path
   or the first path that sorts after the specified path.

Signed-off-by: Junio C Hamano <>
2021-02-16 09:30:42 -08:00
Junio C Hamano aac006aa99 Merge branch 'so/log-diff-merge'
"git log" learned a new "--diff-merges=<how>" option.

* so/log-diff-merge: (32 commits)
  t4013: add tests for --diff-merges=first-parent
  doc/git-show: include --diff-merges description
  doc/rev-list-options: document --first-parent changes merges format
  doc/diff-generate-patch: mention new --diff-merges option
  doc/git-log: describe new --diff-merges options
  diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
  diff-merges: add old mnemonic counterparts to --diff-merges
  diff-merges: let new options enable diff without -p
  diff-merges: do not imply -p for new options
  diff-merges: implement new values for --diff-merges
  diff-merges: make -m/-c/--cc explicitly mutually exclusive
  diff-merges: refactor opt settings into separate functions
  diff-merges: get rid of now empty diff_merges_init_revs()
  diff-merges: group diff-merge flags next to each other inside 'rev_info'
  diff-merges: split 'ignore_merges' field
  diff-merges: fix -m to properly override -c/--cc
  t4013: add tests for -m failing to override -c/--cc
  t4013: support test_expect_failure through ':failure' magic
  diff-merges: revise revs->diff flag handling
  diff-merges: handle imply -p on -c/--cc logic for log.c
2021-02-05 16:40:44 -08:00
Sergey Organov 1d24509b7b doc/git-show: include --diff-merges description
Move description of --diff-merges option from git-log.txt to
diff-options.txt so that it is included in the git-show help.

Signed-off-by: Sergey Organov <>
Signed-off-by: Junio C Hamano <>
2020-12-21 13:47:32 -08:00
Junio C Hamano 3f6dc9c366 Merge branch 'pb/blame-funcname-range-userdiff'
"git blame -L :funcname -- path" did not work well for a path for
which a userdiff driver is defined.

* pb/blame-funcname-range-userdiff:
  blame: simplify 'setup_blame_bloom_data' interface
  blame: simplify 'setup_scoreboard' interface
  blame: enable funcname blaming with userdiff driver
  line-log: mention both modes in 'blame' and 'log' short help
  doc: add more pointers to gitattributes(5) for userdiff
  blame-options.txt: also mention 'funcname' in '-L' description
  doc: line-range: improve formatting
  doc: log, gitk: move '-L' description to 'line-range-options.txt'
2020-11-18 13:32:53 -08:00
Junio C Hamano ee13bebbd5 Merge branch 'jc/abbrev-doc'
The documentation on the "--abbrev=<n>" option did not say the
output may be longer than "<n>" hexdigits, which has been

* jc/abbrev-doc:
  doc: clarify that --abbrev=<n> is about the minimum length
2020-11-11 13:18:38 -08:00
Junio C Hamano c5a802f0ce Merge branch 'so/format-patch-doc-on-default-diff-format'

* so/format-patch-doc-on-default-diff-format:
  doc/diff-options: fix out of place mentions of '--patch/-p'
2020-11-11 13:18:37 -08:00
Junio C Hamano cda34e0d0c doc: clarify that --abbrev=<n> is about the minimum length
Early text written in 2006 explains the "--abbrev=<n>" option to
"show only a partial prefix", without saying that the length of the
partial prefix is not necessarily the number given to the option to
ensure that the output names the object uniquely.

Update documentation for the diff family of commands, "blame",
"branch --verbose", "ls-files" and "ls-tree" to stress that the
short prefix must uniquely refer to an object, and <n> is merely
the mininum number of hexdigits used in the prefix.

Helped-by: Derrick Stolee <>
Signed-off-by: Junio C Hamano <>
2020-11-04 14:04:44 -08:00
Junio C Hamano 1ae0949a03 Merge branch 'mk/diff-ignore-regex'
"git diff" family of commands learned the "-I<regex>" option to
ignore hunks whose changed lines all match the given pattern.

* mk/diff-ignore-regex:
  diff: add -I<regex> that ignores matching changes
  merge-base, xdiff: zero out xpparam_t structures
2020-11-02 13:17:44 -08:00
Philippe Blain 0cce88f1e4 doc: add more pointers to gitattributes(5) for userdiff
Several Git commands can make use of the builtin userdiff patterns, but
it's not obvious in the documentation. Add pointers to the 'Defining a
custom hunk header' part of gitattributes(5) in the description of the
following options:

- the '--function-context' option of `git diff` and friends
- the '--function-context' option of `git grep`
- the '-L :<funcname>' option of `git log`, `gitk` and `git blame`

In 'git-grep.txt', take the opportunity to use backticks in the
description of '--show-function', and improve the wording of the
desription of '--function-context'.

Signed-off-by: Philippe Blain <>
Signed-off-by: Junio C Hamano <>
2020-11-01 15:54:14 -08:00
Sergey Organov 714d491af0 doc/diff-options: fix out of place mentions of '--patch/-p'
First, references to --patch and -p appeared in the description of
git-format-patch, where the options themselves are not included.

Next, the description of --unified option elsewhere had duplicate implied
statements: "Implies --patch. Implies -p."

Signed-off-by: Sergey Organov <>
Signed-off-by: Junio C Hamano <>
2020-10-31 13:14:26 -07:00
Michał Kępień 296d4a94e7 diff: add -I<regex> that ignores matching changes
Add a new diff option that enables ignoring changes whose all lines
(changed, removed, and added) match a given regular expression.  This is
similar to the -I/--ignore-matching-lines option in standalone diff
utilities and can be used e.g. to ignore changes which only affect code
comments or to look for unrelated changes in commits containing a large
number of automatically applied modifications (e.g. a tree-wide string
replacement).  The difference between -G/-S and the new -I option is
that the latter filters output on a per-change basis.

Use the 'ignore' field of xdchange_t for marking a change as ignored or
not.  Since the same field is used by --ignore-blank-lines, identical
hunk emitting rules apply for --ignore-blank-lines and -I.  These two
options can also be used together in the same git invocation (they are
complementary to each other).

Rename xdl_mark_ignorable() to xdl_mark_ignorable_lines(), to indicate
that it is logically a "sibling" of xdl_mark_ignorable_regex() rather
than its "parent".

Signed-off-by: Michał Kępień <>
Signed-off-by: Junio C Hamano <>
2020-10-20 12:53:26 -07:00
Junio C Hamano 096c948dab Merge branch 'dd/diff-customize-index-line-abbrev'
The output from the "diff" family of the commands had abbreviated
object names of blobs involved in the patch, but its length was not
affected by the --abbrev option.  Now it is.

* dd/diff-customize-index-line-abbrev:
  diff: index-line: respect --abbrev in object's name
  t4013: improve diff-post-processor logic
2020-08-31 15:49:46 -07:00
Đoàn Trần Công Danh 3046c7f69a diff: index-line: respect --abbrev in object's name
A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that behaviour is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's allow the blob object names shown on the "index" line to be
abbreviated to arbitrary length given via the "--abbrev" option.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always show full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <>
Signed-off-by: Junio C Hamano <>
2020-08-21 12:43:05 -07:00
Jeff King 9a6d515fc3 doc/git-log: move "-t" into diff-options list
The "-t" option is infrequently used; it doesn't deserve a spot near the
top of the options list. Let's push it down into the diff-options
include, near the definition of --raw.

We'll protect it with a git-log ifdef, since it doesn't make any sense
for non-tree diff commands. Note that this means it also shows up in
git-show, but that's a good thing; it applies equally well there.

Signed-off-by: Jeff King <>
Signed-off-by: Junio C Hamano <>
2020-07-29 13:44:03 -07:00
Laurent Arnoud c28ded83fc diff: add config option relative
The `diff.relative` boolean option set to `true` shows only changes in
the current directory/value specified by the `path` argument of the
`relative` option and shows pathnames relative to the aforementioned

Teach `--no-relative` to override earlier `--relative`

Add for git-format-patch(1) options documentation `--relative` and

Signed-off-by: Laurent Arnoud <>
Acked-by: Đoàn Trần Công Danh <>
Signed-off-by: Junio C Hamano <>
2020-05-24 16:23:59 -07:00
Martin Ågren 9299f84921 diff-options.txt: avoid "regex" overload in example
When we exemplify the difference between `-G` and `-S` (using
`--pickaxe-regex`), we do so using an example diff and git-diff
invocation involving "regexec", "regexp", "regmatch", ...

The example is correct, but we can make it easier to untangle by
avoiding writing "regex.*" unless it's really needed to make our point.

Use some made-up, non-regexy words instead.

Reported-by: Adam Dinwoodie <>
Signed-off-by: Martin Ågren <>
Reviewed-by: Taylor Blau <>
Signed-off-by: Junio C Hamano <>
2020-02-09 09:25:24 -08:00
Junio C Hamano 8a9a837a63 Merge branch 'nd/diff-parseopt-3'
Third batch to teach the diff machinery to use the parse-options

* nd/diff-parseopt-3:
  diff-parseopt: convert --submodule
  diff-parseopt: convert --ignore-submodules
  diff-parseopt: convert --textconv
  diff-parseopt: convert --ext-diff
  diff-parseopt: convert --quiet
  diff-parseopt: convert --exit-code
  diff-parseopt: convert --color-words
  diff-parseopt: convert --word-diff-regex
  diff-parseopt: convert --word-diff
  diff-parseopt: convert --[no-]color
  diff-parseopt: convert --[no-]follow
  diff-parseopt: convert -R
  diff-parseopt: convert -a|--text
  diff-parseopt: convert --full-index
  diff-parseopt: convert --binary
  diff-parseopt: convert --anchored
  diff-parseopt: convert --diff-algorithm
  diff-parseopt: convert --histogram
  diff-parseopt: convert --patience
  diff-parseopt: convert --[no-]indent-heuristic
2019-04-16 19:28:03 +09:00
Junio C Hamano 4ab0f13857 Merge branch 'nd/diff-parseopt-2'
Second batch to teach the diff machinery to use the parse-options

* nd/diff-parseopt-2: (21 commits)
  diff-parseopt: convert --ignore-some-changes
  diff-parseopt: convert --[no-]minimal
  diff-parseopt: convert --relative
  diff-parseopt: convert --no-renames|--[no--rename-empty
  diff-parseopt: convert --find-copies-harder
  diff-parseopt: convert -C|--find-copies
  diff-parseopt: convert -D|--irreversible-delete
  diff-parseopt: convert -M|--find-renames
  diff-parseopt: convert -B|--break-rewrites
  diff-parseopt: convert --output-*
  diff-parseopt: convert --[no-]compact-summary
  diff-parseopt: convert --stat*
  diff-parseopt: convert -s|--no-patch
  diff-parseopt: convert --name-status
  diff-parseopt: convert --name-only
  diff-parseopt: convert --patch-with-stat
  diff-parseopt: convert --summary
  diff-parseopt: convert --check
  diff-parseopt: convert --dirstat and friends
  diff-parseopt: convert --numstat and --shortstat
2019-03-07 09:59:58 +09:00
Junio C Hamano 54b469b9e9 Merge branch 'nd/diff-parseopt'
The diff machinery, one of the oldest parts of the system, which
long predates the parse-options API, uses fairly long and complex
handcrafted option parser.  This is being rewritten to use the
parse-options API.

* nd/diff-parseopt:
  diff.c: convert --raw
  diff.c: convert -W|--[no-]function-context
  diff.c: convert -U|--unified
  diff.c: convert -u|-p|--patch
  diff.c: prepare to use parse_options() for parsing
  diff.h: avoid bit fields in struct diff_flags
  diff.h: keep forward struct declarations sorted
  parse-options: allow ll_callback with OPTION_CALLBACK
  parse-options: avoid magic return codes
  parse-options: stop abusing 'callback' for lowlevel callbacks
  parse-options: add OPT_BITOP()
  parse-options: disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
  parse-options: add one-shot mode
  parse-options.h: remove extern on function prototypes
2019-03-07 09:59:52 +09:00
Nguyễn Thái Ngọc Duy 6d9af6f4da diff-parseopt: convert --binary
Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2019-03-07 08:02:21 +09:00
Nguyễn Thái Ngọc Duy cdc43eb0b8 diff-parseopt: convert --no-renames|--[no--rename-empty
For --rename-empty, see 90d43b0768 (teach diffcore-rename to
optionally ignore empty content - 2012-03-22) for more information.

Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2019-02-21 15:16:59 -08:00
Nguyễn Thái Ngọc Duy af2f368091 diff-parseopt: convert --output-*
This also validates that the user specifies a single character in
--output-indicator-*, not a string.

Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2019-02-21 15:16:59 -08:00
Nguyễn Thái Ngọc Duy 4ce7aab5a5 diff-parseopt: convert --dirstat and friends
Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2019-02-21 15:16:58 -08:00
Junio C Hamano 15b07cba0b Merge branch 'pw/diff-color-moved-ws-fix'
"git diff --color-moved-ws" updates.

* pw/diff-color-moved-ws-fix:
  diff --color-moved-ws: handle blank lines
  diff --color-moved-ws: modify allow-indentation-change
  diff --color-moved-ws: optimize allow-indentation-change
  diff --color-moved=zebra: be stricter with color alternation
  diff --color-moved-ws: fix false positives
  diff --color-moved-ws: demonstrate false positives
  diff: allow --no-color-moved-ws
  Use "whitespace" consistently
  diff: document --no-color-moved
2019-01-29 12:47:53 -08:00
Nguyễn Thái Ngọc Duy d473e2e0e8 diff.c: convert -U|--unified
Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2019-01-27 16:28:18 -08:00
Phillip Wood b73bcbac4a diff: allow --no-color-moved-ws
Allow --no-color-moved-ws and --color-moved-ws=no to cancel any previous
--color-moved-ws option.

Signed-off-by: Phillip Wood <>
Reviewed-by: Stefan Beller <>
Signed-off-by: Junio C Hamano <>
2019-01-10 10:37:59 -08:00
Phillip Wood 748aa1aa34 Use "whitespace" consistently
Most of the messages and documentation use 'whitespace' rather than
'white space' or 'white spaces' convert to latter two to the former for

Signed-off-by: Phillip Wood <>
Reviewed-by: Stefan Beller <>
Signed-off-by: Junio C Hamano <>
2019-01-10 10:37:42 -08:00
Phillip Wood fbafb7c682 diff: document --no-color-moved
Add documentation for --no-color-moved.

Signed-off-by: Phillip Wood <>
Reviewed-by: Stefan Beller <>
Signed-off-by: Junio C Hamano <>
2019-01-10 10:37:32 -08:00
Thomas Braun e0e7cb8080 log -G: ignore binary files
The -G<regex> option of log looks for the differences whose patch text
contains added/removed lines that match regex.

Currently -G looks also into patches of binary files (which
according to [1]) is binary as well.

This has a couple of issues:

- It makes the pickaxe search slow. In a proprietary repository of the
  author with only ~5500 commits and a total .git size of ~300MB
  searching takes ~13 seconds

    $time git log -Gwave > /dev/null

    real    0m13,241s
    user    0m12,596s
    sys     0m0,644s

  whereas when we ignore binary files with this patch it takes ~4s

    $time ~/devel/git/git log -Gwave > /dev/null

    real    0m3,713s
    user    0m3,608s
    sys     0m0,105s

  which is a speedup of more than fourfold.

- The internally used algorithm for generating patch text is based on
  xdiff and its states in [1]

  > The output format of the binary patch file is proprietary
  > (and binary) and it is basically a collection of copy and insert
  > commands [..]

  which means that the current format could change once the internal
  algorithm is changed as the format is not standardized. In addition
  the git binary patch format used for preparing patches for git apply
  is *different* from the xdiff format as can be seen by comparing

  git log -p -a

    commit 6e95bf4bafccf14650d02ab57f3affe669be10cf
    Author: A U Thor <>
    Date:   Thu Apr 7 15:14:13 2005 -0700

        modify binary file

    diff --git a/data.bin b/data.bin
    index f414c84..edfeb6f 100644
    --- a/data.bin
    +++ b/data.bin
    @@ -1,2 +1,4 @@

  with git log --binary

    commit 6e95bf4bafccf14650d02ab57f3affe669be10cf
    Author: A U Thor <>
    Date:   Thu Apr 7 15:14:13 2005 -0700

        modify binary file

    diff --git a/data.bin b/data.bin
    index f414c84bd3aa25fa07836bb1fb73db784635e24b..edfeb6f501[..]
    GIT binary patch
    literal 12

    literal 6

  which seems unexpected.

To resolve these issues this patch makes -G<regex> ignore binary files
by default. Textconv filters are supported and also -a/--text for
getting the old and broken behaviour back.

The -S<block of text> option of log looks for differences that changes
the number of occurrences of the specified block of text (i.e.
addition/deletion) in a file. As we want to keep the current behaviour,
add a test to ensure it stays that way.


Signed-off-by: Thomas Braun <>
Signed-off-by: Junio C Hamano <>
2018-12-26 14:59:37 -08:00
Junio C Hamano 706b0b5e8d Merge branch 'es/diff-color-moved-fix'
One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by

* es/diff-color-moved-fix:
  diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
2018-08-15 15:08:22 -07:00
Junio C Hamano a81575aa91 Merge branch 'sb/diff-color-move-more'
"git diff --color-moved" feature has further been tweaked.

* sb/diff-color-move-more:
  diff.c: offer config option to control ws handling in move detection
  diff.c: add white space mode to move detection that allows indent changes
  diff.c: factor advance_or_nullify out of mark_color_as_moved
  diff.c: decouple white space treatment from move detection algorithm
  diff.c: add a blocks mode for moved code detection
  diff.c: adjust hash function signature to match hashmap expectation
  diff.c: do not pass diff options as keydata to hashmap
  t4015: avoid git as a pipe input
  xdiff/xdiffi.c: remove unneeded function declarations
  xdiff/xdiff.h: remove unused flags
2018-08-02 15:30:40 -07:00
Eric Sunshine e3f2f5f9cd diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".

Signed-off-by: Eric Sunshine <>
Reviewed-by: Jonathan Nieder <>
Signed-off-by: Junio C Hamano <>
2018-07-25 14:23:52 -07:00
Stefan Beller 626c0b5d39 diff.c: offer config option to control ws handling in move detection
Signed-off-by: Stefan Beller <>
Signed-off-by: Junio C Hamano <>
2018-07-19 12:02:54 -07:00
Stefan Beller ca1f4ae4df diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.

To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough.  However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".

As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.

This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.

As this is a white space mode, it is made exclusive to other white space
modes in the move detection.

This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):

 - A
 - B
 - C
 +    A
 +    B
 +    C

At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.

Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:

 - <TAB> A
 + A <TAB>

As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:

 - A
 - B
 -   A
 -   B
 +    A
 +    B
 +      A
 +      B

When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is

Signed-off-by: Stefan Beller <>
Signed-off-by: Junio C Hamano <>
2018-07-19 12:02:54 -07:00