Git Source Code Mirror - This is a publish-only repository and all pull requests are ignored. Please follow Documentation/SubmittingPatches procedure for any of your improvements. https://git-scm.com/
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
git/Documentation/rev-list-options.txt

1189 lines
44 KiB

Commit Limiting
~~~~~~~~~~~~~~~
Besides specifying a range of commits that should be listed using the
special notations explained in the description, additional commit
limiting may be applied.
Using more options generally further limits the output (e.g.
`--since=<date1>` limits to commits newer than `<date1>`, and using it
with `--grep=<pattern>` further limits to commits whose log message
has a line that matches `<pattern>`), unless otherwise noted.
Note that these are applied before commit
ordering and formatting options, such as `--reverse`.
-<number>::
-n <number>::
--max-count=<number>::
Limit the number of commits to output.
--skip=<number>::
Skip 'number' commits before starting to show the commit output.
--since=<date>::
--after=<date>::
Show commits more recent than a specific date.
log: "--since-as-filter" option is a non-terminating "--since" variant The "--since=<time>" option of "git log" limits the commits displayed by the command by stopping the traversal once it sees a commit whose timestamp is older than the given time and not digging further into its parents. This is OK in a history where a commit always has a newer timestamp than any of its parents'. Once you see a commit older than the given <time>, all ancestor commits of it are even older than the time anyway. It poses, however, a problem when there is a commit with a wrong timestamp that makes it appear older than its parents. Stopping traversal at the "incorrectly old" commit will hide its ancestors that are newer than that wrong commit and are newer than the cut-off time given with the --since option. --max-age and --after being the synonyms to --since, they share the same issue. Add a new "--since-as-filter" option that is a variant of "--since=<time>". Instead of stopping the traversal to hide an old enough commit and its all ancestors, exclude commits with an old timestamp from the output but still keep digging the history. Without other traversal stopping options, this will force the command in "git log" family to dig down the history to the root. It may be an acceptable cost for a small project with short history and many commits with screwy timestamps. It is quite unlikely for us to add traversal stopper other than since, so have this as a --since-as-filter option, rather than a separate --as-filter, that would be probably more confusing. Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
7 months ago
--since-as-filter=<date>::
Show all commits more recent than a specific date. This visits
all commits in the range, rather than stopping at the first commit which
is older than a specific date.
--until=<date>::
--before=<date>::
Show commits older than a specific date.
ifdef::git-rev-list[]
--max-age=<timestamp>::
--min-age=<timestamp>::
Limit the commits output to specified time range.
endif::git-rev-list[]
--author=<pattern>::
--committer=<pattern>::
Limit the commits output to ones with author/committer
header lines that match the specified pattern (regular
expression). With more than one `--author=<pattern>`,
commits whose author matches any of the given patterns are
chosen (similarly for multiple `--committer=<pattern>`).
--grep-reflog=<pattern>::
Limit the commits output to ones with reflog entries that
match the specified pattern (regular expression). With
more than one `--grep-reflog`, commits whose reflog message
matches any of the given patterns are chosen. It is an
error to use this option unless `--walk-reflogs` is in use.
--grep=<pattern>::
Limit the commits output to ones with log message that
matches the specified pattern (regular expression). With
more than one `--grep=<pattern>`, commits whose message
matches any of the given patterns are chosen (but see
`--all-match`).
ifndef::git-rev-list[]
+
When `--notes` is in effect, the message from the notes is
matched as if it were part of the log message.
endif::git-rev-list[]
--all-match::
Limit the commits output to ones that match all given `--grep`,
instead of ones that match at least one.
--invert-grep::
Limit the commits output to ones with log message that do not
match the pattern specified with `--grep=<pattern>`.
-i::
--regexp-ignore-case::
Match the regular expression limiting patterns without regard to letter
case.
--basic-regexp::
Consider the limiting patterns to be basic regular expressions;
this is the default.
-E::
--extended-regexp::
Consider the limiting patterns to be extended regular expressions
instead of the default basic regular expressions.
-F::
--fixed-strings::
Consider the limiting patterns to be fixed strings (don't interpret
pattern as a regular expression).
-P::
--perl-regexp::
Consider the limiting patterns to be Perl-compatible regular
expressions.
+
Support for these types of regular expressions is an optional
compile-time dependency. If Git wasn't compiled with support for them
providing this option will cause it to die.
--remove-empty::
Stop when a given path disappears from the tree.
--merges::
Print only merge commits. This is exactly the same as `--min-parents=2`.
--no-merges::
Do not print commits with more than one parent. This is
exactly the same as `--max-parents=1`.
--min-parents=<number>::
--max-parents=<number>::
--no-min-parents::
--no-max-parents::
Show only commits which have at least (or at most) that many parent
commits. In particular, `--max-parents=1` is the same as `--no-merges`,
`--min-parents=2` is the same as `--merges`. `--max-parents=0`
gives all root commits and `--min-parents=3` all octopus merges.
+
`--no-min-parents` and `--no-max-parents` reset these limits (to no limit)
again. Equivalent forms are `--min-parents=0` (any commit has 0 or more
parents) and `--max-parents=-1` (negative numbers denote no upper limit).
--first-parent::
When finding commits to include, follow only the first
parent commit upon seeing a merge commit. This option
can give a better overview when viewing the evolution of
a particular topic branch, because merges into a topic
branch tend to be only about adjusting to updated upstream
from time to time, and this option allows you to ignore
the individual commits brought in to your history by such
a merge.
ifdef::git-log[]
+
This option also changes default diff format for merge commits
to `first-parent`, see `--diff-merges=first-parent` for details.
endif::git-log[]
--exclude-first-parent-only::
When finding commits to exclude (with a '{caret}'), follow only
the first parent commit upon seeing a merge commit.
This can be used to find the set of changes in a topic branch
from the point where it diverged from the remote branch, given
that arbitrary merges can be valid topic branch changes.
--not::
Reverses the meaning of the '{caret}' prefix (or lack thereof)
for all following revision specifiers, up to the next `--not`.
--all::
Pretend as if all the refs in `refs/`, along with `HEAD`, are
listed on the command line as '<commit>'.
--branches[=<pattern>]::
Pretend as if all the refs in `refs/heads` are listed
on the command line as '<commit>'. If '<pattern>' is given, limit
branches to ones matching given shell glob. If pattern lacks '?',
'{asterisk}', or '[', '/{asterisk}' at the end is implied.
--tags[=<pattern>]::
Pretend as if all the refs in `refs/tags` are listed
on the command line as '<commit>'. If '<pattern>' is given, limit
tags to ones matching given shell glob. If pattern lacks '?', '{asterisk}',
or '[', '/{asterisk}' at the end is implied.
--remotes[=<pattern>]::
Pretend as if all the refs in `refs/remotes` are listed
on the command line as '<commit>'. If '<pattern>' is given, limit
remote-tracking branches to ones matching given shell glob.
If pattern lacks '?', '{asterisk}', or '[', '/{asterisk}' at the end is implied.
--glob=<glob-pattern>::
Pretend as if all the refs matching shell glob '<glob-pattern>'
are listed on the command line as '<commit>'. Leading 'refs/',
is automatically prepended if missing. If pattern lacks '?', '{asterisk}',
or '[', '/{asterisk}' at the end is implied.
--exclude=<glob-pattern>::
Do not include refs matching '<glob-pattern>' that the next `--all`,
`--branches`, `--tags`, `--remotes`, or `--glob` would otherwise
consider. Repetitions of this option accumulate exclusion patterns
up to the next `--all`, `--branches`, `--tags`, `--remotes`, or
`--glob` option (other options or arguments do not clear
accumulated patterns).
+
The patterns given should not begin with `refs/heads`, `refs/tags`, or
`refs/remotes` when applied to `--branches`, `--tags`, or `--remotes`,
respectively, and they must begin with `refs/` when applied to `--glob`
or `--all`. If a trailing '/{asterisk}' is intended, it must be given
explicitly.
--reflog::
Pretend as if all objects mentioned by reflogs are listed on the
command line as `<commit>`.
--alternate-refs::
Pretend as if all objects mentioned as ref tips of alternate
repositories were listed on the command line. An alternate
repository is any repository whose object directory is specified
in `objects/info/alternates`. The set of included objects may
be modified by `core.alternateRefsCommand`, etc. See
linkgit:git-config[1].
--single-worktree::
By default, all working trees will be examined by the
following options when there are more than one (see
linkgit:git-worktree[1]): `--all`, `--reflog` and
`--indexed-objects`.
This option forces them to examine the current working tree
only.
--ignore-missing::
Upon seeing an invalid object name in the input, pretend as if
the bad input was not given.
ifndef::git-rev-list[]
--bisect::
Pretend as if the bad bisection ref `refs/bisect/bad`
was listed and as if it was followed by `--not` and the good
bisection refs `refs/bisect/good-*` on the command
line.
endif::git-rev-list[]
--stdin::
In addition to the '<commit>' listed on the command
line, read them from the standard input. If a `--` separator is
seen, stop reading commits and start reading paths to limit the
result.
ifdef::git-rev-list[]
--quiet::
Don't print anything to standard output. This form
is primarily meant to allow the caller to
test the exit status to see if a range of objects is fully
connected (or not). It is faster than redirecting stdout
to `/dev/null` as the output does not have to be formatted.
rev-list: add --disk-usage option for calculating disk usage It can sometimes be useful to see which refs are contributing to the overall repository size (e.g., does some branch have a bunch of objects not found elsewhere in history, which indicates that deleting it would shrink the size of a clone). You can find that out by generating a list of objects, getting their sizes from cat-file, and then summing them, like: git rev-list --objects --no-object-names main..branch git cat-file --batch-check='%(objectsize:disk)' | perl -lne '$total += $_; END { print $total }' Though note that the caveats from git-cat-file(1) apply here. We "blame" base objects more than their deltas, even though the relationship could easily be flipped. Still, it can be a useful rough measure. But one problem is that it's slow to run. Teaching rev-list to sum up the sizes can be much faster for two reasons: 1. It skips all of the piping of object names and sizes. 2. If bitmaps are in use, for objects that are in the bitmapped packfile we can skip the oid_object_info() lookup entirely, and just ask the revindex for the on-disk size. This patch implements a --disk-usage option which produces the same answer in a fraction of the time. Here are some timings using a clone of torvalds/linux: [rev-list piped to cat-file, no bitmaps] $ time git rev-list --objects --no-object-names --all | git cat-file --buffer --batch-check='%(objectsize:disk)' | perl -lne '$total += $_; END { print $total }' 1459938510 real 0m29.635s user 0m38.003s sys 0m1.093s [internal, no bitmaps] $ time git rev-list --disk-usage --objects --all 1459938510 real 0m31.262s user 0m30.885s sys 0m0.376s Even though the wall-clock time is slightly worse due to parallelism, notice the CPU savings between the two. We saved 21% of the CPU just by avoiding the pipes. But the real win is with bitmaps. If we use them without the new option: [rev-list piped to cat-file, bitmaps] $ time git rev-list --objects --no-object-names --all --use-bitmap-index | git cat-file --batch-check='%(objectsize:disk)' | perl -lne '$total += $_; END { print $total }' 1459938510 real 0m6.244s user 0m8.452s sys 0m0.311s then we're faster to generate the list of objects, but we still spend a lot of time piping and looking things up. But if we do both together: [internal, bitmaps] $ time git rev-list --disk-usage --objects --all --use-bitmap-index 1459938510 real 0m0.219s user 0m0.169s sys 0m0.049s then we get the same answer much faster. For "--all", that answer will correspond closely to "du objects/pack", of course. But we're actually checking reachability here, so we're still fast when we ask for more interesting things: $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10 374798628 real 0m0.429s user 0m0.356s sys 0m0.072s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2 years ago
--disk-usage::
--disk-usage=human::
rev-list: add --disk-usage option for calculating disk usage It can sometimes be useful to see which refs are contributing to the overall repository size (e.g., does some branch have a bunch of objects not found elsewhere in history, which indicates that deleting it would shrink the size of a clone). You can find that out by generating a list of objects, getting their sizes from cat-file, and then summing them, like: git rev-list --objects --no-object-names main..branch git cat-file --batch-check='%(objectsize:disk)' | perl -lne '$total += $_; END { print $total }' Though note that the caveats from git-cat-file(1) apply here. We "blame" base objects more than their deltas, even though the relationship could easily be flipped. Still, it can be a useful rough measure. But one problem is that it's slow to run. Teaching rev-list to sum up the sizes can be much faster for two reasons: 1. It skips all of the piping of object names and sizes. 2. If bitmaps are in use, for objects that are in the bitmapped packfile we can skip the oid_object_info() lookup entirely, and just ask the revindex for the on-disk size. This patch implements a --disk-usage option which produces the same answer in a fraction of the time. Here are some timings using a clone of torvalds/linux: [rev-list piped to cat-file, no bitmaps] $ time git rev-list --objects --no-object-names --all | git cat-file --buffer --batch-check='%(objectsize:disk)' | perl -lne '$total += $_; END { print $total }' 1459938510 real 0m29.635s user 0m38.003s sys 0m1.093s [internal, no bitmaps] $ time git rev-list --disk-usage --objects --all 1459938510 real 0m31.262s user 0m30.885s sys 0m0.376s Even though the wall-clock time is slightly worse due to parallelism, notice the CPU savings between the two. We saved 21% of the CPU just by avoiding the pipes. But the real win is with bitmaps. If we use them without the new option: [rev-list piped to cat-file, bitmaps] $ time git rev-list --objects --no-object-names --all --use-bitmap-index | git cat-file --batch-check='%(objectsize:disk)' | perl -lne '$total += $_; END { print $total }' 1459938510 real 0m6.244s user 0m8.452s sys 0m0.311s then we're faster to generate the list of objects, but we still spend a lot of time piping and looking things up. But if we do both together: [internal, bitmaps] $ time git rev-list --disk-usage --objects --all --use-bitmap-index 1459938510 real 0m0.219s user 0m0.169s sys 0m0.049s then we get the same answer much faster. For "--all", that answer will correspond closely to "du objects/pack", of course. But we're actually checking reachability here, so we're still fast when we ask for more interesting things: $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10 374798628 real 0m0.429s user 0m0.356s sys 0m0.072s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2 years ago
Suppress normal output; instead, print the sum of the bytes used
for on-disk storage by the selected commits or objects. This is
equivalent to piping the output into `git cat-file
--batch-check='%(objectsize:disk)'`, except that it runs much
faster (especially with `--use-bitmap-index`). See the `CAVEATS`
section in linkgit:git-cat-file[1] for the limitations of what
"on-disk storage" means.
With the optional value `human`, on-disk storage size is shown
in human-readable string(e.g. 12.24 Kib, 3.50 Mib).
endif::git-rev-list[]
--cherry-mark::
Like `--cherry-pick` (see below) but mark equivalent commits
with `=` rather than omitting them, and inequivalent ones with `+`.
--cherry-pick::
Omit any commit that introduces the same change as
another commit on the ``other side'' when the set of
commits are limited with symmetric difference.
+
For example, if you have two branches, `A` and `B`, a usual way
to list all commits on only one side of them is with
`--left-right` (see the example below in the description of
the `--left-right` option). However, it shows the commits that were
cherry-picked from the other branch (for example, ``3rd on b'' may be
cherry-picked from branch A). With this option, such pairs of commits are
excluded from the output.
--left-only::
--right-only::
List only commits on the respective side of a symmetric difference,
i.e. only those which would be marked `<` resp. `>` by
`--left-right`.
+
For example, `--cherry-pick --right-only A...B` omits those
commits from `B` which are in `A` or are patch-equivalent to a commit in
docs: stop using asciidoc no-inline-literal In asciidoc 7, backticks like `foo` produced a typographic effect, but did not otherwise affect the syntax. In asciidoc 8, backticks introduce an "inline literal" inside which markup is not interpreted. To keep compatibility with existing documents, asciidoc 8 has a "no-inline-literal" attribute to keep the old behavior. We enabled this so that the documentation could be built on either version. It has been several years now, and asciidoc 7 is no longer in wide use. We can now decide whether or not we want inline literals on their own merits, which are: 1. The source is much easier to read when the literal contains punctuation. You can use `master~1` instead of `master{tilde}1`. 2. They are less error-prone. Because of point (1), we tend to make mistakes and forget the extra layer of quoting. This patch removes the no-inline-literal attribute from the Makefile and converts every use of backticks in the documentation to an inline literal (they must be cleaned up, or the example above would literally show "{tilde}" in the output). Problematic sites were found by grepping for '`.*[{\\]' and examined and fixed manually. The results were then verified by comparing the output of "html2text" on the set of generated html pages. Doing so revealed that in addition to making the source more readable, this patch fixes several formatting bugs: - HTML rendering used the ellipsis character instead of literal "..." in code examples (like "git log A...B") - some code examples used the right-arrow character instead of '->' because they failed to quote - api-config.txt did not quote tilde, and the resulting HTML contained a bogus snippet like: <tt><sub></tt> foo <tt></sub>bar</tt> which caused some parsers to choke and omit whole sections of the page. - git-commit.txt confused ``foo`` (backticks inside a literal) with ``foo'' (matched double-quotes) - mentions of `A U Thor <author@example.com>` used to erroneously auto-generate a mailto footnote for author@example.com - the description of --word-diff=plain incorrectly showed the output as "[-removed-] and {added}", not "{+added+}". - using "prime" notation like: commit `C` and its replacement `C'` confused asciidoc into thinking that everything between the first backtick and the final apostrophe were meant to be inside matched quotes - asciidoc got confused by the escaping of some of our asterisks. In particular, `credential.\*` and `credential.<url>.\*` properly escaped the asterisk in the first case, but literally passed through the backslash in the second case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
11 years ago
`A`. In other words, this lists the `+` commits from `git cherry A B`.
More precisely, `--cherry-pick --right-only --no-merges` gives the exact
list.
--cherry::
A synonym for `--right-only --cherry-mark --no-merges`; useful to
limit the output to the commits on our side and mark those that
have been applied to the other side of a forked history with
`git log --cherry upstream...mybranch`, similar to
`git cherry upstream mybranch`.
-g::
--walk-reflogs::
Instead of walking the commit ancestry chain, walk
reflog entries from the most recent one to older ones.
When this option is used you cannot specify commits to
exclude (that is, '{caret}commit', 'commit1..commit2',
and 'commit1\...commit2' notations cannot be used).
+
With `--pretty` format other than `oneline` and `reference` (for obvious reasons),
this causes the output to have two extra lines of information
taken from the reflog. The reflog designator in the output may be shown
as `ref@{Nth}` (where `Nth` is the reverse-chronological index in the
reflog) or as `ref@{timestamp}` (with the timestamp for that entry),
depending on a few rules:
+
--
1. If the starting point is specified as `ref@{Nth}`, show the index
format.
+
2. If the starting point was specified as `ref@{now}`, show the
timestamp format.
+
3. If neither was used, but `--date` was given on the command line, show
the timestamp in the format requested by `--date`.
+
4. Otherwise, show the index format.
--
+
Under `--pretty=oneline`, the commit message is
prefixed with this information on the same line.
This option cannot be combined with `--reverse`.
See also linkgit:git-reflog[1].
+
Under `--pretty=reference`, this information will not be shown at all.
--merge::
After a failed merge, show refs that touch files having a
conflict and don't exist on all heads to merge.
--boundary::
Output excluded boundary commits. Boundary commits are
prefixed with `-`.
rev-list: add bitmap mode to speed up object lists The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master | wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
ifdef::git-rev-list[]
--use-bitmap-index::
Try to speed up the traversal using the pack bitmap index (if
one is available). Note that when traversing with `--objects`,
trees and blobs will not have their associated path printed.
--progress=<header>::
Show progress reports on stderr as objects are considered. The
`<header>` text will be printed with each progress update.
rev-list: add bitmap mode to speed up object lists The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master | wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
9 years ago
endif::git-rev-list[]
History Simplification
~~~~~~~~~~~~~~~~~~~~~~
Sometimes you are only interested in parts of the history, for example the
commits modifying a particular <path>. But there are two parts of
'History Simplification', one part is selecting the commits and the other
is how to do it, as there are various strategies to simplify the history.
The following options select the commits to be shown:
<paths>::
Commits modifying the given <paths> are selected.
--simplify-by-decoration::
Commits that are referred by some branch or tag are selected.
Note that extra commits can be shown to give a meaningful history.
The following options affect the way the simplification is performed:
Default mode::
Simplifies the history to the simplest history explaining the
final state of the tree. Simplest because it prunes some side
branches if the end result is the same (i.e. merging branches
with the same content)
revision: --show-pulls adds helpful merges The default file history simplification of "git log -- <path>" or "git rev-list -- <path>" focuses on providing the smallest set of commits that first contributed a change. The revision walk greatly restricts the set of walked commits by visiting only the first TREESAME parent of a merge commit, when one exists. This means that portions of the commit-graph are not walked, which can be a performance benefit, but can also "hide" commits that added changes but were ignored by a merge resolution. The --full-history option modifies this by walking all commits and reporting a merge commit as "interesting" if it has _any_ parent that is not TREESAME. This tends to be an over-representation of important commits, especially in an environment where most merge commits are created by pull request completion. Suppose we have a commit A and we create a commit B on top that changes our file. When we merge the pull request, we create a merge commit M. If no one else changed the file in the first-parent history between M and A, then M will not be TREESAME to its first parent, but will be TREESAME to B. Thus, the simplified history will be "B". However, M will appear in the --full-history mode. However, suppose that a number of topics T1, T2, ..., Tn were created based on commits C1, C2, ..., Cn between A and M as follows: A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn \ \ \ \ / / / / \ \__.. \ \/ ..__T1 / Tn \ \__.. /\ ..__T2 / \_____________________B \____________________/ If the commits T1, T2, ... Tn did not change the file, then all of P1 through Pn will be TREESAME to their first parent, but not TREESAME to their second. This means that all of those merge commits appear in the --full-history view, with edges that immediately collapse into the lower history without introducing interesting single-parent commits. The --simplify-merges option was introduced to remove these extra merge commits. By noticing that the rewritten parents are reachable from their first parents, those edges can be simplified away. Finally, the commits now look like single-parent commits that are TREESAME to their "only" parent. Thus, they are removed and this issue does not cause issues anymore. However, this also ends up removing the commit M from the history view! Even worse, the --simplify-merges option requires walking the entire history before returning a single result. Many Git users are using Git alongside a Git service that provides code storage alongside a code review tool commonly called "Pull Requests" or "Merge Requests" against a target branch. When these requests are accepted and merged, they typically create a merge commit whose first parent is the previous branch tip and the second parent is the tip of the topic branch used for the request. This presents a valuable order to the parents, but also makes that merge commit slightly special. Users may want to see not only which commits changed a file, but which pull requests merged those commits into their branch. In the previous example, this would mean the users want to see the merge commit "M" in addition to the single- parent commit "C". Users are even more likely to want these merge commits when they use pull requests to merge into a feature branch before merging that feature branch into their trunk. In some sense, users are asking for the "first" merge commit to bring in the change to their branch. As long as the parent order is consistent, this can be handled with the following rule: Include a merge commit if it is not TREESAME to its first parent, but is TREESAME to a later parent. These merges look like the merge commits that would result from running "git pull <topic>" on a main branch. Thus, the option to show these commits is called "--show-pulls". This has the added benefit of showing the commits created by closing a pull request or merge request on any of the Git hosting and code review platforms. To test these options, extend the standard test example to include a merge commit that is not TREESAME to its first parent. It is surprising that that option was not already in the example, as it is instructive. In particular, this extension demonstrates a common issue with file history simplification. When a user resolves a merge conflict using "-Xours" or otherwise ignoring one side of the conflict, they create a TREESAME edge that probably should not be TREESAME. This leads users to become frustrated and complain that "my change disappeared!" In my experience, showing them history with --full-history and --simplify-merges quickly reveals the problematic merge. As mentioned, this option is expensive to compute. The --show-pulls option _might_ show the merge commit (usually titled "resolving conflicts") more quickly. Of course, this depends on the user having the correct parent order, which is backwards when using "git pull master" from a topic branch. There are some special considerations when combining the --show-pulls option with --simplify-merges. This requires adding a new PULL_MERGE object flag to store the information from the initial TREESAME comparisons. This helps avoid dropping those commits in later filters. This is covered by a test, including how the parents can be simplified. Since "struct object" has already ruined its 32-bit alignment by using 33 bits across parsed, type, and flags member, let's not make it worse. PULL_MERGE is used in revision.c with the same value (1u<<15) as REACHABLE in commit-graph.c. The REACHABLE flag is only used when writing a commit-graph file, and a revision walk using --show-pulls does not happen in the same process. Care must be taken in the future to ensure this remains the case. Update Documentation/rev-list-options.txt with significant details around this option. This requires updating the example in the History Simplification section to demonstrate some of the problems with TREESAME second parents. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
3 years ago
--show-pulls::
Include all commits from the default mode, but also any merge
commits that are not TREESAME to the first parent but are
TREESAME to a later parent. This mode is helpful for showing
the merge commits that "first introduced" a change to a branch.
--full-history::
Same as the default mode, but does not prune some history.
--dense::
Only the selected commits are shown, plus some to have a
meaningful history.
--sparse::
All commits in the simplified history are shown.
--simplify-merges::
Additional option to `--full-history` to remove some needless
merges from the resulting history, as there are no selected
commits contributing to this merge.
--ancestry-path[=<commit>]::
When given a range of commits to display (e.g. 'commit1..commit2'
or 'commit2 {caret}commit1'), only display commits in that range
that are ancestors of <commit>, descendants of <commit>, or
<commit> itself. If no commit is specified, use 'commit1' (the
excluded part of the range) as <commit>. Can be passed multiple
times; if so, a commit is included if it is any of the commits
given or if it is an ancestor or descendant of one of them.
A more detailed explanation follows.
Suppose you specified `foo` as the <paths>. We shall call commits
that modify `foo` !TREESAME, and the rest TREESAME. (In a diff
filtered for `foo`, they look different and equal, respectively.)
In the following, we will always refer to the same example history to
illustrate the differences between simplification settings. We assume
that you are filtering for a file `foo` in this commit graph:
-----------------------------------------------------------------------
.-A---M---N---O---P---Q
/ / / / / /
I B C D E Y
\ / / / / /
`-------------' X
-----------------------------------------------------------------------
The horizontal line of history A---Q is taken to be the first parent of
each merge. The commits are:
* `I` is the initial commit, in which `foo` exists with contents
``asdf'', and a file `quux` exists with contents ``quux''. Initial
commits are compared to an empty tree, so `I` is !TREESAME.
* In `A`, `foo` contains just ``foo''.
* `B` contains the same change as `A`. Its merge `M` is trivial and
hence TREESAME to all parents.
* `C` does not change `foo`, but its merge `N` changes it to ``foobar'',
so it is not TREESAME to any parent.
* `D` sets `foo` to ``baz''. Its merge `O` combines the strings from
`N` and `D` to ``foobarbaz''; i.e., it is not TREESAME to any parent.
* `E` changes `quux` to ``xyzzy'', and its merge `P` combines the
strings to ``quux xyzzy''. `P` is TREESAME to `O`, but not to `E`.
* `X` is an independent root commit that added a new file `side`, and `Y`
modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
`Q` is TREESAME to `P`, but not to `Y`.
`rev-list` walks backwards through history, including or excluding
commits based on whether `--full-history` and/or parent rewriting
(via `--parents` or `--children`) are used. The following settings
are available.
Default mode::
Commits are included if they are not TREESAME to any parent
(though this can be changed, see `--sparse` below). If the
commit was a merge, and it was TREESAME to one parent, follow
only that parent. (Even if there are several TREESAME
parents, follow only one of them.) Otherwise, follow all
parents.
+
This results in:
+
-----------------------------------------------------------------------
.-A---N---O
/ / /
I---------D
-----------------------------------------------------------------------
+
Note how the rule to only follow the TREESAME parent, if one is
available, removed `B` from consideration entirely. `C` was
considered via `N`, but is TREESAME. Root commits are compared to an
empty tree, so `I` is !TREESAME.
+
Parent/child relations are only visible with `--parents`, but that does
not affect the commits selected in default mode, so we have shown the
parent lines.
--full-history without parent rewriting::
This mode differs from the default in one point: always follow
all parents of a merge, even if it is TREESAME to one of them.
Even if more than one side of the merge has commits that are
included, this does not imply that the merge itself is! In
the example, we get
+
-----------------------------------------------------------------------
I A B N D O P Q
-----------------------------------------------------------------------
+
revision.c: Make --full-history consider more merges History simplification previously always treated merges as TREESAME if they were TREESAME to any parent. While this was consistent with the default behaviour, this could be extremely unhelpful when searching detailed history, and could not be overridden. For example, if a merge had ignored a change, as if by "-s ours", then: git log -m -p --full-history -Schange file would successfully locate "change"'s addition but would not locate the merge that resolved against it. Futher, simplify_merges could drop the actual parent that a commit was TREESAME to, leaving it as a normal commit marked TREESAME that isn't actually TREESAME to its remaining parent. Now redefine a commit's TREESAME flag to be true only if a commit is TREESAME to _all_ of its parents. This doesn't affect either the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits), or full-history with parent rewriting (because all merges are output). But it does affect other modes. The clearest difference is that --full-history will show more merges - sufficient to ensure that -m -p --full-history log searches can really explain every change to the file, including those changes' ultimate fate in merges. Also modify simplify_merges to recalculate TREESAME after removing a parent. This is achieved by storing per-parent TREESAME flags on the initial scan, so the combined flag can be easily recomputed. This fixes some t6111 failures, but creates a couple of new ones - we are now showing some merges that don't need to be shown. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
10 years ago
`M` was excluded because it is TREESAME to both parents. `E`,
`C` and `B` were all walked, but only `B` was !TREESAME, so the others
do not appear.
+
Note that without parent rewriting, it is not really possible to talk
about the parent/child relationships between the commits, so we show
them disconnected.
--full-history with parent rewriting::
Ordinary commits are only included if they are !TREESAME
(though this can be changed, see `--sparse` below).
+
Merges are always included. However, their parent list is rewritten:
Along each parent, prune away commits that are not included
themselves. This results in
+
-----------------------------------------------------------------------
.-A---M---N---O---P---Q
/ / / / /
I B / D /
\ / / / /
`-------------'
-----------------------------------------------------------------------
+
Compare to `--full-history` without rewriting above. Note that `E`
was pruned away because it is TREESAME, but the parent list of P was
rewritten to contain `E`'s parent `I`. The same happened for `C` and
`N`, and `X`, `Y` and `Q`.
In addition to the above settings, you can change whether TREESAME
affects inclusion:
--dense::
Commits that are walked are included if they are not TREESAME
to any parent.
--sparse::
All commits that are walked are included.
+
Note that without `--full-history`, this still simplifies merges: if
one of the parents is TREESAME, we follow only that one, so the other
sides of the merge are never walked.
--simplify-merges::
First, build a history graph in the same way that
`--full-history` with parent rewriting does (see above).
+
Then simplify each commit `C` to its replacement `C'` in the final
history according to the following rules:
+
--
* Set `C'` to `C`.
+
* Replace each parent `P` of `C'` with its simplification `P'`. In
the process, drop parents that are ancestors of other parents or that are
root commits TREESAME to an empty tree, and remove duplicates, but take care
to never drop all parents that we are TREESAME to.
+
* If after this parent rewriting, `C'` is a root or merge commit (has
zero or >1 parents), a boundary commit, or !TREESAME, it remains.
Otherwise, it is replaced with its only parent.
--
+
The effect of this is best shown by way of comparing to
`--full-history` with parent rewriting. The example turns into:
+
-----------------------------------------------------------------------
.-A---M---N---O
/ / /
I B D
\ / /
`---------'
-----------------------------------------------------------------------
+
Note the major differences in `N`, `P`, and `Q` over `--full-history`: