Commit Graph

45 Commits (master)

Author SHA1 Message Date
Junio C Hamano 67a3b2b39f Merge branch 'jc/attr-source-tree'
"git --attr-source=<tree> cmd $args" is a new way to have any
command to read attributes not from the working tree but from the
given tree object.

* jc/attr-source-tree:
  attr: teach "--attr-source=<tree>" global option to "git"
2023-05-17 10:11:41 -07:00
John Cai 44451a2e5e attr: teach "--attr-source=<tree>" global option to "git"
Earlier, 47cfc9bd (attr: add flag `--source` to work with tree-ish,
2023-01-14) taught "git check-attr" the "--source=<tree>" option to
allow it to read attribute files from a tree-ish, but did so only
for the command.  Just like "check-attr" users wanted a way to use
attributes from a tree-ish and not from the working tree files,
users of other commands (like "git diff") would benefit from the

Undo most of the UI change the commit made, while keeping the
internal logic to read attributes from a given tree-ish. Expose the
internal logic via a new "--attr-source=<tree>" command line option
given to "git", so that it can be used with any git command that
runs as part of the main git process.

Additionally, add an environment variable GIT_ATTR_SOURCE that is set
when --attr-source is passed in, so that subprocesses use the same value
for the attributes source tree.

Signed-off-by: John Cai <>
Signed-off-by: Junio C Hamano <>
2023-05-06 14:34:09 -07:00
Elijah Newren 641223137b ws.h: move declarations for ws.c functions from cache.h
Signed-off-by: Elijah Newren <>
Signed-off-by: Junio C Hamano <>
2023-04-24 12:47:32 -07:00
Elijah Newren 69a63fe663 treewide: be explicit about dependence on strbuf.h
Signed-off-by: Elijah Newren <>
Signed-off-by: Junio C Hamano <>
2023-04-24 12:47:31 -07:00
Junio C Hamano 577bff3a81 Merge branch 'kn/attr-from-tree'
"git check-attr" learned to take an optional tree-ish to read the
.gitattributes file from.

* kn/attr-from-tree:
  attr: add flag `--source` to work with tree-ish
  t0003: move setup for `--all` into new block
2023-01-23 13:39:51 -08:00
Karthik Nayak 47cfc9bd7d attr: add flag `--source` to work with tree-ish
The contents of the .gitattributes files may evolve over time, but "git
check-attr" always checks attributes against them in the working tree
and/or in the index. It may be beneficial to optionally allow the users
to check attributes taken from a commit other than HEAD against paths.

Add a new flag `--source` which will allow users to check the
attributes against a commit (actually any tree-ish would do). When the
user uses this flag, we go through the stack of .gitattributes files but
instead of checking the current working tree and/or in the index, we
check the blobs from the provided tree-ish object. This allows the
command to also be used in bare repositories.

Since we use a tree-ish object, the user can pass "--source
HEAD:subdirectory" and all the attributes will be looked up as if
subdirectory was the root directory of the repository.

We cannot simply use the `<rev>:<path>` syntax without the `--source`
flag, similar to how it is used in `git show` because any non-flag
parameter before `--` is treated as an attribute and any parameter after
`--` is treated as a pathname.

The change involves creating a new function `read_attr_from_blob`, which
given the path reads the blob for the path against the provided source and
parses the attributes line by line. This function is plugged into
`read_attr()` function wherein we go through the stack of attributes

Signed-off-by: Karthik Nayak <>
Signed-off-by: Toon Claes <>
Signed-off-by: Junio C Hamano <>
2023-01-14 08:49:55 -08:00
Jeff King d43b99322b convert trivial uses of strncmp() to skip_prefix()
As with the previous patch, using skip_prefix() is more readable and
less error-prone than a raw strncmp(), because it avoids a
manually-computed length. These cases differ from the previous patch
that uses starts_with() because they care about the value after the
matched prefix.

We can convert these to use skip_prefix() by introducing an extra
variable to hold the out-pointer.

Note in the case in ws.c that to get rid of the magic number "9"
completely, we also switch out "len" for recomputing the pointer
difference. These are equivalent because "len" is always "ep - string".

Signed-off-by: Jeff King <>
Signed-off-by: Junio C Hamano <>
2023-01-08 10:34:37 +09:00
Jeff King c5224f0f4c ws: drop unused parameter from ws_blank_line()
We take a ws_rule parameter, but have never looked at it since the
function was added in 877f23ccb8 (Teach "diff --check" about new blank
lines at end, 2008-06-26). A comment in the function does mention how we
_could_ use it, but nobody has felt the need to do so for over a decade.

We could keep it around as reminder of what could be done, but the
comment serves that purpose. And in the meantime, it triggers

So let's drop it, which in turn allows us to drop similar arguments
further up the callstack. I've left the comment intact. It does still
say "ws_rule", but that name is used consistently in the whitespace
code, so the meaning is clear.

Signed-off-by: Jeff King <>
Signed-off-by: Junio C Hamano <>
2022-12-13 22:16:23 +09:00
Junio C Hamano 11877b9ebe Merge branch 'nd/the-index'
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".

* nd/the-index: (23 commits)
  revision.c: reduce implicit dependency the_repository
  revision.c: remove implicit dependency on the_index
  ws.c: remove implicit dependency on the_index
  tree-diff.c: remove implicit dependency on the_index
  submodule.c: remove implicit dependency on the_index
  line-range.c: remove implicit dependency on the_index
  userdiff.c: remove implicit dependency on the_index
  rerere.c: remove implicit dependency on the_index
  sha1-file.c: remove implicit dependency on the_index
  patch-ids.c: remove implicit dependency on the_index
  merge.c: remove implicit dependency on the_index
  merge-blobs.c: remove implicit dependency on the_index
  ll-merge.c: remove implicit dependency on the_index
  diff-lib.c: remove implicit dependency on the_index
  read-cache.c: remove implicit dependency on the_index
  diff.c: remove implicit dependency on the_index
  grep.c: remove implicit dependency on the_index
  diff.c: remove the_index dependency in textconv() functions
  blame.c: rename "repo" argument to "r"
  combine-diff.c: remove implicit dependency on the_index
2018-10-19 13:34:02 +09:00
Nguyễn Thái Ngọc Duy 26d024ecf0 ws.c: remove implicit dependency on the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2018-09-21 09:51:18 -07:00
Torsten Bögershausen d64324cb60 Make git_check_attr() a void function
git_check_attr() returns always 0.
Remove all the error handling code of the callers, which is never executed.
Change git_check_attr() to be a void function.

Signed-off-by: Torsten Bögershausen <>
Signed-off-by: Junio C Hamano <>
2018-09-12 15:15:34 -07:00
Nguyễn Thái Ngọc Duy 7a400a2c02 attr: remove an implicit dependency on the_index
Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.

There is one ugly temporary workaround added in attr.c that needs some
more explanation.

Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.

The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.

Signed-off-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Junio C Hamano <>
2018-08-13 14:14:42 -07:00
Junio C Hamano 2aef63d31c attr: convert git_check_attrs() callers to use the new API
The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano <>
Signed-off-by: Stefan Beller <>
Signed-off-by: Brandon Williams <>
Signed-off-by: Junio C Hamano <>
2017-02-01 13:46:52 -08:00
Junio C Hamano 7bd18054d2 attr: rename function and struct related to checking attributes
The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano <>
Signed-off-by: Stefan Beller <>
Signed-off-by: Brandon Williams <>
Signed-off-by: Junio C Hamano <>
2017-02-01 13:46:52 -08:00
Rohit Mani 2c5495f7b6 use strchrnul() in place of strchr() and strlen()
Avoid scanning strings twice, once with strchr() and then with
strlen(), by using strchrnul().

Helped-by: Junio C Hamano <>
Signed-off-by: Rohit Mani <>
Signed-off-by: Junio C Hamano <>
2014-03-10 08:35:30 -07:00
Michael Haggerty d932f4eb9f Rename git_checkattr() to git_check_attr()
Suggested by: Junio Hamano <>

Signed-off-by: Michael Haggerty <>
Signed-off-by: Junio C Hamano <>
2011-08-04 15:53:21 -07:00
Johannes Sixt f4b05a4947 Make the tab width used for whitespace checks configurable
A new whitespace "rule" is added that sets the tab width to use for
whitespace checks and fix-ups and replaces the hard-coded constant 8.

Since the setting is part of the rules, it can be set per file using

The new configuration is backwards compatible because older git versions
simply ignore unknown whitespace rules.

Signed-off-by: Johannes Sixt <>
Signed-off-by: Junio C Hamano <>
2010-12-01 14:47:51 -08:00
Junio C Hamano dee40e5178 Merge branch 'js/maint-apply-tab-in-indent-fix' into HEAD
* js/maint-apply-tab-in-indent-fix:
  apply --whitespace=fix: fix tab-in-indent
2010-12-01 14:42:00 -08:00
Johannes Sixt d35711adc4 apply --whitespace=fix: fix tab-in-indent
When the whitespace rule tab-in-indent is enabled, apply --whitespace=fix
replaces tabs by the appropriate amount of blanks. The code used
"dst->len % 8" as the criterion to stop adding blanks. But it forgot that
dst holds more than just the current line. Consequently, the modulus was
computed correctly only for the first added line, but not for the second
and subsequent lines. Fix it.

Signed-off-by: Johannes Sixt <>
Acked-by: Chris Webb <>
Signed-off-by: Junio C Hamano <>
2010-12-01 14:34:00 -08:00
Kevin Ballard cfd1a9849c diff: handle lines containing only whitespace and tabs better
When a line contains nothing but whitespace with at least one tab
and the core.whitespace config option contains blank-at-eol, the
whitespace on the line is being printed twice, once unhighlighted
(unless otherwise matched by one of the other core.whitespace values),
and a second time highlighted for blank-at-eol.

Update the leading indentation check to stop checking when it reaches
the trailing whitespace.

Signed-off-by: Kevin Ballard <>
Signed-off-by: Junio C Hamano <>
2010-10-20 16:10:15 -07:00
Chris Webb 4e35c51e51 whitespace: add tab-in-indent support for --whitespace=fix
If tab-in-indent is set, --whitespace=fix will ensure that any stray tabs in
the initial indent are expanded to the correct number of space characters.

Signed-off-by: Chris Webb <>
Signed-off-by: Junio C Hamano <>
2010-04-04 14:21:54 -07:00
Chris Webb d511bd330d whitespace: replumb ws_fix_copy to take a strbuf *dst instead of char *dst
To implement --whitespace=fix for tab-in-indent, we have to allow for the
possibility that whitespace can increase in size when it is fixed, expanding
tabs to to multiple spaces in the initial indent.

Signed-off-by: Chris Webb <>
Signed-off-by: Junio C Hamano <>
2010-04-04 14:21:54 -07:00
Chris Webb 3e3ec2abe0 whitespace: add tab-in-indent error class
Some projects and languages use coding style where no tab character is used to
indent the lines.

This only adds support and documentation for "apply --whitespace=warn" and
"diff --check"; later patches add "apply --whitespace=fix" and tests.

Signed-off-by: Chris Webb <>
Signed-off-by: Junio C Hamano <>
2010-04-02 21:08:04 -07:00
Junio C Hamano 727c3718a5 whitespace: we cannot "catch all errors known to git" anymore
Traditionally, "*.txt whitespace" in .gitattributes file has been an
instruction to catch _all_ classes of whitespace errors known to git.

This has to change, however, in order to introduce "tab-in-indent" which
is inherently incompatible with "indent-with-non-tab".  As we do not want
to break configuration of existing users, add a mechanism to allow marking
selected rules to be excluded from "all rules known to git".

Signed-off-by: Chris Webb <>
Signed-off-by: Junio C Hamano <>
2010-04-02 21:07:44 -07:00
Junio C Hamano 7fb0eaa289 git_attr(): fix function signature
The function took (name, namelen) as its arguments, but all the public
callers wanted to pass a full string.

Demote the counted-string interface to an internal API status, and allow
public callers to just pass the string to the function.

Signed-off-by: Junio C Hamano <>
2010-01-16 20:39:59 -08:00
Junio C Hamano afd9db4173 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-09-15 03:28:08 -07:00
Junio C Hamano aeb84b05ae core.whitespace: split trailing-space into blank-at-{eol,eof}
People who configured trailing-space depended on it to catch both extra
white space at the end of line, and extra blank lines at the end of file.
Earlier attempt to introduce only blank-at-eof gave them an escape hatch
to keep the old behaviour, but it is a regression until they explicitly
specify the new error class.

This introduces a blank-at-eol that only catches extra white space at the
end of line, and makes the traditional trailing-space a convenient synonym
to catch both blank-at-eol and blank-at-eof.  This way, people who used
trailing-space continue to catch both classes of errors.

Signed-off-by: Junio C Hamano <>
2009-09-05 23:14:31 -07:00
Junio C Hamano 77b15bbd88 apply --whitespace=warn/error: diagnose blank at EOF
"git apply" strips new blank lines at EOF under --whitespace=fix option,
but neigher --whitespace=warn nor --whitespace=error paid any attention to
these errors.

Introduce a new whitespace error class, blank-at-eof, to make the
whitespace error handling more consistent.

The patch adds a new "linenr" field to the struct fragment in order to
record which line the hunk started in the input file, but this is needed
solely for reporting purposes.  The detection of this class of whitespace
errors cannot be done while parsing a patch like we do for all the other
classes of whitespace errors.  It instead has to wait until we find where
to apply the hunk, but at that point, we do not have an access to the
original line number in the input file anymore, hence the new field.

Depending on your point of view, this may be a bugfix that makes warn and
error in line with fix.  Or you could call it a new feature.  The line
between them is somewhat fuzzy in this case.

Strictly speaking, triggering more errors than before is a change in
behaviour that is not backward compatible, even though the reason for the
change is because the code was not checking for an error that it should
have.  People who do not want added blank lines at EOF to trigger an error
can disable the new error class.

Signed-off-by: Junio C Hamano <>
2009-09-04 11:50:26 -07:00
Junio C Hamano 422a82f213 Fix severe breakage in "git-apply --whitespace=fix"
735c674 (Trailing whitespace and no newline fix, 2009-07-22) completely
broke --whitespace=fix, causing it to lose all the empty lines in a patch.

Signed-off-by: Junio C Hamano <>
2009-07-25 01:29:20 -07:00
SZEDER Gábor 735c674416 Trailing whitespace and no newline fix
If a patch adds a new line to the end of a file and this line ends with
one trailing whitespace character and has no newline, then
'--whitespace=fix' currently does not remove that trailing whitespace.

This patch fixes this by removing the check for trailing whitespace at
the end of the line at a hardcoded offset which does not take the
eventual absence of newline into account.

Signed-off-by: SZEDER Gábor <>
Signed-off-by: Junio C Hamano <>
2009-07-22 18:54:55 -07:00
Junio C Hamano a437900fd7 attribute: whitespace set to true detects all errors known to git
That is what the documentation says, but the code pretends as if all the
known whitespace error tokens were given.

Among the whitespace error tokens, there is one kind that loosens the rule
when set: cr-at-eol.  Which means that whitespace error token that is set
to true ignores a newly introduced CR at the end, which is inconsistent
with the documentation.

Signed-off-by: Junio C Hamano <>
2009-06-21 10:43:10 -07:00
Brandon Casey f285a2d7ed Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer
Many call sites use strbuf_init(&foo, 0) to initialize local
strbuf variable "foo" which has not been accessed since its
declaration. These can be replaced with a static initialization
using the STRBUF_INIT macro which is just as readable, saves a
function call, and takes up fewer lines.

Signed-off-by: Brandon Casey <>
Signed-off-by: Shawn O. Pearce <>
2008-10-12 12:36:19 -07:00
Junio C Hamano 877f23ccb8 Teach "diff --check" about new blank lines at end
When a patch adds new blank lines at the end, "git apply --whitespace"
warns.  This teaches "diff --check" to do the same.

Signed-off-by: Junio C Hamano <>
2008-06-26 22:07:26 -07:00
Junio C Hamano 8f8841e9c8 check_and_emit_line(): rename and refactor
The function name was too bland and not explicit enough as to what it is
checking.  Split it into two, and call the one that checks if there is a
whitespace breakage "ws_check()", and call the other one that checks and
emits the line after color coding "ws_check_emit()".

Signed-off-by: Junio C Hamano <>
2008-06-26 18:13:50 -07:00
Junio C Hamano c6fabfafbc git-apply --whitespace=fix: fix off by one thinko
When a patch adds a whitespace followed by end-of-line, the
trailing whitespace error was detected correctly but was not
fixed, due to misconversion in 42ab241 (builtin-apply.c: do not
feed copy_wsfix() leading '+').

Signed-off-by: Junio C Hamano <>
2008-02-26 12:24:40 -08:00
Junio C Hamano fe3403c320 ws_fix_copy(): move the whitespace fixing function to ws.c
This is used by git-apply but we can use it elsewhere by slightly
generalizing it.

Signed-off-by: Junio C Hamano <>
2008-02-23 16:59:16 -08:00
Junio C Hamano b2979ff599 core.whitespace: cr-at-eol
This new error mode allows a line to have a carriage return at the
end of the line when checking and fixing trailing whitespace errors.

Some people like to keep CRLF line ending recorded in the repository,
and still want to take advantage of the automated trailing whitespace
stripping.  We still show ^M in the diff output piped to "less" to
remind them that they do have the CR at the end, but these carriage
return characters at the end are no longer flagged as errors.

Signed-off-by: Junio C Hamano <>
2008-02-05 00:38:41 -08:00
J. Bruce Fields ffe568859b whitespace: more accurate initial-indent highlighting
Instead of highlighting the entire initial indent, highlight only the
problematic spaces.

In the case of an indent like ' \t \t' there may be multiple problematic
ranges, so it's easiest to emit the highlighting as we go instead of
trying rember disjoint ranges and do it all at the end.

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Junio C Hamano <>
2007-12-16 13:07:58 -08:00
J. Bruce Fields 9afa2d4aa9 whitespace: fix initial-indent checking
After this patch, "written" counts the number of bytes up to and
including the most recently seen tab.  This allows us to detect (and
count) spaces by comparing to "i".

This allows catching initial indents like '\t        ' (a tab followed
by 8 spaces), while previously indent-with-non-tab caught only indents
that consisted entirely of spaces.

This also allows fixing an indent-with-non-tab regression, so we can
again detect indents like '\t \t'.

Also update tests to catch these cases.

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Junio C Hamano <>
2007-12-16 13:07:49 -08:00
J. Bruce Fields 954ecd4353 whitespace: minor cleanup
The variable leading_space is initially used to represent the index of
the last space seen before a non-space.  Then later it represents the
index of the first non-indent character.

It will prove simpler to replace it by a variable representing a number
of bytes.  Eventually it will represent the number of bytes written so
far (in the stream != NULL case).

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Junio C Hamano <>
2007-12-16 13:07:41 -08:00
J. Bruce Fields 1020999a98 whitespace: reorganize initial-indent check
Reorganize to emphasize the most complicated part of the code (the tab

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Junio C Hamano <>
2007-12-16 13:07:20 -08:00
J. Bruce Fields 4d9697c787 whitespace: fix off-by-one error in non-space-in-indent checking
If there were no tabs, and the last space was at position 7, then
positions 0..7 had spaces, so there were 8 spaces.

Update test to check exactly this case.

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Junio C Hamano <>
2007-12-16 13:07:14 -08:00
Wincent Colaiuta 420f4f04de Use shorter error messages for whitespace problems
The initial version of the whitespace_error_string() function took the
messages from builtin-apply.c rather than the shorter messages from

This commit addresses Junio's concern that these messages might be too
long (now that we can emit multiple warnings per line).

Signed-off-by: Wincent Colaiuta <>
Signed-off-by: Junio C Hamano <>
2007-12-14 20:51:58 -08:00
Wincent Colaiuta c1795bb08a Unify whitespace checking
This commit unifies three separate places where whitespace checking was

 - the whitespace checking previously done in builtin-apply.c is
extracted into a function in ws.c

 - the equivalent logic in "git diff" is removed

 - the emit_line_with_ws() function is also removed because that also
rechecks the whitespace, and its functionality is rolled into ws.c

The new function is called check_and_emit_line() and it does two things:
checks a line for whitespace errors and optionally emits it. The checking
is based on lines of content rather than patch lines (in other words, the
caller must strip the leading "+" or "-"); this was suggested by Junio on
the mailing list to allow for a future extension to "git show" to display
whitespace errors in blobs.

At the same time we teach it to report all classes of whitespace errors
found for a given line rather than reporting only the first found error.

Signed-off-by: Wincent Colaiuta <>
Signed-off-by: Junio C Hamano <>
2007-12-13 23:43:58 -08:00
Junio C Hamano cf1b7869f0 Use gitattributes to define per-path whitespace rule
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]).  This attribute gives you finer
control per path.

For example, if you have these in the .gitattributes:

    frotz   whitespace
    nitfol  -whitespace
    xyzzy   whitespace=-trailing

all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'.  A project with mixed Python and C might want
to have:

    *.c    whitespace
    *.py   whitespace=-indent-with-non-tab

in its toplevel .gitattributes file.

Signed-off-by: Junio C Hamano <>
2007-12-06 00:45:30 -08:00