Initial check in of YAPF 0.1.

style-config
Bill Wendling 8 years ago
commit 7d623455f4
  1. 28
      .gitignore
  2. 9
      AUTHORS
  3. 24
      CONTRIBUTING
  4. 13
      CONTRIBUTORS
  5. 202
      LICENSE
  6. 151
      README.rst
  7. 68
      setup.py
  8. 142
      yapf/__init__.py
  9. 18
      yapf/__main__.py
  10. 13
      yapf/yapflib/__init__.py
  11. 162
      yapf/yapflib/blank_line_calculator.py
  12. 293
      yapf/yapflib/comment_splicer.py
  13. 77
      yapf/yapflib/file_resources.py
  14. 388
      yapf/yapflib/format_decision_state.py
  15. 211
      yapf/yapflib/format_token.py
  16. 107
      yapf/yapflib/line_joiner.py
  17. 250
      yapf/yapflib/pytree_unwrapper.py
  18. 216
      yapf/yapflib/pytree_utils.py
  19. 135
      yapf/yapflib/pytree_visitor.py
  20. 451
      yapf/yapflib/reformatter.py
  21. 268
      yapf/yapflib/split_penalty.py
  22. 69
      yapf/yapflib/style.py
  23. 245
      yapf/yapflib/subtype_assigner.py
  24. 390
      yapf/yapflib/unwrapped_line.py
  25. 72
      yapf/yapflib/verifier.py
  26. 234
      yapf/yapflib/yapf_api.py
  27. 56
      yapftests/__init__.py
  28. 276
      yapftests/blank_line_calculator_test.py
  29. 312
      yapftests/comment_splicer_test.py
  30. 169
      yapftests/format_decision_state_test.py
  31. 42
      yapftests/format_token_test.py
  32. 95
      yapftests/line_joiner_test.py
  33. 378
      yapftests/pytree_unwrapper_test.py
  34. 194
      yapftests/pytree_utils_test.py
  35. 123
      yapftests/pytree_visitor_test.py
  36. 1249
      yapftests/reformatter_test.py
  37. 262
      yapftests/split_penalty_test.py
  38. 204
      yapftests/subtype_assigner_test.py
  39. 119
      yapftests/unwrapped_line_test.py
  40. 309
      yapftests/yapf_test.py

28
.gitignore vendored

@ -0,0 +1,28 @@
#==============================================================================#
# This file specifies intentionally untracked files that git should ignore.
# See: http://www.kernel.org/pub/software/scm/git/docs/gitignore.html
#
# This file is intentionally different from the output of `git svn show-ignore`,
# as most of those are useless.
#==============================================================================#
#==============================================================================#
# File extensions to be ignored anywhere in the tree.
#==============================================================================#
# Temp files created by most text editors.
*~
# Merge files created by git.
*.orig
# Byte compiled python modules.
*.pyc
# vim swap files
.*.sw?
.sw?
#OS X specific files.
.DS_store
#==============================================================================#
# Directories to ignore (do not add trailing '/'s, they skip symlinks).
#==============================================================================#
# The build directory.
build

@ -0,0 +1,9 @@
# This is the official list of YAPF authors for copyright purposes.
# This file is distinct from the CONTRIBUTORS files.
# See the latter for an explanation.
# Names should be added to this file as:
# Name or Organization <email address>
# The email address is not required for organizations.
Google Inc.

@ -0,0 +1,24 @@
Want to contribute? Great! First, read this page (including the small print at the end).
### Before you contribute
Before we can use your code, you must sign the
[Google Individual Contributor License Agreement](https://developers.google.com/open-source/cla/individual?csw=1)
(CLA), which you can do online. The CLA is necessary mainly because you own the
copyright to your changes, even after your contribution becomes part of our
codebase, so we need your permission to use and distribute your code. We also
need to be sure of various other things—for instance that you'll tell us if you
know that your code infringes on other people's patents. You don't have to sign
the CLA until after you've submitted your code for review and a member has
approved it, but you must do it before we can put your code into our codebase.
Before you start working on a larger contribution, you should get in touch with
us first through the issue tracker with your idea so that we can help out and
possibly guide you. Coordinating up front makes it much easier to avoid
frustration later on.
### Code reviews
All submissions, including submissions by project members, require review. We
use Github pull requests for this purpose.
### The small print
Contributions made by corporations are covered by a different agreement than
the one above, the Software Grant and Corporate Contributor License Agreement.

@ -0,0 +1,13 @@
# People who have agreed to one of the CLAs and can contribute patches.
# The AUTHORS file lists the copyright holders; this file
# lists people. For example, Google employees are listed here
# but not in AUTHORS, because Google holds the copyright.
#
# https://developers.google.com/open-source/cla/individual
# https://developers.google.com/open-source/cla/corporate
#
# Names should be added to this file as:
# Name <email address>
Bill Wendling <morbo@google.com>
Eli Bendersky <eliben@google.com>

@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

@ -0,0 +1,151 @@
====
YAPF
====
Introduction
============
Most of the current formatters for Python -- e.g., autopep8, and pep8ify -- are
made to remove lint errors from code. This has some obvious limitations. For
instance, code that conforms to the PEP 8 guidelines may not be reformatted.
But it doesn't mean that the code looks good.
YAPF takes a different approach. It's based off of 'clang-format', developed by
Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
best formatting that conforms to the style guide, even if the original code
didn't violate the style guide.
The ultimate goal is that the code YAPF produces is as good as the code that a
programmer would write if they were following the style guide.
.. contents::
Installation
============
From source directory::
$ sudo python ./setup.py install
Usage
=====
<verbatim>
usage: __main__.py [-h] [-d | -i] [-l START-END | -r] ...
Formatter for Python code.
positional arguments:
files
optional arguments:
-h, --help show this help message and exit
-d, --diff print the diff for the fixed source
-i, --in-place make changes to files in place
-l START-END, --lines START-END
range of lines to reformat, one-based
-r, --recursive run recursively over directories
</verbatim>
Why Not Improve Existing Tools?
===============================
We wanted to use clang-format's reformatting algorithm. It's very powerful and
designed to come up with the best formatting possible. Existing tools were
created with different goals in mind, and would require extensive modifications
to convert to using clang-format's algorithm.
Can I Use YAPF In My Program?
=============================
Please do! YAPF was designed to be used as a library as well as a command line
tool. This means that a tool or IDE plugin is free to use YAPF.
Gory Details
============
Algorithm Design
----------------
The main data structure in YAPF is the UnwrappedLine object. It holds a list of
FormatTokens, that we would want to place on a single line if there were no
column limit. An exception being a comment in the middle of an expression
statement will force the line to be formatted on more than one line. The
formatter works on one UnwrappedLine object at a time.
An UnwrappedLine typically won't affect the formatting of lines before or after
it. There is a part of the algorithm that may join two or more UnwrappedLines
into one line. For instance, an if-then statement with a short body can be
placed on a single line:
if a == 42: continue
YAPF's formatting algorithm creates a weighted tree that acts as the solution
space for the algorithm. Each node in the tree represents the result of a
formatting decision --- i.e., whether to split or not to split before a token.
Each formatting decision has a cost associated with it. Therefore, the cost is
realized on the edge between two nodes. (In reality, the weighted tree doesn't
have separate edge objects, so the cost resides on the nodes themselves.)
For example, take the following Python code snippet. For the sake of this
example, assume that line (1) violates the column limit restriction and needs to
be reformatted.
1: def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee):
2: pass
For line (1), the algorithm will build a tree where each node (a
FormattingDecisionState object) is the state of the line at that token given the
decision to split before the token or not. Note: the FormatDecisionState objects
are copied by value so each node in the graph is unique and a change in one
doesn't affect other nodes.
Here is a hypothetical subtree of the first five tokens. The value in
parentheses is the hypothetical cost of splitting before the token. (The left
hand branch is a decision to split and the right hand branch is a decision not
to split.)
'def'
|
| (0)
|
'xxxxxxxxxxx'
|
| (0)
|
'('
|
(3) +---------------------------+ (1)
| |
| |
'aaaaaaaaaaaa' 'aaaaaaaaaaaa'
| |
| |
+--------------+ +-------------+
| | | |
(50) | (0) | (50) | (0) |
',' ',' ',' ','
And so on. Heuristics are used to determine the costs of splitting or not
splitting. Because a node holds the state of the tree up to a token's insertion,
it can easily determine if a splitting decision will violate one of the style
requirements. For instance, the heuristic is able to apply an extra penalty to
the edge when not splitting between the previous token and the one being added.
There are some instances where we will never want to split the line, because
doing so will always be detrimental (i.e., it will require a backslash-newline,
which is very rarely desirable). For line (1), we will never want to split the
first three tokens: 'def', 'xxxxxxxxxxx', and '('. Nor will we want to split
between the ')' and the ':' at the end. These regions are said to be
"unbreakable." This is reflected in the tree by there not being a 'split'
decision (left hand branch) within the unbreakable region.
Now that we have the tree, we determine what the "best" formatting is by finding
the path through the tree with the lowest cost.
And that's it!

@ -0,0 +1,68 @@
#!/usr/bin/env python
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import unittest
from distutils.core import setup, Command
import yapf
import yapftests
class RunTests(Command):
user_options = []
def initialize_options(self):
pass
def finalize_options(self):
pass
def run(self):
tests = unittest.TestSuite(yapftests.suite())
runner = unittest.TextTestRunner()
runner.run(tests)
with open('README', 'r') as fd:
setup(
name='yapf',
version=yapf.__version__,
description='A formatter for Python code.',
long_description=fd.read(),
license='Apache License, Version 2.0',
author='Google Inc.',
maintainer='Bill Wendling',
maintainer_email='morbo@google.com',
packages=['yapf', 'yapf.yapflib'],
classifiers=[
'Development Status :: 3 - Alpha',
'Environment :: Console',
'Intended Audience :: Developers',
'License :: OSI Approved :: Apache Software License',
'Operating System :: OS Independent',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Topic :: Software Development :: Libraries :: Python Modules',
'Topic :: Software Development :: Quality Assurance',
],
cmdclass={
'test': RunTests,
},
)

@ -0,0 +1,142 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Yet Another Python Formatter.
YAPF uses the algorithm in clang-format to figure out the "best" formatting for
Python code. It looks at the program as a series of "unwrappable lines" ---
i.e., lines which, if there were no column limit, we would place all tokens on
that line. It then uses a priority queue to figure out what the best formatting
is --- i.e., the formatting with the least penalty.
It differs from tools like autopep8 and pep8ify in that it doesn't just look for
violations of the style guide, but looks at the module as a whole, making
formatting decisions based on what's the best format for each line.
If no filenames are specified, YAPF reads the code from stdin.
"""
import argparse
import logging
import sys
from yapf.yapflib import file_resources
from yapf.yapflib import yapf_api
__version__ = '0.1'
def main(argv):
"""Main program.
Arguments:
argv: (Positional arguments) A list of files to reformat.
Returns:
0 if there were no errors, non-zero otherwise.
"""
parser = argparse.ArgumentParser(description='Formatter for Python code.')
diff_inplace_group = parser.add_mutually_exclusive_group()
diff_inplace_group.add_argument(
'-d', '--diff', action='store_true',
help='print the diff for the fixed source')
diff_inplace_group.add_argument(
'-i', '--in-place', action='store_true',
help='make changes to files in place')
lines_recursive_group = parser.add_mutually_exclusive_group()
lines_recursive_group.add_argument(
'-l', '--lines', metavar='START-END', action='append', default=None,
help='range of lines to reformat, one-based')
lines_recursive_group.add_argument(
'-r', '--recursive', action='store_true',
help='run recursively over directories')
parser.add_argument('files', nargs=argparse.REMAINDER)
args = parser.parse_args()
if args.lines and len(args.files) > 1:
parser.error('cannot use -l/--lines with more than one file')
lines = _GetLines(args.lines) if args.lines is not None else None
files = file_resources.GetCommandLineFiles(argv[1:], args.recursive)
if not files:
# No arguments specified. Read code from stdin.
if args.in_place or args.diff:
parser.error('cannot use --in_place or --diff flags when reading '
'from stdin')
original_source = []
while True:
try:
# Use 'raw_input' instead of 'sys.stdin.read', because otherwise the
# user will need to hit 'Ctrl-D' more than once if they're inputting
# the program by hand. 'raw_input' throws an EOFError exception if
# 'Ctrl-D' is pressed, which makes it easy to bail out of this loop.
original_source.append(raw_input())
except EOFError:
break
sys.stdout.write(yapf_api.FormatCode(
unicode('\n'.join(original_source) + '\n'),
filename='<stdin>',
lines=lines))
return 0
FormatFiles(files, lines, args.in_place)
return 0
def FormatFiles(filenames, lines, in_place=False):
"""Format a list of files.
Arguments:
filenames: (list of unicode) A list of files to reformat.
lines: (list of tuples of integers) A list of tuples of lines, [start, end],
that we want to format. The lines are 1-based indexed. This argument
overrides the 'args.lines'. It can be used by third-party code (e.g.,
IDEs) when reformatting a snippet of code.
in_place: (bool) Modify the files in place.
"""
for filename in filenames:
logging.info('Reformatting %s', filename)
reformatted_code = yapf_api.FormatFile(filename, lines)
if reformatted_code is not None:
file_resources.WriteReformattedCode(filename, reformatted_code, in_place)
def _GetLines(line_strings):
"""Parses the start and end lines from a line string like 'start-end'.
Arguments:
line_strings: (array of string) A list of strings representing a line
range like 'start-end'.
Returns:
A list of tuples of the start and end line numbers.
Raises:
ValueError: If the line string failed to parse or was an invalid line range.
"""
lines = []
for line_string in line_strings:
line = map(int, line_string.split('-', 1))
if line[0] < 1:
raise ValueError('invalid start of line range: %r' % line)
if line[0] > line[1]:
raise ValueError('end comes before start in line range: %r', line)
lines.append(tuple(line))
return lines
if __name__ == '__main__':
sys.exit(main(sys.argv))

@ -0,0 +1,18 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import yapf
sys.exit(yapf.main(sys.argv))

@ -0,0 +1,13 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

@ -0,0 +1,162 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Calculate the number of blank lines between top-level entities.
Calculates how many blank lines we need between classes, functions, and other
entities at the same level.
CalculateBlankLines(): the main function exported by this module.
Annotations:
newlines: The number of newlines required before the node.
"""
from lib2to3 import pytree
from yapf.yapflib import pytree_utils
from yapf.yapflib import pytree_visitor
_NO_BLANK_LINES = 1
_ONE_BLANK_LINE = 2
_TWO_BLANK_LINES = 3
_PYTHON_STATEMENTS = frozenset({
'simple_stmt', 'small_stmt', 'expr_stmt', 'print_stmt', 'del_stmt',
'pass_stmt', 'break_stmt', 'continue_stmt', 'return_stmt', 'raise_stmt',
'yield_stmt', 'import_stmt', 'global_stmt', 'exec_stmt', 'assert_stmt',
'if_stmt', 'while_stmt', 'for_stmt', 'try_stmt'
})
def CalculateBlankLines(tree):
"""Run the blank line calculator visitor over the tree.
This modifies the tree in place.
Arguments:
tree: the top-level pytree node to annotate with subtypes.
"""
blank_line_calculator = _BlankLineCalculator()
blank_line_calculator.Visit(tree)
class _BlankLineCalculator(pytree_visitor.PyTreeVisitor):
"""_BlankLineCalculator - see file-level docstring for a description."""
def __init__(self):
self.class_level = 0
self.function_level = 0
self.last_comment_lineno = 0
self.last_was_decorator = False
self.last_was_class_or_function = False
def Visit_simple_stmt(self, node): # pylint: disable=invalid-name
self.DefaultNodeVisit(node)
if pytree_utils.NodeName(node.children[0]) == 'COMMENT':
self.last_comment_lineno = node.children[0].lineno
def Visit_decorator(self, node): # pylint: disable=invalid-name
if (self.last_comment_lineno and
self.last_comment_lineno == node.children[0].lineno - 1):
self._SetNumNewlines(node.children[0], _NO_BLANK_LINES)
else:
self._SetNumNewlines(node.children[0], self._GetNumNewlines())
for child in node.children:
self.Visit(child)
self.last_was_decorator = True
def Visit_classdef(self, node): # pylint: disable=invalid-name
index = self._SetBlankLinesBetweenCommentAndClassFunc(node)
self.last_was_decorator = False
self.class_level += 1
for child in node.children[index:]:
self.Visit(child)
self.class_level -= 1
self.last_was_class_or_function = True
def Visit_funcdef(self, node): # pylint: disable=invalid-name
index = self._SetBlankLinesBetweenCommentAndClassFunc(node)
self.last_was_decorator = False
self.function_level += 1
for child in node.children[index:]:
self.Visit(child)
self.function_level -= 1
self.last_was_class_or_function = True
def DefaultNodeVisit(self, node):
"""Override the default visitor for Node.
This will set the blank lines required if the last entity was a class or
function.
Arguments:
node: (pytree.Node) The node to visit.
"""
def GetFirstChildLeaf(node):
if isinstance(node, pytree.Leaf):
return node
return GetFirstChildLeaf(node.children[0])
if self.last_was_class_or_function:
if pytree_utils.NodeName(node) in _PYTHON_STATEMENTS:
leaf = GetFirstChildLeaf(node)
if pytree_utils.NodeName(leaf) != 'COMMENT':
self._SetNumNewlines(leaf, self._GetNumNewlines())
self.last_was_class_or_function = False
super(_BlankLineCalculator, self).DefaultNodeVisit(node)
def _SetBlankLinesBetweenCommentAndClassFunc(self, node):
"""Set the number of blanks between a comment and class or func definition.
Class and function definitions have leading comments as children of the
classdef and functdef nodes.
Arguments:
node: (pytree.Node) The classdef or funcdef node.
Returns:
The index of the first child past the comment nodes.
"""
index = 0
while pytree_utils.IsCommentStatement(node.children[index]):
# Standalone comments are wrapped in a simple_stmt node with the comment
# node as its only child.
self.Visit(node.children[index].children[0])
self._SetNumNewlines(node.children[index].children[0], _ONE_BLANK_LINE)
index += 1
if (index and node.children[index].lineno - 1 ==
node.children[index - 1].children[0].lineno):
self._SetNumNewlines(node.children[index], _NO_BLANK_LINES)
else:
if self.last_comment_lineno + 1 == node.children[index].lineno:
num_newlines = _NO_BLANK_LINES
else:
num_newlines = self._GetNumNewlines()
self._SetNumNewlines(node.children[index], num_newlines)
return index
def _GetNumNewlines(self):
if self.last_was_decorator:
return _NO_BLANK_LINES
elif self._IsTopLevel():
return _TWO_BLANK_LINES
return _ONE_BLANK_LINE
def _SetNumNewlines(self, node, num_newlines):
pytree_utils.SetNodeAnnotation(node, pytree_utils.Annotation.NEWLINES,
num_newlines)
def _IsTopLevel(self):
return not (self.class_level or self.function_level)

@ -0,0 +1,293 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Comment splicer for lib2to3 trees.
The lib2to3 syntax tree produced by the parser holds comments and whitespace in
prefix attributes of nodes, rather than nodes themselves. This module provides
functionality to splice comments out of prefixes and into nodes of their own,
making them easier to process.
SpliceComments(): the main function exported by this module.
"""
from lib2to3 import pygram
from lib2to3 import pytree
from lib2to3.pgen2 import token
from yapf.yapflib import pytree_utils
def SpliceComments(tree):
"""Given a pytree, splice comments into nodes of their own right.
Extract comments from the prefixes where they are housed after parsing.
The prefixes that previously housed the comments become empty.
Args:
tree: a pytree.Node - the tree to work on. The tree is modified by this
function.
"""
# The previous leaf node encountered in the traversal.
# This is a list because Python 2.x doesn't have 'nonlocal' :)
prev_leaf = [None]
_AnnotateIndents(tree)
def _VisitNodeRec(node):
# This loop may insert into node.children, so we'll iterate over a copy.
for child in node.children[:]:
if isinstance(child, pytree.Node):
# Nodes don't have prefixes.
_VisitNodeRec(child)
else:
if child.prefix.lstrip().startswith('#'):
# We have a comment prefix in this child, so splicing is needed.
comment_prefix = child.prefix
comment_lineno = child.lineno - comment_prefix.count('\n')
# Remember the leading indentation of this prefix and clear it.
# Mopping up the prefix is important because we may go over this same
# child in the next iteration...
child_prefix = child.prefix.lstrip('\n')
prefix_indent = child_prefix[:child_prefix.find('#')]
child.prefix = ''
if child.type == token.NEWLINE:
# If the prefix was on a NEWLINE leaf, it's part of the line so it
# will be inserted after the previously encountered leaf.
# We can't just insert it before the NEWLINE node, because as a
# result of the way pytrees are organized, this node can be under
# an inappropriate parent.
assert prev_leaf[0] is not None
pytree_utils.InsertNodesAfter(_CreateCommentsFromPrefix(
comment_prefix, comment_lineno,
standalone=False), prev_leaf[0])
elif child.type == token.DEDENT:
# Comment prefixes on DEDENT nodes also deserve special treatment,
# because their final placement depends on their prefix.
# We'll look for an ancestor of this child with a matching
# indentation, and insert the comment after it.
ancestor_at_indent = _FindAncestorAtIndent(child, prefix_indent)
if ancestor_at_indent.type == token.DEDENT:
# Special case where the comment is inserted in the same
# indentation level as the DEDENT it was originally attached to.
pytree_utils.InsertNodesBefore(_CreateCommentsFromPrefix(
comment_prefix, comment_lineno,
standalone=True), ancestor_at_indent)
else:
pytree_utils.InsertNodesAfter(_CreateCommentsFromPrefix(
comment_prefix, comment_lineno,
standalone=True), ancestor_at_indent)
else:
# Otherwise there are two cases.
#
# 1. The comment is on its own line
# 2. The comment is part of an expression.
#
# Unfortunately, it's fairly difficult to distinguish between the
# two in lib2to3 trees. The algorithm here is to determine whether
# child is the first leaf in the statement it belongs to. If it is,
# then the comment (which is a prefix) belongs on a separate line.
# If it is not, it means the comment is buried deep in the statement
# and is part of some expression.
stmt_parent = _FindStmtParent(child)
for leaf_in_parent in stmt_parent.leaves():
if leaf_in_parent.type == token.NEWLINE:
continue
elif id(leaf_in_parent) == id(child):
# This comment stands on its own line, and it has to be inserted
# into the appropriate parent. We'll have to find a suitable
# parent to insert into. See comments above
# _STANDALONE_LINE_NODES for more details.
node_with_line_parent = _FindNodeWithStandaloneLineParent(child)
pytree_utils.InsertNodesBefore(
_CreateCommentsFromPrefix(
comment_prefix, comment_lineno, standalone=True),
node_with_line_parent)
break
else:
if comment_lineno == prev_leaf[0].lineno:
comment_lines = comment_prefix.splitlines()
comment_leaf = pytree.Leaf(type=token.COMMENT,
value=comment_lines[0].strip(),
context=('', (comment_lineno, 0)))
pytree_utils.InsertNodesAfter([comment_leaf], prev_leaf[0])
comment_prefix = '\n'.join(comment_lines[1:])
comment_lineno += 1
comments = _CreateCommentsFromPrefix(comment_prefix,
comment_lineno,
standalone=False)
pytree_utils.InsertNodesBefore(comments, child)
break
prev_leaf[0] = child
_VisitNodeRec(tree)
def _CreateCommentsFromPrefix(comment_prefix, comment_lineno, standalone=False):
"""Create pytree nodes to represent the given comment prefix.
Args:
comment_prefix: (unicode) the text of the comment from the node's prefix.
comment_lineno: (int) the line number for the start of the comment.
standalone: (bool) determines if the comment is standalone or not.
Returns:
The simple_stmt nodes if this is a standalone comment, otherwise a list of
new COMMENT leafs. The prefix may consist of multiple comment blocks,
separated by blank lines. Each block gets its own leaf.
"""
# The comment is stored in the prefix attribute, with no lineno of its
# own. So we only know at which line it ends. To find out at which line it
# starts, look at how many newlines the comment itself contains.
comments = []
lines = comment_prefix.split('\n')
index = 0
while True:
if index >= len(lines):
break
comment_block = []
while index < len(lines) and lines[index].lstrip().startswith('#'):
comment_block.append(lines[index])
index += 1
if comment_block:
new_lineno = comment_lineno + index - 1
comment_leaf = pytree.Leaf(type=token.COMMENT,
value='\n'.join(comment_block).strip(),
context=('', (new_lineno, 0)))
comment_node = comment_leaf if not standalone else pytree.Node(
pygram.python_symbols.simple_stmt, [comment_leaf])
comments.append(comment_node)
while index < len(lines) and not lines[index].lstrip():
index += 1
return comments
# "Standalone line nodes" are tree nodes that have to start a new line in Python
# code (and cannot follow a ';' or ':'). Other nodes, like 'expr_stmt', serve as
# parents of other nodes but can come later in a line. This is a list of
# standalone line nodes in the grammar. It is meant to be exhaustive
# *eventually*, and we'll modify it with time as we discover more corner cases
# in the parse tree.
#
# When splicing a standalone comment (i.e. a comment that appears on its own
# line, not on the same line with other code), it's important to insert it into
# an appropriate parent of the node it's attached to. An appropriate parent
# is the first "standaline line node" in the parent chain of a node.
_STANDALONE_LINE_NODES = frozenset(['suite', 'if_stmt', 'while_stmt',
'for_stmt', 'try_stmt', 'with_stmt',
'funcdef', 'classdef', 'decorated',
'file_input'])
def _FindNodeWithStandaloneLineParent(node):
"""Find a node whose parent is a 'standalone line' node.
See the comment above _STANDALONE_LINE_NODES for more details.
Arguments:
node: node to start from
Returns:
Suitable node that's either the node itself or one of its ancestors.
"""
if pytree_utils.NodeName(node.parent) in _STANDALONE_LINE_NODES:
return node
else:
# This is guaranteed to terminate because 'file_input' is the root node of
# any pytree.
return _FindNodeWithStandaloneLineParent(node.parent)
# "Statement nodes" are standalone statements. The don't have to start a new
# line.
_STATEMENT_NODES = frozenset(['simple_stmt']) | _STANDALONE_LINE_NODES
def _FindStmtParent(node):
"""Find the nearest parent of node that is a statement node.
Arguments:
node: node to start from
Returns:
Nearest parent (or node itself, if suitable).
"""
if pytree_utils.NodeName(node) in _STATEMENT_NODES:
return node
else:
return _FindStmtParent(node.parent)
def _FindAncestorAtIndent(node, indent):
"""Find an ancestor of node with the given indentation.
Arguments:
node: node to start from. This must not be the tree root.
indent: indentation string for the ancestor we're looking for.
See _AnnotateIndents for more details.
Returns:
An ancestor node with suitable indentation. If no suitable ancestor is
found, the closest ancestor to the tree root is returned.
"""
if node.parent.parent is None:
# Our parent is the tree root, so there's nowhere else to go.
return node
else:
# If the parent has an indent annotation, and it's shorter than node's
# indent, this is a suitable ancestor.
# The reason for "shorter" rather than "equal" is that comments may be
# improperly indented (i.e. by three spaces, where surrounding statements
# have either zero or two or four), and we don't want to propagate them all
# the way to the root.
parent_indent = pytree_utils.GetNodeAnnotation(
node.parent, pytree_utils.Annotation.CHILD_INDENT)
if parent_indent is not None and indent.startswith(parent_indent):
return node
else:
# Keep looking up the tree.
return _FindAncestorAtIndent(node.parent, indent)
def _AnnotateIndents(tree):
"""Annotate the tree with child_indent annotations.
A child_indent annotation on a node specifies the indentation (as a string,
like " ") of its children. It is inferred from the INDENT child of a node.
Arguments:
tree: root of a pytree. The pytree is modified to add annotations to nodes.
Raises:
RuntimeError: if the tree is malformed.
"""
# Annotate the root of the tree with zero indent.
if tree.parent is None:
pytree_utils.SetNodeAnnotation(tree, pytree_utils.Annotation.CHILD_INDENT,
'')
for child in tree.children:
if child.type == token.INDENT:
child_indent = pytree_utils.GetNodeAnnotation(
tree, pytree_utils.Annotation.CHILD_INDENT)
if child_indent is not None and child_indent != child.value:
raise RuntimeError('inconsistent indentation for child', (tree, child))
pytree_utils.SetNodeAnnotation(tree, pytree_utils.Annotation.CHILD_INDENT,
child.value)
_AnnotateIndents(child)

@ -0,0 +1,77 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Interface to file resources.
This module provides functions for interfacing with files: opening, writing, and
querying.
"""
import io
import os
import sys
def GetCommandLineFiles(command_line_file_list, recursive):
"""Return the list of files specified on the command line."""
return _FindFiles(command_line_file_list, recursive)
def WriteReformattedCode(filename, reformatted_code, in_place):
"""Emit the reformatted code.
Write the reformatted code into the file, if in_place is True. Otherwise,
write to stdout.
Arguments:
filename: (unicode) The name of the unformatted file.
reformatted_code: (unicode) The reformatted code.
in_place: (bool) If True, then write the reformatted code to the file.
"""
if not reformatted_code.strip():
return
if in_place:
with io.open(filename, mode='w', newline='') as fd:
fd.write(reformatted_code)
else:
# Re-encode the text so that if we pipe the output to a file, it will
# have the proper encoding. Otherwise, we'll get a UnicodeEncodeError
# exception.
reformatted_code = reformatted_code.encode('UTF-8')
sys.stdout.write(reformatted_code)
def _FindFiles(filenames, recursive):
"""Find all Python files."""
python_files = []
for filename in filenames:
if os.path.isdir(filename):
if recursive:
# TODO(morbo): Look into a version of os.walk that can handle recursion.
python_files.extend(os.path.join(dirpath, f)
for dirpath, _, filelist in os.walk(filename)
for f in filelist
if IsPythonFile(os.path.join(dirpath, f)))
else:
python_files.extend(os.path.join(filename, f)
for f in os.listdir(filename)
if IsPythonFile(os.path.join(filename, f)))
elif os.path.isfile(filename) and IsPythonFile(filename):
python_files.append(filename)
return python_files
def IsPythonFile(filename):
"""Return True if filename is a Python file."""
return os.path.splitext(filename)[1] == '.py'

@ -0,0 +1,388 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Implements a format decision state object that manages whitespace decisions.
Each token is processed one at a time, at which point its whitespace formatting
decisions are made. A graph of potential whitespace formattings is created,
where each node in the graph is a format decision state object. The heuristic
tries formatting the token with and without a newline before it to determine
which one has the least penalty. Therefore, the format decision state object for
each decision needs to be its own unique copy.
Once the heuristic determines the best formatting, it makes a non-dry run pass
through the code to commit the whitespace formatting.
FormatDecisionState: main class exported by this module.
"""
import copy
from yapf.yapflib import format_token
from yapf.yapflib import split_penalty
from yapf.yapflib import style
class FormatDecisionState(object):
"""The current state when indenting an unwrapped line.
The FormatDecisionState object is meant to be copied instead of referenced.
Attributes:
first_indent: The indent of the first token.
column: The number of used columns in the current line.
next_token: The next token to be formatted.
paren_level: The level of nesting inside (), [], and {}.
start_of_line_level: The paren_level at the start of this line.
lowest_level_on_line: The lowest paren_level on the current line.
newline: Indicates if a newline is added along the edge to this format
decision state node.
previous: The previous format decision state in the decision tree.
stack: A stack (of _ParenState) keeping track of properties applying to
parenthesis levels.
ignore_stack_for_comparison: Ignore the stack of _ParenState for state
comparison.
"""
def __init__(self, line, first_indent):
"""Initializer.
Initializes to the state after placing the first token from 'line' at
'first_indent'.
Arguments:
line: (UnwrappedLine) The unwrapped line we're currently processing.
first_indent: (int) The indent of the first token.
"""
self.next_token = line.first
self.column = first_indent
self.paren_level = 0
self.start_of_line_level = 0
self.lowest_level_on_line = 0
self.ignore_stack_for_comparison = False
self.stack = [_ParenState(first_indent, first_indent)]
self.first_indent = first_indent
self.newline = False
self.previous = None
self._MoveStateToNextToken()
def Clone(self):
new = copy.copy(self)
new.stack = copy.deepcopy(self.stack)
return new
def __eq__(self, other):
# Note: 'first_indent' is implicit in the stack. Also, we ignore 'previous',
# because it shouldn't have a bearing on this comparison. (I.e., it will
# report equal if 'next_token' does.)
return (self.next_token == other.next_token and
self.column == other.column and
self.paren_level == other.paren_level and
self.start_of_line_level == other.start_of_line_level and
self.lowest_level_on_line == other.lowest_level_on_line and
(self.ignore_stack_for_comparison or
other.ignore_stack_for_comparison or self.stack == other.stack))
def __ne__(self, other):
return not self == other
def __hash__(self):
return hash((self.next_token, self.column, self.paren_level,
self.start_of_line_level, self.lowest_level_on_line))
def __repr__(self):
return ('column::%d, next_token::%s, paren_level::%d, stack::[\n\t%s' %
(self.column, repr(self.next_token), self.paren_level,
'\n\t'.join(repr(s) for s in self.stack) + ']'))
def CanSplit(self):
"""Returns True if the line can be split before the next token."""
current = self.next_token
if not current.can_break_before:
return False
return True
def MustSplit(self):
"""Returns True if the line must split before the next token."""
current = self.next_token
previous_token = current.previous_token
next_token = current.next_token
if current.must_break_before:
return True
if (self.stack[-1].split_before_closing_bracket and
# FIXME(morbo): Use the 'matching_bracket' instead of this.
# FIXME(morbo): Don't forget about tuples!
current.value in ']}'):
# Split if we need to split before the closing bracket and the next
# token is a closing bracket.
return True
if previous_token:
length = _GetLengthToMatchingParen(previous_token)
if (previous_token.value == '{' and # TODO(morbo): List initializers?
length + self.column > style.COLUMN_LIMIT):
return True
# TODO(morbo): This should be controlled with a knob.
if (current.subtype == format_token.Subtype.DICTIONARY_KEY and
not current.is_comment):
# Place each dictionary entry on its own line.
return True
# TODO(morbo): This should be controlled with a knob.
if current.subtype == format_token.Subtype.DICT_SET_GENERATOR:
return True
if (next_token and previous_token.value != '(' and
next_token.subtype ==
format_token.Subtype.DEFAULT_OR_NAMED_ASSIGN and
next_token.node_split_penalty < split_penalty.UNBREAKABLE):
return style.SPLIT_BEFORE_NAMED_ASSIGNS
return False
def AddTokenToState(self, newline, dry_run, must_split=False):
"""Add a token to the format decision state.
Allow the heuristic to try out adding the token with and without a newline.
Later on, the algorithm will determine which one has the lowest penalty.
Arguments:
newline: (bool) Add the token on a new line if True.
dry_run: (bool) Don't commit whitespace changes to the FormatToken if
True.
must_split: (bool) A newline was required before this token.
Returns:
The penalty of splitting after the current token.
"""
if not self.stack:
self.column = (self.next_token.spaces_required_before +
len(self.next_token.value))
self.next_token = self.next_token.next_token
return 0
penalty = 0
if newline:
penalty = self._AddTokenOnNewline(dry_run, must_split)
else:
self._AddTokenOnCurrentLine(dry_run)
return self._MoveStateToNextToken() + penalty
def _AddTokenOnCurrentLine(self, dry_run):
"""Puts the token on the current line.
Appends the next token to the state and updates information necessary for
indentation.
Arguments:
dry_run: (bool) Commit whitespace changes to the FormatToken if True.
"""
current = self.next_token
previous = current.previous_token
spaces = current.spaces_required_before
if not dry_run:
current.AddWhitespacePrefix(newlines_before=0, spaces=spaces)
if previous.OpensScope():
if not current.is_comment:
# Align closing scopes that are on a newline with the opening scope:
#
# foo = [a,
# b,
# ]
self.stack[-1].closing_scope_indent = previous.column
self.stack[-1].indent = self.column + spaces
else:
self.stack[-1].closing_scope_indent = (
self.stack[-1].indent - style.CONTINUATION_INDENT_WIDTH)
self.column += spaces
def _AddTokenOnNewline(self, dry_run, must_split):
"""Adds a line break and necessary indentation.
Appends the next token to the state and updates information necessary for
indentation.
Arguments:
dry_run: (bool) Don't commit whitespace changes to the FormatToken if
True.
must_split: (bool) A newline was required before this token.