Python Style Guide
Author: Guido van Rossum
This style guide has been converted to several PEPs (Python Enhancement
Proposals): PEP 8 for the main text, PEP 257 for docstring conventions. See the
PEP index.
A Foolish Consistency is the Hobgoblin of Little MindsA style guide is
about consistency. Consistency with this style guide is important. Consistency
within a project is more important. Consistency within one module or function is
most important.
But most importantly: know when to be inconsistent -- sometimes the
style guide just doesn't apply. When in doubt, use your best judgement. Look at
other examples and decide what looks best. And don't hesitate to ask!
Table of Contents
- Lay-out -- how to use tabs, spaces, and newlines.
- Comments -- on proper use of comments (and documentation strings).
- Names -- various naming conventions.
IndentationUse the default of Emacs Python-mode: 4 spaces for one
indentation level. For really old code that you don't want to mess up, you can
continue to use 8-space tabs. Emacs Python-mode auto-detects the prevailing
indentation level used in a file and sets its indentation parameters
accordingly.
Tabs or Spaces?Never mix tabs and spaces. The most popular way of
indenting Python is with spaces only. The second-most popular way is with tabs
only. Code indented with a mixture of tabs and spaces should be converted to
using spaces exclusively. (In Emacs, select the whole buffer and hit ESC-x
untabify.) When invoking the python command line interpreter with the -t option,
it issues warnings about code that illegally mixes tabs and spaces. When using
-tt these warnings become errors. These options are highly recommended!
Maximum Line LengthThere are still many devices around that are limited
to 80 character lines. The default wrapping on such devices looks ugly.
Therefore, please limit all lines to a maximum of 79 characters (Emacs wraps
lines that are exactly 80 characters long).
The preferred way of wrapping long lines is by using Python's implied line
continuation inside parentheses, brackets and braces. If necessary, you can add
an extra pair of parentheses around an expression, but sometimes using a
backslash looks better. Make sure to indent the continued line appropriately.
Emacs Python-mode does this right. Some examples:
class Rectangle(Blob):
def __init__(self, width, height,
color='black', emphasis=None, highlight=0):
if width == 0 and height == 0 and \
color == 'red' and emphasis == 'strong' or \
highlight > 100:
raise ValueError, "sorry, you lose"
if width == 0 and height == 0 and (color == 'red' or
emphasis is None):
raise ValueError, "I don't think so"
Blob.__init__(self, widt, height,
color, emphasis, highlight)
Blank LinesSeparate top-level function and class definitions with two
blank lines. Method definitions inside a class are separated by a single blank
line. Extra blank lines may be used (sparingly) to separate groups of related
functions. Blank lines may be omitted between a bunch of related one-liners
(e.g. a set of dummy implementations).
When blank lines are used to separate method definitions, there is also a
blank line between the `class' line and the first method definition.
Use blank lines in functions, sparingly, to indicate logical sections.
Whitespace in Expressions and Statements
Pet PeevesI hate whitespace in the following places:
(Don't bother to argue with me on any of the above -- I've grown
accustomed to this style over 15 years.)
Other Recommendations
- Always surround these binary operators with a single space on either side:
assignment (=), comparisons (==, <, >, !=, <>, <=, >=, in,
not in, is, is not), Booleans (and, or, not).
- Use your better judgement for the insertion of spaces around arithmetic
operators. Always be consistent about whitespace on either side of a binary
operator. Some examples:
i = i+1
submitted = submitted + 1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)
c = (a + b) * (a - b)
- Don't use spaces around the '=' sign when used to indicate a keyword
argument or a default parameter value. For instance:
def complex(real, imag=0.0):
return magic(r=real, i=imag)
Comments that contradict the code are
worse than no comments. Always make a priority of keeping the comments
up-to-date when the code changes!
If a comment is a phrase or sentence, its first word should be capitalized,
unless it is an identifier that begins with a lower case letter (never alter the
case of identifiers!).
If a comment is short, the period at the end is best omitted. Block comments
generally consist of one or more paragraphs built out of complete sentences, and
each sentence should end in a period.
You can use two spaces after a sentence-ending period.
As always when writing English, Strunk and White apply.
Python coders from non-English speaking countries: please write your comments
in English, unless you are 120% sure that the code will never be read by people
who don't speak your language.
Block CommentsBlock comments generally apply to some (or all) code that
follows them, and are indented to the same level as that code. Each line of a
block comment starts with a # and a single space (unless it is indented text
inside the comment). Paragraphs inside a block comment are separated by a line
containing a single #. Block comments are best surrounded by a blank line above
and below them (or two lines above and a single line below for a block comment
at the start of a a new section of function definitions).
Inline CommentsAn inline comment is a comment on the same line as a
statement. Inline comments should be used sparingly. Inline comments should be
separated by at least two spaces from the statement. They should start with a #
and a single space.
Inline comments are unnecessary and in fact distracting if they state the
obvious. Don't do this: x = x+1 # Increment x
But sometimes, this is useful: x = x+1 # Compensate for border
Documentation StringsAll modules should normally have doc strings, and
all functions and classes exported by a module should also have doc strings.
Public methods (including the __init__ constructor) should also have doc
strings.
The doc string of a script (a stand-alone program) should be usable as its
"usage" message, printed when the script is invoked with incorrect or missing
arguments (or perhaps with a "-h" option, for "help"). Such a doc string should
document the script's function and command line syntax, environment variables,
and files. Usage messages can be fairly elaborate (several screenfuls) and
should be sufficient for a new user to use the command properly, as well as a
complete quick reference to all options and arguments for the sophisticated
user.
For consistency, always use """triple double quotes""" around doc strings.
There are two forms of doc strings: one-liners and multi-line doc strings.
One-line Doc StringsOne-liners are for really obvious cases. They
should really fit on one line. For example: def kos_root():
"""Return the pathname of the KOS root directory."""
global _kos_root
if _kos_root: return _kos_root
...
Notes:
- Triple quotes are used even though the string fits on one line. This makes
it easy to later expand it.
- The closing quotes are on the same line as the opening quotes. This looks
better for one-liners.
- There's no blank line either before or after the doc string.
- The doc string is a phrase ending in a period. It prescribes the
function's effect as a command ("Do this", "Return that"), not
as a description: e.g. don't write "Returns the pathname ..."
Multi-line Doc StringsMulti-line doc strings consist of a summary line
just like a one-line doc string, followed by a blank line, followed by a
more elaborate description. The summary line may be used by automatic indexing
tools; it is important that it fits on one line and is separated from the rest
of the doc string by a blank line.
The entire doc string is indented the same as the quotes at its first line
(see example below). Doc string processing tools will strip an amount of
indentation from the second and further lines of the doc string equal to the
indentation of the first non-blank line after the first line of the doc string.
Relative indentation of later lines in the doc string is retained.
I recommend inserting a blank line between the last paragraph in a multi-line
doc string and its closing quotes, placing the closing quotes on a line by
themselves. This way, Emacs' fill-paragraph command can be used on it.
I also recommend inserting a blank line before and after all doc strings
(one-line or multi-line) that document a class -- generally speaking, the class'
methods are separated from each other by a single blank line, and the doc string
needs to be offset from the first method by a blank line; for symmetry, I prefer
having a blank line between the class header and the doc string. Doc strings
documenting function generally don't have this requirement, unless the
function's body is written as a number of blank-line separated sections -- in
this case, treat the doc string as another section, and precede it with a blank
line.
The doc string for a module should generally list the classes, exceptions and
functions (and any other objects) that are exported by the module, with a
one-line summary of each. (These summaries generally give less detail than the
summary line in the object's doc string.)
The doc string for a function or method should summarize its behavior and
document its arguments, return value(s), side effects, exceptions raised, and
restrictions on when it can be called (all if applicable). Optional arguments
should be indicated. It should be documented whether keyword arguments are part
of the interface.
The doc string for a class should summarize its behavior and list the public
methods and instance variables. If the class is intended to be subclassed, and
has an additional interface for subclasses, this interface should be listed
separately (in the doc string). The class constructor should be documented in
the doc string for its __init__ method. Individual methods should be documented
by their own doc string.
If a class subclasses another class and its behavior is mostly inherited from
that class, its doc string should mention this and summarize the differences.
Use the verb "override" to indicate that a subclass method replaces a superclass
method and does not call the superclass method; use the verb "extend" to
indicate that a subclass method calls the superclass method (in addition to its
own behavior).
Do not use the Emacs convention of mentioning the arguments of
functions or methods in upper case in running text. Python is case sensitive and
the argument names can be used for keyword arguments, so the doc string should
document the correct argument names. It is best to list each argument on a
separate line, with two dashes separating the name from the description, like
this: def complex(real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0: return complex_zero
...
Version BookkeepingIf you have to have RCS or CVS crud in your source
file, do it as follows. __version__ = "$Revision: 6104 $"
# $Source$
These lines should be included after the module's doc string, before any
other code, separated by a blank line above and below.
The naming conventions of Python's
library are a bit of a mess, so we'll never get this completely consistent --
nevertheless, here are some guidelines.
Descriptive: Naming StylesThere are a lot of different naming styles.
It helps to be able to recognize what naming style is being used, independently
from what they are used for. The following naming styles are commonly
distinguished:
- x (single lowercase letter)
- X (single uppercase letter)
- lowercase
- lower_case_with_underscores
- UPPERCASE
- UPPER_CASE_WITH_UNDERSCORES
- CapitalizedWords (or CapWords)
- mixedCase (differs from CapitalizedWords by initial lowercase character!)
- Capitalized_Words_With_Underscores (ugly!)
There's also the style
of using a short unique prefix to group related names together. This is not used
much in Python, but I mention it for completeness. For example, the os.stat()
function returns a tuple whose items traditionally have names like st_mode,
st_size, st_mtime and so on. The X11 library uses a leading X for all its public
functions. (In Python, this style is generally deemed unnecessary because
attribute and method names are prefixed with an object, and function names are
prefixed with a module name.)
In addition, the following special forms using leading or trailing
underscores are recognized (these can gerally be combined with any case
convention):
- _single_leading_underscore: weak "internal use" indicator (e.g. "from M
import *" does not import objects whose name starts with an underscore).
- single_trailing_underscore_: used by convention to avoid conflicts with
Python keyword, e.g. Tkinter.Toplevel(master, class_="ClassName").
- __double_leading_underscore: class-private names in Python 1.4.
- __double_leading_and_trailing_underscore__: "magic" objects or attributes
that live in user-controlled namespaces, e.g. __init__, __import__ or
__file__. Sometimes these are defined by the user to trigger certain magic
behavior (e.g. operator overloading); sometimes these are inserted by the
infrastructure for its own use or for debugging purposes. Since the
infrastructure (loosely defined as the Python interpreter and the standard
library) may decide to grow its list of magic attributes in future versions,
user code should generally refrain from using this convention for its own use.
User code that aspires to become part of the infrastructure could combine this
with a short prefix inside the underscores, e.g. __bobo_magic_attr__.
Prescriptive: Naming Conventions
Module NamesModule names can be either MixedCase or lowercase. There is
no unambiguous convention to decide which to use. Modules that export a single
class (or a number of closely related classes, plus some additional support) are
often named in MixedCase, with the module name being the same as the class name
(e.g. the standard StringIO module). Modules that export a bunch of functions
are usually named in all lowercase.
Since module names are mapped to file names, and some file systems are case
insensitive and truncate long names, it is important that module names be chosen
to be fairly short and not in conflict with other module names that only differ
in the case -- this won't be a problem on Unix, but it will be when the code is
transported to Mac or Windows.
There is an emerging convention that when an extension module written in C or
C++ has an accompanying Python module that provides a higher level (e.g. more
object oriented) interface, the Python module's name CapWords, while the C/C++
module is named in all lowercase and has a leading underscore (e.g.
Tkinter/_tkinter).
"Packages" (groups of modules, supported by the "ni" module) generally have a
short all lowercase name.
Class NamesAlmost without exception, class names use the CapWords
convention. Classes for internal use have a leading underscore in addition.
Exception NamesIf a module defines a single exception raised for all
sorts of conditions, it is generally called "error" or "Error". As far as I can
tell, built-in (extension) modules use "error" (e.g. os.error), while Python
modules generally use "Error" (e.g. xdrlib.Error).
Function NamesPlain functions exported by a module can either use the
CapWords style or lowercase (or lower_case_with_underscores). I have no strong
preference, but believe that the CapWords style is used for functions that
provide major functionality (e.g. nstools.WorldOpen()), while lowercase is used
more for "utility" functions (e.g. pathhack.kos_root()).
Global Variable Names(Let's hope that these variables are meant for use
inside one module only.) The conventions are about the same as those for
exported functions. Modules that are designed for use via "from M import *"
should prefix their globals (and internal functions and classes) with an
underscore to prevent exporting them.
Method NamesHmm, the story is largely the same as for functions. When
using ILU, here's a good convention: use CapWords for methods published via an
ILU interface. Use lowercase for methods accessed by other classes or functions
that are part of the implementation of an object type. Use one leading
underscore for "internal" methods and instance variables when there is no chance
of a conflict with subclass or superclass attributes or when a subclass might
actually need access to them. Use two leading underscores (class-private names,
enforced by Python 1.4) in those cases where it is important that only
the current class accesses an attribute. (But realize that Python contains
enough loopholes so that an insistent user could gain access nevertheless, e.g.
via the __dict__ attribute.
|