We’re going to talk about underscores, dunders, encapsulation and magic methods in Python
Python is an easy to learn language that provides a stepping-stone into the world of programming, but some of its features are confusing for beginners and advanced developers. At the end of this article, you’ll know when and how to use underscores, dunders, magic methods and encapsulation in Python.
_ and double
__ leading or trailing underscores have different meanings in Python. Most of the time it’s just a convention (hint to the programmer), but there are cases when they’re enforced by the Python interpreter. We’re going to talk about:
- Single trailing underscore:
- Single leading underscore:
- Double leading underscores:
- Double leading and trailing underscores:
Double underscores are referred to as dunders because they appear quite often in the Python code and it’s easier to use the shorten “dunder” instead of “double underscore”.
A single stand-alone underscore is used to indicate that a variable is temporary or insignificant. This meaning is per convention only and doesn’t trigger any special behavior in the Python parser. A single underscore is just a valid variable name that’s used for this purpose. Let’s see a couple examples:
If you’re iterating, and you are not using the yielded value from the iterator, you can use a single underscore to indicate that it’s just a temporary value:
>>> for _ in range(3): ... print('Zen of Python') ... Zen of Python Zen of Python Zen of Python
If you’re unpacking values from a tuple, but you don’t need some of it’s values, you can use a single underscore to mark it as insignificant:
>>> foo, _ = ('bar', 42)
In a Python REPL the single underscore is a special variable that represents the result of the last evaluated expression:
>>> 5 + 5 10 >>> _ 10
Bonus feature When doing internationalization in Python code with Django it’s a convention to import the gettext function as
_ to save typing:
from django.utils.translation import gettext as _ def django_view(request): translated_text = _("The zen of Python") ...
Sooner or later one ends up using a Python keyword like
list, etc. as a variable name because it fits well in his context, but this is a bad practice and in some cases can end up in a
>>> def foo(class): File "<stdin>", line 1 def foo(class): ^ SyntaxError: invalid syntax
To avoid naming conflicts append a single underscore to the variable name:
>>> def foo(class_): ... return 42
The leading underscore prefix is used as a hint the programmer that a variable or method is intended for internal use. However, this convention isn’t enforced by the Python interpreter and it doesn’t affect the behavior of your programs because Python doesn’t have a strong distinction between private and public variables like Java or C++:
>>> class Foo: ... def __init__(self): ... self.spam = 'spam' ... self._ham = '_ham' ... >>> foo = Foo() >>> foo.spam 'spam' >>> foo._ham '_ham'
The leading underscore has an impact on how functions are imported from modules. Let’s have the following module (a module is just a file that contains function definitions):
# example.py def foo(): return 42 def _bar(): return 42
If one uses a wildcard import
import * to import all names from the module this will import all names except those beginning with and underscore:
Note: Avoid using wildcard imports as they make it unclear what names are present in the namespace and the code is less readable.
>>> from example import * >>> foo() 42 >>> _bar() Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name '_bar' is not defined
You can override this behavior by explicitly defining the value of
__all__ in the module:
# example.py __all__ = ['foo', '_bar'] def foo(): return 42 def _bar(): return 42
>>> from example import * >>> foo() 42 >>> _bar() 42
Another way to import a name with a leading underscore is by not using the
import * syntax, but a regular import instead:
>>> from example import foo, _bar >>> foo() 42 >>> _bar() 42
All of the naming patterns so far have been agreed-upon conventions to which the Python community agrees. However, Python class attributes that start with double underscores are rewritten by the Python interpreter. This helps to avoid naming collisions in extended classes. Let’s see how the name mangling works:
>>> class Foo: ... def __init__(self): ... self.__spam = 'spam' ... >>> foo = Foo() >>> foo.__spam Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'Foo' object has no attribute '__spam'
__spam attribute is not accessible in the
Foo instance. This is because it’s been renamed to
_Foo__spam – this is the so-called name mangling:
>>> foo._Foo__spam 'spam'
Name mangling is done under the hood and if you create a getter method for your class you won’t notice it:
>>> class Foo: ... def __init__(self): ... self.__spam = 'spam' ... ... def get_spam(self): ... return self.__spam ... >>> foo = Foo() >>> foo.get_spam() 'spam'
If you decide to extend
Foo and override the
__spam attribute, by assigning a different value, the new attribute will again be rewritten by the interpreter because name mangling is applied to both classes. Unless you override the
get_spam method you’ll receive Foo’s original attribute value if you call it. To get the new overridden attribute’s value create a new method. All of this is possible because both mangled attributes exist in the extended class:
>>> class ExtendsFoo(Foo): ... def __init__(self): ... super().__init__() ... self.__spam = 'extended spam' ... ... def get_extended_spam(self): ... return self.__spam ... >>> extended_foo = ExtendsFoo() >>> extended_foo.get_spam() 'spam' >>> extended_foo.get_extended_spam() 'extended spam' >>> extended_foo._Foo__spam 'spam' >>> extended_foo._ExtendsFoo__spam 'extended spam'
Encapsulation in Python lacks strict access control such as private and protected attributes. It will stop you from accidentally accessing stuff, but you can intentionally do pretty much everything as long as you’re aware of how the language works.
In the examples above we’ve used class attributes, but the same rules apply for method names too. In short name mangling affects all names that start with two leading underscores in a class context. Having this in mind let’s take a look at the following example:
>>> _Foo__mangled = 42 >>> class Foo: ... def bar(self): ... return __mangled ... >>> foo = Foo() >>> foo.bar() 42
Cool, right? But please don’t do this. No one deserves to be abused.
One very important fact about the name mangling is that it isn’t applied if a name starts and ends with double underscores:
>>> class Foo: ... def __init__(self): ... self.__spam__ = 'spam' ... >>> foo = Foo() >>> foo.__spam__ 'spam'
Names that have both leading and trailing double underscores are reserved for special use in the language. Such methods are often referred to as magic methods even though they have nothing to do with wizardry. Magic methods are called behind the scenes when certain circumstances occur. For example, when you create an instance of a class the necessary calls to
__init__ are made.
However, as far as naming conventions go, it’s best to stay away from using names that start and end with double underscores to avoid collision with future methods in the Python language.
* Single underscore
_ for temporary or insignificant variables
* Single trailing underscore
foo_ to avoid naming conflicts with Python keywords
* Single leading underscore
_foo to indicate a name is meant for internal use
* Double leading underscore
__foo to avoid naming conflicts and overriding in subclasses
* Double leading and trailing underscores
__foo__ as they are used to indicate Python special methods
This is the first article from a series of posts about Python. I’m going to blog about more Python cool features & gotchas, built-in data structures, generators, coroutines,
await & more. If you’ve liked what you’ve read subscribe for our newsletter, share, tweet and check out the rest of the articles in our blog.