De-Mystifying Python Descriptors + Django
Sep 16, 2018
At my new job, I've been working a lot with Django. I was wary at first, but it has been about half a year working with it, and I can honestly say that it's been a dream. It's a significantly different experience from my last job, where we used Flask and SQLite. For me, one of the best reasons to use it is for its ORM that makes it really nice and easy to create complex queries and filters using its very extensible and easy-to-understand
QuerySet API. Maybe I'll write about my experience with that switch in a future blog post.
In any case, after what was probably my first thousand hours working with Django, I finally took the time to learn about a Python feature that's at the heart of Django's Model/Field API, but is so well hidden and abstracted from you that many Django developers don't really ever have to concern themselves with its nuances until you start doing some meta programming that touches this area. This feature, my dear friends, as I'm sure you've figured out by the title of this blog post, is Descriptors.
The name "descriptor" does not refer to a new kind of Python type or keyword; it's better to describe it instead as a coding pattern or protocol that you may find in a regular Python class. Descriptors are Python objects that define the following magic methods:
For example, check out this simple descriptor example:
# defining the descriptor class MyDescriptor: def __get__(self, instance, type): # details omitted ... def __set__(self, instance, value): # details omitted ... # defining a class that uses the descriptor class Foo: some_attribute = MyDescriptor() # invoking the descriptor methods foo = Foo() foo.some_attribute
foo.some_attribute like we do above example is enough to trigger Python to call the descriptor's
Theoretically, you could use the descriptor methods by calling them directly, i.e.,
foo.__get__(foo, Foo), but you typically don't want to do that because you lose the nice syntactical shorthand that the descriptor gives you. If you were going to call the method directly, you would probably prefer to create a class or instance method on
Foo, and call it that way.
Surprisingly, there is a lot of magic happening when you invoke
foo.some_attribute. Python objects have a magic method called
__getattribute__, which is called any time you attempt to lookup a value on a given object. This
__getattribute__ method is not special: it's quite literally how Python finds attributes on any object. This method uses a lookup chain that starts by checking for the existence of
some_attribute in the instance's dict (i.e.,
foo.__dict__['some_attribute']), then checks the instances's class's dict (i.e.,
Foo.__dict__['some_attribute']), then finally through looks through the dicts for each of the base classes of
Once Python has has found the
some_attribute object, it checks the
some_attribute object to see if it has a
__get__ method implemented. If it doesn't, then it just returns the
some_attribute object as-is, whatever it may be. If it does have a
__get__ method, then Python calls it, passing in the instance and its class as arguments.
In other words, for our example above:
is the same as:
__get__ method does or returns is up to you. A similar process happens when you assign a value to
some_attribute, except that it checks for the existence of and then calls the
__set__ method instead. Same for
__delete__, which gets called when you attempt to
del the attribute. There is space here to perform a whole host of shenanigans inside of these methods. You could make a counter that increments every time an attribute is accessed. You could raise an Exception inside of a
__set__ to render it read-only. You could return different values from the
__get__ depending on the state of the
Django's Use of Descriptors
Descriptors are ubiquitous in a Django app. In your
models.py, suppose you have the following model:
class Post(models.Model): title = models.CharField(max_length=200, unique=True) body = models.TextField()
When Django sets up this model, Django's
ModelBase metaclass will iterate over all of model fields and call each field's
contribute_to_class function. This function moves the field to the model's
_meta attribute and then instantiates a descriptor called
DeferredAttribute in its original location.
DeferredAttribute helps us with performance because upon accessing the value for the first time, Django will query the database and then cache the result. Every subsequent access of the attribute will attempt to avoid reaching into the database, which could end up being costly if you're doing it repeatedly!
One common pattern in many Django applications is the use of the
@property decorator -- a Python built-in function which transforms the function it decorates it into a descriptor that only implements its
class Post(models.Model): title = models.CharField(max_length=200, unique=True) body = models.TextField() @property def excerpt(self): return self.body[:100] if len(self.body) > 100 else self.body first_post = Post.objects.first() first_post.excerpt # calls and returns the excerpt function. first_post.excerpt = 'blah blah blah' # AttributeError
Attempting to set 'blah blah blah' to the
excerpt property will fail with an AttributeError because the
__set__ method is not implemented by default when you use the
@property decorator. Other similar functions are the
@classmethod and the
@staticmethod decorators, which work similarly.
Descriptors are all over the place once you know how to recognize them and it helps to know how they work and what they're good for. In general, they're an advanced Python topic that you typically won't need to reach for other than for the built-in
@staticmethod decorators. However, when you do come across a good use-case, like when multiple class attributes have the same getter/setter functionality, a custom descriptor can significantly DRY out your code and encapsulate logic in exactly the places that you want them.