CHRISDOESCODING
LATEST
POSTS
KB

De-Mystifying Python Descriptors + Django

Sep 16, 2018


At my new job, I've been working a lot with Django. I was wary at first, but it has been about half a year working with it, and I can honestly say that it's been a dream. It's a significantly different experience from my last job, where we used Flask and SQLite. For me, one of the best reasons to use it is for its ORM that makes it really nice and easy to create complex queries and filters using its very extensible and easy-to-understand QuerySet API. Maybe I'll write about my experience with that switch in a future blog post.

In any case, after what was probably my first thousand hours working with Django, I finally took the time to learn about a Python feature that's at the heart of Django's Model/Field API, but is so well hidden and abstracted from you that many Django developers don't really ever have to concern themselves with its nuances until you start doing some meta programming that touches this area. This feature, my dear friends, as I'm sure you've figured out by the title of this blog post, is Descriptors.

Descriptors

The name "descriptor" does not refer to a new kind of Python type or keyword; it's better to describe it instead as a coding pattern or protocol that you may find in a regular Python class. Descriptors are Python objects that define the following magic methods: __get__, __set__, or __delete__.

For example, check out this simple descriptor example:

# defining the descriptor
class MyDescriptor:
    def __get__(self, instance, type):
        # details omitted ...

    def __set__(self, instance, value):
        # details omitted ...

# defining a class that uses the descriptor
class Foo:
    some_attribute = MyDescriptor()

# invoking the descriptor methods
foo = Foo()
foo.some_attribute

Invoking foo.some_attribute like we do above example is enough to trigger Python to call the descriptor's __get__ method.

Theoretically, you could use the descriptor methods by calling them directly, i.e., foo.__get__(foo, Foo), but you typically don't want to do that because you lose the nice syntactical shorthand that the descriptor gives you. If you were going to call the method directly, you would probably prefer to create a class or instance method on Foo, and call it that way.

Surprisingly, there is a lot of magic happening when you invoke foo.some_attribute. Python objects have a magic method called __getattribute__, which is called any time you attempt to lookup a value on a given object. This __getattribute__ method is not special: it's quite literally how Python finds attributes on any object. This method uses a lookup chain that starts by checking for the existence of some_attribute in the instance's dict (i.e., foo.__dict__['some_attribute']), then checks the instances's class's dict (i.e., type(foo).__dict__['some_attribute'] or Foo.__dict__['some_attribute']), then finally through looks through the dicts for each of the base classes of Foo.

Once Python has has found the some_attribute object, it checks the some_attribute object to see if it has a __get__ method implemented. If it doesn't, then it just returns the some_attribute object as-is, whatever it may be. If it does have a __get__ method, then Python calls it, passing in the instance and its class as arguments.

In other words, for our example above:

foo.some_attribute

is the same as:

type(foo).__dict__['some_attribute'].__get__(foo, type(foo))

Whatever that __get__ method does or returns is up to you. A similar process happens when you assign a value to some_attribute, except that it checks for the existence of and then calls the __set__ method instead. Same for __delete__, which gets called when you attempt to del the attribute. There is space here to perform a whole host of shenanigans inside of these methods. You could make a counter that increments every time an attribute is accessed. You could raise an Exception inside of a __set__ to render it read-only. You could return different values from the __get__ depending on the state of the instance.

Django's Use of Descriptors

Descriptors are ubiquitous in a Django app. In your models.py, suppose you have the following model:

class Post(models.Model):
    title = models.CharField(max_length=200, unique=True)
    body = models.TextField()

When Django sets up this model, Django's ModelBase metaclass will iterate over all of model fields and call each field's contribute_to_class function. This function moves the field to the model's _meta attribute and then instantiates a descriptor called DeferredAttribute in its original location. DeferredAttribute helps us with performance because upon accessing the value for the first time, Django will query the database and then cache the result. Every subsequent access of the attribute will attempt to avoid reaching into the database, which could end up being costly if you're doing it repeatedly!

One common pattern in many Django applications is the use of the @property decorator -- a Python built-in function which transforms the function it decorates it into a descriptor that only implements its __get__ method.

class Post(models.Model):
    title = models.CharField(max_length=200, unique=True)
    body = models.TextField()

    @property
    def excerpt(self):
        return self.body[:100] if len(self.body) > 100 else self.body

first_post = Post.objects.first()
first_post.excerpt  # calls and returns the excerpt function.
first_post.excerpt = 'blah blah blah'   # AttributeError

Attempting to set 'blah blah blah' to the excerpt property will fail with an AttributeError because the __set__ method is not implemented by default when you use the @property decorator. Other similar functions are the @classmethod and the @staticmethod decorators, which work similarly.

Summary

Descriptors are all over the place once you know how to recognize them and it helps to know how they work and what they're good for. In general, they're an advanced Python topic that you typically won't need to reach for other than for the built-in @property, @classmethod, or @staticmethod decorators. However, when you do come across a good use-case, like when multiple class attributes have the same getter/setter functionality, a custom descriptor can significantly DRY out your code and encapsulate logic in exactly the places that you want them.