off the stack

A python can bite you too

2012-09-15 | django, python

While doing maintenance/bug fixing work on a website built with Django, a came across an issue with the commenting system. The problem was, that the order of the comments was not right sometimes. Comments which were placed as a comment on another comment were occasionally even displayed before their subject. Another strange thing about it was that it mostly happened at times with higher traffic.

The application which contained the commenting functionality contained the following model definition:

class Comment(models.Model):
    comment = models.TextField()
    published = models.DateTimeField(
    user = models.ForeignKey(User, related_name="user_comment")
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = generic.GenericForeignKey('content_type', 'object_id')
    client_ip = models.CharField(max_length=15)

Can you spot the bug?

It is right there on line 3 (where the published field is defined).

The problem here was the call to right there in the class definition.

Whats happening

When the of the application in question is loaded, the class definition of Comment is evaluated too. Each field of the Comment class receives an instance of a Field class.

The published field receives an instance of the DateTimeField. No problem so far. The instance is created using the result of as default. And right there things are wrong.

Because we actually call a function here, the result is used. This means the default value of the field is the "datetime" of the moment when the class definition is being evaluated. The python interpreter remembers the resulting class as long as it is running.

In case of the website, the python interpreter was invoked using Apache with mod_wsgi. This means that a python instance will be reused for several requests. And all of those requests will use the same default time for published...

The fix

The fix for the bug was the removal of two characters. I had to change the line from this:

    published = models.DateTimeField(

to this:

    published = models.DateTimeField(

Now the parentheses () after now are missing from the definition, a reference to the now function is being passed as the default.

When Django tries to retrieve the default value of a field (for instance when the model is being saved) it now calls the function instead of using the result of it when the models were loaded the first time.

Similar issues

Other places which are only evaluated once during an interpreter session will probably expose similar issues.

Take a basic function definition for example:

def example(items=[]):
    return items

What will a call to example() return? What will be returned a second or third time?

>>> def example(items=[]):
...     items.append("test")
...     return items
>>> example()
>>> example()
['test', 'test']
>>> example()
['test', 'test', 'test']

You can even get more confusing issues when you combine this behavior with more side-effects such as database queries.

Once, I saw something like this within a Django application:

class ExampleForm(forms.Form):
    ages = [(u'', u'Age group')]
    ages.extend([([, c.label]) for c in Age.objects.all()])
    age = forms.ChoiceField(choices=ages, required=False)

This will probably work. The will validate and the application will start just fine. But what will happen when an age group is added within the Django admin interface while the front-end application is running?

Try it. You will see that the list of age groups will not change as long as Django is not restarted/reloaded...


Just take care where you declare something and think about when your declaration will be evaluated, how it will change within the application's lifetime and if this is really what you want.

So just think about it.