While doing maintenance/bug fixing work on a website built with Django, a came across an issue with the commenting system. The problem was, that the order of the comments was not right sometimes. Comments which were placed as a comment on another comment were occasionally even displayed before their subject. Another strange thing about it was that it mostly happened at times with higher traffic.
The application which contained the commenting functionality contained the following model definition:
1 2 3 4 5 6 7 8
Can you spot the bug?
It is right there on line 3 (where the
published field is defined).
The problem here was the call to
datetime.datetime.now() right there in the class definition.
When the models.py of the application in question is loaded, the class definition of Comment is evaluated too. Each field of the Comment class receives an instance of a Field class.
published field receives an instance of the
DateTimeField. No problem so far. The instance is created using the result of
datetime.datetime.now() as default. And right there things are wrong.
Because we actually call a function here, the result is used. This means the default value of the field is the
datetime of the moment when the class definition is being evaluated. The python interpreter remembers the resulting class as long as it is running.
In case of the website, the python interpreter was invoked using Apache with mod_wsgi. This means that a python instance will be reused for several requests. And all of those requests will use the same default time for published…
The fix for the bug was the removal of two characters. I had to change the line from this:
published = models.DateTimeField(default=datetime.datetime.now())
published = models.DateTimeField(default=datetime.datetime.now)
Now the parentheses
now are missing from the definition, a reference to the
now function is being passed as the default.
When Django tries to retrieve the default value of a field (for instance when the model is being saved) it now calls the
datetime.datetime.now() function instead of using the result of it when the models were loaded the first time.
Other places which are only evaluated once during an interpreter session will probably expose similar issues.
Take a basic function definition for example:
def example(items=): items.append("test") return items
What will a call to
example() return? What will be returned a second or third time?
>>> def example(items=): ... items.append("test") ... return items ... >>> example() ['test'] >>> example() ['test', 'test'] >>> example() ['test', 'test', 'test'] >>>
You can even get more confusing issues when you combine this behavior with more side-effects such as database queries.
Once, I saw something like this within a Django application:
class ExampleForm(forms.Form): ages = [(u'', u'Age group')] ages.extend([([c.id, c.label]) for c in Age.objects.all()]) age = forms.ChoiceField(choices=ages, required=False)
This will probably work. The models.py will validate and the application will start just fine. But what will happen when an age group is added within the Django admin interface while the front-end application is running?
Try it. You will see that the list of age groups will not change as long as Django is not restarted/reloaded…
Just take care where you declare something and think about when your declaration will be evaluated, how it will change within the application’s lifetime and if this is really what you want.
So just think about it.
by Rico Moorman (rico.moorman @ gmail.com) - 2012-09-15