This post is based on a bug my team discovered in production a couple of years ago.
We had to pin our Django version because a security patch changed undocumented routing logic and broke a contract with a front-end client.
In this post, I'm going to argue that we could have caught this bug earlier if we built paths in tests using strings and string interpolation rather than calling reverse
.
Many Django Tests are Integration Tests
Many of the people I work with refer to the tests we write that use a test client and real database as unit tests. In fairness, these are not as integrated as they would be if we ran Django with a WSGI (or ASGI) server in a container with a managed database and used an external HTTP client.
Nonetheless, these tests include the complete request-response cycle and execute real SQL queries. They also benefit from an HTTP server and client and can validate expected contracts.
Example
The cause of this bug was a regex URL pattern that did not start with a caret, which meant that the client could put anything between different parts of the conf.
The client used to send requests /api/user/password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114
and these worked fine with the URL conf password-reset/(?P<uidb64>[0-9A-Za-z]+)/(?P<token>.+)$
.
How did routing work before the security patch?
A client sends a request to
/api/password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114
.Django’s routing first matches the regex
^api/
which restricts it to a subset of views.It then truncates the original path, so that it’s
password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114
.It matches the regex
password-reset/(?P<uidb64>[0-9A-Za-z]+)/(?P<token>.+)$
and routes the request to theUserPasswordReset
view.
Because the URL conf for UserPasswordReset
does not start with a caret (^
), requests to /api/bulbous-clown-shoes/password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114
would also be routed to that view.
After the patch exact matches were forced and the front-end seemingly inexplicably broken. The URL conf hasn't been changed in years so it wasn't clear why it was 404-ing.
What would have caught it
At the time of the bug, the tests looked like this:
def test_user_password_reset_token_validation_invalid_token(self):
path = reverse(
"api:password-reset",
kwargs={
"uidb64": urlsafe_base64_encode(force_bytes(cls.a_user.pk)),
"token": "1234"
},
)
response = self.client.get(path)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert response.data.get("detail") == "Token invalid"
Since then they've been updated to look like this:
def test_user_password_reset_token_validation_invalid_token(self):
uidb64 = urlsafe_base64_encode(force_bytes(cls.a_user.pk))
path = f"/api/password-reset/{uidb64}/1234"
response = self.client.get(self.endpoint)
assert response.status_code == status.HTTP_400_BAD_REQUEST
assert response.data.get("detail") == "Token invalid"
reverse
calls in tests to stings and f-strings. I may write a post about it one day.Not only are there fewer lines of code, it's clearer at a glance what endpoint the test is testing.
As long as the paths in the test are in line with your API documentation, they should provide some coverage of the contract. Without explicitly typing out the paths this would need to be covered by end-to-end tests or contact tests.
Hyrum's Law
In the aftermath of this bug, I did a survey of all of the actual usage of our API compared with URL confs (using data collected using OpenTelemetry). I was looking for clients accidentally depending on the undocumented looseness of our routing. See https://www.hyrumslaw.com/
I only found one obvious example of Hyrum's Law. The regex /api/cards/(P<info_card_pk>\d+)
didn't terminate with a $
, meaning that any following characters were allowed.
Given the content type of these cards, some client developers had taken to appending .html
on the end. To avoid breaking changes I converted this to re_path
and left the overly permissive regex as it was.
Using django.url.path
The story behind this post shows its age somewhat as django.conf.urls.url
was deprecated for some time and has been removed from Django 4.0+.
All the errors I've discussed were a result of overly broad regular expressions. django.urls.path avoids these errors by providing a simpler path syntax similar to routes in Flask or FastAPI. The path
URL conf functions enforce exact matches by default, whereas the older url
calls use custom regular expressions which are error-prone and can allow clients to use unexpected URLs.
Conclusion
Even if you are using django.urls.path
for most of your URL I'd still argue that it's better practice to type out URLs in your tests. Django tests that use a fake client and real database are integration tests and we should take advantage of that.