Don't Call django.shortcuts.reverse in Tests

Don't Call django.shortcuts.reverse in Tests

This post is based on a bug my team discovered in production a couple of years ago.

We had to pin our Django version because a security patch changed undocumented routing logic and broke a contract with a front-end client.

In this post, I'm going to argue that we could have caught this bug earlier if we built paths in tests using strings and string interpolation rather than calling reverse.

Many Django Tests are Integration Tests

Many of the people I work with refer to the tests we write that use a test client and real database as unit tests. In fairness, these are not as integrated as they would be if we ran Django with a WSGI (or ASGI) server in a container with a managed database and used an external HTTP client.

Nonetheless, these tests include the complete request-response cycle and execute real SQL queries. They also benefit from an HTTP server and client and can validate expected contracts.

Example

The cause of this bug was a regex URL pattern that did not start with a caret, which meant that the client could put anything between different parts of the conf.

The client used to send requests /api/user/password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114 and these worked fine with the URL conf password-reset/(?P<uidb64>[0-9A-Za-z]+)/(?P<token>.+)$.

The particular error here has already been fixed by a historical security patch for CVE-2021-44420. It's included in the article to illustrate the point rather than on its own merits.

How did routing work before the security patch?

  • A client sends a request to /api/password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114.

  • Django’s routing first matches the regex ^api/ which restricts it to a subset of views.

  • It then truncates the original path, so that it’s password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114.

  • It matches the regex password-reset/(?P<uidb64>[0-9A-Za-z]+)/(?P<token>.+)$ and routes the request to the UserPasswordReset view.

Because the URL conf for UserPasswordReset does not start with a caret (^), requests to /api/bulbous-clown-shoes/password-reset/MzU2ODAzMA/601-cf2aabf988d02cb64114 would also be routed to that view.

After the patch exact matches were forced and the front-end seemingly inexplicably broken. The URL conf hasn't been changed in years so it wasn't clear why it was 404-ing.

What would have caught it

At the time of the bug, the tests looked like this:

    def test_user_password_reset_token_validation_invalid_token(self):
        path = reverse(
            "api:password-reset",
            kwargs={
                "uidb64": urlsafe_base64_encode(force_bytes(cls.a_user.pk)), 
                "token": "1234"
             },
        )

        response = self.client.get(path)

        assert response.status_code == status.HTTP_400_BAD_REQUEST
        assert response.data.get("detail") == "Token invalid"

Since then they've been updated to look like this:

    def test_user_password_reset_token_validation_invalid_token(self):
        uidb64 = urlsafe_base64_encode(force_bytes(cls.a_user.pk))
        path = f"/api/password-reset/{uidb64}/1234"

        response = self.client.get(self.endpoint)

        assert response.status_code == status.HTTP_400_BAD_REQUEST
        assert response.data.get("detail") == "Token invalid"
🤓
I wrote a script that iterated through all test files and automatically converted most reverse calls in tests to stings and f-strings. I may write a post about it one day.

Not only are there fewer lines of code, it's clearer at a glance what endpoint the test is testing.

As long as the paths in the test are in line with your API documentation, they should provide some coverage of the contract. Without explicitly typing out the paths this would need to be covered by end-to-end tests or contact tests.

Hyrum's Law

In the aftermath of this bug, I did a survey of all of the actual usage of our API compared with URL confs (using data collected using OpenTelemetry). I was looking for clients accidentally depending on the undocumented looseness of our routing. See https://www.hyrumslaw.com/

I only found one obvious example of Hyrum's Law. The regex /api/cards/(P<info_card_pk>\d+) didn't terminate with a $, meaning that any following characters were allowed.

Given the content type of these cards, some client developers had taken to appending .html on the end. To avoid breaking changes I converted this to re_path and left the overly permissive regex as it was.

Using django.url.path

The story behind this post shows its age somewhat as django.conf.urls.url was deprecated for some time and has been removed from Django 4.0+.

All the errors I've discussed were a result of overly broad regular expressions. django.urls.path avoids these errors by providing a simpler path syntax similar to routes in Flask or FastAPI. The path URL conf functions enforce exact matches by default, whereas the older url calls use custom regular expressions which are error-prone and can allow clients to use unexpected URLs.

Conclusion

Even if you are using django.urls.path for most of your URL I'd still argue that it's better practice to type out URLs in your tests. Django tests that use a fake client and real database are integration tests and we should take advantage of that.