6) Testing Lesson

Write a Python Unit Test

8 min to complete · By Martin Breuss

Navigate into the folder where you've saved your web scraper script. What's the name of the file you used? In this lesson, I'll assume that it's named rescrape.py.

How to Write a Unittest

Create a new Python file in the same folder and name it appropriately, following the testing guidelines described in the previous lessons. For example, in this case, it could be test_rescrape.py.

1. Set Basic Structure

It's time to set up the basic unittest test structure in that file:

import unittest
import rescrape


class TestRescrape(unittest.TestCase):
    pass

if __name__ == "__main__":
    unittest.main()

2. Check for Errors

Great, the basic structure is done. Make sure that the import statements work and that there's no SyntaxError anywhere by executing your tests:

python3 -m unittest test_rescrape.py

If you run the test suite with the basic structure set up but no test cases in your class yet, you should see this response:


----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK

No dots, but also no complaints. Everything is OK and following the plan so far :)

3. Write the Pseudocode

Now you can transfer your notes on what you might want to test in your web scraper into the testing file as pseudocode:

import unittest
import rescrape


class TestRescrape(unittest.TestCase):

    # requests can establish a connection and receive a valid response

    # the response contains HTML code

    # the HTML can be successfully converted to a Beautiful Soup object

    # can identify all links from the index page

    # can identify the author of a recipe

    # can get the main recipe text

    pass

if __name__ == "__main__":
    unittest.main()

These are a couple of ideas for the functionality of your scraper that you might want to test. Keep in mind that your pseudocode doesn't have to be the same, and it doesn't have to be complete and in order. These are notes to help you think and structure the code you'll write.

Colorful illustration of a light bulb

Info: You'll start by focusing on the first idea, which is why the following code blocks won't show the rest of the pseudocode anymore.

With your code comments and ideas in place, you can start to tackle the first one.

4. Write a Test Case

You'll want to make sure that your web scraping code can connect to the server and receive a valid response. Now, you'll try to encode this task into a test:

import unittest
import rescrape


class TestRescrape(unittest.TestCase):

    # requests can establish a connection and receive a valid response
    def test_get_valid_html_response(self):
        BASE_URL = "https://codingnomads.github.io/recipes/"
        index_page = rescrape.get_page_content(BASE_URL)
        self.assertEqual(index_page.status_code, 200)

You wrote your first test case, and you named it descriptively as test_get_valid_html_response(). Inside this method, you wrote three lines of code:

  1. You defined a BASE_URL variable that holds the URL of the recipe collection's index page.
  2. You called a non-existing function from your rescrape.py script, get_page_content(), and passed it the BASE_URL as an argument. You also saved the return value of this imaginary function to the variable index_page.
  3. Finally, you called self.assertEqual() to check whether the .status_code attribute of your index_page variable is equal to the HTTP success code 200.

5. Run Your Test

There are quite some assumptions you've made with this test case, and now it's time to check up on these assumptions by running your test suite:

python3 -m unittest test_rescrape.py

And because Python tries to be a good friend to you, it'll right away let you know that something's not quite right:

E
======================================================================
ERROR: test_get_valid_html_response (__main__.TestRescrape)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_rescrape.py", line 9, in test_get_valid_html_response
    index_page = rescrape.get_page_content(BASE_URL)
AttributeError: module 'rescrape' has no attribute 'get_page_content'

----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (errors=1)

Debug Errors

unittest presents you with an error (E). This is even before a failed test, which would be represented with a capital F. Python attempted to import the function you mentioned from rescrape.py, but it wasn't there:

AttributeError: module 'rescrape' has no attribute 'get_page_content'

It's not there, because you haven't written it yet. From a test-driven development perspective, you've received exactly the message you expected.

You start with failing tests, then you build or refactor your production code until the test passes. By writing your test method test_get_valid_html_response(), you created a blueprint for the function you'll build in your scraper code in the next lesson.

Summary: Write a Python Unit Test

  • There are several steps to writing a unittest
  • The initial tests are supposed to fail
  • From a failed test, you build or refactor your production code until the test passes

Steps to Write a Unit Test

  1. Set basic structure
  2. Check for errors
  3. Write the pseudocode
  4. Write a test case
  5. Run your test