Content from Introduction


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • What are the goals of this course?

Objectives

  • To understand the learning outcomees of this course
  • To understand the structure of the practicals

Welcome to Testing and Continuous Integration with Python


This course aims to equip researchers with the skills to write effective tests and ensure the quality and reliability of their research software. No prior testing experience is required! We’ll guide you through the fundamentals of software testing using Python’s Pytest framework, a powerful and beginner-friendly tool. You’ll also learn how to integrate automated testing into your development workflow using continuous integration (CI). CI streamlines your process by automatically running tests with every code change, catching bugs early and saving you time. By the end of the course, you’ll be able to write clear tests, leverage CI for efficient development, and ultimately strengthen the foundation of your scientific findings.

This course has a single continuous project that you will work on throughout the lessons and each lesson builds on the last through practicals that will help you apply the concepts you learn. However if you get stuck or fall behind during the course, don’t worry! All the stages of the project for each lesson are available in the files directory in this course’s materials that you can copy across if needed. For example if you are on lesson 3 and haven’t completed the practicals for lesson 2, you can copy the corresponding folder from the files directory.

By the end of this course, you should:

  • Understand how testing can be used to improve code & research reliability
  • Be comfortable with writing basic tests & running them
  • Be able to construct a simple Python project that incorporates tests
  • Be familiar with testing best practices such as unit testing & the AAA pattern
  • Be aware of more advanced testing features such as fixtures & parametrization
  • Understand what Continuous Integration is and why it is useful
  • Be able to add testing to a GitHub repository with simple Continuous Integration

Code of Conduct


This course is covered by the Carpentries Code of Conduct.

As mentioned in the Carpentries Code of Conduct, we encourage you to:

  • Use welcoming and inclusive language
  • Be respectful of different viewpoints and experiences
  • Gracefully accept constructive criticism
  • Focus on what is best for the community
  • Show courtesy and respect towards other community members

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by following our reporting guidelines.

Challenges


This course uses blocks like the one below to indicate an exercise for you to attempt. The solution is hidden by default and can be clicked on to reveal it.

Challenge 1: Talk to your neighbour

  • Introduce yourself to your neighbour
  • Have either of you experienced a time when testing would have been useful?
  • Have either of you written scripts to check that your code is working as expected?
  • Perhaps during a project your code kept breaking and taking up a lot of your time?
  • Perhaps you have written a script to check that your data is being processed correctly?

Key Points

  • This course will teach you how to write effective tests and ensure the quality and reliability of your research software
  • No prior testing experience is required
  • You can catch up on practicals by copying the corresponding folder from the files directory of this course’s materials

Content from Why Test My Code?


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • Why should I test my code?

Objectives

  • Understand how testing can help to ensure that code is working as expected

What is software testing?


Software testing is the process of checking that code is working as expected. You may have data processing functions or automations that you use in your work - how do you know that they are doing what you expect them to do?

Software testing is most commonly done by writing code (tests) that check that your code work as expected.

This might seem like a lot of effort, so let’s go over some of the reasons you might want to add tests to your project.

Catching bugs


Whether you are writing the occasional script or developing a large software, mistakes are inevitable. Sometimes you don’t even know when a mistake creeps into the code, and it gets published.

Consider the following function:

PYTHON

def add(a, b):
    return a - b

When writing this function, I made a mistake. I accidentally wrote a - b instead of a + b. This is a simple mistake, but it could have serious consequences in a project.

When writing the code, I could have tested this function by manually trying it with different input and checking the output, but:

  • This takes time.
  • I might forget to test it again when we make changes to the code later on.
  • Nobody else in my team knows if I tested it, or how I tested it, and therefore whether they can trust it.

This is where automated testing comes in.

Automated testing


Automated testing is where we write code that checks that our code works as expected. Every time we make a change, we can run our tests to automatically make sure that our code still works as expected.

If we were writing a test from scratch for the add function, think for a moment on how we would do it. We would need to write a function that runs the add function on a set of inputs, checking each case to ensure it does what we expect. Let’s write a test for the add function and call it test_add:

PYTHON

def test_add():
   # Check that it adds two positive integers
   if add(1, 2) != 3:
      print("Test failed!")
   # Check that it adds zero
   if add(5, 0) != 5:
      print("Test failed!")
   # Check that it adds two negative integers
   if add(-1, -2) != -3:
      print("Test failed!")

Here we check that the function works for a set of test cases. We ensure that it works for positive numbers, negative numbers, and zero.

Challenge 1: What could go wrong?

When writing functions, sometimes we don’t anticipate all the ways that they could go wrong.

Take a moment to think about what is wrong, or might go wrong with these functions:

PYTHON

def greet_user(name):
   return "Hello" + name + "!"

PYTHON

def gradient(x1, y1, x2, y2):
    return (y2 - y1) / (x2 - x1)

The first function will incorrectly greet the user, as it is missing a space after “Hello”. It would print HelloAlice! instead of Hello Alice!.

If we wrote a test for this function, we would have noticed that it was not working as expected:

PYTHON

def test_greet_user():
   if greet_user("Alice") != "Hello Alice!":
      print("Test failed!")

The second function will crash if x2 - x1 is zero.

If we wrote a test for this function, it may have helped us to catch this unexpected behaviour:

PYTHON

def test_gradient():
   if gradient(1, 1, 2, 2) != 1:
      print("Test failed!")
   if gradient(1, 1, 2, 3) != 2:
      print("Test failed!")
   if gradient(1, 1, 1, 2) != "Undefined":
      print("Test failed!")

And we could have ammened the function:

PYTHON

def gradient(x1, y1, x2, y2):
   if x2 - x1 == 0:
      return "Undefined"
   return (y2 - y1) / (x2 - x1)

Finding the root cause of a bug


When a test fails, it can help us to find the root cause of a bug. For example, consider the following function:

PYTHON


def multiply(a, b):
      return a * a

def divide(a, b):
      return a / b

def triangle_area(base, height):
      return divide(multiply(base, height), 2)

There is a bug in this code too, but since we have several functions calling each other, it is not immediately obvious where the bug is. Also, the bug is not likely to cause a crash, so we won’t get a helpful error message telling us what went wrong. If a user happened to notice that there was an error, then we would have to check triangle_area to see if the formula we used is right, then multiply, and divide to see if they were working as expected too!

However, if we had written tests for these functions, then we would have seen that both the triangle_area and multiply functions were not working as expected, allowing us to quickly see that the bug was in the multiply function without having to check the other functions.

Increased confidence in code


When you have tests for your code, you can be more confident that it works as expected. This is especially important when you are working in a team or producing software for users, as it allows everyone to trust the code. If you have a test that checks that a function works as expected, then you can be confident that the function will work as expected, even if you didn’t write it yourself.

Forcing a more structured approach to coding


When you write tests for your code, you are forced to think more carefully about how your code behaves and how you will verify that it works as expected. This can help you to write more structured code, as you will need to think about how to test it as well as how it could fail.

Challenge 2: What could go wrong?

Consider a function that controls a driverless car.

  • What checks might we add to make sure it is not dangerous to use?

PYTHON


def drive_car(speed, direction):

   ... # complex car driving code

    return speed, direction, brake_status
   
  • We might want to check that the speed is within a safe range.

  • We might want to check that the direction is a valid direction. ie not towards a tree, and if so, the car should be applying the brakes.

Key Points

  • Automated testing helps to catch hard to spot errors in code & find the root cause of complex issues.
  • Tests reduce the time spent manually verifying (and re-verifying!) that code works.
  • Tests help to ensure that code works as expected when changes are made.
  • Tests are especially useful when working in a team, as they help to ensure that everyone can trust the code.

Content from Simple Tests


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How to write a simple test?
  • How to run the test?

Objectives

  • Write a basic test.
  • Run the test.
  • Understand its output in the terminal.

Your first test


The most basic thing you will want to do in a test is check that an output for a function is correct, by checking that it is equal to a certain value.

Let’s take the add function example from the previous chapter and the test we conceptualised for it and write it in code.

  • Make a folder called my_project (or whatever you want to call it for these lessons) and inside it, create a file called ‘calculator.py’, and another file called ‘test_calculator.py’.

So your directory structure should look like this:

BASH

project_directory/

├── calculator.py
└── test_calculator.py

calculator.py will contain our Python functions that we want to test, and test_calculator.py will contain our tests for those functions.

  • In calculator.py, write the add function:

PYTHON

def add(a, b):
  return a + b
  • And in test_calculator.py, write the test for the add function that we conceptualised in the previous lesson:

PYTHON

# Import the add function so the test can use it
from calculator import add

def test_add():
   # Check that it adds two positive integers
   if add(1, 2) != 3:
      print("Test failed!")
      raise AssertionError("Test failed!")

   # Check that it adds zero
   if add(5, 0) != 5:
      print("Test failed!")
      raise AssertionError("Test failed!")

   # Check that it adds two negative integers
   if add(-1, -2) != -3:
      print("Test failed!")
      raise AssertionError("Test failed!")

(Note that the AssertionError is a way to tell Python to crash the test, so Pytest knows that the test has failed.)

This system of placing functions in a file and then tests for those functions in another file, is a common pattern in software development. It allows you to keep your code organised and separate your tests from your actual code.

With Pytest, the expectation is to name your test functions with the prefix test_.

Now, let’s run the test. We can do this by running the following command in the terminal:

(make sure you’re in the my_project directory before running this command)

BASH

 pytest ./

This command tells pytest to run all the tests in the current directory.

When you run the test, you should see that the test runs successfully, indicated by some green. text in the terminal. We will go through the output and what it means in the next lesson, but for now, know that green means that the test passed, and red means that the test failed.

Try changing the add function to return the wrong value, and run the test again to see that the test now fails and the text turns red - neat!

The assert keyword


Writing these if blocks for each test case is cumbersome. Fortunately, Python has a keyword to do this for us - the assert keyword.

The assert keyword checks if a statement is true and if it is, the test continues, but if it isn’t, then the test will crash, printing an error in the terminal. This enables us to write succinct tests without lots of if-statements.

The assert keyword is used like this:

PYTHON

assert add(1, 2) == 3

which is equivalent to:

PYTHON

if add(1, 2) != 3:
  # Crash the test
  raise AssertionError

Challenge 1: Use the assert keyword to update the test for the add function

Use the assert keyword to update the test for the add function to make it more concise and readable.

Then re-run the test using pytest ./ to check that it still passes.

PYTHON

from calculator import add

def test_add():
  assert add(1, 2) == 3 # Check that it adds to positive integers
  assert add(5, 0) == 5 # Check that it adds zero
  assert add(-1, -2) == -3 # Check that it adds wro negative numbers

Now that we are using the assert keyword, pytest will let us know if our test fails.

What’s more, is that if any of these assert statements fail, it will flag to pytest that the test has failed, and pytest will let you know.

Make the add function return the wrong value, and run the test again to see that the test fails and the text turns red as we expect.

So if this was a real testing situation, we would know to investigate the add function to see why it’s not behaving as expected.

Challenge 2: Write a test for a multiply function

Try using what we have covered to write a test for a multiply function that multiplies two numbers together.

  • Place this multiply function in calculator.py:

PYTHON

def multiply(a, b):
  return a * b
  • Then write a test for this function in test_calculator.py. Remember to import the multiply function from calculator.py at the top of the file like this:

PYTHON

from calculator import multiply

There are many different test cases that you could include, but it’s important to check that different types of cases are covered. A test for this function could look like this:

PYTHON

def test_multiply():
  # Check that positive numbers work
  assert multiply(5, 5) == 25
  # Check that multiplying by 1 works
  assert multiply(1, 5) == 5
  # Check that multiplying by 0 works
  assert multiply(0, 3) == 0
  # Check that negative numbers work
  assert multiply(-5, 2) == -10

Run the test using pytest ./ to check that it passes. If it doesn’t, don’t worry, that’s the point of testing - to find bugs in code.

Key Points

  • The assert keyword is used to check if a statement is true and is a shorthand for writing if statements in tests.
  • Pytest is invoked by running the command pytest ./ in the terminal.
  • pytest will run all the tests in the current directory, found by looking for files that start with test_.
  • The output of a test is displayed in the terminal, with green text indicating a successful test and red text indicating a failed test.
  • It’s best practice to write tests in a separate file from the code they are testing. Eg: scripts.py and test_scripts.py.

Content from Interacting with Tests


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How do I use pytest to run my tests?
  • What does the output of pytest look like and how do I interpret it?

Objectives

  • Understand how to run tests using pytest.
  • Understand how to interpret the output of pytest.

Running pytest


As we saw in the previous lesson, you can invoke pytest using the pytest terminal command. This searches within the current directory (and any sub-directories) for files that start or end with ‘test’. For example: test_scripts.py, scripts_test.py. It then searches for tests in these files, which are functions (or classes) with names start with ‘test’, such as the test_add function we made in the previous lesson.

So far, we should have a file called calculator.py with an add and multiply function, and a file called test_calculator.py with test_add and test_multiply functions. If you are missing either of these, they are listed in the previous lesson.

To show off pytest’s ability to search multiple files for tests, let’s create a directory (folder) inside the current project directory called advanced where we will add some advanced calculator functionality.

  • Create a directory called advanced inside your project directory.
  • Inside this directory, create a file called advanced_calculator.py and a file called test_advanced_calculator.py.

Your project directory should now look like this:

project_directory/
│
├── calculator.py
├── test_calculator.py
│
└── advanced/
    ├── advanced_calculator.py
    └── test_advanced_calculator.py
  • In the advanced_calculator.py file, add the following code:

PYTHON

def power(value, exponent):
    """Raise a value to an exponent"""
    result = value
    for _ in range(exponent-1):
        result *= value
    return result
  • In the test_advanced_calculator.py file, add the following test:

PYTHON

from advanced_calculator import power

def test_power():
    """Test for the power function"""
    assert power(2, 3) == 8
    assert power(3, 3) == 27
  • Now run pytest in the terminal. You should see that all tests pass due to the green output.

Let’s have a closer look at the output of pytest.

Test output


When running pytest, there are usually two possible outcomes:

Case 1: All tests pass

Let’s break down the successful output in more detail.

=== test session starts ===
  • The first line tells us that pytest has started running tests.
platform darwin -- Python 3.11.0, pytest-8.1.1, pluggy-1.4.0
  • The next line just tells us the versions of several packages.
rootdir: /Users/sylvi/Documents/GitKraken/python-testing-for-research/episodes/files/03-interacting-with-tests
  • The next line tells us where the tests are being searched for. In this case, it is your project directory. So any file that starts or ends with test anywhere in this directory will be opened and searched for test functions.
plugins: regtest-2.1.1
  • This tells us what plugins are being used. In my case, I have a plugin called regtest that is being used, but you may not. This is fine and you can ignore it.
collected 3 items
  • This simply tells us that 3 tests have been found and are ready to be run.
advanced/test_advanced_calculator.py .
test_calculator.py ..    [100%]
  • These two lines tells us that the tests in test_calculator.py and advanced/test_advanced_calculator.py have passed. Each . means that a test has passed. There are two of them beside test_calculator.py because there are two tests in test_calculator.py If a test fails, it will show an F instead of a ..
=== 3 passed in 0.01s ===
  • This tells us that the 3 tests have passed in 0.01 seconds.

Case 2: Some or all tests fail

Now let’s look at the output when the tests fail. Edit a test in test_calculator.py to make it fail (for example switching the + in add to a -), then run pytest again.

The start is much the same as before:

=== test session starts ===
platform darwin -- Python 3.11.0, pytest-8.1.1, pluggy-1.4.0
rootdir: /Users/sylvi/Documents/GitKraken/python-testing-for-research/episodes/files/03-interacting-with-tests
plugins: regtest-2.1.1
collected 3 items

But now we see that the tests have failed:

advanced/test_advanced_calculator.py .                                                                                                                                                                                                                                                                                                                                      [ 33%]
test_calculator.py F. 

These F tells us that a test has failed. The output then tells us which test has failed:

=== FAILURES ===

___ test_add ___
    def test_add():
        """Test for the add function"""
>       assert add(1, 2) == 3
E       assert -1 == 3
E       +  where -1 = add(1, 2)

test_calculator.py:21: AssertionError

This is where we get detailled information about what exactly broke in the test.

  • The > chevron points to the line that failed in the test. In this case, the assertion assert add(1, 2) == 3 failed.
  • The following line tells us what the assertion tried to do. In this case, it tried to assert that the number -1 was equal to 3. Which of course it isn’t.
  • The next line goes into more detail about why it tried to equate -1 to 3. It tells us that -1 is the result of calling add(1, 2).
  • The final line tells us where the test failed. In this case, it was on line 21 of test_calculator.py.

Using this detailled output, we can quickly find the exact line that failed and know the inputs that caused the failure. From there, we can examine exactly what went wrong and fix it.

Finally, pytest prints out a short summary of all the failed tests:

=== short test summary info ===
FAILED test_calculator.py::test_add - assert -1 == 3
=== 1 failed, 2 passed in 0.01s ===

This tells us that one of our tests failed, and gives a short summary of what went wrong in this test and finally tells us that it took 0.01s to run the tests.

Errors in collection


If pytest encounters an error while collecting the tests, it will print out an error message and won’t run the tests. This happens when there is a syntax error in one of the test files, or if pytest can’t find the test files.

For example, if you remove the : from the end of the def test_multiply(): function definition and run pytest, you will see the following output:

=== test session starts ===
platform darwin -- Python 3.11.0, pytest-8.1.1, pluggy-1.4.0
Matplotlib: 3.9.0
Freetype: 2.6.1
rootdir: /Users/sylvi/Documents/GitKraken/python-testing-for-research/episodes/files/03-interacting-with-tests.Rmd
plugins: mpl-0.17.0, regtest-2.1.1
collected 1 item / 1 error

=== ERRORS ===
___ ERROR collecting test_calculator.py ___
...
E     File "/Users/sylvi/Documents/GitKraken/python-testing-for-research/episodes/files/03-interacting-with-tests.Rmd/test_calculator.py", line 14
E       def test_multiply()
E                          ^
E   SyntaxError: expected ':'
=== short test summary info ===
ERROR test_calculator.py
!!! Interrupted: 1 error during collection !!!
=== 1 error in 0.01s ===

This rather scary output is just telling us that there is a syntax error that needs fixing before the tests can be run.

Pytest options


Pytest has a number of options that can be used to customize how tests are run. It is very useful to know about these options as they can help you to run tests the way you want and get more information if necessary about a test run.

The verbose flag

The verbose flag -v can be used to get more detailed output from pytest. This can be useful when you want to see more information about the tests that are being run. For example, running pytest -v will give you more information about the tests that are being run, including the names of the tests and the files that they are in.

The quiet flag

The quiet flag -q can be used to get less detailed output from pytest. This can be useful when you want to see less information about the tests that are being run. For example, running pytest -q will give you less information about the tests that are being run, including the names of the tests and the files that they are in.

Running specific tests

In order to run a specific test, you can use the -k flag followed by the name of the test you want to run. For example, to run only the test_add test, you can run pytest -k test_add. This will only run the test_add test and ignore the test_multiply test.

Alternatively you can call a specific test using this notation: pytest test_calculator.py::test_add. This tells pytest to only run the test_add test in the test_calculator.py file.

Stopping after the first failure

If you want to stop running tests after the first failure, you can use the -x flag. This will cause pytest to stop running tests after the first failure. This is useful when you have lots of tests that take a while to run.

Challenge - Experiment with pytest options

Try running pytest with the above options, editing the code to make the tests fail where necessary to see what happens.

  • Run pytest -v to see more detailed output.

  • Run pytest -q to see less detailed output.

  • Run pytest -k test_add to run only the test_add test.

  • Alternatively run pytest test_calculator.py::test_add to run only the test_add test.

  • Run pytest -x to stop running tests after the first failure. (Make sure you have a failing test to see this in action).

Key Points

  • You can run multiple tests at once by running pytest in the terminal.
  • Pytest searches for tests in files that start or end with ‘test’ in the current directory and subdirectories.
  • The output of pytest tells you which tests have passed and which have failed and precisely why they failed.
  • Flags such as -v, -q, -k, and -x can be used to get more detailed output, less detailed output, run specific tests, and stop running tests after the first failure, respectively.

Content from Unit tests & Testing Practices


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • What to do about complex functions & tests?
  • What are some testing best practices for testing?
  • How far should I go with testing?
  • How do I add tests to an existing project?

Objectives

  • Be able to write effective unit tests for more complex functions
  • Understand the AAA pattern for structuring tests
  • Understand the benefits of test driven development
  • Know how to handle randomness in tests

But what about complicated functions?


Some of the functions that you write will be more complex, resulting in tests that are very complex and hard to debug if they fail. Take this function as an example:

PYTHON

def process_data(data: list, maximum_value: float):

    # Remove negative values
    data_negative_removed = []
    for i in range(len(data)):
        if data[i] >= 0:
            data_negative_removed.append(data[i])

    # Remove values above the maximum value
    data_maximum_removed = []
    for i in range(len(data_negative_removed)):
        if data_negative_removed[i] <= maximum_value:
            data_maximum_removed.append(data_negative_removed[i])
    
    # Calculate the mean
    mean = sum(data_maximum_removed) / len(data_maximum_removed)

    # Calculate the standard deviation
    variance = sum([(x - mean) ** 2 for x in data_maximum_removed]) / len(data_maximum_removed)
    std_dev = variance ** 0.5

    return mean, std_dev

A test for this function might look like this:

PYTHON

def test_process_data():
    data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    maximum_value = 5
    mean, std_dev = process_data(data, maximum_value)
    assert mean == 3
    assert std_dev == 1.5811388300841898

This test is very complex and hard to debug if it fails. Imagine if the calculation of the mean broke - the test would fail but it would not tell us what part of the function was broken, requiring us to check each function manually to find the bug. Not very efficient!

Unit Testing


The process of unit testing is a fundamental part of software development. It is where you test individual units or components of a software instead of multiple things at once. For example, if you were adding tests to a car, you would want to test the wheels, the engine, the brakes, etc. separately to make sure they all work as expected before testing that the car could drive to the shops. The goal with unit testing is to validate that each unit of the software performs as designed. A unit is the smallest testable part of your code. A unit test usually has one or a few inputs and usually a single output.

The above function could usefully be broken down into smaller functions, each of which could be tested separately. This would make the tests easier to write and maintain.

PYTHON

def remove_negative_values(data: list):
    data_negatives_removed = []
    for i in range(len(data)):
        if data[i] >= 0:
            data_negatives_removed.append(data[i])
    return data

def remove_values_above_maximum(data: list, maximum_value: float):
    data_maximum_removed = []
    for i in range(len(data)):
        if data[i] <= maximum_value:
            data_maximum_removed.append(data[i])
    return data

def calculate_mean(data: list):
    return sum(data) / len(data)

def calculate_std_dev(data: list):
    mean = calculate_mean(data)
    variance = sum([(x - mean) ** 2 for x in data]) / len(data)
    return variance ** 0.5

def process_data(data: list, maximum_value: float):
    # Remove negative values
    data = remove_negative_values(data)
    # Remove values above the maximum value
    data = remove_values_above_maximum(data, maximum_value)
    # Calculate the mean
    mean = calculate_mean(data)
    # Calculate the standard deviation
    std_dev = calculate_std_dev(data)
    return mean, std_dev

Now we can write tests for each of these functions separately:

PYTHON

def test_remove_negative_values():
    data = [1, -2, 3, -4, 5, -6, 7, -8, 9, -10]
    assert remove_negative_values(data) == [1, 3, 5, 7, 9]

def test_remove_values_above_maximum():
    data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    maximum_value = 5
    assert remove_values_above_maximum(data, maximum_value) == [1, 2, 3, 4, 5]

def test_calculate_mean():
    data = [1, 2, 3, 4, 5]
    assert calculate_mean(data) == 3

def test_calculate_std_dev():
    data = [1, 2, 3, 4, 5]
    assert calculate_std_dev(data) == 1.5811388300841898

def test_process_data():
    data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    maximum_value = 5
    mean, std_dev = process_data(data, maximum_value)
    assert mean == 3
    assert std_dev == 1.5811388300841898

These tests are much easier to read and understand, and if one of them fails, it is much easier to see which part of the function is broken. This is the principle of unit testing: breaking down complex functions into smaller, testable units.

AAA pattern


When writing tests, it is a good idea to follow the AAA pattern:

  • Arrange: Set up the data and the conditions for the test
  • Act: Perform the action that you are testing
  • Assert: Check that the result of the action is what you expect

It is a standard pattern in unit testing and is used in many testing frameworks. This makes your tests easier to read and understand for both yourself and others reading your code.

PYTHON

def test_calculate_mean():
    # Arrange
    data = [1, 2, 3, 4, 5]
    
    # Act
    mean = calculate_mean(data)
    
    # Assert
    assert mean == 3

Test Driven Development (TDD)


Test Driven Development (TDD) is a software development process that focuses on writing tests before writing the code. This can have several benefits:

  • It forces you to think about the requirements of the code before you write it, this is especially useful in research.
  • It can help you to write cleaner, more modular code by breaking down complex functions into smaller, testable units.
  • It can help you to catch bugs early in the development process.

Without the test driven development process, you might write the code first and then try to write tests for it afterwards. This can lead to tests that are hard to write and maintain, and can result in bugs that are hard to find and fix.

The TDD process usually follows these steps:

  1. Write a failing test
  2. Write the minimum amount of code to make the test pass
  3. Refactor the code to make it clean and maintainable

Here is an example of the TDD process:

  1. Write a failing test

PYTHON


def test_calculate_mean():
    # Arrange
    data = [1, 2, 3, 4, 5]
    
    # Act
    mean = calculate_mean(data)
    
    # Assert
    assert mean == 3.5
  1. Write the minimum amount of code to make the test pass

PYTHON

def calculate_mean(data: list):
    total = 0
    for i in range(len(data)):
        total += data[i]
    mean = total / len(data)
    return mean
  1. Refactor the code to make it clean and maintainable

PYTHON

def calculate_mean(data: list):
    if len(data) == 0:
        return 0
    return sum(data) / len(data)

This process can help you to write clean, maintainable code that is easy to test and debug.

Of course, in research, sometimes you might not know exactly what the requirements of the code are before you write it. In this case, you can still use the TDD process, but you might need to iterate on the tests and the code as you learn more about the problem you are trying to solve.

Randomness in tests


Some functions use randomness, which you might assume means we cannot write tests for them. However using random seeds, we can make this randomness deterministic and write tests for these functions.

PYTHON

import random

def random_number():
    return random.randint(1, 10)

def test_random_number():
    random.seed(0)
    assert random_number() == 1
    assert random_number() == 2
    assert random_number() == 3

Random seeds work by setting the initial state of the random number generator. This means that if you set the seed to the same value, you will get the same sequence of random numbers each time you run the function.

Challenge: Write your own unit tests

Take this complex function, break it down and write unit tests for it.

  • Create a new directory called statistics in your project directory
  • Create a new file called stats.py in the statistics directory
  • Write the following function in stats.py:

PYTHON

import random

def randomly_sample_and_filter_participants(
    participants: list, 
    sample_size: int, 
    min_age: int, 
    max_age: int, 
    min_height: int, 
    max_height: int
):
    """Participants is a list of tuples, containing the age and height of each participant
    participants = [
                      {age: 25, height: 180}, 
                      {age: 30, height: 170}, 
                      {age: 35, height: 160}, 
    ]
    """
    
    # Get the indexes to sample
    indexes = random.sample(range(len(participants)), sample_size)

    # Get the sampled participants
    sampled_participants = []
    for i in indexes:
        sampled_participants.append(participants[i])
    
    # Remove participants that are outside the age range
    sampled_participants_age_filtered = []
    for participant in sampled_participants:
        if participant['age'] >= min_age and participant['age'] <= max_age:
            sampled_participants_age_filtered.append(participant)
    
    # Remove participants that are outside the height range
    sampled_participants_height_filtered = []
    for participant in sampled_participants_age_filtered:
        if participant['height'] >= min_height and participant['height'] <= max_height:
            sampled_participants_height_filtered.append(participant)

    return sampled_participants_height_filtered
  • Create a new file called test_stats.py in the statistics directory
  • Write unit tests for the randomly_sample_and_filter_participants function in test_stats.py

The function can be broken down into smaller functions, each of which can be tested separately:

PYTHON

import random

def sample_participants(
    participants: list, 
    sample_size: int
):
    indexes = random.sample(range(len(participants)), sample_size)
    sampled_participants = []
    for i in indexes:
        sampled_participants.append(participants[i])
    return sampled_participants

def filter_participants_by_age(
    participants: list, 
    min_age: int, 
    max_age: int
):
    filtered_participants = []
    for participant in participants:
        if participant['age'] >= min_age and participant['age'] <= max_age:
            filtered_participants.append(participant)
    return filtered_participants

def filter_participants_by_height(
    participants: list, 
    min_height: int, 
    max_height: int
):
    filtered_participants = []
    for participant in participants:
        if participant['height'] >= min_height and participant['height'] <= max_height:
            filtered_participants.append(participant)
    return filtered_participants

def randomly_sample_and_filter_participants(
    participants: list, 
    sample_size: int, 
    min_age: int, 
    max_age: int, 
    min_height: int, 
    max_height: int
):
    sampled_participants = sample_participants(participants, sample_size)
    age_filtered_participants = filter_participants_by_age(sampled_participants, min_age, max_age)
    height_filtered_participants = filter_participants_by_height(age_filtered_participants, min_height, max_height)
    return height_filtered_participants

Now we can write tests for each of these functions separately, remembering to set the random seed to make the randomness deterministic:

PYTHON

import random

def test_sample_participants():
    # set random seed
    random.seed(0)

    participants = [
        {'age': 25, 'height': 180},
        {'age': 30, 'height': 170},
        {'age': 35, 'height': 160},
    ]
    sample_size = 2
    sampled_participants = sample_participants(participants, sample_size)
    expected = [{'age': 30, 'height': 170}, {'age': 35, 'height': 160}]
    assert sampled_participants == expected

def test_filter_participants_by_age():
    participants = [
        {'age': 25, 'height': 180},
        {'age': 30, 'height': 170},
        {'age': 35, 'height': 160},
    ]
    min_age = 30
    max_age = 35
    filtered_participants = filter_participants_by_age(participants, min_age, max_age)
    expected = [{'age': 30, 'height': 170}, {'age': 35, 'height': 160}]
    assert filtered_participants == expected

def test_filter_participants_by_height():
    participants = [
        {'age': 25, 'height': 180},
        {'age': 30, 'height': 170},
        {'age': 35, 'height': 160},
    ]
    min_height = 160
    max_height = 170
    filtered_participants = filter_participants_by_height(participants, min_height, max_height)
    expected = [{'age': 30, 'height': 170}, {'age': 35, 'height': 160}]
    assert filtered_participants == expected

def test_randomly_sample_and_filter_participants():
    # set random seed
    random.seed(0)

    participants = [
        {"age": 25, "height": 180},
        {"age": 30, "height": 170},
        {"age": 35, "height": 160},
        {"age": 38, "height": 165},
        {"age": 40, "height": 190},
        {"age": 45, "height": 200},
    ]
    sample_size = 5
    min_age = 28
    max_age = 42
    min_height = 159
    max_height = 172
    filtered_participants = randomly_sample_and_filter_participants(
        participants, sample_size, min_age, max_age, min_height, max_height
    )
    expected = [{"age": 38, "height": 165}, {"age": 30, "height": 170}, {"age": 35, "height": 160}]
    assert filtered_participants == expected

These tests are much easier to read and understand, and if one of them fails, it is much easier to see which part of the function is broken.

Adding tests to an existing project


You may have an existing project that does not have any tests yet. Adding tests to an existing project can be a daunting task and it can be hard to know where to start.

In general, it’s a good idea to start by adding regression tests to your most important functions. Regression tests are tests that simply check that the output of a function doesn’t change when you make changes to the code. They don’t check the individual components of the functions like unit testing does.

For example if you had a long processing pipeline that returns a single number, 23 when provided a certain set of inputs, you could write a regression test that checks that the output is still 23 when you make changes to the code.

After adding regression tests, you can start adding unit tests to the individual functions in your code, starting with the more commonly used / likely to break functions such as ones that handle data processing or input/output.

Should we aim for 100% test coverage?


Although tests add reliability to your code, it’s not always practicable to spend so much development time writing tests. When time is limited, it’s often better to only write tests for the most critical parts of the code as opposed to rigorously testing every function.

You should discuss with your team how much of the code you think should be tested, and what the most critical parts of the code are in order to prioritize your time.

Key Points

  • Complex functions can be broken down into smaller, testable units.
  • Testing each unit separately is called unit testing.
  • The AAA pattern is a good way to structure your tests.
  • Test driven development can help you to write clean, maintainable code.
  • Randomness in tests can be made deterministic using random seeds.
  • Adding tests to an existing project can be done incrementally, starting with regression tests.

Content from Testing for Exceptions


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How check that a function raises an exception?

Objectives

  • Learn how to test exceptions using pytest.raises.

What to do about code that raises exceptions?


Sometimes you will want to make sure that a function raises an exception when it should. For example, you might want to check that a function raises a ValueError when it receives an invalid input.

Take this example of the square_root function. We don’t have time to implement complex numbers yet, so we can raise a ValueError when the input is negative to crash a program that tries to compute the square root of a negative number.

PYTHON


def square_root(x):
   if x < 0:
      raise ValueError("Cannot compute square root of negative number yet!")
   return x ** 0.5

We can test that the function raises an exception using pytest.raises as follows:

PYTHON

import pytest

from advanced.advanced_calculator import square_root

def test_square_root():
    with pytest.raises(ValueError):
        square_root(-1)

Here, pytest.raises is a context manager that checks that the code inside the with block raises a ValueError exception. If it doesn’t, the test fails.

If you want to get more detailled with things, you can test what the error message says too:

PYTHON


def test_square_root():
    with pytest.raises(ValueError) as e:
        square_root(-1)
    assert str(e.value) == "Cannot compute square root of negative number yet!"

Challenge : Ensure that the divide function raises a ZeroDivisionError when the denominator is zero.

  • Add a divide function to calculator.py:

PYTHON


def divide(numerator, denominator):
    if denominator == 0:
        raise ZeroDivisionError("Cannot divide by zero!")
    return numerator / denominator
  • Write a test in test_calculator.py that checks that the divide function raises a ZeroDivisionError when the denominator is zero.

PYTHON

import pytest

from calculator import divide

def test_divide():
    with pytest.raises(ZeroDivisionError):
        divide(1, 0)

Key Points

  • Use pytest.raises to check that a function raises an exception.

Content from Testing Data Structures


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How do you compare data structures such as lists and dictionaries?
  • How do you compare objects in libraries like pandas and numpy?

Objectives

  • Learn how to compare lists and dictionaries in Python.
  • Learn how to compare objects in libraries like pandas and numpy.

Data structures


When writing tests for your code, you often need to compare data structures such as lists, dictionaries, and objects from libraries like numpy and pandas. Here we will go over some of the more common data structures that you may use in research and how to test them.

Lists

Python lists can be tested using the usual == operator as we do for numbers.

PYTHON


def test_lists_equal():
    """Test that lists are equal"""
    # Create two lists
    list1 = [1, 2, 3]
    list2 = [1, 2, 3]
    # Check that the lists are equal
    assert list1 == list2

    # Two lists, different order
    list3 = [1, 2, 3]
    list4 = [3, 2, 1]
    assert list3 != list4

    # Create two different lists
    list5 = [1, 2, 3]
    list6 = [1, 2, 4]
    # Check that the lists are not equal
    assert list5 != list6

Note that the order of elements in the list matters. If you want to check that two lists contain the same elements but in different order, you can use the sorted function.

PYTHON

def test_sorted_lists_equal():
    """Test that lists are equal"""
    # Create two lists
    list1 = [1, 2, 3]
    list2 = [1, 2, 3]
    # Check that the lists are equal
    assert sorted(list1) == sorted(list2)

    # Two lists, different order
    list3 = [1, 2, 3]
    list4 = [3, 2, 1]
    assert sorted(list3) == sorted(list4)

    # Create two different lists
    list5 = [1, 2, 3]
    list6 = [1, 2, 4]
    # Check that the lists are not equal
    assert sorted(list5) != sorted(list6)

Dictionaries

Python dictionaries can also be tested using the == operator, however, the order of the keys does not matter. This means that if you have two dictionaries with the same keys and values, but in different order, they will still be considered equal.

The reason for this is that dictionaries are unordered collections of key-value pairs. (If you need to preserve the order of keys, you can use the collections.OrderedDict class.)

PYTHON

def test_dictionaries_equal():
    """Test that dictionaries are equal"""
    # Create two dictionaries
    dict1 = {"a": 1, "b": 2, "c": 3}
    dict2 = {"a": 1, "b": 2, "c": 3}
    # Check that the dictionaries are equal
    assert dict1 == dict2

    # Create two dictionaries, different order
    dict3 = {"a": 1, "b": 2, "c": 3}
    dict4 = {"c": 3, "b": 2, "a": 1}
    assert dict3 == dict4

    # Create two different dictionaries
    dict5 = {"a": 1, "b": 2, "c": 3}
    dict6 = {"a": 1, "b": 2, "c": 4}
    # Check that the dictionaries are not equal
    assert dict5 != dict6

numpy

Numpy is a common library used in research. Instead of the usual assert a == b, numpy has its own testing functions that are more suitable for comparing numpy arrays. These two functions are the ones you are most likely to use: - numpy.testing.assert_array_equal is used to compare two numpy arrays. - numpy.testing.assert_allclose is used to compare two numpy arrays with a tolerance for floating point numbers. - numpy.testing.assert_equal is used to compare two objects such as lists or dictionaries that contain numpy arrays.

Here are some examples of how to use these functions:

PYTHON


def test_numpy_arrays():
    """Test that numpy arrays are equal"""
    # Create two numpy arrays
    array1 = np.array([1, 2, 3])
    array2 = np.array([1, 2, 3])
    # Check that the arrays are equal
    np.testing.assert_array_equal(array1, array2)

# Note that np.testing.assert_array_equal even works with nested numpy arrays!

def test_nested_numpy_arrays():
    """Test that nested numpy arrays are equal"""
    # Create two nested numpy arrays
    array1 = np.array([[1, 2], [3, 4]])
    array2 = np.array([[1, 2], [3, 4]])
    # Check that the nested arrays are equal
    np.testing.assert_array_equal(array1, array2)

def test_numpy_arrays_with_tolerance():
    """Test that numpy arrays are equal with tolerance"""
    # Create two numpy arrays
    array1 = np.array([1.0, 2.0, 3.0])
    array2 = np.array([1.00009, 2.0005, 3.0001])
    # Check that the arrays are equal with tolerance
    np.testing.assert_allclose(array1, array2, atol=1e-3)

Data structures with numpy arrays

When you have data structures that contain numpy arrays, such as lists or dictionaries, you cannot use == to compare them. Instead, you can use numpy.testing.assert_equal to compare the data structures.

PYTHON

def test_dictionaries_with_numpy_arrays():
    """Test that dictionaries with numpy arrays are equal"""
    # Create two dictionaries with numpy arrays
    dict1 = {"a": np.array([1, 2, 3]), "b": np.array([4, 5, 6])}
    dict2 = {"a": np.array([1, 2, 3]), "b": np.array([4, 5, 6])}
    # Check that the dictionaries are equal
    np.testing.assert_equal(dict1, dict2)

pandas

Pandas is another common library used in research for storing and manipulating datasets. Pandas has its own testing functions that are more suitable for comparing pandas objects. These two functions are the ones you are most likely to use: - pandas.testing.assert_frame_equal is used to compare two pandas DataFrames. - pandas.testing.assert_series_equal is used to compare two pandas Series.

Here are some examples of how to use these functions:

PYTHON


def test_pandas_dataframes():
    """Test that pandas DataFrames are equal"""
    # Create two pandas DataFrames
    df1 = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
    df2 = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
    # Check that the DataFrames are equal
    pd.testing.assert_frame_equal(df1, df2)

def test_pandas_series():
    """Test that pandas Series are equal"""
    # Create two pandas Series
    s1 = pd.Series([1, 2, 3])
    s2 = pd.Series([1, 2, 3])
    # Check that the Series are equal
    pd.testing.assert_series_equal(s1, s2)

Checking if lists are equal

In statistics/stats.py add this function to remove anomalies from a list:

PYTHON

def remove_anomalies(data: list, maximum_value: float, minimum_value: float) -> list:
    """Remove anomalies from a list of numbers"""

    result = []

    for i in data:
        if i <= maximum_value and i >= minimum_value:
            result.append(i)
    
    return result

Then write a test for this function by comparing lists.

PYTHON

from stats import remove_anomalies

def test_remove_anomalies():
    """Test remove_anomalies function"""
    data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    maximum_value = 5
    minimum_value = 2
    expected_result = [2, 3, 4, 5]
    assert remove_anomalies(data, maximum_value, minimum_value) == expected_result

Checking if dictionaries are equal

In statistics/stats.py add this function to calculate the frequency of each element in a list:

PYTHON

def calculate_frequency(data: list) -> dict:
    """Calculate the frequency of each element in a list"""

    frequencies = {}

    # Iterate over each value in the list
    for value in data:
        # If the value is already in the dictionary, increment the count
        if value in frequencies:
            frequencies[value] += 1
        # Otherwise, add the value to the dictionary with a count of 1
        else:
            frequencies[value] = 1

    return frequencies

Then write a test for this function by comparing dictionaries.

PYTHON

from stats import calculate_frequency

def test_calculate_frequency():
    """Test calculate_frequency function"""
    data = [1, 2, 3, 1, 2, 1, 1, 3, 3, 3]
    expected_result = {1: 4, 2: 2, 3: 4}
    assert calculate_frequency(data) == expected_result

Checking if numpy arrays are equal

In statistics/stats.py add this function to calculate the cumulative sum of a numpy array:

PYTHON

import numpy as np

def calculate_cumulative_sum(array: np.ndarray) -> np.ndarray:
    """Calculate the cumulative sum of a numpy array"""
    
    # don't use the built-in numpy function
    result = np.zeros(array.shape)
    result[0] = array[0]
    for i in range(1, len(array)):
        result[i] = result[i-1] + array[i]

    return result

Then write a test for this function by comparing numpy arrays.

PYTHON

import numpy as np
from stats import calculate_cumulative_sum

def test_calculate_cumulative_sum():
    """Test calculate_cumulative_sum function"""
    array = np.array([1, 2, 3, 4, 5])
    expected_result = np.array([1, 3, 6, 10, 15])
    np.testing.assert_array_equal(calculate_cumulative_sum(array), expected_result)

Checking if data structures with numpy arrays are equal

In statistics/stats.py add this function to calculate the total score of each player in a dictionary:

PYTHON


def calculate_player_total_scores(participants: dict):
    """Calculate the total score of each player in a dictionary.
    
    Example input:
    {
        "Alice": {
            "scores": np.array([1, 2, 3])
        },
        "Bob": {
            "scores": np.array([4, 5, 6])
        },
        "Charlie": {
            "scores": np.array([7, 8, 9])
        },
    }

    Example output:
    {
        "Alice": {
            "scores": np.array([1, 2, 3]),
            "total_score": 6
        },
        "Bob": {
            "scores": np.array([4, 5, 6]),
            "total_score": 15
        },
        "Charlie": {
            "scores": np.array([7, 8, 9]),
            "total_score": 24
        },
    }
    """"
    
    for player in participants:
        participants[player]["total_score"] = np.sum(participants[player]["scores"])

    return participants

Then write a test for this function by comparing dictionaries with numpy arrays.

PYTHON

import numpy as np
from stats import calculate_player_total_scores

def test_calculate_player_total_scores():
    """Test calculate_player_total_scores function"""
    participants = {
        "Alice": {
            "scores": np.array([1, 2, 3])
        },
        "Bob": {
            "scores": np.array([4, 5, 6])
        },
        "Charlie": {
            "scores": np.array([7, 8, 9])
        },
    }
    expected_result = {
        "Alice": {
            "scores": np.array([1, 2, 3]),
            "total_score": 6
        },
        "Bob": {
            "scores": np.array([4, 5, 6]),
            "total_score": 15
        },
        "Charlie": {
            "scores": np.array([7, 8, 9]),
            "total_score": 24
        },
    }
    np.testing.assert_equal(calculate_player_total_scores(participants), expected_result)

Checking if pandas DataFrames are equal

In statistics/stats.py add this function to calculate the average score of each player in a pandas DataFrame:

PYTHON

import pandas as pd

def calculate_player_average_scores(df: pd.DataFrame) -> pd.DataFrame:
    """Calculate the average score of each player in a pandas DataFrame.
    
    Example input:
    |   | player  | score_1 | score_2 |
    |---|---------|---------|---------|
    | 0 | Alice   | 1       | 2       |
    | 1 | Bob     | 3       | 4       |

    Example output:
    |   | player  | score_1 | score_2 | average_score |
    |---|---------|---------|---------|---------------|
    | 0 | Alice   | 1       | 2       | 1.5           |
    | 1 | Bob     | 3       | 4       | 3.5           |
    """

    df["average_score"] = df[["score_1", "score_2"]].mean(axis=1)

    return df

Then write a test for this function by comparing pandas DataFrames.

Hint: You can create a dataframe like this:

PYTHON

df = pd.DataFrame({
    "player": ["Alice", "Bob"],
    "score_1": [1, 3],
    "score_2": [2, 4]
})

PYTHON

import pandas as pd
from stats import calculate_player_average_scores

def test_calculate_player_average_scores():
    """Test calculate_player_average_scores function"""
    df = pd.DataFrame({
        "player": ["Alice", "Bob"],
        "score_1": [1, 3],
        "score_2": [2, 4]
    })
    expected_result = pd.DataFrame({
        "player": ["Alice", "Bob"],
        "score_1": [1, 3],
        "score_2": [2, 4],
        "average_score": [1.5, 3.5]
    })
    pd.testing.assert_frame_equal(calculate_player_average_scores(df), expected_result)

Key Points

  • You can test equality of lists and dictionaries using the == operator.
  • Numpy arrays cannot be compared using the == operator. Instead, use numpy.testing.assert_array_equal and numpy.testing.assert_allclose.
  • Data structures that contain numpy arrays should be compared using numpy.testing.assert_equal.
  • Pandas DataFrames and Series should be compared using pandas.testing.assert_frame_equal and pandas.testing.assert_series_equal.

Content from Fixtures


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How to reuse data and objects in tests?

Objectives

  • Learn how to use fixtures to store data and objects for use in tests.

Repetitiveness in tests


When writing more complex tests, you may find that you need to reuse data or objects across multiple tests.

Here is an example of a set of tests that re-use the same data a lot. We have a class, Point, that represents a point in 2D space. We have a few tests that check the behaviour of the class. Notice how we have to repeat the extact same setup code in each test.

PYTHON


class Point:
   def __init__(self, x, y):
      self.x = x
      self.y = y

   def distance_from_origin(self):
      return (self.x ** 2 + self.y ** 2) ** 0.5

   def move(self, dx, dy):
      self.x += dx
      self.y += dy
   
   def reflect_over_x(self):
      self.y = -self.y

   def reflect_over_y(self):
      self.x = -self.x

PYTHON


def test_distance_from_origin():
   # Positive coordinates
   point_positive_coords = Point(3, 4)
   # Negative coordinates
   point_negative_coords = Point(-3, -4)
   # Mix of positive and negative coordinates
   point_mixed_coords = Point(-3, 4)

   assert point_positive_coords.distance_from_origin() == 5.0
   assert point_negative_coords.distance_from_origin() == 5.0
   assert point_mixed_coords.distance_from_origin() == 5.0

def test_move():
   # Repeated setup again...

   # Positive coordinates
   point_positive_coords = Point(3, 4)
   # Negative coordinates
   point_negative_coords = Point(-3, -4)
   # Mix of positive and negative coordinates
   point_mixed_coords = Point(-3, 4)


   # Test logic
   point_positive_coords.move(2, -1)
   point_negative_coords.move(2, -1)
   point_mixed_coords.move(2, -1)

   assert point_positive_coords.x == 5
   assert point_positive_coords.y == 3
   assert point_negative_coords.x == -1
   assert point_negative_coords.y == -5
   assert point_mixed_coords.x == -1
   assert point_mixed_coords.y == 3

def test_reflect_over_x():
   # Yet another setup repetition

   # Positive coordinates
   point_positive_coordinates = Point(3, 4)
   # Negative coordinates
   point_negative_coordinates = Point(-3, -4)
   # Mix of positive and negative coordinates
   point_mixed_coordinates = Point(-3, 4)

   # Test logic
   point_positive_coordinates.reflect_over_x()
   point_negative_coordinates.reflect_over_x()
   point_mixed_coordinates.reflect_over_x()

   assert point_positive_coordinates.x == 3
   assert point_positive_coordinates.y == -4
   assert point_negative_coordinates.x == -3
   assert point_negative_coordinates.y == 4
   assert point_mixed_coordinates.x == -3
   assert point_mixed_coordinates.y == -4


def test_reflect_over_y():
   # One more time...

   # Positive coordinates
   point_positive_coordinates = Point(3, 4)
   # Negative coordinates
   point_negative_coordinates = Point(-3, -4)
   # Mix of positive and negative coordinates
   point_mixed_coordinates = Point(-3, 4)

   # Test logic
   point_positive_coordinates.reflect_over_y()
   point_negative_coordinates.reflect_over_y()
   point_mixed_coordinates.reflect_over_y()

   assert point_positive_coordinates.x == -3
   assert point_positive_coordinates.y == 4
   assert point_negative_coordinates.x == 3
   assert point_negative_coordinates.y == -4
   assert point_mixed_coordinates.x == 3
   assert point_mixed_coordinates.y == 4

Fixtures


Pytest provides a way to store data and objects for use in tests - fixtures.

Fixtures are simply functions that return a value, and can be used in tests by passing them as arguments. Pytest magically knows that any test that requires a fixture as an argument should run the fixture function first, and pass the result to the test.

Fixtures are defined using the @pytest.fixture decorator. (Don’t worry if you are not aware of decorators, they are just ways of flagging functions to do something special - in this case, to let pytest know that this function is a fixture.)

Here is a very simple fixture to demonstrate this:

PYTHON

import pytest

@pytest.fixture
def my_fixture():
   return "Hello, world!"

def test_my_fixture(my_fixture):
   assert my_fixture == "Hello, world!"

Here, Pytest will notice that my_fixture is a fixture due to the @pytest.fixture decorator, and will run my_fixture, then pass the result into test_my_fixture.

Now let’s see how we can improve the tests for the Point class using fixtures:

PYTHON

import pytest

@pytest.fixture
def point_positive_3_4():
   return Point(3, 4)

@pytest.fixture
def point_negative_3_4():
   return Point(-3, -4)

@pytest.fixture
def point_mixed_3_4():
   return Point(-3, 4)

def test_distance_from_origin(point_positive_3_4, point_negative_3_4, point_mixed_3_4):
   assert point_positive_3_4.distance_from_origin() == 5.0
   assert point_negative_3_4.distance_from_origin() == 5.0
   assert point_mixed_3_4.distance_from_origin() == 5.0

def test_move(point_positive_3_4, point_negative_3_4, point_mixed_3_4):
   point_positive_3_4.move(2, -1)
   point_negative_3_4.move(2, -1)
   point_mixed_3_4.move(2, -1)

   assert point_positive_3_4.x == 5
   assert point_positive_3_4.y == 3
   assert point_negative_3_4.x == -1
   assert point_negative_3_4.y == -5
   assert point_mixed_3_4.x == -1
   assert point_mixed_3_4.y == 3

def test_reflect_over_x(point_positive_3_4, point_negative_3_4, point_mixed_3_4):
   point_positive_3_4.reflect_over_x()
   point_negative_3_4.reflect_over_x()
   point_mixed_3_4.reflect_over_x()

   assert point_positive_3_4.x == 3
   assert point_positive_3_4.y == -4
   assert point_negative_3_4.x == -3
   assert point_negative_3_4.y == 4
   assert point_mixed_3_4.x == -3
   assert point_mixed_3_4.y == -4

def test_reflect_over_y(point_positive_3_4, point_negative_3_4, point_mixed_3_4):
   point_positive_3_4.reflect_over_y()
   point_negative_3_4.reflect_over_y()
   point_mixed_3_4.reflect_over_y()

   assert point_positive_3_4.x == -3
   assert point_positive_3_4.y == 4
   assert point_negative_3_4.x == 3
   assert point_negative_3_4.y == -4
   assert point_mixed_3_4.x == 3
   assert point_mixed_3_4.y == 4

With the setup code defined in the fixtures, the tests are more concise and it won’t take as much effort to add more tests in the future.

Challenge : Write your own fixture

In the unit testing lesson, we wrote several tests for sampling & filtering data. We turned a complex function into a properly unit tested set of functions which greatly improved the readability and maintainability of the code, however we had to repeat the same setup code in each test.

Code:

PYTHON

def sample_participants(participants: list, sample_size: int):
    indexes = random.sample(range(len(participants)), sample_size)
    sampled_participants = []
    for i in indexes:
        sampled_participants.append(participants[i])
    return sampled_participants


def filter_participants_by_age(participants: list, min_age: int, max_age: int):
    filtered_participants = []
    for participant in participants:
        if participant["age"] >= min_age and participant["age"] <= max_age:
            filtered_participants.append(participant)
    return filtered_participants


def filter_participants_by_height(participants: list, min_height: int, max_height: int):
    filtered_participants = []
    for participant in participants:
        if participant["height"] >= min_height and participant["height"] <= max_height:
            filtered_participants.append(participant)
    return filtered_participants


def randomly_sample_and_filter_participants(
    participants: list, sample_size: int, min_age: int, max_age: int, min_height: int, max_height: int
):
    sampled_participants = sample_participants(participants, sample_size)
    age_filtered_participants = filter_participants_by_age(sampled_participants, min_age, max_age)
    height_filtered_participants = filter_participants_by_height(age_filtered_participants, min_height, max_height)
    return height_filtered_participants

Tests:

PYTHON

import random
from stats import sample_participants, filter_participants_by_age, filter_participants_by_height, randomly_sample_and_filter_participants

def test_sample_participants():
    # set random seed
    random.seed(0)

    participants = [
        {"age": 25, "height": 180},
        {"age": 30, "height": 170},
        {"age": 35, "height": 160},
        {"age": 38, "height": 165},
        {"age": 40, "height": 190},
        {"age": 45, "height": 200},
    ]
    sample_size = 2
    sampled_participants = sample_participants(participants, sample_size)
    expected = [{"age": 38, "height": 165}, {"age": 45, "height": 200}]
    assert sampled_participants == expected


def test_filter_participants_by_age():
    participants = [
        {"age": 25, "height": 180},
        {"age": 30, "height": 170},
        {"age": 35, "height": 160},
        {"age": 38, "height": 165},
        {"age": 40, "height": 190},
        {"age": 45, "height": 200},
    ]
    min_age = 30
    max_age = 35
    filtered_participants = filter_participants_by_age(participants, min_age, max_age)
    expected = [{"age": 30, "height": 170}, {"age": 35, "height": 160}]
    assert filtered_participants == expected


def test_filter_participants_by_height():
    participants = [
        {"age": 25, "height": 180},
        {"age": 30, "height": 170},
        {"age": 35, "height": 160},
        {"age": 38, "height": 165},
        {"age": 40, "height": 190},
        {"age": 45, "height": 200},
    ]
    min_height = 160
    max_height = 170
    filtered_participants = filter_participants_by_height(participants, min_height, max_height)
    expected = [{"age": 30, "height": 170}, {"age": 35, "height": 160}, {"age": 38, "height": 165}]
    assert filtered_participants == expected


def test_randomly_sample_and_filter_participants():
    # set random seed
    random.seed(0)

    participants = [
        {"age": 25, "height": 180},
        {"age": 30, "height": 170},
        {"age": 35, "height": 160},
        {"age": 38, "height": 165},
        {"age": 40, "height": 190},
        {"age": 45, "height": 200},
    ]
    sample_size = 5
    min_age = 28
    max_age = 42
    min_height = 159
    max_height = 172
    filtered_participants = randomly_sample_and_filter_participants(
        participants, sample_size, min_age, max_age, min_height, max_height
    )
    expected = [{"age": 38, "height": 165}, {"age": 30, "height": 170}, {"age": 35, "height": 160}]
    assert filtered_participants == expected
  • Try making these tests more concise by creating a fixture for the input data.

PYTHON

import pytest

@pytest.fixture
def participants():
    return [
        {"age": 25, "height": 180},
        {"age": 30, "height": 170},
        {"age": 35, "height": 160},
        {"age": 38, "height": 165},
        {"age": 40, "height": 190},
        {"age": 45, "height": 200},
    ]

def test_sample_participants(participants):
   # set random seed
   random.seed(0)

   sample_size = 2
   sampled_participants = sample_participants(participants, sample_size)
   expected = [{"age": 38, "height": 165}, {"age": 45, "height": 200}]
   assert sampled_participants == expected

def test_filter_participants_by_age(participants):
   min_age = 30
   max_age = 35
   filtered_participants = filter_participants_by_age(participants, min_age, max_age)
   expected = [{"age": 30, "height": 170}, {"age": 35, "height": 160}]
   assert filtered_participants == expected

def test_filter_participants_by_height(participants):
   min_height = 160
   max_height = 170
   filtered_participants = filter_participants_by_height(participants, min_height, max_height)
   expected = [{"age": 30, "height": 170}, {"age": 35, "height": 160}, {"age": 38, "height": 165}]
   assert filtered_participants == expected

def test_randomly_sample_and_filter_participants(participants):
   # set random seed
   random.seed(0)

   sample_size = 5
   min_age = 28
   max_age = 42
   min_height = 159
   max_height = 172
   filtered_participants = randomly_sample_and_filter_participants(
       participants, sample_size, min_age, max_age, min_height, max_height
   )
   expected = [{"age": 38, "height": 165}, {"age": 30, "height": 170}, {"age": 35, "height": 160}]
   assert filtered_participants == expected

Fixtures also allow you to set up and tear down resources that are needed for tests, such as database connections, files, or servers, but those are more advanced topics that we won’t cover here.

Fixture organisation

Fixtures can be placed in the same file as the tests, or in a separate file. If you have a lot of fixtures, it may be a good idea to place them in a separate file to keep your test files clean. It is common to place fixtures in a file called conftest.py in the same directory as the tests.

For example you might have this structure:

project_directory/
│
├── tests/
│   ├── conftest.py
│   ├── test_my_module.py
│   ├── test_my_other_module.py
│
├── my_module.py
├── my_other_module.py

In this case, the fixtures defined in conftest.py can be used in any of the test files in the tests directory, provided that the fixtures are imported.

Key Points

  • Fixtures are useful way to store data, objects and automations to re-use them in many different tests.
  • Fixtures are defined using the @pytest.fixture decorator.
  • Tests can use fixtures by passing them as arguments.
  • Fixtures can be placed in a separate file or in the same file as the tests.

Content from Parametrization


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • Is there a better way to test a function with lots of different inputs than writing a separate test for each one?

Objectives

  • Understand how to use parametrization in pytest to run the same test with different parameters in a concise and more readable way.

Parametrization


When writing tests for functions that need to test lots of different combinations of inputs, this can take a lot of space and be quite verbose. Parametrization is a way to run the same test with different parameters in a concise and more readable way.

To use parametrization in pytest, you need to use the @pytest.mark.parametrize decorator (don’t worry if you don’t know what this means). This decorator takes two arguments: the name of the parameters and the list of values you want to test.

Consider the following example:

We have a Triangle class that has a function to calculate the triangle’s area from its side lengths.

PYTHON


def Point:
   def __init__(self, x, y):
       self.x = x
       self.y = y

class Triangle:
   def __init__(self, p1: Point, p2: Point, p3: Point):
      self.p1 = p1
      self.p2 = p2
      self.p3 = p3

   def calculate_area(self):
      a = ((self.p1.x * (self.p2.y - self.p3.y)) +
           (self.p2.x * (self.p3.y - self.p1.y)) +
           (self.p3.x * (self.p1.y - self.p2.y))) / 2
      return abs(a)

If we want to test this function with different combinations of sides, we could write a test like this:

PYTHON

def test_calculate_area():
   """Test the calculate_area function of the Triangle class"""

   # Equilateral triangle
   p11 = Point(0, 0)
   p12 = Point(2, 0)
   p13 = Point(1, 1.7320)
   t1 = Triangle(p11, p12, p13)
   assert t1.calculate_area() == 6

   # Right-angled triangle
   p21 = Point(0, 0)
   p22 = Point(3, 0)
   p23 = Point(0, 4)
   t2 = Triangle(p21, p22, p23)
   assert t2.calculate_area() == 6

   # Isosceles triangle
   p31 = Point(0, 0)
   p32 = Point(4, 0)
   p33 = Point(2, 8)
   t3 = Triangle(p31, p32, p33)
   assert t3.calculate_area() == 16

   # Scalene triangle
   p41 = Point(0, 0)
   p42 = Point(3, 0)
   p43 = Point(1, 4)
   t4 = Triangle(p41, p42, p43)
   assert t4.calculate_area() == 6

   # Negative values
   p51 = Point(0, 0)
   p52 = Point(-3, 0)
   p53 = Point(0, -4)
   t5 = Triangle(p51, p52, p53)
   assert t5.calculate_area() == 6

This test is quite long and repetitive. We can use parametrization to make it more concise:

PYTHON

import pytest

@pytest.mark.parametrize(
   ("p1x, p1y, p2x, p2y, p3x, p3y, expected"),
   [
      pytest.param(0, 0, 2, 0, 1, 1.7320, 6, id="Equilateral triangle"),
      pytest.param(0, 0, 3, 0, 0, 4, 6, id="Right-angled triangle"),
      pytest.param(0, 0, 4, 0, 2, 8, 16, id="Isosceles triangle"),
      pytest.param(0, 0, 3, 0, 1, 4, 6, id="Scalene triangle"),
      pytest.param(0, 0, -3, 0, 0, -4, 6, id="Negative values")
   ]
)
def test_calculate_area(p1x, p1y, p2x, p2y, p3x, p3y, expected):
   p1 = Point(p1x, p1y)
   p2 = Point(p2x, p2y)
   p3 = Point(p3x, p3y)
   t = Triangle(p1, p2, p3)
   assert t.calculate_area() == expected

Let’s have a look at how this works.

Similar to how fixtures are defined, the @pytest.mark.parametrize line is a decorator, letting pytest know that this is a parametrized test.

  • The first argument is a tuple, a list of the names of the parameters you want to use in your test. For example ("p1x, p2y, p2x, p2y, p3x, p3y, expected") means that we will use the parameters p1x, p1y, p2x, p2y, p3x, p3y and expected in our test.

  • The second argument is a list of pytest.param objects. Each pytest.param object is a tuple of the values you want to test, with an optional id argument to give a name to the test. For example, pytest.param(0, 0, 2, 0, 1, 1.7320, 6, id="Equilateral triangle") means that we will test the function with the parameters 0, 0, 2, 0, 1, 1.7320, 6 and give it the name “Equilateral triangle”.

(note that if the test fails you will see the id in the output, so it’s useful to give them meaningful names to help you understand what went wrong.)

  • The test function will be run once for each set of parameters in the list.

  • Inside the test function, you can use the parameters as you would any other variable.

This is a much more concise way to write tests for functions that need to be tested with lots of different inputs, especially when there is a lot of repetition in the setup for each of the different test cases.

Challenge - Practice with Parametrization

Add the following function to advanced/advanced_calculator.py and write a parametrized test for it in tests/test_advanced_calculator.py that tests the function with a range of different inputs using parametrization.

PYTHON


def is_prime(n: int) -> bool:
    """Return True if n is a prime number, False otherwise"""
    if n < 2:
        return False
    for i in range(2, int(n ** 0.5) + 1):
        if n % i == 0:
            return False
    return True

PYTHON

import pytest

@pytest.mark.parametrize(
   ("n, expected"),
   [
      pytest.param(0, False, id="0 is not prime"),
      pytest.param(1, False, id="1 is not prime"),
      pytest.param(2, True, id="2 is prime"),
      pytest.param(3, True, id="3 is prime"),
      pytest.param(4, False, id="4 is not prime"),
      pytest.param(5, True, id="5 is prime"),
      pytest.param(6, False, id="6 is not prime"),
      pytest.param(7, True, id="7 is prime"),
      pytest.param(8, False, id="8 is not prime"),
      pytest.param(9, False, id="9 is not prime"),
      pytest.param(10, False, id="10 is not prime"),
      pytest.param(11, True, id="11 is prime"),
      pytest.param(12, False, id="12 is not prime"),
      pytest.param(13, True, id="13 is prime"),
      pytest.param(14, False, id="14 is not prime"),
      pytest.param(15, False, id="15 is not prime"),
      pytest.param(16, False, id="16 is not prime"),
      pytest.param(17, True, id="17 is prime"),
      pytest.param(18, False, id="18 is not prime"),
      pytest.param(19, True, id="19 is prime"),
      pytest.param(20, False, id="20 is not prime"),
      pytest.param(21, False, id="21 is not prime"),
      pytest.param(22, False, id="22 is not prime"),
      pytest.param(23, True, id="23 is prime"),
      pytest.param(24, False, id="24 is not prime"),
   ]
)
def test_is_prime(n, expected):
   assert is_prime(n) == expected

Key Points

  • Parametrization is a way to run the same test with different parameters in a concise and more readable way, especially when there is a lot of repetition in the setup for each of the different test cases.
  • Use the @pytest.mark.parametrize decorator to define a parametrized test.

Content from Regression Testing and Plots


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How to test for changes in program outputs?
  • How to test for changes in plots?

Objectives

  • Learn how to test for changes in images & plots

Regression testing


When you have a large processing pipeline or you are just starting out adding tests to an existing project, you might not have the time to carefully define exactly what each function should do, or your code may be so complex that it’s hard to write unit tests for it all.

In these cases, you can use regression testing. This is where you just test that the output of a function matches the output of a previous version of the function.

The library pytest-regtest provides a simple way to do this. When writing a test, we pass the argument regtest to the test function and use regtest.write() to log the output of the function. This tells pytest-regtest to compare the output of the test to the output of the previous test run.

To install pytest-regtest:

BASH

pip install pytest-regtest

Callout

This regtest argument is actually a fixture that is provided by the pytest-regtest package. It captures the output of the test function and compares it to the output of the previous test run. If the output is different, the test will fail.

Let’s make a regression test:

  • Create a new function in statistics/stats.py called very_complex_processing():

PYTHON


def very_complex_processing(data: list):

    # Do some very complex processing
    processed_data = [x * 2 for x in data]

    return processed_data
  • Then in test_stats.py, we can add a regression test for this function using the regtest argument.

PYTHON

import pytest

from stats import very_complex_processing

def test_very_complex_processing(regtest):

    data = [1, 2, 3]
    processed_data = very_complex_processing(data)

    regtest.write(str(processed_data))
  • Now because we haven’t run the test yet, there is no reference output to compare against, so we need to generate it using the --regtest-generate flag:

BASH

pytest --regtest-generate

This tells pytest to run the test but instead of comparing the result, it will save the result for use in future tests.

  • Try running pytest and since we haven’t changed how the function works, the test should pass.

  • Then change the function to break the test and re-run pytest. The test will fail and show you the difference between the expected and actual output.

BASH


=== FAILURES ===
___ test_very_complex_processing ___

regression test output differences for statistics/test_stats.py::test_very_complex_processing:
(recorded output from statistics/_regtest_outputs/test_stats.test_very_complex_processing.out)

>   --- current
>   +++ expected
>   @@ -1 +1 @@
>   -[3, 6, 9]
>   +[2, 4, 6]

Here we can see that it has picked up on the difference between the expected and actual output, and displayed it for us to see.

Regression tests, while not as powerful as unit tests, are a great way to quickly add tests to a project and ensure that changes to the code don’t break existing functionality. It is also a good idea to add regression tests to your main processing pipelines just in case your unit tests don’t cover all the edge cases, this will ensure that the output of your program remains consistent between versions.

Testing plots


When you are working with plots, you may want to test that the output is as expected. This can be done by comparing the output to a reference image or plot. The pytest-mpl package provides a simple way to do this, automating the comparison of the output of a test function to a reference image.

To install pytest-mpl:

BASH

pip install pytest-mpl
  • Create a new folder called plotting and add a file plotting.py with the following function:

PYTHON

import matplotlib.pyplot as plt

def plot_data(data: list):
    fig, ax = plt.subplots()
    ax.plot(data)
    return fig

This function takes a list of points to plot, plots them and returns the figure produced.

In order to test that this funciton produces the correct plots, we will need to store the correct plots to compare against. - Create a new folder called test_plots inside the plotting folder. This is where we will store the reference images.

pytest-mpl adds the @pytest.mark.mpl_image_compare decorator that is used to compare the output of a test function to a reference image. It takes a baseline_dir argument that specifies the directory where the reference images are stored.

  • Create a new file called test_plotting.py in the plotting folder with the following content:

PYTHON

import pytest
from plotting import plot_data

@pytest.mark.mpl_image_compare(baseline_dir="test_plots/")
def test_plot_data():
    data = [1, 3, 2]
    fig = plot_data(data)
    return fig

Here we have told pytest that we want it to compare the output of the test_plot_data function to the images in the test_plots directory.

  • Run the following command to generate the reference image: (make sure you are in the base directory in your project and not in the plotting folder)

BASH

pytest --mpl-generate-path=plotitng/test_plots

This tells pytest to run the test but instead of comparing the result, it will save the result into the test_plots directory for use in future tests.

Now we have the reference image, we can run the test to ensure that the output of plot_data matches the reference image. Pytest doesn’t check the images by default, so we need to pass it the --mpl flag to tell it to check the images.

BASH

pytest --mpl

Since we just generated the reference image, the test should pass.

Now let’s edit the plot_data function to plot a different set of points by adding a 4 to the data:

PYTHON

import matplotlib.pyplot as plt

def plot_data(data: list):
    fig, ax = plt.subplots()
    # Add 4 to the data
    data.append(4)
    ax.plot(data)
    return fig
  • Now re-run the test. You should see that it fails.

BASH

=== FAILURES ===
___ test_plot_data ___
Error: Image files did not match.
  RMS Value: 15.740441786649093
  Expected:  
    /var/folders/sr/wjtfqr9s6x3bw1s647t649x80000gn/T/tmp6d0p4yvm/test_plotting.test_plot_data/baseline.png
  Actual:    
    /var/folders/sr/wjtfqr9s6x3bw1s647t649x80000gn/T/tmp6d0p4yvm/test_plotting.test_plot_data/result.png
  Difference:
    /var/folders/sr/wjtfqr9s6x3bw1s647t649x80000gn/T/tmp6d0p4yvm/test_plotting.test_plot_data/result-failed-diff.png
  Tolerance: 
    2

Notice that the test shows you three image files. (All of these files are stored in a temporary directory that pytest creates when running the test. Depending on your system, you may be able to click on the paths to view the images. Try holding down CTRL or Command and clicking on the path.)

  • The first, “Expected” is the reference image that the test is comparing against.
  • The second, “Actual” is the image that was produced by the test.
  • And the third is a difference image that shows the differences between the two images. This is very useful as it enables us to cleraly see what went wrong with the plotting, allowing us to fix the issue more easily. In this example, we can clearly see that the axes ticks are different, and the line plot is a completely different shape.

This doesn’t just work with line plots, but with any type of plot that matplotlib can produce.

Testing your plots can be very useful especially if your project allows users to define their own plots.

Key Points

  • Regression testing ensures that the output of a function remains consistent between changes and are a great first step in adding tests to an existing project.
  • pytest-regtest provides a simple way to do regression testing.
  • pytest-mpl provides a simple way to test plots by comparing the output of a test function to a reference image.

Content from Continuous Integration with GitHub Actions


Last updated on 2024-12-19 | Edit this page

Overview

Questions

  • How can I automate the testing of my code?
  • What are GitHub Actions?

Objectives

  • Understand the concept of continuous integration
  • Learn how to use GitHub Actions to automate the testing of your code

Continuous Integration


Continuous Integration (CI) is the practice of automating the merging of code changes into a project. In the context of software testing, CI is the practice of running tests on every code change to ensure that the code is working as expected. GitHub provides a feature called GitHub Actions that allows you to integrate this into your projects.

In this lesson we will go over the very basics of how to set up a GitHub Action to run tests on your code.

Setting up your project repository


  • Create a new repository on GitHub for this lesson called “python-testing-course” (whatever you like really)
  • Clone the repository into your local machine using git clone <repository-url> or GitKraken if you use that.
  • Move over all your code from the previous lessons into this repository.
  • Commit the changes using git add . and git commit -m "Add all the project code"
  • Create a new file called requirements.txt in the root of your repository and add the following contents:
pytest
numpy
pandas
pytest-mpl
pytest-regtest
matplotlib

This is just a list of all the packages that your project uses and will be needed later. Recall that each of these are used in various lessons in this course.

Now we have a repository with all our code in it online on GitHub.

Creating a GitHub Action


GitHub Actions are defined in yaml files (these are just simple text files that contain a list of instructions). They are stored in the .github/workflows directory in your repository.

  • Create a new directory in your repository called .github
  • Inside the .github directory, create a new directory called workflows
  • Inside the workflows directory, create a new file called tests.yaml

This test.yaml file is where you will tell GitHub how to run the tests for your code.

Let’s add some instructions to the tests.yaml file:

YAML

# This is just the name of the action, you can call it whatever you like.
name: Tests (pytest)

# This is the event that triggers the action. In this case, we are telling GitHub to run the tests whenever a pull request is made to the main branch.
on:
    pull_request:
        branches:
            - main

# This is a list of jobs that the action will run. In this case, we have only one job called build.
jobs:
    build:
        # This is the environment that the job will run on. In this case, we are using the latest version of Ubuntu, however you can ues other operating systems like Windows or MacOS if you like!
        runs-on: ubuntu-latest

        # This is a list of steps that the job will run. Each step is a command that will be executed on the environment.
        steps:
            # This command tells GitHub to use a pre-built action. In this case, we are using the actions/checkout action to check out the repository. This just means that GitHub will use this repository's code to run the tests.
            - uses: actions/checkout@v3 # Check out the repository on github
            # This is the name of the step. This is just a label that will be displayed in the GitHub UI.
            - name: Set up Python 3.10
              # This command tells GitHub to use a pre-built action. In this case, we are using the actions/setup-python action to set up Python 3.10.
              uses: actions/setup-python@v3
              with:
                  python-version: "3.10"
            
            # This step installs the dependencies for the project such as pytest, numpy, pandas, etc using the requirements.txt file we created earlier.
            - name: Install dependencies
              run: |
                python -m pip install --upgrade pip
                pip install -r requirements.txt
                    
            # This step runs the tests using the pytest command. Remember to use the --mpl and --regtest flags to run the tests that use matplotlib and pytest-regtest.
            - name: Run tests
              run: |
                pytest --mpl --regtest

This is a simple GitHub Action that runs the tests for your code whenever a pull request is made to the main branch.

Upload the workflow to GitHub


Now that you have created the tests.yaml file, you need to upload it to GitHub.

  • Commit the changes using git add . and git commit -m "Add GitHub Action to run tests"
  • Push the changes to GitHub using git push origin main

Enable running the tests on a Pull Request


The typical use-case for a CI system is to run the tests whenever a pull request is made to the main branch to add a feature.

  • Go to your GitHub repository
  • Click on the “Settings” tab
  • Scroll down to “Branches”
  • Under “Branch protection rules” / “Branch name pattern” type “main”
  • Select the checkbox for “Require status checks to pass before merging”
  • Select the checkbox for “Require branches to be up to date before merging”

This makes it so when a Pull Request is made, trying to merge code into main, it will need to have all of its tests passing before the code can be merged.

Let’s test it out.

  • Create a new branch in your repository called subtract using git checkout -b subtract
  • Add a new function in your calculator.py file that subtracts two numbers, but make it wrong on purpose:

PYTHON

def subtract(a, b):
    return a + b
  • Then add a test for this function in your test_calculator.py file:

PYTHON

def test_subtract():
    assert subtract(5, 3) == 2
  • Commit the changes using git add . and git commit -m "Add subtract function"

  • Push the changes to GitHub using git push origin subtract

  • Now go to your GitHub repository and create a new Pull Request to merge the subtract branch into main

You should see that the GitHub Action runs the tests and fails because the test for the subtract function is failing.

  • Let’s now fix the test and commit the changes: git add . and git commit -m "Fix subtract function"
  • Push the changes to GitHub using git push origin subtract again
  • Go back to the Pull Request on GitHub and you should see that the tests are now passing and you can merge the code into the main branch.

So now, when you or your team want to make a feature or just update the code, the workflow is as follows:

  • Create a new branch for the feature
  • Write the code for the feature
  • Write tests for the feature
  • Push the code to GitHub
  • Create a Pull Request
  • Wait for the tests to pass or fail
  • If the tests pass, merge the code into the main branch or fix the code if the tests fail

This will greatly improve the quality of your code and make it easier to collaborate with others.

Key Points

  • Continuous Integration (CI) is the practice of automating the merging of code changes into a project.
  • GitHub Actions is a feature of GitHub that allows you to automate the testing of your code.
  • GitHub Actions are defined in yaml files and are stored in the .github/workflows directory in your repository.
  • You can use GitHub Actions to only allow code to be merged into the main branch if the tests pass.