Skip to content
Author Nejat Hakan
eMail nejat.hakan@outlook.de
PayPal Me https://paypal.me/nejathakan


Writing Testable Code with Pytest

Introduction Why Testing Matters

Welcome to this comprehensive guide on writing testable code using Pytest, specifically tailored for a Linux environment. In the realm of software development, writing code that simply works is only half the battle. Ensuring that the code works correctly, reliably, and continues to do so as it evolves is equally, if not more, important. This is where software testing, and particularly automated testing with frameworks like Pytest, becomes indispensable. This guide aims to take you from the fundamental concepts of testing and Pytest to advanced techniques, empowering you to write robust, maintainable, and high-quality Python applications. We will assume you are working within a Linux environment, and any command-line examples will reflect that.

What is Software Testing?

Software testing is the process of evaluating a software item to detect differences between given input and expected output. It's also about assessing the quality of the software. In essence, testing involves executing a piece of software with the intent of finding errors (bugs), verifying that it meets specified requirements, and confirming that it behaves as expected under various conditions. It's a critical part of the software development lifecycle (SDLC) and quality assurance (QA).

Testing isn't just about finding bugs after the code is written; it's fundamentally about building confidence in the software. It helps ensure that:

  1. Functionality: The software does what it's supposed to do according to its specifications.
  2. Reliability: The software performs consistently without failures.
  3. Performance: The software operates efficiently under expected loads.
  4. Security: The software is protected against vulnerabilities.
  5. Maintainability: The software can be easily modified or extended without introducing new errors.

Why is Testing Crucial?

Imagine building a complex machine, like a car, without ever testing individual components (engine, brakes, steering) before assembling them, and then never test-driving the final product. It sounds absurdly risky, yet sometimes software development proceeds with insufficient testing, leading to predictable problems. Here's why testing is vital:

  1. Bug Detection: The most obvious benefit. Finding and fixing bugs early in the development cycle is significantly cheaper and easier than fixing them after the software has been deployed to users. A bug found in production can damage reputation, cause financial loss, or even have critical real-world consequences depending on the application.
  2. Code Quality Improvement: The act of writing tests often forces developers to think more deeply about their code's design, edge cases, and potential failure points. This leads to better-structured, more modular, and more robust code. Code designed with testability in mind is generally higher quality code.
  3. Facilitating Refactoring and Maintenance: Software is rarely static; it evolves. Requirements change, features are added, performance is optimized, and underlying libraries are updated. Automated tests act as a safety net, providing immediate feedback if changes break existing functionality. This gives developers the confidence to refactor and improve the codebase without fear of introducing regressions (re-introducing old bugs or creating new ones).
  4. Documentation: Well-written tests serve as executable documentation. They demonstrate how different parts of the code are intended to be used and what behavior is expected under specific conditions. Unlike traditional documentation, tests don't easily become outdated because they are run frequently and will fail if the code they test changes its behavior.
  5. Collaboration: In team environments, tests ensure that code written by one developer doesn't inadvertently break functionality implemented by another. They establish a shared understanding of how components should behave and interact.
  6. Confidence in Deployment: Automated test suites provide a high degree of confidence that the application is ready for deployment. If all tests pass, there's a strong indication that the core functionality is working as expected.

Types of Tests (Unit, Integration, End-to-End)

Tests can be categorized based on the scope of what they are testing. Understanding these categories helps in building a balanced and effective testing strategy.

  1. Unit Tests:

    • Scope: Test the smallest possible testable parts of an application in isolation. Typically, this means testing individual functions, methods, or classes.
    • Goal: Verify that each unit of the software performs correctly on its own.
    • Characteristics: Fast to execute, numerous, rely heavily on mocking/stubbing external dependencies (like databases, network calls, file system) to maintain isolation.
    • Example: Testing a function that calculates the sum of two numbers, ensuring it returns the correct result for various inputs (positive, negative, zero).
  2. Integration Tests:

    • Scope: Test the interaction and communication between two or more integrated units or components.
    • Goal: Verify that different parts of the system work together as expected.
    • Characteristics: Slower than unit tests, fewer in number, may involve interacting with actual external services (like a test database or a local file system) but often still mock external systems (e.g., third-party APIs).
    • Example: Testing the interaction between a data access layer and the business logic layer, ensuring that data fetched from a (test) database is processed correctly by the business logic. Or testing if saving data through one function can be correctly retrieved by another function.
  3. End-to-End (E2E) Tests (or Functional/System Tests):

    • Scope: Test the entire application flow from start to finish, simulating real user scenarios.
    • Goal: Verify that the complete system meets the business requirements and user expectations.
    • Characteristics: Slowest to execute, fewest in number, interact with the application as a whole, often through its user interface (UI) or API endpoints, using real dependencies where possible.
    • Example: Testing a web application by simulating a user logging in, adding an item to the shopping cart, proceeding to checkout, and verifying the final confirmation message. For a command-line tool, it might involve running the tool with specific arguments and checking its output and exit code.

A common analogy is the "Testing Pyramid," which suggests having a large base of fast unit tests, a smaller layer of integration tests, and a very small top layer of E2E tests. This strategy optimizes for speed and pinpoints failures more easily (a failing unit test points to a specific function, while a failing E2E test could indicate a problem anywhere in the stack).

What is Testable Code?

Testable code is software designed in a way that makes it easy to write automated tests for it, particularly unit tests. Writing testable code is not an afterthought; it's a design principle. Key characteristics include:

  1. Modularity: Code is broken down into small, well-defined functions or classes, each with a single responsibility (Single Responsibility Principle - SRP). This makes it easier to test each part in isolation.
  2. Dependency Injection (DI): Instead of creating dependencies (like database connections, file handlers, or objects of other classes) inside a function or method, they are passed in as arguments (or injected). This allows tests to provide "fake" or "mock" versions of these dependencies, enabling isolated testing.
    • Untestable Example:
      import json
      
      def process_data_from_file():
          # Hard-coded dependency on 'data.json' and file system
          with open('data.json', 'r') as f:
              data = json.load(f)
          # ... process data ...
          return processed_data
      
    • Testable Example (using DI):
      import json
      from io import StringIO # Used for mocking file-like objects in tests
      
      def process_data(file_obj):
          # Dependency 'file_obj' is injected
          data = json.load(file_obj)
          # ... process data ...
          return processed_data
      
      # Real usage:
      # with open('data.json', 'r') as real_file:
      #     result = process_data(real_file)
      
      # Test usage:
      # fake_file = StringIO('{"key": "value"}')
      # result = process_data(fake_file)
      
  3. Clear Inputs and Outputs: Functions should ideally rely only on their input arguments to produce results (or modify state passed in) and communicate results via return values or exceptions. Avoid relying on hidden global state or producing side effects that are hard to track or control in tests. Pure functions (functions with no side effects that always return the same output for the same input) are inherently very testable.
  4. Avoiding Side Effects where Possible: Functions that perform actions like writing to files, sending network requests, or modifying global variables are harder to test. If possible, separate the core logic (which can be tested easily) from the parts that cause side effects.
  5. Interfaces and Abstractions: Depending on abstractions (like abstract base classes or protocols) rather than concrete implementations makes it easier to substitute mock objects during testing.

Introduction to Pytest

Pytest is a mature, feature-rich, and widely adopted Python testing framework. It makes writing, organizing, and running tests simple and scalable. Key advantages include:

  1. Simple Syntax: Uses plain assert statements for checking conditions, making tests highly readable. No need to learn lots of assertSomething methods like in unittest.
  2. Less Boilerplate: Test functions are just regular functions prefixed with test_. Test classes are simple classes containing test methods, without requiring inheritance from a base class (though possible).
  3. Powerful Fixtures: Provides a sophisticated and elegant way to manage test setup, teardown, and dependencies using dependency injection principles (@pytest.fixture). Fixtures are reusable and composable.
  4. Test Discovery: Automatically discovers test files ( test_*.py or *_test.py) and test functions/methods (test_*) within them.
  5. Rich Plugin Ecosystem: Extensible architecture with hundreds of plugins available for various needs (e.g., pytest-cov for coverage, pytest-django/pytest-flask for web frameworks, pytest-mock for mocking, pytest-xdist for parallel testing).
  6. Detailed Reporting: Provides informative output about test failures, including introspection of assertion failures.
  7. Parameterization: Easily run the same test function with multiple different inputs (@pytest.mark.parametrize).
  8. Markers: Tag tests with metadata (@pytest.mark) to selectively run subsets of tests or apply specific behaviors.

Pytest allows you to write tests ranging from simple unit tests to complex functional tests, making it a versatile tool for ensuring your Python code quality on Linux and other platforms. In the following sections, we will dive deep into setting up your environment and using Pytest effectively.

Setting Up Your Environment

Before we can start writing and running tests with Pytest, we need to ensure our Linux development environment is correctly configured. This involves installing Python, managing dependencies with virtual environments, and installing Pytest itself. We'll also discuss a standard project structure that facilitates testing.

Installing Python and pip on Linux

Most modern Linux distributions come with Python pre-installed. However, it might be an older version (like Python 2, though increasingly less common) or you might want a specific newer version.

  1. Check Current Version: Open your terminal and type:

    python3 --version
    pip3 --version
    
    If these commands return a recent Python 3 version (e.g., 3.8 or higher is recommended for modern development) and the corresponding pip version, you might be ready. pip is the package installer for Python.

  2. Installing Python 3: If you need to install or upgrade Python 3, the method depends on your distribution:

    • Debian/Ubuntu:
      sudo apt update
      sudo apt install python3 python3-pip python3-venv -y
      
      The python3-venv package provides the standard library module for creating virtual environments.
    • Fedora:
      sudo dnf update
      sudo dnf install python3 python3-pip -y
      
      The venv module is usually included with the main Python package on Fedora.
    • CentOS/RHEL (using dnf):
      sudo dnf update
      sudo dnf install python3 python3-pip -y
      
    • Arch Linux:
      sudo pacman -Syu python python-pip --noconfirm
      
  3. Verify Installation: After installation, re-run python3 --version and pip3 --version to confirm.

Creating a Virtual Environment

Using virtual environments is a crucial best practice in Python development. A virtual environment provides an isolated space for each project, allowing you to install specific versions of packages for that project without interfering with other projects or the system-wide Python installation.

  1. Navigate to Your Project Directory: Open your terminal and go to where you want to create your project (or an existing project). Let's create a directory for our workshop project:

    mkdir taskmgr-project
    cd taskmgr-project
    

  2. Create the Virtual Environment: Use the venv module to create a virtual environment. It's conventional to name the environment directory .venv (the leading dot hides it from normal directory listings in Linux).

    python3 -m venv .venv
    
    This command creates the .venv directory containing a copy of the Python interpreter and a place to install project-specific packages.

  3. Activate the Virtual Environment: Before installing packages or running your project code, you need to activate the environment:

    source .venv/bin/activate
    
    Your terminal prompt should now change, usually prepending (.venv) to indicate that the virtual environment is active. Now, commands like python and pip will refer to the versions inside .venv.

  4. Deactivating: When you're finished working on the project, you can deactivate the environment by simply typing:

    deactivate
    

Why is this important for testing? It ensures your tests run against the specific package versions your project depends on, making test results consistent and reproducible across different machines or deployment environments.

Installing Pytest

With your virtual environment activated, installing Pytest is straightforward using pip:

  1. Ensure Activation: Make sure your prompt shows (.venv). If not, activate it using source .venv/bin/activate.
  2. Install Pytest:
    pip install pytest
    
    Pip will download and install Pytest and its dependencies within your active virtual environment (.venv/lib/pythonX.Y/site-packages/).
  3. Verify Installation: Check that Pytest is installed and accessible:
    pytest --version
    
    This should display the installed Pytest version.

Project Structure for Testability

A well-organized project structure makes code easier to navigate, understand, and test. While there's flexibility, a common and recommended structure for a Python project with tests looks like this:

taskmgr-project/
├── .venv/                   # Virtual environment directory (created by venv)
├── src/                     # Source code directory (optional but recommended)
│   └── taskmgr/             # Your actual Python package
│       ├── __init__.py      # Makes 'taskmgr' a package
│       ├── core.py          # Core logic (e.g., task management functions)
│       ├── cli.py           # Command-line interface logic
│       └── utils.py         # Utility functions (if needed)
├── tests/                   # Directory for all tests
│   ├── __init__.py          # Makes 'tests' a package (optional, depends on imports)
│   ├── test_core.py         # Tests for core.py
│   ├── test_cli.py          # Tests for cli.py
│   └── conftest.py          # Common test configurations and fixtures (optional)
├── pyproject.toml           # Modern package metadata and build config (preferred)
├── setup.py                 # Legacy package setup (can coexist or replace pyproject.toml)
├── requirements.txt         # Project dependencies (optional, often managed via pyproject.toml)
└── README.md                # Project description

Explanation:

  • src/ Layout: Placing your main application code inside a src directory (the "source layout") helps prevent common import issues and clearly separates your code from other project files (like tests, documentation, build scripts). Your installable package (taskmgr) resides inside src.
  • tests/ Directory: All test code lives here, mirroring the structure of your source code where appropriate (e.g., tests/test_core.py tests src/taskmgr/core.py).
  • __init__.py: These empty files signal to Python that the directories (taskmgr, tests) should be treated as packages, allowing for imports between modules.
  • conftest.py: A special Pytest file. Fixtures and hooks defined here become available to all tests within the tests directory and its subdirectories automatically.
  • pyproject.toml / setup.py: Define package metadata, dependencies, and build instructions. Modern projects prefer pyproject.toml.
  • requirements.txt: Often used to list dependencies, especially for applications rather than libraries. For reproducible testing, you might have a requirements-dev.txt including Pytest and other development tools.

Using this structure ensures Pytest can easily discover your tests (tests/) and that your tests can reliably import your application code (from src/taskmgr/).

Workshop Setting Up the Project

Let's apply the concepts above to create the basic structure for our command-line task manager project (taskmgr).

Goal:
Create the project directory, set up the virtual environment, install Pytest, and establish the recommended directory structure.

Steps:

  1. Create Project Directory: Open your Linux terminal and execute:

    # Choose a parent directory for your projects if you wish
    # cd ~/projects/
    mkdir taskmgr-project
    cd taskmgr-project
    echo "Created project directory: $(pwd)"
    

  2. Create Virtual Environment:

    python3 -m venv .venv
    echo "Created virtual environment in .venv/"
    

  3. Activate Virtual Environment:

    source .venv/bin/activate
    echo "Virtual environment activated. Your prompt should now start with (.venv)."
    # Verify by checking python path
    which python
    # This should point to taskmgr-project/.venv/bin/python
    

  4. Install Pytest:

    pip install pytest
    echo "Installed Pytest."
    # Verify installation
    pytest --version
    

  5. Create Project Structure: Create the source and test directories and initial files.

    mkdir -p src/taskmgr tests
    touch src/taskmgr/__init__.py
    touch src/taskmgr/core.py
    touch src/taskmgr/cli.py
    touch tests/__init__.py
    touch tests/test_core.py
    touch tests/test_cli.py
    touch README.md
    echo "Created basic project structure."
    

  6. (Optional) Create pyproject.toml: Create a minimal pyproject.toml file. This is good practice for defining project metadata and dependencies.

    touch pyproject.toml
    # Add minimal content (you can expand this later)
    echo "[build-system]" > pyproject.toml
    echo "requires = [\"setuptools>=61.0\"]" >> pyproject.toml
    echo "build-backend = \"setuptools.build_meta\"" >> pyproject.toml
    echo "" >> pyproject.toml
    echo "[project]" >> pyproject.toml
    echo "name = \"taskmgr\"" >> pyproject.toml
    echo "version = \"0.1.0\"" >> pyproject.toml
    echo "description = \"A simple command-line task manager.\"" >> pyproject.toml
    echo "requires-python = \">=3.8\"" >> pyproject.toml
    # Add pytest as a development dependency
    echo "" >> pyproject.toml
    echo "[project.optional-dependencies]" >> pyproject.toml
    echo "dev = [" >> pyproject.toml
    echo "    \"pytest\"," >> pyproject.toml
    # Add other dev tools like pytest-cov later here
    echo "]" >> pyproject.toml
    
    echo "Created minimal pyproject.toml."
    
    Note: With pyproject.toml defining dependencies, you could install them using pip install .[dev] (install the project itself plus its 'dev' optional dependencies). For now, we already installed pytest manually.

  7. Verify Structure: Use the tree command (install it if needed: sudo apt install tree or sudo dnf install tree) or ls -R to view the structure:

    tree -a -I '.venv' # -a shows hidden files, -I ignores .venv
    
    The output should resemble the structure described earlier.

You now have a clean, organized project setup on your Linux system, ready for writing your first Pytest tests in the upcoming sections. Remember to keep your virtual environment activated (source .venv/bin/activate) whenever you work on this project.

1. Basic Testing with Pytest

Now that the environment is set up, let's dive into the fundamentals of writing and running tests using Pytest. We'll focus on the core concepts: test functions, assertions, and executing tests.

Writing Your First Test Function

Pytest makes writing tests straightforward. A test in Pytest is typically just a Python function whose name starts with test_.

Let's add a very simple function to our taskmgr project's core logic file (src/taskmgr/core.py) and then write a test for it.

src/taskmgr/core.py:

# src/taskmgr/core.py

def add(a: int, b: int) -> int:
    """Adds two integers."""
    # A deliberate (and obvious) bug for demonstration
    # return a + b # Correct implementation
    return a * b # Incorrect implementation (for now)

# We'll add more task-related functions later

Now, let's write a test for this add function in our test file (tests/test_core.py).

tests/test_core.py:

# tests/test_core.py

# We need to import the code we want to test
# Note: Pytest handles path adjustments, especially if you run `pytest`
# from the project root directory (taskmgr-project).
# If you installed your package in editable mode (`pip install -e .`),
# standard imports work directly.
# Otherwise, ensure your PYTHONPATH includes the 'src' directory,
# or rely on Pytest's path manipulation.
# A common way pytest finds the 'src' code is by running pytest
# from the root 'taskmgr-project' directory.
from taskmgr import core # Imports src/taskmgr/core.py

def test_add_positive_numbers():
    """
    Tests the add function with two positive integers.
    Verifies that the sum is calculated correctly.
    """
    result = core.add(5, 3)
    # The core of a Pytest test: the assert statement
    assert result == 8, "Assertion Failed: Expected 5 + 3 to be 8"

def test_add_negative_numbers():
    """
    Tests the add function with negative integers.
    """
    result = core.add(-2, -4)
    assert result == -6, "Assertion Failed: Expected -2 + (-4) to be -6"

def test_add_mixed_numbers():
    """
    Tests the add function with a positive and a negative integer.
    """
    result = core.add(10, -3)
    assert result == 7, "Assertion Failed: Expected 10 + (-3) to be 7"

Key points:

  • Import: We import the core module from our taskmgr package. Pytest modifies sys.path temporarily during test collection and execution, typically allowing imports relative to the project root or src directory if run from the root.
  • test_ prefix: Each function intended as a test must start with test_.
  • assert statement: This is the core mechanism for checking conditions in Pytest. If the condition following assert evaluates to False, Pytest reports a test failure.
  • Optional Message: You can add an optional message after the comma in the assert statement, which Pytest will display if the assertion fails. This helps in understanding the failure.
  • Docstrings: While optional, adding docstrings to your test functions explaining what they are testing is excellent practice for maintainability.

Naming Conventions for Tests

Pytest uses naming conventions to automatically discover tests:

  1. Files: Looks for files named test_*.py or *_test.py in the current directory and subdirectories.
  2. Functions: Within those files, collects functions prefixed with test_.
  3. Classes: Collects classes prefixed with Test that do not have an __init__ method (unless it follows specific patterns).
  4. Methods: Within those classes, collects methods prefixed with test_.

Adhering to these conventions (test_*.py files and test_* functions/methods) is crucial for Pytest to find and run your tests.

Test function names should be descriptive, indicating what specific scenario or behavior they are verifying. Names like test_add_positive_numbers are much better than generic names like test1 or test_addition.

Running Tests with Pytest

Running your tests is simple. Make sure your virtual environment is activated and you are in the root directory of your project (taskmgr-project).

  1. Open Terminal: Navigate to taskmgr-project.
  2. Run Pytest: Execute the command:
    pytest
    

Pytest will:

  • Scan the tests directory (and subdirectories) for test files and functions matching the conventions.
  • Execute each discovered test function.
  • Report the results.

Expected Output (with the bug in core.add):

You'll see output similar to this (details might vary slightly based on Pytest version):

(.venv) $ pytest
============================= test session starts ==============================
platform linux -- Python 3.X.Y, pytest-X.Y.Z, pluggy-X.Y.Z
rootdir: /path/to/taskmgr-project
collected 3 items

tests/test_core.py FFF                                                   [100%]

=================================== FAILURES ===================================
_________________________ test_add_positive_numbers __________________________

    def test_add_positive_numbers():
        """
        Tests the add function with two positive integers.
        Verifies that the sum is calculated correctly.
        """
        result = core.add(5, 3)
        # The core of a Pytest test: the assert statement
>       assert result == 8, "Assertion Failed: Expected 5 + 3 to be 8"
E       AssertionError: Assertion Failed: Expected 5 + 3 to be 8
E       assert 15 == 8
E        +  where 15 = <function add at 0x...> (5, 3)
E        +    where <function add at 0x...> = taskmgr.core.add

tests/test_core.py:15: AssertionError
_________________________ test_add_negative_numbers __________________________

    def test_add_negative_numbers():
        """
        Tests the add function with negative integers.
        """
        result = core.add(-2, -4)
>       assert result == -6, "Assertion Failed: Expected -2 + (-4) to be -6"
E       AssertionError: Assertion Failed: Expected -2 + (-4) to be -6
E       assert 8 == -6
E        +  where 8 = <function add at 0x...> (-2, -4)
E        +    where <function add at 0x...> = taskmgr.core.add

tests/test_core.py:22: AssertionError
__________________________ test_add_mixed_numbers ___________________________

    def test_add_mixed_numbers():
        """
        Tests the add function with a positive and a negative integer.
        """
        result = core.add(10, -3)
>       assert result == 7, "Assertion Failed: Expected 10 + (-3) to be 7"
E       AssertionError: Assertion Failed: Expected 10 + (-3) to be 7
E       assert -30 == 7
E        +  where -30 = <function add at 0x...> (10, -3)
E        +    where <function add at 0x...> = taskmgr.core.add

tests/test_core.py:29: AssertionError
=========================== short test summary info ============================
FAILED tests/test_core.py::test_add_positive_numbers - AssertionError: Asse...
FAILED tests/test_core.py::test_add_negative_numbers - AssertionError: Asse...
FAILED tests/test_core.py::test_add_mixed_numbers - AssertionError: Asser...
============================== 3 failed in 0.XXs ===============================

This output clearly shows:

  • Three tests were collected (collected 3 items).
  • All three failed (FFF).
  • Detailed tracebacks for each failure, pinpointing the exact line (> assert result == 8...) and the reason (AssertionError: ... assert 15 == 8). Pytest even shows the values involved in the comparison (15 == 8) and where the value 15 came from (core.add(5, 3)).

Now, let's fix the bug in src/taskmgr/core.py:

# src/taskmgr/core.py

def add(a: int, b: int) -> int:
    """Adds two integers."""
    # Fix the bug
    return a + b # Correct implementation

Run pytest again:

pytest

Expected Output (after fixing the bug):

(.venv) $ pytest
============================= test session starts ==============================
platform linux -- Python 3.X.Y, pytest-X.Y.Z, pluggy-X.Y.Z
rootdir: /path/to/taskmgr-project
collected 3 items

tests/test_core.py ...                                                   [100%]

============================== 3 passed in 0.XXs ===============================

Success! The dots (...) indicate passing tests, and the summary confirms 3 passed. This simple example demonstrates the fundamental test-driven cycle: write a test, see it fail, write/fix the code, see the test pass.

Understanding Assertions (assert statement)

The assert keyword is a built-in part of Python. Its syntax is:

assert <condition>, [optional_message]
  • If <condition> evaluates to True (or a truthy value), execution continues normally.
  • If <condition> evaluates to False (or a falsy value), Python raises an AssertionError exception. If optional_message is provided, it's passed as an argument to the AssertionError.

While assert is built into Python, Pytest enhances its usage significantly.

Basic Assertions in Pytest

Pytest leverages the standard assert statement but provides much more informative output upon failure through assertion introspection. When an assert fails, Pytest examines the expression, determines the values involved, and includes them in the failure report. This is why we saw assert 15 == 8 in the failure output earlier, even though we only wrote assert result == 8.

You can use assert with almost any comparison or boolean check:

# Examples of common assertions in Pytest tests

def test_equality():
    assert 2 + 2 == 4

def test_inequality():
    assert 5 != 6

def test_greater_than():
    assert 10 > 5

def test_less_than_or_equal():
    assert 3 <= 3
    assert 3 <= 4

def test_membership():
    my_list = [1, 2, 3]
    assert 2 in my_list
    assert 4 not in my_list

def test_identity(): # Check if two variables point to the exact same object
    a = [1]
    b = a
    c = [1]
    assert a is b
    assert a is not c # Though a == c is True

def test_boolean_values():
    assert True
    assert not False

def test_string_properties():
    message = "Hello Pytest"
    assert message.startswith("Hello")
    assert "Pytest" in message
    assert len(message) == 12

# And for checking types (though sometimes checking behavior is preferred)
def test_types():
    assert isinstance("hello", str)
    assert not isinstance(5, str)

Pytest's intelligent handling of the basic assert statement means you rarely need more complex assertion methods, leading to cleaner and more readable tests.

Workshop Writing and Running Basic Tests

Let's expand the functionality of our taskmgr and add corresponding tests. We'll implement functions to add a task and list tasks. For now, we'll store tasks in a simple list in memory.

Goal:
Implement basic task adding and listing functionality in core.py and write Pytest tests to verify their behavior.

Steps:

  1. Define Task Structure (Conceptual): A task could be represented as a dictionary, perhaps like: {'id': 1, 'description': 'Buy groceries', 'done': False}

  2. Implement Core Functions: Modify src/taskmgr/core.py. We'll use a simple list to store tasks for now.

    src/taskmgr/core.py:

    # src/taskmgr/core.py
    import itertools # To generate unique IDs
    
    # In-memory storage for tasks (a list of dictionaries)
    # In a real app, this would be loaded from/saved to a file or database.
    # For basic unit tests, we can manage this directly or use fixtures (later).
    _tasks = []
    _next_id = itertools.count(1) # Generator for unique IDs starting from 1
    
    def add(a: int, b: int) -> int:
        """Adds two integers."""
        return a + b
    
    def add_task(description: str) -> dict:
        """
        Adds a new task with the given description to the in-memory list.
    
        Args:
            description: The text of the task.
    
        Returns:
            The newly created task dictionary.
    
        Raises:
            ValueError: If the description is empty or whitespace only.
        """
        if not description or description.isspace():
            raise ValueError("Task description cannot be empty.")
    
        new_task = {
            'id': next(_next_id),
            'description': description.strip(), # Remove leading/trailing whitespace
            'done': False
        }
        _tasks.append(new_task)
        return new_task
    
    def list_tasks() -> list[dict]:
        """
        Returns a copy of the current list of tasks.
        Returning a copy prevents external modification of the internal list.
        """
        return _tasks[:] # Return a slice (copy)
    
    def clear_tasks():
        """
        Clears all tasks from the list and resets the ID counter.
        Useful for testing setup/teardown.
        """
        global _tasks, _next_id
        _tasks = []
        _next_id = itertools.count(1) # Reset ID generator
    
    # Example of how this might be used (not part of testing logic):
    # if __name__ == "__main__":
    #     clear_tasks()
    #     task1 = add_task("Learn Pytest")
    #     task2 = add_task("Write report")
    #     print(list_tasks())
    #     # Output might be:
    #     # [{'id': 1, 'description': 'Learn Pytest', 'done': False},
    #     #  {'id': 2, 'description': 'Write report', 'done': False}]
    
    Self-Correction: Initially, I didn't have clear_tasks. However, because _tasks is a global list, tests could interfere with each other. Adding clear_tasks allows us to reset the state before each test (or group of tests), which is crucial for test independence. We'll see better ways to handle this with fixtures later, but this is a start.

  3. Write Tests: Now, let's add tests for add_task and list_tasks in tests/test_core.py.

    tests/test_core.py:

    # tests/test_core.py
    from taskmgr import core
    import pytest # Import pytest to use features like pytest.raises (later)
    
    # --- Tests for the old 'add' function (keep them) ---
    def test_add_positive_numbers():
        assert core.add(5, 3) == 8
    
    def test_add_negative_numbers():
        assert core.add(-2, -4) == -6
    
    def test_add_mixed_numbers():
        assert core.add(10, -3) == 7
    
    # --- Tests for the new task functions ---
    
    def test_add_task_returns_dict():
        """Verify add_task returns a dictionary representing the task."""
        # Ensure a clean state before the test
        core.clear_tasks()
        description = "Test Task 1"
        task = core.add_task(description)
        assert isinstance(task, dict)
        assert "id" in task
        assert "description" in task
        assert "done" in task
    
    def test_add_task_adds_to_list():
        """Verify add_task actually adds the task to the internal list."""
        core.clear_tasks()
        description = "Test Task 2"
        initial_count = len(core.list_tasks())
        task = core.add_task(description)
        # Check list_tasks reflects the addition
        tasks_list = core.list_tasks()
        assert len(tasks_list) == initial_count + 1
        # Check if the returned task is the one in the list (by value)
        # Note: This assumes list_tasks() returns tasks in order added for now
        assert tasks_list[-1]['description'] == description
        assert tasks_list[-1]['id'] == task['id']
        assert not tasks_list[-1]['done'] # Should be False by default
    
    def test_add_task_increments_id():
        """Verify that task IDs are unique and incrementing."""
        core.clear_tasks()
        task1 = core.add_task("First task")
        task2 = core.add_task("Second task")
        assert isinstance(task1['id'], int)
        assert isinstance(task2['id'], int)
        assert task1['id'] == 1 # Assuming clear_tasks resets ID counter to start at 1
        assert task2['id'] == 2 # Assuming the counter increments correctly
        assert task1['id'] != task2['id']
    
    def test_list_tasks_returns_copy():
        """Verify list_tasks returns a copy, not the original list."""
        core.clear_tasks()
        core.add_task("Sample Task")
        list1 = core.list_tasks()
        list2 = core.list_tasks()
        assert list1 == list2 # Content should be the same
        assert list1 is not list2 # Should be different objects (list copies)
        # Modify the returned list - should not affect the internal state
        list1.append({"id": 99, "description": "rogue task", "done": True})
        list3 = core.list_tasks()
        assert len(list3) == 1 # Should still only contain "Sample Task"
        assert list3[0]['description'] == "Sample Task"
    
    
    def test_add_task_empty_description_raises_error():
        """Verify that adding a task with an empty description raises ValueError."""
        core.clear_tasks()
        # We need to test that *calling* add_task raises an error.
        # We'll use pytest.raises for this (covered properly later).
        # For now, let's just demonstrate the basic idea.
        with pytest.raises(ValueError):
             core.add_task("") # This call should raise ValueError
    
    def test_add_task_whitespace_description_raises_error():
        """Verify that adding a task with only whitespace raises ValueError."""
        core.clear_tasks()
        with pytest.raises(ValueError):
             core.add_task("   \t\n ") # Should also raise ValueError
    
    # Note: The use of `core.clear_tasks()` highlights a dependency between tests
    # due to the shared global `_tasks` list. Fixtures (covered later) provide
    # a much cleaner way to manage this test isolation.
    
    Self-Correction: Added tests for edge cases like empty/whitespace descriptions and confirmed that list_tasks returns a copy. Introduced pytest.raises for error checking, which will be formally explained later but is necessary here. Added comments about the clear_tasks() limitation.

  4. Run Tests: Go to your project root (taskmgr-project) in the terminal (make sure .venv is active) and run Pytest:

    pytest
    

  5. Analyze Output: All tests should pass. You should see output similar to:

    (.venv) $ pytest
    ============================= test session starts ==============================
    platform linux -- Python 3.X.Y, pytest-X.Y.Z, pluggy-X.Y.Z
    rootdir: /path/to/taskmgr-project
    collected 9 items
    
    tests/test_core.py .........                                             [100%]
    
    ============================== 9 passed in 0.XXs ===============================
    

You have now successfully implemented basic functionality and verified it with Pytest using simple test functions and assertions. You also saw how tests immediately catch regressions (like the initial bug in add) and how they can verify expected behavior and error conditions. The use of clear_tasks() hints at the need for better test isolation, which we will address with fixtures in later sections.

2. Structuring Tests

As your project grows, so will your test suite. Keeping tests organized is essential for maintainability, readability, and efficient execution. Pytest offers several ways to structure your tests, including organizing them into files, classes, and using markers for grouping and categorization.

Test Discovery in Pytest

Before discussing structure, let's recap how Pytest finds tests (its "discovery" process). By default, starting from the rootdir (usually where you run the pytest command or the location of a pytest.ini, pyproject.toml, or setup.cfg file) or the current directory:

  1. Recursion: Pytest recursively searches directories unless explicitly told otherwise.
  2. File Matching: It looks for Python files named test_*.py or *_test.py.
  3. Function Matching: Inside these files, it collects standalone functions named test_*.
  4. Class Matching: It looks for classes named Test*.
  5. Method Matching: Inside Test* classes (that don't have conflicting __init__ methods), it collects methods named test_*.

Understanding discovery is key to ensuring your tests are found and run. Placing tests in a dedicated tests/ directory at the project root is the standard convention and works seamlessly with discovery.

Grouping Tests in Files

The simplest way to group related tests is by putting them in the same file. We've already done this with tests/test_core.py containing tests for the functions within src/taskmgr/core.py.

Best Practice: Create separate test files for separate modules or distinct areas of functionality in your application.

  • tests/test_core.py -> Tests functions in src/taskmgr/core.py
  • tests/test_cli.py -> Tests functions/classes in src/taskmgr/cli.py
  • tests/test_utils.py -> Tests functions/classes in src/taskmgr/utils.py

This makes it easy to locate tests related to a specific part of your codebase. When a test fails in tests/test_core.py, you know the problem likely lies within src/taskmgr/core.py or its interactions.

You can run tests only from specific files:

# Run only tests in tests/test_core.py
pytest tests/test_core.py

# Run tests in any file named test_cli.py (anywhere it's found)
pytest -k test_cli.py # Using -k for keyword matching on file path

# Run tests in a specific directory
pytest tests/

Grouping Tests in Classes

For functionalities that involve multiple related tests, grouping them inside a class can improve organization, especially when they share common setup or state (though fixtures are often preferred for setup/state).

Test classes must follow the Test* naming convention (e.g., TestAddTask, TestTaskListing). Methods within the class intended as tests must follow the test_* convention.

Example (tests/test_core.py):

Let's refactor some of our task-related tests into a class.

# tests/test_core.py
from taskmgr import core
import pytest

# --- Keep the 'add' function tests as standalone functions ---
def test_add_positive_numbers():
    assert core.add(5, 3) == 8
# ... other add tests ...

# --- Group task-related tests in a class ---
class TestTaskManager:
    """Groups tests related to task management operations."""

    # Helper method to reset state before each test method in this class
    # Pytest fixtures offer a much better way (see later sections)
    def setup_method(self):
        """Pytest special method: runs before each test method in the class."""
        print("\nSetting up for test method...") # For demonstration
        core.clear_tasks()

    def teardown_method(self):
        """Pytest special method: runs after each test method in the class."""
        print("Tearing down after test method...") # For demonstration
        core.clear_tasks() # Ensure clean state even if test fails mid-way

    # Test methods within the class
    def test_add_task_returns_dict(self):
        """Verify add_task returns a dictionary representing the task."""
        # No need for core.clear_tasks() here, setup_method handles it
        description = "Test Task 1"
        task = core.add_task(description)
        assert isinstance(task, dict)
        assert "id" in task
        assert "description" in task
        assert "done" in task

    def test_add_task_adds_to_list(self):
        """Verify add_task actually adds the task to the internal list."""
        description = "Test Task 2"
        initial_count = len(core.list_tasks())
        task = core.add_task(description)
        tasks_list = core.list_tasks()
        assert len(tasks_list) == initial_count + 1
        assert tasks_list[-1]['description'] == description
        assert not tasks_list[-1]['done']

    def test_add_task_increments_id(self):
        """Verify that task IDs are unique and incrementing."""
        task1 = core.add_task("First task")
        task2 = core.add_task("Second task")
        assert task1['id'] == 1
        assert task2['id'] == 2

    def test_list_tasks_returns_copy(self):
        """Verify list_tasks returns a copy, not the original list."""
        core.add_task("Sample Task")
        list1 = core.list_tasks()
        list2 = core.list_tasks()
        assert list1 == list2
        assert list1 is not list2
        list1.append({"id": 99, "description": "rogue task", "done": True})
        list3 = core.list_tasks()
        assert len(list3) == 1

    def test_add_task_empty_description_raises_error(self):
        """Verify ValueError for empty description."""
        with pytest.raises(ValueError, match="cannot be empty"): # Added match
             core.add_task("")

    def test_add_task_whitespace_description_raises_error(self):
        """Verify ValueError for whitespace-only description."""
        with pytest.raises(ValueError, match="cannot be empty"): # Added match
             core.add_task("   \t\n ")

# Note: The original standalone task tests are now methods of TestTaskManager

Key changes:

  • A class TestTaskManager is created.
  • The task-related test functions are now methods of this class (and take self as the first argument, though it's often unused in basic tests unless accessing class attributes or helper methods).
  • setup_method / teardown_method: These are special methods Pytest recognizes within test classes.
    • setup_method(self) runs before each test method in the class.
    • teardown_method(self) runs after each test method in the class, regardless of whether the test passed or failed. This provides a basic mechanism for setup/cleanup per test, replacing the repetitive core.clear_tasks() calls within each test. (Again, fixtures are generally superior).
  • pytest.raises now includes an optional match parameter. This allows you to assert that the error message contains a specific substring, making the test more robust.

Running Class-Based Tests:

Running pytest still works the same way. Pytest discovers the TestTaskManager class and runs all test_* methods within it.

You can also run tests specific to a class:

# Run all tests within the TestTaskManager class in test_core.py
pytest tests/test_core.py::TestTaskManager

# Run a specific method within the class
pytest tests/test_core.py::TestTaskManager::test_add_task_increments_id

Grouping tests in classes can be useful, but modern Pytest practices often favor using fixtures for setup/teardown and grouping primarily by file and markers, as classes can sometimes add unnecessary boilerplate.

Using Markers (@pytest.mark)

Markers are a powerful Pytest feature for categorizing tests, applying specific behavior, or selecting subsets of tests to run. They are implemented using Python decorators.

Common Use Cases:

  1. Categorization: Tag tests based on their type (e.g., unit, integration, slow, fast, database, network).
  2. Skipping Tests: Conditionally skip tests based on certain criteria (@pytest.mark.skip, @pytest.mark.skipif).
  3. Expected Failures: Mark tests that are known to fail but shouldn't break the build (@pytest.mark.xfail).
  4. Parameterization: Run a test multiple times with different inputs (@pytest.mark.parametrize, covered later).

Defining and Using Markers:

You can use built-in markers or define your own custom markers.

Example (tests/test_core.py):

Let's add some markers to our tests. Suppose we want to mark some tests as fast and others related to error handling as errors.

# tests/test_core.py
from taskmgr import core
import pytest
import sys # Import sys to use in skipif

# Register custom markers in pytest.ini or pyproject.toml (Recommended)
# See explanation below

# --- Tests for the 'add' function (Mark as 'fast') ---
@pytest.mark.fast
def test_add_positive_numbers():
    assert core.add(5, 3) == 8

@pytest.mark.fast
def test_add_negative_numbers():
    assert core.add(-2, -4) == -6

@pytest.mark.fast
def test_add_mixed_numbers():
    assert core.add(10, -3) == 7

# --- Group task-related tests in a class ---
# Can mark the whole class or individual methods
@pytest.mark.tasks # Mark the whole class related to tasks
class TestTaskManager:
    """Groups tests related to task management operations."""

    def setup_method(self):
        core.clear_tasks()

    def teardown_method(self):
        core.clear_tasks()

    @pytest.mark.fast # Mark specific methods too
    def test_add_task_returns_dict(self):
        description = "Test Task 1"
        task = core.add_task(description)
        assert isinstance(task, dict)
        # ... more assertions ...

    @pytest.mark.fast
    def test_add_task_adds_to_list(self):
        description = "Test Task 2"
        core.add_task(description)
        tasks_list = core.list_tasks()
        assert len(tasks_list) == 1
        # ... more assertions ...

    def test_add_task_increments_id(self):
        task1 = core.add_task("First task")
        task2 = core.add_task("Second task")
        assert task1['id'] == 1
        assert task2['id'] == 2

    # Example of skipping a test based on a condition
    @pytest.mark.skipif(sys.platform == "win32", reason="Test relies on Linux-specific behavior (example)")
    def test_list_tasks_returns_copy(self):
        core.add_task("Sample Task")
        list1 = core.list_tasks()
        list2 = core.list_tasks()
        assert list1 == list2
        assert list1 is not list2

    @pytest.mark.errors # Mark error handling tests
    def test_add_task_empty_description_raises_error(self):
        with pytest.raises(ValueError, match="cannot be empty"):
             core.add_task("")

    @pytest.mark.errors # Mark error handling tests
    @pytest.mark.xfail(reason="Whitespace stripping not yet implemented robustly (example)") # Example xfail
    def test_add_task_whitespace_description_raises_error(self):
        # Let's pretend our stripping isn't perfect yet
        # This test is expected to fail for now
        with pytest.raises(ValueError, match="cannot be empty"):
             core.add_task("   \t\n ")

Registering Custom Markers:

Using custom markers (like @pytest.mark.fast, @pytest.mark.errors, @pytest.mark.tasks) without registering them will cause Pytest to issue warnings. It's best practice to register them to avoid typos and provide descriptions. You can do this in pytest.ini or pyproject.toml.

pyproject.toml (Recommended):

Add a [tool.pytest.ini_options] section:

# pyproject.toml

[build-system]
# ... (keep existing content) ...

[project]
# ... (keep existing content) ...

[project.optional-dependencies]
# ... (keep existing content) ...

# Add this section for Pytest configuration
[tool.pytest.ini_options]
minversion = "6.0" # Optional: specify minimum pytest version
addopts = "-ra -q" # Optional: add default command line options (e.g., report extra info for skips/xfails)
testpaths = [      # Optional: explicitly define where to look for tests
    "tests",
]
markers = [
    "fast: marks tests as fast running unit tests (deselect with '-m \"not fast\"')",
    "errors: marks tests focusing on error handling",
    "tasks: marks tests related to the task management core",
    # Add other markers like 'slow', 'integration', 'cli' etc. as needed
]

pytest.ini (Alternative):

Create a file named pytest.ini in your project root (taskmgr-project):

# pytest.ini
[pytest]
minversion = 6.0
addopts = -ra -q
testpaths =
    tests
markers =
    fast: marks tests as fast running unit tests (deselect with '-m "not fast"')
    errors: marks tests focusing on error handling
    tasks: marks tests related to the task management core

Running Tests with Markers:

The -m command-line option allows you to select tests based on markers.

# Run only tests marked as 'fast'
pytest -m fast

# Run only tests marked as 'errors'
pytest -m errors

# Run only tests marked as 'tasks' (includes all methods in TestTaskManager)
pytest -m tasks

# Run tests that are NOT marked as 'fast'
pytest -m "not fast"

# Run tests marked as 'tasks' AND 'fast'
pytest -m "tasks and fast"

# Run tests marked as 'tasks' OR 'errors'
pytest -m "tasks or errors"

Output Explanation:

  • When you run tests with markers, the output will show how many tests were selected and how many were deselected.
  • @pytest.mark.skipif: If the condition (e.g., sys.platform == "win32") is true, the test is skipped. Pytest reports skipped tests with an s.
  • @pytest.mark.xfail: If the test fails as expected, Pytest reports it as xfailed (expected failure) with an x. If it unexpectedly passes, it's reported as XPASS. This is useful for tracking known bugs or tests for features not yet fully implemented.

Markers provide a flexible and powerful way to manage and execute your growing test suite.

Workshop Organizing Tests for the Task Manager

Let's apply these structuring techniques to our taskmgr project.

Goal:
Organize the existing tests using files, classes (where appropriate), and markers. Register the markers.

Steps:

  1. Review Current Structure: We currently have:

    • tests/test_core.py: Contains tests for add and task functions.
    • A TestTaskManager class within test_core.py grouping task tests.
    • Some basic markers added (fast, errors, tasks, skipif, xfail).
  2. Refine File Structure (No changes needed yet): The current structure with tests/test_core.py for src/taskmgr/core.py is appropriate. We still have tests/test_cli.py ready for when we implement CLI tests.

  3. Refine Class Structure (Keep TestTaskManager): The TestTaskManager class provides reasonable grouping for the core task operations. We'll keep it.

  4. Apply and Register Markers:

    • Ensure the markers (fast, errors, tasks, skipif, xfail) are applied as shown in the example above in tests/test_core.py.
    • Register these markers in pyproject.toml (or pytest.ini). Let's use pyproject.toml.

    Update pyproject.toml: Make sure the [tool.pytest.ini_options] section exists and includes the markers list as shown in the previous section.

    # pyproject.toml snippet
    [tool.pytest.ini_options]
    minversion = "6.0"
    addopts = "-ra -q" # Show extra summary info for skips/xfails ('a'), quiet reporting ('q')
    testpaths = [
        "tests",
    ]
    markers = [
        "fast: marks tests as fast running unit tests",
        "errors: marks tests focusing on error handling",
        "tasks: marks tests related to the task management core",
        # We don't need to register built-in markers like skipif or xfail
    ]
    
  5. Run Tests with Markers: Experiment with running different subsets of tests from your project root directory (taskmgr-project). Make sure your virtual environment is active.

    # Run all tests (verify registration removed warnings)
    pytest
    
    # Run only fast tests
    pytest -m fast
    
    # Run only error handling tests
    pytest -m errors
    
    # Run only task-related tests (should select the whole class)
    pytest -m tasks
    
    # Run tests that are NOT fast
    pytest -m "not fast"
    
    # Run tests marked as tasks AND fast
    pytest -m "tasks and fast"
    
    # Run only the test expected to fail (xfail)
    # Note: xfail tests run by default unless explicitly excluded
    # To run ONLY xfail tests is tricky, often easier to run by name/marker
    pytest -k test_add_task_whitespace_description_raises_error
    
    # Run all tests EXCEPT those marked 'errors'
    pytest -m "not errors"
    
  6. Observe Output: Pay attention to the test summary lines. You should see confirmations like X deselected or X passed, Y skipped, Z xfailed. Ensure the xfail test is reported correctly (as x or X). If you are not on Windows, the skipif test should run and pass; if you were on Windows, it would be skipped (s). The -ra option in addopts should give you a summary of skips and xfails at the end.

By organizing tests into logical files and using classes and markers, you create a more manageable and scalable test suite. This structure makes it easier to understand test coverage, run relevant tests quickly, and maintain the tests as the application evolves.

3. Handling Failures and Errors

A core purpose of testing is to detect problems. Pytest provides excellent feedback when tests fail or encounter errors. Understanding how to read this output and how to specifically test for expected error conditions is crucial.

Reading Pytest Output

When a test fails, Pytest provides a detailed traceback to help you diagnose the issue quickly. Let's re-examine the failure output from our initial buggy add function:

_________________________ test_add_positive_numbers __________________________

    def test_add_positive_numbers():
        """
        Tests the add function with two positive integers.
        Verifies that the sum is calculated correctly.
        """
        result = core.add(5, 3)
        # The core of a Pytest test: the assert statement
>       assert result == 8, "Assertion Failed: Expected 5 + 3 to be 8"
E       AssertionError: Assertion Failed: Expected 5 + 3 to be 8
E       assert 15 == 8
E        +  where 15 = <function add at 0x...> (5, 3)
E        +    where <function add at 0x...> = taskmgr.core.add

tests/test_core.py:15: AssertionError

Breakdown:

  1. Header: _________________________ test_add_positive_numbers __________________________ clearly identifies the failing test function.
  2. Source Code Snippet: Pytest shows the relevant lines from your test function.
  3. Failure Point: The line starting with > indicates the exact line where the failure occurred (assert result == 8...).
  4. Exception Type: E AssertionError: shows the type of exception that caused the failure. For assert statements, this is always AssertionError. If your code raised an unexpected exception (e.g., TypeError, NameError), that exception type would be shown here.
  5. Assertion Message: E Assertion Failed: Expected 5 + 3 to be 8 shows the custom message provided in the assert statement.
  6. Assertion Introspection: E assert 15 == 8 This is Pytest's magic. It re-evaluates the failing assertion and shows the actual values involved (result was 15, compared against the expected 8).
  7. Value Provenance: E + where 15 = <function add at 0x...> (5, 3) Pytest traces back where the value 15 came from – the result of calling taskmgr.core.add(5, 3). This is incredibly helpful for complex expressions.
  8. Location: tests/test_core.py:15: AssertionError gives the file and line number of the failing assertion.

Understanding this detailed output allows you to quickly pinpoint whether the bug is in the test logic itself or in the code being tested, and precisely what the discrepancy is.

Testing for Expected Exceptions (pytest.raises)

Sometimes, the correct behavior of your code is to raise an exception under specific circumstances (e.g., invalid input, resource not found). You need to test that the code does raise the expected exception when it should. Trying to catch exceptions manually with try...except blocks in tests is clumsy and error-prone. Pytest provides the pytest.raises context manager for this purpose.

Syntax:

import pytest

def test_function_raises_exception():
    with pytest.raises(ExpectedExceptionType):
        # Code that is expected to raise ExpectedExceptionType
        call_function_that_should_raise(invalid_input)

    # Optional: Assert properties of the raised exception
    with pytest.raises(ExpectedExceptionType) as excinfo:
        call_function_that_should_raise(invalid_input)
    # Now you can inspect the caught exception via excinfo
    assert "some specific message" in str(excinfo.value)
    assert excinfo.type is ExpectedExceptionType

How it Works:

  1. You wrap the code expected to raise an exception within the with pytest.raises(ExpectedExceptionType): block.
  2. If the code inside the block does raise an instance of ExpectedExceptionType (or a subclass of it), the test passes, and execution continues after the with block.
  3. If the code inside the block raises a different exception, or no exception at all, pytest.raises catches this discrepancy, and the test fails.
  4. Inspecting the Exception: By using as excinfo, you capture information about the caught exception in the excinfo object (an ExceptionInfo instance). You can then make further assertions about the exception, such as checking its type (excinfo.type) or its message/arguments (excinfo.value, str(excinfo.value)).
  5. Matching Error Messages: You can also pass a match argument to pytest.raises to assert that the string representation of the exception matches a regular expression pattern. This is often more concise than using as excinfo just to check the message.

Example (tests/test_core.py):

We already used this briefly. Let's refine the tests for add_task with invalid descriptions:

# tests/test_core.py (within TestTaskManager class)
# ... other imports ...

    @pytest.mark.errors
    def test_add_task_empty_description_raises_error(self):
        """Verify ValueError for empty description with specific message."""
        # Use match for concise message checking
        expected_message = "Task description cannot be empty."
        with pytest.raises(ValueError, match=expected_message):
             core.add_task("")
        # The test fails if ValueError is not raised, or if the message doesn't match.

    @pytest.mark.errors
    # Remove xfail for now, assume our code works
    # @pytest.mark.xfail(reason="Whitespace stripping not yet implemented robustly (example)")
    def test_add_task_whitespace_description_raises_error(self):
        """Verify ValueError for whitespace-only description using excinfo."""
        expected_message = "Task description cannot be empty."
        with pytest.raises(ValueError) as excinfo:
             core.add_task("   \t\n ")
        # Assert specific details using the excinfo object
        assert str(excinfo.value) == expected_message
        assert excinfo.type is ValueError

Using pytest.raises is the standard, robust way to test for expected exceptions in Pytest.

Testing for Expected Warnings (pytest.warns)

Similar to exceptions, sometimes functions should issue warnings (using Python's warnings module) to signal potential issues or deprecated usage without stopping execution. Pytest provides pytest.warns to test for these scenarios.

Syntax:

import pytest
import warnings

def function_that_might_warn(arg):
    if arg < 0:
        warnings.warn("Negative argument provided", UserWarning)
        # DeprecationWarning is another common type
    return arg * 2

def test_function_warns():
    # Check if the code issues *any* warning of the specified type
    with pytest.warns(UserWarning):
        result = function_that_might_warn(-5)
    assert result == -10 # Check return value as well

    # Check if the warning message matches a pattern
    with pytest.warns(UserWarning, match="Negative argument"):
        function_that_might_warn(-10)

    # Capture the warning details (less common than for exceptions)
    with pytest.warns(UserWarning) as record:
        function_that_might_warn(-1)
    assert len(record) == 1 # Check how many warnings were caught
    assert "Negative argument provided" in str(record[0].message)

How it Works:

  • It works analogously to pytest.raises. The test passes if the code inside the with block issues at least one warning of the specified type (UserWarning, DeprecationWarning, etc.).
  • It fails if no warning of the specified type is issued, or if a different exception occurs.
  • You can use match to check the warning message.
  • You can use as record to capture a list of recorded warning objects (warnings.WarningMessage instances) for more detailed inspection.

When to Use: Test that your code correctly uses the warnings module for non-critical issues, API deprecations, or potentially problematic usage patterns.

Using -x (Exit on First Failure) and -k (Keyword Matching)

Pytest offers command-line options to control test execution flow, especially useful when dealing with failures:

  1. -x or --exitfirst:

    • Purpose: Stop the entire test session immediately after the first test failure or error.
    • Use Case: Useful when you have a large test suite and a fundamental failure occurs early on (e.g., environment setup failed, core component broken). Running further tests might be pointless and time-consuming. Running pytest -x gives you the first point of failure quickly.
    pytest -x
    
  2. -k <expression> or --keyword <expression>:

    • Purpose: Run only tests whose names (including file paths, class names, function/method names, and marker names separated by ::) match the given keyword expression.
    • Use Case: Quickly run a specific test or a group of related tests without editing files or relying solely on markers. Useful for debugging a particular function or feature.
    • Expression Syntax:
      • MyClass: Selects tests in classes named MyClass.
      • my_function: Selects tests named my_function.
      • MyClass and my_function: Selects tests matching both.
      • MyClass or other_function: Selects tests matching either.
      • not specific_test: Selects tests not matching.
      • Case-insensitive matching.
      • Can match parts of names (e.g., -k add would match test_add_positive_numbers, test_add_task_adds_to_list, etc.). Be specific to avoid unintended matches. Use underscores, ::, etc. to narrow down.
    # Run only tests with 'add_task' in their name
    pytest -k add_task
    
    # Run only test_add_task_increments_id within TestTaskManager in test_core.py
    pytest -k "TestTaskManager and test_add_task_increments_id"
    # More specific way:
    pytest tests/test_core.py::TestTaskManager::test_add_task_increments_id
    
    # Run all tests EXCEPT those related to 'add_task'
    pytest -k "not add_task"
    
    # Run tests related to 'add' or 'list'
    pytest -k "add or list"
    

These options provide fine-grained control over which tests are run and how failures are handled, significantly improving the debugging and testing workflow.

Workshop Testing Error Conditions

Let's enhance our taskmgr by adding more functions and ensuring we test their error handling thoroughly using pytest.raises. We'll add functions to retrieve a task by ID and mark a task as done.

Goal:
Implement get_task_by_id and mark_task_done in core.py. Write tests for their successful operation and, importantly, for cases where they should raise errors (e.g., task not found).

Steps:

  1. Implement New Core Functions: Add the following functions to src/taskmgr/core.py:

    # src/taskmgr/core.py
    # ... (keep existing code: _tasks, _next_id, add, add_task, list_tasks, clear_tasks) ...
    
    class TaskNotFoundError(Exception):
        """Custom exception for when a task ID is not found."""
        def __init__(self, task_id):
            self.task_id = task_id
            super().__init__(f"Task with ID {task_id} not found.")
    
    def get_task_by_id(task_id: int) -> dict:
        """
        Retrieves a single task by its ID.
    
        Args:
            task_id: The ID of the task to retrieve.
    
        Returns:
            The task dictionary if found.
    
        Raises:
            TaskNotFoundError: If no task with the given ID exists in the list.
        """
        for task in _tasks:
            if task['id'] == task_id:
                # Return a copy to prevent modification of internal state? Debatable for get.
                # Let's return a copy for consistency with list_tasks.
                return task.copy()
        # If loop finishes without finding the task
        raise TaskNotFoundError(task_id)
    
    def mark_task_done(task_id: int) -> dict:
        """
        Marks a task as done based on its ID.
    
        Args:
            task_id: The ID of the task to mark as done.
    
        Returns:
            The updated task dictionary.
    
        Raises:
            TaskNotFoundError: If no task with the given ID exists in the list.
        """
        for task in _tasks:
            if task['id'] == task_id:
                if task['done']: # Optional: Maybe warn or do nothing if already done?
                    pass # Idempotent: marking done again has no effect
                task['done'] = True
                return task.copy() # Return a copy of the updated task
        raise TaskNotFoundError(task_id)
    
    def mark_task_undone(task_id: int) -> dict:
        """Marks a task as not done (undone). Added for completeness."""
        for task in _tasks:
            if task['id'] == task_id:
                task['done'] = False
                return task.copy()
        raise TaskNotFoundError(task_id)
    
    Self-Correction: Added a custom exception TaskNotFoundError for better error specificity than using a generic ValueError or LookupError. Made get_task_by_id and mark_task_done return copies of the task dictionary. Added mark_task_undone for symmetry. Considered idempotency for mark_task_done.

  2. Write Tests for New Functions (Success Cases): Add these tests to the TestTaskManager class in tests/test_core.py.

    # tests/test_core.py (within TestTaskManager class)
    # ... imports ...
    
    # Add TaskNotFoundError to imports if needed at top level
    from taskmgr.core import TaskNotFoundError
    
    class TestTaskManager:
        # ... setup_method, teardown_method ...
        # ... existing test methods ...
    
        def test_get_task_by_id_success(self):
            """Verify retrieving an existing task by its ID."""
            task1 = core.add_task("Task for get")
            retrieved_task = core.get_task_by_id(task1['id'])
            assert retrieved_task == task1
            # Verify it's a copy (optional but good practice)
            assert retrieved_task is not task1 # Check identity, should be different objects
    
        def test_mark_task_done_success(self):
            """Verify marking an existing task as done."""
            task = core.add_task("Task to complete")
            assert not task['done'] # Verify initial state
            updated_task = core.mark_task_done(task['id'])
            assert updated_task['id'] == task['id']
            assert updated_task['done'] is True
            # Verify the internal state was updated
            retrieved_task = core.get_task_by_id(task['id'])
            assert retrieved_task['done'] is True
    
        def test_mark_task_done_idempotent(self):
            """Verify marking a task done multiple times has no adverse effect."""
            task = core.add_task("Task to complete repeatedly")
            core.mark_task_done(task['id']) # Mark done once
            updated_task = core.mark_task_done(task['id']) # Mark done again
            assert updated_task['done'] is True
            retrieved_task = core.get_task_by_id(task['id'])
            assert retrieved_task['done'] is True # Still done
    
        def test_mark_task_undone_success(self):
            """Verify marking a completed task as not done."""
            task = core.add_task("Task to un-complete")
            core.mark_task_done(task['id']) # Mark as done first
            assert core.get_task_by_id(task['id'])['done'] is True
            # Now mark as undone
            updated_task = core.mark_task_undone(task['id'])
            assert updated_task['id'] == task['id']
            assert updated_task['done'] is False
            # Verify internal state
            retrieved_task = core.get_task_by_id(task['id'])
            assert retrieved_task['done'] is False
    
  3. Write Tests for Error Cases using pytest.raises: Add these tests to the TestTaskManager class as well, marked appropriately.

    # tests/test_core.py (within TestTaskManager class)
    # ... imports ...
    
    class TestTaskManager:
        # ... setup_method, teardown_method ...
        # ... existing test methods ...
        # ... success tests for get/mark ...
    
        @pytest.mark.errors
        def test_get_task_by_id_not_found(self):
            """Verify TaskNotFoundError when ID doesn't exist."""
            core.add_task("Some other task") # Ensure list isn't empty
            non_existent_id = 999
            with pytest.raises(TaskNotFoundError) as excinfo:
                core.get_task_by_id(non_existent_id)
            # Check the exception message contains the ID
            assert str(non_existent_id) in str(excinfo.value)
            assert f"Task with ID {non_existent_id} not found" == str(excinfo.value)
            assert excinfo.value.task_id == non_existent_id # Check custom attribute
    
        @pytest.mark.errors
        def test_mark_task_done_not_found(self):
            """Verify TaskNotFoundError when marking a non-existent task done."""
            core.add_task("Some other task")
            non_existent_id = 999
            with pytest.raises(TaskNotFoundError, match=f"Task with ID {non_existent_id} not found"):
                core.mark_task_done(non_existent_id)
    
        @pytest.mark.errors
        def test_mark_task_undone_not_found(self):
            """Verify TaskNotFoundError when marking a non-existent task undone."""
            core.add_task("Some other task")
            non_existent_id = 999
            with pytest.raises(TaskNotFoundError) as excinfo:
                 core.mark_task_undone(non_existent_id)
            assert excinfo.value.task_id == non_existent_id
    
  4. Run Tests: Execute pytest in your terminal from the project root.

    pytest
    # Or run only the error tests
    pytest -m errors
    # Or run only tests for get_task_by_id
    pytest -k get_task_by_id
    # Or stop on the first failure if any occur
    pytest -x
    
  5. Analyze Output: All tests should pass. Verify that the new tests are collected and executed. If you temporarily introduce a bug (e.g., remove the raise TaskNotFoundError line in get_task_by_id), run pytest -m errors or pytest -k get_task_by_id_not_found and observe how pytest.raises causes the test to fail because the expected exception was not raised. Then fix the bug and see the test pass again.

This workshop demonstrates how to proactively test for expected error conditions using pytest.raises, ensuring your code handles failures gracefully and predictably. Combining error tests with success-case tests provides comprehensive validation of your functions' behavior.

4. Intermediate Testing Techniques

We've covered the basics of writing, structuring, and running tests, along with handling expected errors. Now, let's explore more advanced Pytest features that significantly improve test efficiency, maintainability, and expressiveness, starting with fixtures.

Introduction to Fixtures (@pytest.fixture)

Fixtures are one of Pytest's most powerful and distinguishing features. They provide a fixed baseline state or context for tests, managing setup and teardown logic cleanly and efficiently. Think of them as providing the necessary ingredients or environment for your tests to run.

What Fixtures Replace:

  • Repetitive Setup Code: Instead of calling the same setup functions (like our core.clear_tasks()) at the start of multiple tests or using setup_method/teardown_method in classes.
  • Passing Data Around Manually: Instead of creating common test data within each test function.

How Fixtures Work (Dependency Injection):

  1. Definition: You define a fixture using the @pytest.fixture decorator on a function. This function sets up the desired state or object and typically yields or returns it.
  2. Teardown (Optional): If the fixture function uses yield, the code after the yield statement serves as the teardown logic, which runs after the test using the fixture has completed.
  3. Usage: Test functions (or other fixtures) can request a fixture simply by including its name as an argument in their function signature. Pytest's dependency injection mechanism automatically finds the fixture function, executes it, and passes its result (the value returned or yielded) to the test function.

Why Fixtures? Setup, Teardown, and Reuse

The primary benefits of using fixtures are:

  1. Separation of Concerns: Test setup/teardown logic is separated from the actual test verification logic, making tests cleaner and more focused on what they are asserting.
  2. Reusability: A single fixture can be used by multiple test functions, classes, or even across different test modules (using conftest.py). This eliminates code duplication.
  3. Explicit Dependencies: Test function signatures clearly declare what resources or states they depend on (the fixtures they list as arguments). This improves readability and understanding.
  4. Flexible Scopes: Fixtures can be configured to run once per function, class, module, or the entire test session, allowing for optimization by performing expensive setup operations only when necessary.
  5. Composability: Fixtures can request other fixtures, creating complex setup scenarios from smaller, reusable building blocks.
  6. Clean Teardown: The yield mechanism provides a robust way to ensure cleanup code runs even if the test fails.

Fixture Scopes (function, class, module, session)

The scope argument to @pytest.fixture controls how often a fixture is set up and torn down.

  1. scope="function" (Default):

    • The fixture runs once for each test function that requests it.
    • Setup runs before the test function, teardown runs after.
    • Ensures maximum isolation between tests, as each test gets a fresh instance of the fixture.
  2. scope="class":

    • The fixture runs once per test class.
    • Setup runs before the first test method in the class that requests it, teardown runs after the last test method in the class.
    • Useful when multiple methods in a class need the same expensive-to-create resource, and the tests don't modify the resource in a way that interferes with others.
  3. scope="module":

    • The fixture runs once per module (i.e., per test file like test_core.py).
    • Setup runs before the first test in the module that requests it, teardown runs after the last test in the module.
    • Good for setup that's relevant to all tests in a file but expensive to repeat for every function or class.
  4. scope="session":

    • The fixture runs only once for the entire test session (i.e., when you run pytest).
    • Setup runs before any tests requesting it are executed, teardown runs after all tests have completed.
    • Suitable for very expensive setup operations that are globally applicable and whose state isn't mutated by tests (e.g., setting up a database connection pool, starting an external service).

Choosing the Right Scope: Start with the narrowest scope (function) that makes sense for isolation. Increase the scope (class, module, session) only if:

  • The setup is genuinely expensive.
  • You can guarantee that tests sharing the fixture won't interfere with each other by modifying the fixture's state inappropriately.

Using Fixtures in Test Functions

Let's refactor our TestTaskManager class to use fixtures instead of setup_method / teardown_method.

Example (tests/test_core.py):

# tests/test_core.py
from taskmgr import core
from taskmgr.core import TaskNotFoundError
import pytest
import sys

# --- Fixtures ---

@pytest.fixture(autouse=True) # Apply automatically to all tests in the module
def clean_task_state():
    """Fixture to clear task list and reset ID before each test."""
    # print("\nSetting up via fixture...") # For demonstration
    core.clear_tasks()
    yield # Test runs here
    # print("Tearing down via fixture...") # For demonstration
    # No explicit teardown needed if clear_tasks is sufficient setup
    # If teardown was needed, it would go here

@pytest.fixture
def sample_tasks():
    """Fixture to provide a list of pre-added tasks."""
    task1 = core.add_task("Buy milk")
    task2 = core.add_task("Read book")
    task3 = core.add_task("Learn fixtures")
    return [task1, task2, task3] # Return the list of added tasks

# --- Keep the 'add' function tests as standalone functions (don't need fixtures) ---
@pytest.mark.fast
def test_add_positive_numbers():
    # This test doesn't request any specific fixture,
    # but `clean_task_state` runs anyway due to autouse=True
    assert core.add(5, 3) == 8
# ... other add tests ...


# --- Refactored Test Class using Fixtures ---
# We might not even need the class structure anymore if fixtures handle setup well
# Let's keep it for demonstration but remove setup/teardown methods

@pytest.mark.tasks
class TestTaskManager:
    """Groups tests related to task management operations using fixtures."""

    # No setup_method or teardown_method needed!
    # The `clean_task_state` fixture handles this automatically.

    @pytest.mark.fast
    def test_add_task_returns_dict(self):
        # clean_task_state runs automatically before this test
        description = "Test Task 1"
        task = core.add_task(description)
        assert isinstance(task, dict)
        assert task['description'] == description

    @pytest.mark.fast
    def test_add_task_adds_to_list(self):
        description = "Test Task 2"
        initial_count = len(core.list_tasks()) # Should be 0 due to fixture
        assert initial_count == 0
        core.add_task(description)
        tasks_list = core.list_tasks()
        assert len(tasks_list) == 1
        assert tasks_list[0]['description'] == description

    def test_add_task_increments_id(self):
        task1 = core.add_task("First task")
        task2 = core.add_task("Second task")
        assert task1['id'] == 1
        assert task2['id'] == 2

    # Using the sample_tasks fixture
    def test_list_tasks(self, sample_tasks): # Request sample_tasks fixture
        """Test listing tasks when some already exist."""
        # sample_tasks fixture runs, adding 3 tasks. clean_task_state also runs first.
        retrieved_tasks = core.list_tasks()
        assert len(retrieved_tasks) == len(sample_tasks)
        # Check if descriptions match (order might matter depending on impl)
        assert retrieved_tasks[0]['description'] == sample_tasks[0]['description']
        assert retrieved_tasks[1]['description'] == sample_tasks[1]['description']

    @pytest.mark.skipif(sys.platform == "win32", reason="Example skip")
    def test_list_tasks_returns_copy(self, sample_tasks): # Use fixture data
        list1 = core.list_tasks()
        list2 = core.list_tasks()
        assert list1 == list2
        assert list1 is not list2 # Should be different list objects

    # --- Error handling tests ---
    @pytest.mark.errors
    def test_add_task_empty_description_raises_error(self):
        with pytest.raises(ValueError, match="cannot be empty"):
             core.add_task("")

    @pytest.mark.errors
    def test_add_task_whitespace_description_raises_error(self):
        with pytest.raises(ValueError, match="cannot be empty"):
             core.add_task("   \t\n ")

    # Use sample_tasks fixture for get/mark tests
    @pytest.mark.errors
    def test_get_task_by_id_not_found(self, sample_tasks): # Use fixture
        non_existent_id = 999
        with pytest.raises(TaskNotFoundError) as excinfo:
            core.get_task_by_id(non_existent_id)
        assert excinfo.value.task_id == non_existent_id

    def test_get_task_by_id_success(self, sample_tasks): # Use fixture
        # Get ID of the second task added by the fixture
        target_task = sample_tasks[1]
        retrieved_task = core.get_task_by_id(target_task['id'])
        assert retrieved_task == target_task

    def test_mark_task_done_success(self, sample_tasks): # Use fixture
        target_task = sample_tasks[0] # Mark the first task
        updated_task = core.mark_task_done(target_task['id'])
        assert updated_task['done'] is True
        retrieved_task = core.get_task_by_id(target_task['id'])
        assert retrieved_task['done'] is True

    @pytest.mark.errors
    def test_mark_task_done_not_found(self, sample_tasks): # Use fixture
        non_existent_id = 999
        with pytest.raises(TaskNotFoundError):
            core.mark_task_done(non_existent_id)

    # ... Add tests for mark_task_undone using sample_tasks fixture ...
    def test_mark_task_undone_success(self, sample_tasks):
        target_task = sample_tasks[2] # Use the third task
        # Mark it done first
        core.mark_task_done(target_task['id'])
        assert core.get_task_by_id(target_task['id'])['done'] is True
        # Now mark undone
        updated_task = core.mark_task_undone(target_task['id'])
        assert updated_task['done'] is False
        retrieved_task = core.get_task_by_id(target_task['id'])
        assert retrieved_task['done'] is False

    @pytest.mark.errors
    def test_mark_task_undone_not_found(self, sample_tasks):
        non_existent_id = 1001
        with pytest.raises(TaskNotFoundError):
            core.mark_task_undone(non_existent_id)

Key Changes:

  • clean_task_state Fixture: Defined with @pytest.fixture(autouse=True). The autouse=True part means this fixture will automatically run before every test function defined in test_core.py without needing to be listed as an argument. It replaces the need for setup_method and teardown_method or manual clear_tasks() calls. It uses yield to separate setup (before yield) from potential teardown (after yield, though none is needed here).
  • sample_tasks Fixture: Defined with @pytest.fixture. This fixture adds three tasks and returns them as a list. Tests that need this pre-populated state simply list sample_tasks as an argument (e.g., def test_list_tasks(self, sample_tasks):). Pytest injects the returned list into the test function. Note that clean_task_state still runs first because it's an autouse fixture, ensuring sample_tasks starts with a clean slate each time it's invoked for a test.
  • Removed Manual Setup/Teardown: setup_method, teardown_method, and explicit core.clear_tasks() calls within tests are removed, leading to cleaner test functions.
  • Class Optional: With fixtures handling setup, the need for the TestTaskManager class is diminished. We could easily convert these back to standalone test_* functions if we preferred, and the fixtures would still work. The class now primarily serves as a namespace/organizational tool.

Sharing Fixtures (conftest.py)

If you have fixtures that need to be used across multiple test files within a directory (or subdirectories), defining them repeatedly in each file is inefficient. Pytest addresses this with a special file named conftest.py.

  • Location: Place a conftest.py file in a test directory (e.g., the main tests/ directory).
  • Scope: Fixtures (and hooks) defined in a conftest.py file are automatically discovered and become available to all tests in the directory where conftest.py resides, and in all subdirectories.
  • No Imports Needed: Test files within the scope of a conftest.py can request fixtures defined in it without needing to explicitly import them. Pytest handles the discovery via the fixture name.

Example:

Let's move our clean_task_state and sample_tasks fixtures to a central conftest.py.

  1. Create tests/conftest.py:

    # tests/conftest.py
    import pytest
    from taskmgr import core # Import necessary application code
    
    @pytest.fixture(autouse=True, scope='function') # Explicitly set scope if needed
    def clean_task_state():
        """Fixture to clear task list and reset ID before each test function."""
        # print("\nSetting up via conftest fixture...")
        core.clear_tasks()
        yield
        # print("Tearing down via conftest fixture...")
    
    @pytest.fixture(scope='function') # Default scope is function
    def sample_tasks():
        """Fixture to provide a list of pre-added tasks. Runs after clean_task_state."""
        # Assumes clean_task_state ran due to autouse=True
        task1 = core.add_task("Buy milk")
        task2 = core.add_task("Read book")
        task3 = core.add_task("Learn fixtures")
        return [task1, task2, task3]
    
    # You can define other shared fixtures here, e.g., for temporary files,
    # database connections, etc.
    
  2. Remove Fixture Definitions from tests/test_core.py: Delete the @pytest.fixture definitions for clean_task_state and sample_tasks from the top of tests/test_core.py. The test functions/methods that use them remain unchanged.

  3. Run Tests: Run pytest from the project root. The tests in test_core.py should still find and use the fixtures defined in conftest.py automatically.

Using conftest.py is the standard way to share fixtures and other test configurations, promoting reuse and keeping individual test files focused on the tests themselves.

Workshop Using Fixtures for Setup and Teardown

Let's solidify our understanding by ensuring our taskmgr tests fully utilize fixtures defined in conftest.py.

Goal:
Move all fixture logic to tests/conftest.py and ensure tests/test_core.py relies solely on these shared fixtures for setup and data provisioning.

Steps:

  1. Create/Verify tests/conftest.py: Ensure you have the tests/conftest.py file created in the previous step, containing the clean_task_state (autouse) and sample_tasks fixtures.

    tests/conftest.py:

    # tests/conftest.py
    import pytest
    from taskmgr import core
    
    @pytest.fixture(autouse=True, scope='function')
    def clean_task_state():
        """Ensures task list is empty and ID counter is reset before each test."""
        core.clear_tasks()
        yield # The test runs here
    
    @pytest.fixture(scope='function')
    def sample_tasks():
        """Provides a list of three sample tasks, added after clearing state."""
        # Because clean_task_state is autouse, it runs before this fixture implicitly
        task1 = core.add_task("Buy milk")
        task2 = core.add_task("Read book")
        task3 = core.add_task("Learn fixtures")
        return [task1, task2, task3]
    

  2. Clean up tests/test_core.py: Make sure the definitions of clean_task_state and sample_tasks are removed from tests/test_core.py. The test functions should still request sample_tasks as an argument where needed, but they don't need to know where it's defined. Ensure the TestTaskManager class no longer has setup_method or teardown_method.

    tests/test_core.py should look like (structure only):

    # tests/test_core.py
    from taskmgr import core
    from taskmgr.core import TaskNotFoundError
    import pytest
    import sys
    
    # --- NO FIXTURE DEFINITIONS HERE ---
    
    # --- Standalone tests ---
    @pytest.mark.fast
    def test_add_positive_numbers(): ...
    # ... other add tests ...
    
    # --- Test Class (using fixtures implicitly or explicitly) ---
    @pytest.mark.tasks
    class TestTaskManager:
    
        # NO setup_method / teardown_method
    
        @pytest.mark.fast
        def test_add_task_returns_dict(self): ... # Uses clean_task_state implicitly
    
        # ... other tests, some using sample_tasks explicitly like:
        def test_list_tasks(self, sample_tasks): ...
    
        def test_get_task_by_id_success(self, sample_tasks): ...
    
        # ... etc ...
    

  3. Run Tests: From the project root (taskmgr-project), run:

    pytest -v # Use -v for verbose output, shows test names individually
    

  4. Verify Output: All tests should pass. Confirm that there are no errors related to finding fixtures. The verbose output will show each test name prefixed with tests/test_core.py::TestTaskManager:: or tests/test_core.py::, indicating they are being run correctly using the fixtures from conftest.py.

By successfully migrating fixtures to conftest.py, you've made your test setup reusable and centralized. Any new test file created under tests/ (e.g., tests/test_cli.py) will automatically benefit from the clean_task_state fixture and can request the sample_tasks fixture without any extra configuration. This demonstrates the power and elegance of Pytest fixtures for managing test context.

5. Parameterizing Tests

Often, you need to run the same test logic with different inputs and expected outputs. Writing a separate test function for each variation leads to code duplication and makes the test suite harder to maintain. Pytest's parameterization feature allows you to define multiple sets of arguments and expected results for a single test function.

The Need for Parameterization

Consider testing our simple core.add function. We wrote three separate tests: test_add_positive_numbers, test_add_negative_numbers, and test_add_mixed_numbers. The core logic (result = core.add(a, b); assert result == expected) is identical in all three, only the input values (a, b) and the expected result change.

This duplication is manageable for three cases, but what if we wanted to test edge cases like adding zero, adding large numbers, etc.? The number of test functions would proliferate, all containing nearly identical code.

Parameterization solves this by allowing us to define the test logic once and run it multiple times with different data sets.

Using @pytest.mark.parametrize

The primary mechanism for parameterization in Pytest is the @pytest.mark.parametrize marker.

Syntax:

import pytest

@pytest.mark.parametrize(argnames, argvalues)
def test_function(argnames): # Argument names must match the test function signature
    # Test logic using the arguments provided by parametrize
    assert ...
  • @pytest.mark.parametrize(argnames, argvalues): The decorator applied to the test function.
  • argnames (string): A comma-separated string of argument names that will be passed to the test function. These names must match parameters in the test function's signature.
  • argvalues (list of tuples/lists or just values): A list where each element represents one execution run of the test function.
    • If argnames has multiple names (e.g., "input1, input2, expected"), then each element in argvalues should be a tuple (or list) containing the values for those arguments in that specific run (e.g., [(value1a, value2a, expected_a), (value1b, value2b, expected_b)]).
    • If argnames has only one name (e.g., "input_value"), then argvalues can be a list of single values (e.g., [value1, value2, value3]).

Example: Refactoring add tests:

Let's replace our three test_add_* functions with a single parameterized test.

# tests/test_core.py
from taskmgr import core
# ... other imports ...
import pytest

# --- Parameterized test for 'add' function ---
@pytest.mark.parametrize("a, b, expected_sum", [
    # Test Case 1: Positive numbers
    (5, 3, 8),
    # Test Case 2: Negative numbers
    (-2, -4, -6),
    # Test Case 3: Mixed numbers
    (10, -3, 7),
    # Test Case 4: Adding zero
    (0, 5, 5),
    (5, 0, 5),
    (0, 0, 0),
    # Test Case 5: Large numbers (example)
    (1_000_000, 2_000_000, 3_000_000),
])
@pytest.mark.fast # Keep the marker if applicable
def test_add_parametrized(a, b, expected_sum):
    """
    Tests the core.add function with various input combinations
    provided by parametrize.
    """
    print(f"\nTesting add({a}, {b}) == {expected_sum}") # Add print for visibility
    result = core.add(a, b)
    assert result == expected_sum, f"Failed: add({a}, {b}) produced {result}, expected {expected_sum}"

# --- Remove the old, separate test_add_* functions ---
# def test_add_positive_numbers(): ... (DELETE)
# def test_add_negative_numbers(): ... (DELETE)
# def test_add_mixed_numbers(): ... (DELETE)

# --- Keep the TestTaskManager class and its tests ---
# ... (TestTaskManager class definition remains) ...

Running Parameterized Tests:

When you run pytest, Pytest discovers test_add_parametrized. It sees the @parametrize marker with 7 sets of values (argvalues). It will then run the test_add_parametrized function 7 separate times, once for each tuple in the list.

Output:

The output will show that multiple tests were run from this single function definition. If you run with -v (verbose), you'll see distinct entries for each parameter set:

(.venv) $ pytest -v -k test_add_parametrized
============================= test session starts ==============================
...
collected 7 items / X deselected / Y selected

tests/test_core.py::test_add_parametrized[5-3-8]
Testing add(5, 3) == 8
PASSED
tests/test_core.py::test_add_parametrized[-2--4--6]
Testing add(-2, -4) == -6
PASSED
tests/test_core.py::test_add_parametrized[10--3-7]
Testing add(10, -3) == 7
PASSED
tests/test_core.py::test_add_parametrized[0-5-5]
Testing add(0, 5) == 5
PASSED
tests/test_core.py::test_add_parametrized[5-0-5]
Testing add(5, 0) == 5
PASSED
tests/test_core.py::test_add_parametrized[0-0-0]
Testing add(0, 0) == 0
PASSED
tests/test_core.py::test_add_parametrized[1000000-2000000-3000000]
Testing add(1_000_000, 2_000_000) == 3_000_000
PASSED

======================= 7 passed, X deselected in X.XXs ========================

Notice the identifiers in square brackets ([5-3-8], [-2--4--6], etc.). Pytest automatically generates these IDs based on the input parameters to help distinguish between the different runs of the parameterized test. You can customize these IDs using the ids argument in parametrize.

Parameterizing with Multiple Arguments

As seen in the test_add_parametrized example, argnames takes a comma-separated string defining multiple arguments ("a, b, expected_sum"), and argvalues provides tuples matching those arguments ((5, 3, 8)). This is the standard way to test functions with multiple inputs or to compare input against an expected output.

Generating Parameters Programmatically

Sometimes, the parameter sets are numerous or can be generated algorithmically. Instead of listing them all manually in the argvalues list, you can provide a variable or call a function that returns the list of parameters.

Example: Generating test cases for valid/invalid task descriptions.

# tests/test_core.py
# ... imports ...

# --- Test Data Generation ---
VALID_DESCRIPTIONS = [
    "Simple task",
    "Task with numbers 123",
    " Task with leading/trailing spaces ", # Should be stripped by add_task
    "A very long task description that should still be valid " * 5,
]

INVALID_DESCRIPTIONS = [
    ("", "empty string"), # Tuple: (input, id_suffix)
    ("   ", "whitespace only"),
    ("\t\n", "tabs and newlines"),
    (None, "None value"), # We need to test for None explicitly if needed
]

# --- Parameterized Tests in TestTaskManager ---
@pytest.mark.tasks
class TestTaskManager:
    # ... other tests ...

    # Parameterize successful task additions
    @pytest.mark.parametrize("description", VALID_DESCRIPTIONS)
    def test_add_task_valid_descriptions(self, description):
        """Tests add_task with various valid descriptions."""
        task = core.add_task(description)
        assert task['description'] == description.strip() # Check if stripped
        assert task['id'] is not None
        # Check if it's actually in the list
        assert task in core.list_tasks()

    # Parameterize expected errors for invalid descriptions
    # Provide custom IDs for clarity
    @pytest.mark.parametrize(
        "invalid_desc, error_type",
        [
            ("", ValueError),
            ("   ", ValueError),
            ("\t\n", ValueError),
            # Let's assume None should raise TypeError if not handled specifically
            (None, TypeError),
        ],
        ids=[ # Custom IDs for test runs
            "empty_string",
            "whitespace_only",
            "tabs_newlines",
            "none_value"
        ]
    )
    @pytest.mark.errors
    def test_add_task_invalid_descriptions(self, invalid_desc, error_type):
        """Tests that add_task raises appropriate errors for invalid descriptions."""
        # We expect ValueError for empty/whitespace based on current implementation
        # And perhaps TypeError for None if type hints are enforced or checked
        expected_message = "Task description cannot be empty." # For ValueErrors
        with pytest.raises(error_type) as excinfo:
            core.add_task(invalid_desc)

        # Optionally check message only if it's a ValueError
        if error_type is ValueError:
            assert expected_message in str(excinfo.value)

    # Note: The previous specific tests for empty/whitespace errors can now be removed
    # def test_add_task_empty_description_raises_error(self): ... (DELETE)
    # def test_add_task_whitespace_description_raises_error(self): ... (DELETE)
Self-Correction: The original add_task didn't handle None explicitly, relying on the if not description check which might behave unexpectedly with None depending on context or future changes. The parameterized test now includes a case for None and expects a TypeError (or whatever error naturally occurs, which we can then adjust the test or code for). Also added custom ids to parametrize for better readability in test reports. Realized that checking the exact error message might be too brittle if the message changes slightly; checking in might be better for ValueError. Removed the redundant specific error tests.

Key Points:

  • Test data (VALID_DESCRIPTIONS, INVALID_DESCRIPTIONS) is defined separately, making it easier to manage and extend.
  • The test_add_task_valid_descriptions function runs once for each entry in VALID_DESCRIPTIONS.
  • The test_add_task_invalid_descriptions function runs for each tuple in the argvalues list, testing different invalid inputs and the expected exception type (ValueError or TypeError).
  • The ids parameter provides meaningful names for each test case run, improving reporting ([empty_string], [whitespace_only], etc.).

Parameterization significantly reduces test code duplication, makes tests more data-driven, and encourages more thorough testing of various input conditions.

Workshop Parameterizing Tests for Different Task Inputs

Let's apply parameterization more broadly within our taskmgr tests. We'll parameterize tests for retrieving tasks and marking them done/undone, using our sample_tasks fixture data.

Goal:
Refactor tests for get_task_by_id, mark_task_done, and mark_task_undone to use parameterization, testing various valid and invalid IDs based on the sample_tasks fixture.

Steps:

  1. Identify Parameterization Candidates: The tests for get_task_by_id, mark_task_done, and mark_task_undone (both success and error cases) involve checking different IDs – some that exist (from sample_tasks) and some that don't.

  2. Refactor Success Cases: Instead of just testing one ID from sample_tasks, let's iterate through all IDs provided by the fixture. We can achieve this by parameterizing over the index of the task within the sample_tasks list.

    # tests/test_core.py (within TestTaskManager class)
    # ... imports ...
    
    class TestTaskManager:
        # ... other tests ...
    
        # Parameterize over the index of the task in sample_tasks
        @pytest.mark.parametrize("task_index", [0, 1, 2])
        def test_get_task_by_id_success_parametrized(self, sample_tasks, task_index):
            """Verify retrieving existing tasks by ID using parametrization."""
            target_task = sample_tasks[task_index]
            retrieved_task = core.get_task_by_id(target_task['id'])
            assert retrieved_task == target_task
            assert retrieved_task is not target_task # Still check it's a copy
    
        @pytest.mark.parametrize("task_index", [0, 1, 2])
        def test_mark_task_done_success_parametrized(self, sample_tasks, task_index):
            """Verify marking existing tasks as done using parametrization."""
            target_task = sample_tasks[task_index]
            assert not target_task['done'] # Assume fixture provides undone tasks
            updated_task = core.mark_task_done(target_task['id'])
            assert updated_task['done'] is True
            retrieved_task = core.get_task_by_id(target_task['id'])
            assert retrieved_task['done'] is True
    
        @pytest.mark.parametrize("task_index", [0, 1, 2])
        def test_mark_task_undone_success_parametrized(self, sample_tasks, task_index):
            """Verify marking tasks as not done using parametrization."""
            target_task = sample_tasks[task_index]
            # Mark done first to ensure we can mark undone
            core.mark_task_done(target_task['id'])
            assert core.get_task_by_id(target_task['id'])['done'] is True
            # Now mark undone
            updated_task = core.mark_task_undone(target_task['id'])
            assert updated_task['done'] is False
            retrieved_task = core.get_task_by_id(target_task['id'])
            assert retrieved_task['done'] is False
    
        # --- Remove old, non-parameterized success tests ---
        # def test_get_task_by_id_success(self, sample_tasks): ... (DELETE)
        # def test_mark_task_done_success(self, sample_tasks): ... (DELETE)
        # def test_mark_task_undone_success(self, sample_tasks): ... (DELETE)
        # Keep idempotent test as it's slightly different logic
        def test_mark_task_done_idempotent(self, sample_tasks):
            task_id = sample_tasks[0]['id']
            core.mark_task_done(task_id) # Mark done once
            updated_task = core.mark_task_done(task_id) # Mark done again
            assert updated_task['done'] is True
    
  3. Refactor Error Cases: Parameterize the tests for non-existent IDs.

    # tests/test_core.py (within TestTaskManager class)
    # ... imports ...
    
    class TestTaskManager:
        # ... other tests ...
        # ... parameterized success tests ...
    
        # Define invalid IDs - can include IDs likely higher than sample_tasks produces
        INVALID_IDS = [-1, 0, 999, 1000]
    
        @pytest.mark.parametrize("invalid_id", INVALID_IDS)
        @pytest.mark.errors
        def test_get_task_by_id_not_found_parametrized(self, sample_tasks, invalid_id):
            """Verify TaskNotFoundError for various non-existent IDs."""
            # sample_tasks fixture runs to ensure the list isn't empty, but we ignore its content
            with pytest.raises(TaskNotFoundError) as excinfo:
                core.get_task_by_id(invalid_id)
            assert excinfo.value.task_id == invalid_id
    
        @pytest.mark.parametrize("invalid_id", INVALID_IDS)
        @pytest.mark.errors
        def test_mark_task_done_not_found_parametrized(self, sample_tasks, invalid_id):
            """Verify TaskNotFoundError for marking done on non-existent IDs."""
            with pytest.raises(TaskNotFoundError, match=f"Task with ID {invalid_id} not found"):
                core.mark_task_done(invalid_id)
    
        @pytest.mark.parametrize("invalid_id", INVALID_IDS)
        @pytest.mark.errors
        def test_mark_task_undone_not_found_parametrized(self, sample_tasks, invalid_id):
            """Verify TaskNotFoundError for marking undone on non-existent IDs."""
            with pytest.raises(TaskNotFoundError) as excinfo:
                 core.mark_task_undone(invalid_id)
            assert excinfo.value.task_id == invalid_id
    
        # --- Remove old, non-parameterized error tests ---
        # def test_get_task_by_id_not_found(self, sample_tasks): ... (DELETE)
        # def test_mark_task_done_not_found(self, sample_tasks): ... (DELETE)
        # def test_mark_task_undone_not_found(self, sample_tasks): ... (DELETE)
    
    Self-Correction: Used a separate list INVALID_IDS for clarity. Ensured the sample_tasks fixture is still requested in the error tests, primarily to ensure the task list state is managed correctly (cleared before, potentially populated if needed), even though we don't directly use the tasks returned by the fixture in these error tests.

  4. Run Tests: Execute pytest -v from the project root.

  5. Analyze Output: Observe the verbose output. You should see multiple runs for each parameterized test function (e.g., test_get_task_by_id_success_parametrized[0], test_get_task_by_id_success_parametrized[1], etc., and test_get_task_by_id_not_found_parametrized[-1], test_get_task_by_id_not_found_parametrized[0], etc.). Verify that the total number of collected and passed tests has increased significantly due to parameterization. Ensure the old, non-parameterized versions of these tests are no longer being run.

This workshop demonstrates how parameterization, often combined with fixtures, can make your tests more comprehensive and maintainable by testing variations of inputs and expected outcomes without duplicating test logic.

6. Mocking and Patching

Unit testing aims to test components in isolation. However, real-world code often depends on external systems, resources, or other complex objects (databases, APIs, file systems, hardware clocks, other classes). Directly using these real dependencies in unit tests can make them slow, unreliable (network issues), difficult to set up, and can introduce unwanted side effects (like writing to a real database). Mocking and patching techniques allow you to replace these dependencies with controllable "fake" objects during tests.

Introduction to Mocking

Mocking is the process of creating replacement objects (mocks, fakes, stubs, spies) that mimic the behavior of real dependencies in controlled ways. These mock objects can be configured to:

  • Return specific values when their methods are called.
  • Raise specific exceptions.
  • Record how they were interacted with (e.g., which methods were called, with what arguments).

This allows you to test your unit of code (the "System Under Test" or SUT) without relying on the actual external dependencies.

Why Mock? Isolating Units Under Test

The key reasons for using mocks in unit testing are:

  1. Isolation: Verify the logic of your SUT independently of its collaborators. If a test fails, you know the issue is likely within the SUT itself, not in the dependency it uses.
  2. Speed: Interacting with real dependencies (networks, databases, filesystems) is often slow. Mocks execute entirely in memory and are extremely fast, keeping your unit test suite quick.
  3. Determinism: Real dependencies can have unpredictable behavior (network latency, external API changes, file system permissions). Mocks provide consistent, predictable behavior every time the test runs.
  4. Testing Difficult Scenarios: Mocks allow you to easily simulate conditions that are hard to reproduce with real dependencies, such as:
    • Network errors or timeouts.
    • Specific error responses from an API.
    • Running out of disk space.
    • Specific times returned by datetime.now().
  5. Avoiding Side Effects: Prevent tests from having unwanted consequences, like modifying a production database, sending real emails, or deleting important files.

Using unittest.mock (Built-in)

Python's standard library includes the powerful unittest.mock module (often imported as from unittest import mock or just import unittest.mock). Its core component is the Mock class and the patch function/decorator.

  • mock.Mock / mock.MagicMock: Creates mock objects. MagicMock is a subclass that pre-configures implementations for most "magic" methods (like __str__, __len__, __iter__), making it often more convenient.
    • You can configure return values: my_mock.method.return_value = 10
    • You can configure side effects (like raising exceptions): my_mock.method.side_effect = ValueError("Boom!")
    • You can make assertions about calls: my_mock.method.assert_called_once_with(arg1, arg2)
  • mock.patch: A function/decorator used to temporarily replace objects within a specific scope (e.g., during a test function). It finds the object by its string path (e.g., 'mymodule.dependency.ClassName') and replaces it with a Mock (or another specified object). After the scope (the test function or with block) exits, patch automatically restores the original object.

Basic patch Usage (as decorator):

from unittest import mock
import some_module_to_test # Contains code using 'os.path.exists'

# Decorator replaces 'os.path.exists' with a MagicMock for the duration of the test
@mock.patch('os.path.exists')
def test_file_processing_if_exists(mock_exists): # The mock object is passed as an argument
    # Configure the mock's behavior
    mock_exists.return_value = True # Simulate file existing

    # Call the function under test
    result = some_module_to_test.process_file_if_exists("myfile.txt")

    # Assertions on the function's result
    assert result is True
    # Assertions on the mock interaction
    mock_exists.assert_called_once_with("myfile.txt")

# Test the opposite case
@mock.patch('os.path.exists')
def test_file_processing_if_not_exists(mock_exists):
    mock_exists.return_value = False # Simulate file NOT existing
    result = some_module_to_test.process_file_if_exists("myfile.txt")
    assert result is False
    mock_exists.assert_called_once_with("myfile.txt")

Basic patch Usage (as context manager):

from unittest import mock
import some_module_to_test

def test_file_processing_with_context_manager():
    # Patch within a 'with' block
    with mock.patch('os.path.exists') as mock_exists:
        mock_exists.return_value = True
        result = some_module_to_test.process_file_if_exists("myfile.txt")
        assert result is True
        mock_exists.assert_called_once_with("myfile.txt")

    # Outside the 'with' block, os.path.exists is restored

    # Another 'with' block for the other case
    with mock.patch('os.path.exists') as mock_exists_2:
        mock_exists_2.return_value = False
        result = some_module_to_test.process_file_if_exists("anotherfile.log")
        assert result is False
        mock_exists_2.assert_called_once_with("anotherfile.log")

Where to Patch: The crucial rule for patch is to patch the object where it is looked up, not necessarily where it was defined. If my_module.py contains import os and then uses os.path.exists, you patch 'my_module.os.path.exists', not 'os.path.exists'.

The pytest-mock Plugin (Mocker Fixture)

While unittest.mock is powerful, using patch directly can sometimes feel verbose. The pytest-mock plugin provides a mocker fixture that wraps unittest.mock functionality in a more Pytest-idiomatic way.

  1. Installation:

    pip install pytest-mock
    

  2. Usage (mocker fixture): Request the mocker fixture in your test function. It provides methods like mocker.patch, mocker.Mock, mocker.spy, etc., which are thin wrappers around the corresponding unittest.mock features but handle the start/stop patching automatically within the test function's scope.

Example using mocker:

# Assuming pytest-mock is installed
# No need to import unittest.mock directly in the test file
import some_module_to_test # Contains code using 'os.path.exists'
import os # May still need os if using mocker.spy or other direct interactions

def test_file_processing_with_mocker(mocker): # Request the mocker fixture
    # Use mocker.patch (behaves like unittest.mock.patch)
    mock_exists = mocker.patch('os.path.exists') # Patch os.path.exists *where it's used*
    # Example: if some_module_to_test does 'from os.path import exists'
    # you might need to patch 'some_module_to_test.exists' instead.
    # Let's assume some_module_to_test uses 'os.path.exists' directly.

    # Configure and test the first case
    mock_exists.return_value = True
    result1 = some_module_to_test.process_file_if_exists("myfile.txt")
    assert result1 is True
    mock_exists.assert_called_once_with("myfile.txt")

    # Reset the mock for the next case within the same test (optional)
    # mocker usually handles teardown between tests, but within a test, reset if needed
    mock_exists.reset_mock()

    # Configure and test the second case
    mock_exists.return_value = False
    result2 = some_module_to_test.process_file_if_exists("anotherfile.log")
    assert result2 is False
    mock_exists.assert_called_once_with("anotherfile.log")


def test_api_call_simulation(mocker):
    # Assume we have a function that uses 'requests.get'
    # import my_api_client
    mock_get = mocker.patch('requests.get') # Patch where requests.get is looked up

    # Configure a mock response object
    mock_response = mocker.Mock()
    mock_response.status_code = 200
    mock_response.json.return_value = {"data": "success"} # Mock the .json() method call
    mock_get.return_value = mock_response # Make requests.get return our mock response

    # Call the code that uses requests.get
    # data = my_api_client.fetch_data("http://example.com/api")

    # Assertions
    # assert data == {"data": "success"}
    # mock_get.assert_called_once_with("http://example.com/api")
    # mock_response.json.assert_called_once()

The mocker fixture simplifies patching by automatically managing the start and stop calls, fitting nicely into the fixture-based workflow of Pytest.

Patching Objects and Methods

You can patch various things:

  • Entire Modules: mocker.patch('os') (use with caution).
  • Classes: mocker.patch('mymodule.MyClass'). When called, this returns a mock instance.
  • Functions/Methods: mocker.patch('mymodule.my_function'), mocker.patch('mymodule.MyClass.method').
  • Object Attributes: If you have an instance of an object, you can patch its attributes or methods directly using mocker.patch.object(instance, 'attribute_name').

Example: Patching a method on an object:

class ServiceClient:
    def connect(self):
        print("Connecting to real service...")
        # Complex connection logic
        return True

    def get_data(self):
        if not self.connect():
            raise ConnectionError("Failed to connect")
        print("Fetching real data...")
        return "real data"

def process_service_data(client: ServiceClient):
    try:
        data = client.get_data()
        return f"Processed: {data}"
    except ConnectionError:
        return "Processing failed due to connection error"

# Test using mocker.patch.object
def test_process_data_success(mocker):
    mock_client = mocker.Mock(spec=ServiceClient) # Create mock with ServiceClient spec
    # Configure mock behavior directly
    mock_client.get_data.return_value = "mock data"

    result = process_service_data(mock_client)

    assert result == "Processed: mock data"
    mock_client.get_data.assert_called_once()
    mock_client.connect.assert_not_called() # get_data mock bypasses connect

def test_process_data_connection_error(mocker):
    mock_client = mocker.MagicMock(spec=ServiceClient)
    # Simulate get_data raising an error
    mock_client.get_data.side_effect = ConnectionError("Failed to connect")

    result = process_service_data(mock_client)

    assert result == "Processing failed due to connection error"
    mock_client.get_data.assert_called_once()
Self-Correction: Added spec=ServiceClient to mocker.Mock. Using spec (or spec_set) helps ensure the mock object only responds to methods that actually exist on the real ServiceClient class, catching potential typos or incorrect assumptions in the test setup. Used MagicMock in the error case as ConnectionError might involve magic methods during exception handling.

Asserting Mock Calls

A key benefit of mocks is verifying how your SUT interacts with its dependencies. unittest.mock (and thus mocker) provides assertion methods:

  • mock_object.method.assert_called(): Asserts the method was called at least once.
  • mock_object.method.assert_called_once(): Asserts the method was called exactly once.
  • mock_object.method.assert_not_called(): Asserts the method was never called.
  • mock_object.method.assert_called_with(*args, **kwargs): Asserts the method was called (at least once) with the specified arguments.
  • mock_object.method.assert_called_once_with(*args, **kwargs): Asserts the method was called exactly once with the specified arguments.
  • mock_object.method.assert_any_call(*args, **kwargs): Asserts the method was called at least once with the specified arguments, ignoring other calls.
  • mock_object.method.call_count: An integer property showing how many times the method was called.
  • mock_object.method.call_args: A call object representing the arguments of the last call (call_args.args for positional, call_args.kwargs for keyword).
  • mock_object.method.call_args_list: A list of call objects representing all calls made to the method.

These assertions allow you to precisely verify that your code interacts with its dependencies as expected.

Workshop Mocking External Dependencies (e.g., File I/O)

Our taskmgr currently stores tasks in a global list (_tasks). Let's modify it to save and load tasks from a JSON file, introducing a dependency on the file system and the json module. Then, we'll use mocker to test the core logic without actually interacting with the file system.

Goal:

  1. Modify core.py to load tasks from and save tasks to a specified JSON file.
  2. Update conftest.py and tests to handle this change, likely introducing fixtures for file paths.
  3. Write tests for the save/load functionality, using mocker to patch open, json.dump, and json.load.

Steps:

  1. Modify core.py for File I/O:

    # src/taskmgr/core.py
    import itertools
    import json
    import os # Needed for file operations
    
    # Remove global _tasks and _next_id from module level
    # We will manage state via functions that operate on a file path
    
    class TaskNotFoundError(Exception):
        # ... (keep as is) ...
    
    # --- Core Task Functions (Now operate on data loaded/saved from file) ---
    
    def _load_tasks(filepath: str) -> list[dict]:
        """Loads tasks from a JSON file. Returns empty list if file not found."""
        try:
            with open(filepath, 'r') as f:
                # Handle empty file case
                content = f.read()
                if not content:
                    return []
                # Use object_hook for potential future data transformations
                data = json.loads(content)
                # Basic validation (optional but good)
                if not isinstance(data, list):
                    raise TypeError(f"Data in {filepath} is not a list")
                for task in data:
                    if not isinstance(task, dict) or 'id' not in task or 'description' not in task or 'done' not in task:
                         raise TypeError(f"Invalid task structure in {filepath}")
                return data
        except FileNotFoundError:
            return [] # Return empty list if file doesn't exist yet
        except json.JSONDecodeError:
             print(f"Warning: Could not decode JSON from {filepath}. Starting fresh.")
             return [] # Corrupt file, treat as empty
    
    def _save_tasks(filepath: str, tasks: list[dict]):
        """Saves the list of tasks to a JSON file."""
        # Ensure directory exists (optional, depends on desired behavior)
        os.makedirs(os.path.dirname(filepath), exist_ok=True)
        with open(filepath, 'w') as f:
            json.dump(tasks, f, indent=4) # Use indent for readability
    
    def _get_next_id(tasks: list[dict]) -> int:
        """Calculates the next available ID."""
        if not tasks:
            return 1
        return max(task['id'] for task in tasks) + 1
    
    # --- Public API Functions ---
    
    def add_task(filepath: str, description: str) -> dict:
        """Adds a task to the list stored in the specified file."""
        if not description or description.isspace():
            raise ValueError("Task description cannot be empty.")
    
        tasks = _load_tasks(filepath)
        new_id = _get_next_id(tasks)
        new_task = {
            'id': new_id,
            'description': description.strip(),
            'done': False
        }
        tasks.append(new_task)
        _save_tasks(filepath, tasks)
        return new_task
    
    def list_tasks(filepath: str) -> list[dict]:
        """Lists all tasks from the specified file."""
        # Load fresh every time to reflect current state
        return _load_tasks(filepath)
    
    def clear_tasks(filepath: str):
        """Clears all tasks by saving an empty list to the file."""
        _save_tasks(filepath, [])
    
    def get_task_by_id(filepath: str, task_id: int) -> dict:
        """Retrieves a single task by ID from the file."""
        tasks = _load_tasks(filepath)
        for task in tasks:
            if task['id'] == task_id:
                return task.copy()
        raise TaskNotFoundError(task_id)
    
    def mark_task_done(filepath: str, task_id: int) -> dict:
        """Marks a task as done in the file."""
        tasks = _load_tasks(filepath)
        task_found = False
        updated_task = None
        for task in tasks:
            if task['id'] == task_id:
                task['done'] = True
                task_found = True
                updated_task = task.copy()
                break # Found the task, no need to continue loop
        if not task_found:
            raise TaskNotFoundError(task_id)
        _save_tasks(filepath, tasks)
        return updated_task
    
    def mark_task_undone(filepath: str, task_id: int) -> dict:
        """Marks a task as not done in the file."""
        tasks = _load_tasks(filepath)
        task_found = False
        updated_task = None
        for task in tasks:
            if task['id'] == task_id:
                task['done'] = False
                task_found = True
                updated_task = task.copy()
                break
        if not task_found:
            raise TaskNotFoundError(task_id)
        _save_tasks(filepath, tasks)
        return updated_task
    
    # Remove the simple 'add' function if no longer needed
    # def add(a: int, b: int) -> int: ...
    
    Self-Correction: Significant changes here. Removed global state. All public functions now take a filepath argument. Added private helper functions _load_tasks, _save_tasks, _get_next_id. Added basic error handling for file not found, empty file, and JSON decoding errors in _load_tasks. Added validation for task structure within _load_tasks. Ensured _save_tasks uses indentation. Updated mark_task_done/undone to modify the list then save.

  2. Update conftest.py for File Handling: We need a fixture to provide a temporary file path for tests to use, ensuring tests don't interfere with each other or leave garbage files. Pytest has a built-in tmp_path fixture for this.

    # tests/conftest.py
    import pytest
    from taskmgr import core
    import os # Needed for os.path.join
    
    # Remove the old clean_task_state fixture - state is now managed by files
    # @pytest.fixture(autouse=True, scope='function')
    # def clean_task_state(): ... (DELETE)
    
    @pytest.fixture
    def task_file(tmp_path):
        """
        Fixture providing a path to a temporary task file for a test.
        Ensures the file starts empty for each test function using it.
        Relies on the built-in `tmp_path` fixture which provides a temporary directory unique to each test function.
        """
        # Create a unique file path within the temporary directory
        filepath = tmp_path / "tasks.json" # tmp_path is a pathlib.Path object
        # Ensure the file is empty/non-existent at the start of the test
        # Although tmp_path is unique per function, explicit cleanup is safer
        if filepath.exists():
            filepath.unlink() # Delete if exists from a previous failed run? Unlikely but safe.
        # Provide the path to the test
        yield str(filepath) # Yield the path as a string
        # Teardown (optional): Clean up the file after the test if needed.
        # tmp_path fixture handles directory cleanup automatically.
        # if filepath.exists():
        #     print(f"\nCleaning up {filepath}")
        #     filepath.unlink()
    
    # Remove sample_tasks fixture as it relied on global state.
    # We will create tasks within tests using the new file-based functions.
    # @pytest.fixture(scope='function')
    # def sample_tasks(): ... (DELETE)
    
    Self-Correction: Removed fixtures tied to the old global state (clean_task_state, sample_tasks). Introduced task_file fixture using the built-in tmp_path fixture. tmp_path provides a unique temporary directory per test function, ideal for file I/O isolation. The fixture yields the path to the test file.

  3. Update Tests in tests/test_core.py: All test functions now need to accept the task_file fixture and pass the path to the core functions. We'll also need to rewrite tests that relied on sample_tasks.

    # tests/test_core.py
    from taskmgr import core
    from taskmgr.core import TaskNotFoundError
    import pytest
    import json # Needed for checking file content directly in some tests
    import os # Needed sometimes
    
    # Remove the old add tests if 'add' function was removed from core
    # @pytest.mark.parametrize(...)
    # def test_add_parametrized(a, b, expected_sum): ... (DELETE if core.add removed)
    
    @pytest.mark.tasks
    class TestTaskManager:
    
        # --- Tests for add_task ---
        def test_add_task_creates_file(self, task_file):
            """Verify add_task creates the file if it doesn't exist."""
            assert not os.path.exists(task_file)
            core.add_task(task_file, "First task")
            assert os.path.exists(task_file)
    
        def test_add_task_returns_dict(self, task_file):
            task = core.add_task(task_file, "Test Task 1")
            assert isinstance(task, dict)
            assert task['description'] == "Test Task 1"
            assert task['id'] == 1 # First task ID
            assert not task['done']
    
        def test_add_task_saves_to_file(self, task_file):
            task1 = core.add_task(task_file, "Task One")
            task2 = core.add_task(task_file, "Task Two")
    
            # Read file content directly to verify
            with open(task_file, 'r') as f:
                data = json.load(f)
    
            assert isinstance(data, list)
            assert len(data) == 2
            assert data[0]['id'] == task1['id']
            assert data[0]['description'] == task1['description']
            assert data[1]['id'] == task2['id']
            assert data[1]['description'] == task2['description']
            assert data[1]['id'] == 2 # Check ID increment
    
        @pytest.mark.parametrize("description", ["Valid Task", " Another "])
        def test_add_task_valid_descriptions(self, task_file, description):
             task = core.add_task(task_file, description)
             assert task['description'] == description.strip()
             # Verify save
             tasks_in_file = core.list_tasks(task_file)
             assert tasks_in_file[-1]['description'] == description.strip()
    
    
        @pytest.mark.parametrize("invalid_desc, error_type", [("", ValueError), (" ", ValueError)], ids=["empty", "whitespace"])
        @pytest.mark.errors
        def test_add_task_invalid_descriptions(self, task_file, invalid_desc, error_type):
             with pytest.raises(error_type, match="cannot be empty"):
                 core.add_task(task_file, invalid_desc)
             # Ensure file wasn't created or modified incorrectly
             assert not os.path.exists(task_file) or os.path.getsize(task_file) == 0
    
    
        # --- Tests for list_tasks ---
        def test_list_tasks_empty(self, task_file):
            """Test listing tasks when the file is empty or non-existent."""
            assert core.list_tasks(task_file) == []
    
        def test_list_tasks_multiple(self, task_file):
            """Test listing multiple tasks."""
            task1 = core.add_task(task_file, "Task A")
            task2 = core.add_task(task_file, "Task B")
            listed_tasks = core.list_tasks(task_file)
            assert len(listed_tasks) == 2
            # Order might not be guaranteed unless explicitly handled in add/load
            # Check based on IDs for robustness
            assert task1 in listed_tasks
            assert task2 in listed_tasks
    
        # --- Tests for clear_tasks ---
        def test_clear_tasks(self, task_file):
            core.add_task(task_file, "Task to clear")
            assert len(core.list_tasks(task_file)) == 1
            core.clear_tasks(task_file)
            assert len(core.list_tasks(task_file)) == 0
            # Check file content is empty list
            with open(task_file, 'r') as f:
                 data = json.load(f)
            assert data == []
    
        # --- Tests for get_task_by_id ---
        def test_get_task_by_id_success(self, task_file):
            task1 = core.add_task(task_file, "Target Task")
            core.add_task(task_file, "Other Task")
            retrieved = core.get_task_by_id(task_file, task1['id'])
            assert retrieved == task1
    
        @pytest.mark.parametrize("invalid_id", [-1, 0, 999])
        @pytest.mark.errors
        def test_get_task_by_id_not_found(self, task_file, invalid_id):
            core.add_task(task_file, "Some task") # Ensure file exists and has content
            with pytest.raises(TaskNotFoundError):
                core.get_task_by_id(task_file, invalid_id)
    
        # --- Tests for mark_task_done / mark_task_undone ---
        def test_mark_task_done_success(self, task_file):
            task = core.add_task(task_file, "To Do")
            updated = core.mark_task_done(task_file, task['id'])
            assert updated['done'] is True
            retrieved = core.get_task_by_id(task_file, task['id'])
            assert retrieved['done'] is True
    
        def test_mark_task_undone_success(self, task_file):
            task = core.add_task(task_file, "To Undo")
            core.mark_task_done(task_file, task['id']) # Mark done first
            updated = core.mark_task_undone(task_file, task['id'])
            assert updated['done'] is False
            retrieved = core.get_task_by_id(task_file, task['id'])
            assert retrieved['done'] is False
    
        @pytest.mark.parametrize("invalid_id", [-1, 0, 500])
        @pytest.mark.errors
        def test_mark_task_done_not_found(self, task_file, invalid_id):
             core.add_task(task_file, "A task")
             with pytest.raises(TaskNotFoundError):
                 core.mark_task_done(task_file, invalid_id)
    
        @pytest.mark.parametrize("invalid_id", [-1, 0, 500])
        @pytest.mark.errors
        def test_mark_task_undone_not_found(self, task_file, invalid_id):
             core.add_task(task_file, "Another task")
             core.mark_task_done(task_file, 1) # Ensure one task is done
             with pytest.raises(TaskNotFoundError):
                 core.mark_task_undone(task_file, invalid_id)
    
        # --- Tests for File Edge Cases (using Mocker) ---
    
        @pytest.mark.errors
        def test_load_tasks_file_not_found(self, task_file, mocker):
             """Verify _load_tasks handles FileNotFoundError gracefully."""
             # We don't need to ensure file doesn't exist, _load_tasks handles it.
             # This test verifies the intended public behavior.
             assert core.list_tasks(task_file) == []
    
        @pytest.mark.errors
        def test_load_tasks_json_decode_error(self, task_file, mocker):
            """Verify _load_tasks handles JSONDecodeError."""
            # Create a corrupted file
            with open(task_file, 'w') as f:
                f.write("{invalid json")
    
            # Mock print to suppress warning output during test run (optional)
            mock_print = mocker.patch('builtins.print')
    
            # Call list_tasks which uses _load_tasks
            tasks = core.list_tasks(task_file)
            assert tasks == [] # Should return empty list
            mock_print.assert_called_once() # Check if the warning was printed
            assert "Could not decode JSON" in mock_print.call_args[0][0]
    
        @pytest.mark.errors
        def test_load_tasks_not_a_list(self, task_file):
            """Verify _load_tasks raises TypeError if JSON is not a list."""
            with open(task_file, 'w') as f:
                json.dump({"not": "a list"}, f)
    
            with pytest.raises(TypeError, match="Data in .* is not a list"):
                 core.list_tasks(task_file) # list_tasks calls _load_tasks
    
    
        def test_save_tasks_os_error(self, task_file, mocker):
             """Verify behavior if _save_tasks encounters an OS error."""
             # Mock open to raise an error during write
             mock_open = mocker.patch('builtins.open', side_effect=OSError("Disk full"))
    
             # Expect add_task to fail because it calls _save_tasks
             with pytest.raises(OSError, match="Disk full"):
                 core.add_task(task_file, "Task that won't save")
    
             # Ensure open was called (or attempted) with the correct path and mode
             mock_open.assert_called_with(task_file, 'w')
    
        # --- Mocking Example: Test add_task logic without file IO ---
    
        def test_add_task_logic_mocked(self, task_file, mocker):
            """Test add_task logic without actual saving/loading."""
            # Mock load and save functions
            mock_load = mocker.patch('taskmgr.core._load_tasks')
            mock_save = mocker.patch('taskmgr.core._save_tasks')
            mock_next_id = mocker.patch('taskmgr.core._get_next_id')
    
            # Configure mocks
            initial_tasks = [{'id': 1, 'description': 'Existing Task', 'done': False}]
            mock_load.return_value = initial_tasks # Simulate loading existing tasks
            mock_next_id.return_value = 2 # Simulate next ID calculation
    
            # Call the function under test
            new_task = core.add_task(task_file, "New Task ") # Note trailing space
    
            # Assert on the returned value (logic inside add_task)
            assert new_task == {'id': 2, 'description': 'New Task', 'done': False}
    
            # Assert that load and next_id were called correctly
            mock_load.assert_called_once_with(task_file)
            mock_next_id.assert_called_once_with(initial_tasks) # Check it used loaded tasks
    
            # Assert that save was called with the *updated* task list
            expected_save_list = [
                {'id': 1, 'description': 'Existing Task', 'done': False},
                {'id': 2, 'description': 'New Task', 'done': False}
            ]
            mock_save.assert_called_once_with(task_file, expected_save_list)
    
    Self-Correction: Major rewrite of tests. Removed tests for the old add function. Replaced sample_tasks usage with direct calls to core.add_task using the task_file fixture. Added tests verifying file creation and content directly where needed (integration-style tests). Added specific tests for _load_tasks edge cases like JSONDecodeError and invalid data types, using mocker to suppress print output or simulate errors. Added a key example (test_add_task_logic_mocked) demonstrating how to use mocker to patch the internal _load_tasks and _save_tasks functions to test the logic of add_task (calculating ID, creating task dict, appending to list) in complete isolation from the file system. Adjusted parameterization for invalid descriptions.

  4. Run Tests: Execute pytest -v from the project root.

  5. Analyze Output: Expect a mix of tests. Some tests (like test_add_task_saves_to_file, test_list_tasks_multiple) are closer to integration tests as they perform real file I/O within the temporary directory provided by tmp_path. Other tests (like test_add_task_logic_mocked, test_save_tasks_os_error) use mocker extensively to isolate the logic being tested from the actual file system or to simulate file system errors. All tests should pass, demonstrating both the integrated behavior and the isolated logic verification.

This workshop illustrates the crucial role of mocking. While some tests benefit from interacting with real (temporary) files, mocking allows you to test specific logic units in isolation and to simulate error conditions that are difficult or impossible to create reliably with the real file system.

7. Advanced Pytest Features

Beyond fixtures, parameterization, and mocking, Pytest offers more advanced capabilities for complex testing scenarios, customization, and extending its functionality.

Custom Fixtures and Factories

Fixtures don't just have to return simple data or objects. They can return functions or classes, acting as "factories" to generate customized test data or objects on demand within tests.

Why Use Factory Fixtures?

  • Customizable Data: Tests often need slightly different variations of setup data. A factory fixture can take arguments to produce tailored data.
  • State Management: If creating the resource involves complex steps or state that shouldn't leak between tests using the factory, encapsulating the creation logic in a factory function helps.
  • Readability: Can make tests more readable by abstracting away the details of object creation.

Example: Task Factory Fixture

Instead of a static sample_tasks fixture, let's create a fixture that returns a function capable of adding tasks for the test.

# tests/conftest.py
import pytest
from taskmgr import core
import os

@pytest.fixture
def task_file(tmp_path):
    # ... (keep existing fixture) ...
    filepath = tmp_path / "tasks.json"
    if filepath.exists():
        filepath.unlink()
    yield str(filepath)

@pytest.fixture
def task_factory(task_file):
    """Provides a factory function to add tasks within a test."""
    def _add_task_to_file(description: str, done: bool = False) -> dict:
        """Helper function to add a task and optionally mark it done."""
        # Uses the task_file fixture available in this scope
        task = core.add_task(task_file, description)
        if done:
            task = core.mark_task_done(task_file, task['id'])
        return task

    # The fixture returns the inner function itself
    return _add_task_to_file

# Remove the old clean_task_state and sample_tasks if they still exist

Using the Factory in a Test:

# tests/test_core.py
# ... imports ...

@pytest.mark.tasks
class TestTaskManager:
    # ... other tests ...

    def test_list_tasks_with_factory(self, task_factory, task_file):
        """Test listing tasks created using the factory fixture."""
        task1 = task_factory("Factory Task 1") # Call the factory
        task2 = task_factory("Factory Task 2", done=True) # Call with options
        task3 = task_factory("Factory Task 3")

        tasks = core.list_tasks(task_file)

        assert len(tasks) == 3
        # Check details (adjust based on actual IDs assigned)
        assert {'id': 1, 'description': 'Factory Task 1', 'done': False} in tasks
        assert {'id': 2, 'description': 'Factory Task 2', 'done': True} in tasks
        assert {'id': 3, 'description': 'Factory Task 3', 'done': False} in tasks

    def test_get_done_task_with_factory(self, task_factory, task_file):
        """Use factory to create specific state for a test."""
        task_factory("Undone Task")
        done_task = task_factory("Done Task", done=True)

        retrieved = core.get_task_by_id(task_file, done_task['id'])
        assert retrieved['description'] == "Done Task"
        assert retrieved['done'] is True

This factory pattern provides more flexibility within tests compared to fixtures that return static data.

Using Hooks for Customization

Pytest has a comprehensive system of "hook" functions that allow you to plug into various stages of the testing process. Plugins use hooks extensively, but you can also define hook implementations directly in your conftest.py to customize Pytest's behavior for your project.

Common Hook Examples:

  • pytest_addoption(parser): Add custom command-line options to pytest.
  • pytest_configure(config): Perform initial configuration after command-line options are parsed. Access custom options via config.getoption(...).
  • pytest_collection_modifyitems(session, config, items): Modify the list of collected test items after collection but before execution. You can reorder tests, deselect tests based on custom logic, or add markers dynamically.
  • pytest_runtest_setup(item): Called before setting up fixtures and running a specific test item.
  • pytest_runtest_teardown(item, nextitem): Called after a test item has finished execution.
  • pytest_terminal_summary(terminalreporter, exitstatus, config): Add custom information to the summary report printed at the end of the test session.

Example: Adding a Custom Command-Line Option

Let's add an option --slow to explicitly run tests marked as slow (we'll need to add such a marker first).

  1. Define Marker and Add Option in conftest.py:

    # tests/conftest.py
    import pytest
    # ... other imports and fixtures ...
    
    # Hook to add command-line option
    def pytest_addoption(parser):
        parser.addoption(
            "--slow", action="store_true", default=False, help="Run slow tests"
        )
    
    # Hook to modify collected items based on the option
    def pytest_collection_modifyitems(config, items):
        if config.getoption("--slow"):
            # --slow passed, do not skip slow tests
            print("\nRunning SLOW tests...")
            return
        # --slow not passed, skip tests marked slow
        skip_slow = pytest.mark.skip(reason="need --slow option to run")
        print("\nSkipping slow tests (use --slow to run)...")
        for item in items:
            if "slow" in item.keywords: # 'item.keywords' contains marker names
                item.add_marker(skip_slow)
    
    # Remember to register the 'slow' marker in pyproject.toml or pytest.ini
    # [tool.pytest.ini_options.markers]
    # slow = "marks tests as slow running"
    
  2. Mark a Test as slow:

    # tests/test_core.py
    import time # For simulating slowness
    
    # ... other tests ...
    
    @pytest.mark.slow
    def test_something_slow(task_file):
        print("\nRunning a 'slow' test simulation...")
        time.sleep(1) # Simulate work
        core.add_task(task_file, "Slow task result")
        assert "Slow task result" in core.list_tasks(task_file)[0]['description']
    
  3. Register Marker: Add slow = "marks tests as slow running" to your pyproject.toml or pytest.ini under the markers section.

  4. Run Tests:

    • pytest: The test_something_slow test should be skipped with the reason "need --slow option to run".
    • pytest --slow: The test_something_slow test should be collected and executed.

Hooks provide deep control over Pytest's execution flow and reporting.

Writing Pytest Plugins (Overview)

If your custom hooks and fixtures become complex or reusable across multiple projects, you can package them into an installable Pytest plugin.

Basic Plugin Structure:

  • A Python package (often installable via pip).
  • Contains a module (often named pytest_*.py or placed appropriately) where hook functions and fixtures are defined.
  • Uses specific naming conventions or entry points (setup.py or pyproject.toml) to make Pytest recognize it as a plugin upon installation.

Developing full plugins is beyond the scope of this introduction, but it involves defining your hooks and fixtures as shown above and packaging them using standard Python packaging tools, making sure Pytest can discover them via entry points. The official Pytest documentation has detailed guides on writing plugins.

Property-Based Testing with Hypothesis (Integration Overview)

While example-based testing (like using @parametrize) checks specific known inputs, property-based testing verifies general properties that should hold true for a wide range of automatically generated inputs. The Hypothesis library integrates seamlessly with Pytest and is the de facto standard for property-based testing in Python.

Concept: Instead of defining specific inputs, you define strategies for generating data (e.g., "generate any integer," "generate lists of strings matching a regex"). Hypothesis then generates numerous diverse inputs based on these strategies and feeds them to your test function. Your test asserts a general property that should be true for all valid inputs generated by the strategy. Hypothesis is particularly clever at finding edge cases (zero, empty strings, boundary values, Unicode quirks, etc.).

Example: Testing our add function with Hypothesis.

  1. Installation:

    pip install hypothesis
    

  2. Test Code:

    # tests/test_core_properties.py (New file for property tests)
    import pytest
    from hypothesis import given, strategies as st # Import Hypothesis functions/strategies
    from taskmgr import core
    
    # Basic property: Adding zero shouldn't change the number
    @given(st.integers()) # Generate arbitrary integers for 'x'
    def test_add_zero_identity(x):
        """Property: Adding 0 to any integer x should result in x."""
        assert core.add(x, 0) == x
        assert core.add(0, x) == x
    
    # Commutative property: a + b == b + a
    @given(st.integers(), st.integers()) # Generate two arbitrary integers
    def test_add_commutative(a, b):
        """Property: Addition should be commutative (a+b == b+a)."""
        assert core.add(a, b) == core.add(b, a)
    
    # Associative property: (a + b) + c == a + (b + c)
    # Note: Requires large enough integers to avoid potential overflow issues
    # if the underlying implementation had limits, but Python integers are arbitrary precision.
    @given(st.integers(), st.integers(), st.integers())
    def test_add_associative(a, b, c):
        """Property: Addition should be associative."""
        assert core.add(core.add(a, b), c) == core.add(a, core.add(b, c))
    
    # Property about task descriptions (requires core functions to be robust)
    # Generate valid-looking strings (non-empty, non-whitespace)
    valid_desc_strategy = st.text(min_size=1).map(lambda s: s.strip()).filter(lambda s: len(s) > 0)
    
    @given(description=valid_desc_strategy)
    def test_add_task_description_is_stored(task_file, description):
        """Property: Valid generated descriptions should be stored correctly (ignoring stripping)."""
        # Ensure task_file fixture is available
        task = core.add_task(task_file, description)
        # Retrieve and check
        retrieved_task = core.get_task_by_id(task_file, task['id'])
        # core.add_task strips whitespace, so compare stripped versions
        assert retrieved_task['description'] == description.strip()
        assert retrieved_task['id'] == task['id']
        assert retrieved_task['done'] is False # Default state
    
    # Note: Property-based tests often require code to be quite robust
    # to handle the wide variety of inputs Hypothesis generates.
    

Hypothesis will run these tests many times, trying diverse inputs. If it finds an input where the assertion fails (a counter-example), it will simplify that input to the smallest possible failing case and report it, making debugging much easier. Property-based testing complements example-based testing by exploring the input space much more broadly.

Workshop Creating a Custom Fixture Factory

Let's create and use a more sophisticated fixture factory for our task manager. This factory will allow creating multiple tasks at once, potentially with specific states.

Goal: Implement a task_adder fixture factory in conftest.py that allows adding multiple tasks with specified descriptions and done statuses, and use it in a test.

Steps:

  1. Define Factory in conftest.py:

    # tests/conftest.py
    import pytest
    from taskmgr import core
    import os
    from typing import List, Tuple, Optional
    
    @pytest.fixture
    def task_file(tmp_path):
        """Provides path to a temporary task file."""
        filepath = tmp_path / "tasks.json"
        if filepath.exists(): filepath.unlink()
        yield str(filepath)
    
    @pytest.fixture
    def task_adder(task_file):
        """
        Provides a factory function to add multiple tasks.
        Accepts a list of tuples: (description, optional_done_status).
        """
        created_tasks = [] # Keep track of tasks created via this factory instance
    
        def _add_tasks(task_specs: List[Tuple[str, Optional[bool]]]) -> List[dict]:
            """Adds tasks based on specifications."""
            batch_created = []
            for spec in task_specs:
                description = spec[0]
                done_status = spec[1] if len(spec) > 1 else False # Default to False
    
                task = core.add_task(task_file, description)
                if done_status:
                    try:
                        task = core.mark_task_done(task_file, task['id'])
                    except core.TaskNotFoundError:
                        # Should not happen if add_task succeeded, but handle defensively
                        pytest.fail(f"Failed to mark task {task['id']} as done immediately after adding.")
    
                batch_created.append(task)
                created_tasks.append(task) # Add to overall list for this factory use
    
            return batch_created # Return tasks created in this specific call
    
        # Yield the factory function
        yield _add_tasks
    
        # Optional: Could add cleanup or verification based on 'created_tasks' here if needed
        # print(f"\nTask adder fixture teardown. Created tasks: {len(created_tasks)}")
    
    # Remove the simpler task_factory if it exists
    # @pytest.fixture
    # def task_factory(task_file): ... (DELETE)
    
    Self-Correction: Renamed to task_adder for clarity. Changed the input to accept a list of tuples (description, optional_done_status) for adding multiple tasks in one call. Added tracking of created tasks within the fixture instance (though not strictly necessary for this example, it shows potential for more complex state management). Added error handling within the factory.

  2. Use Factory in tests/test_core.py: Create a new test that uses this factory to set up a specific scenario.

    # tests/test_core.py
    # ... imports ...
    
    @pytest.mark.tasks
    class TestTaskManager:
        # ... other tests ...
    
        def test_list_mixed_status_tasks(self, task_adder, task_file):
            """Test listing tasks with mixed done statuses created by task_adder."""
            # Use the task_adder factory to create tasks
            tasks_added = task_adder([
                ("Task One", False),
                ("Task Two", True),
                ("Task Three", False),
                ("Task Four", True),
            ])
    
            # Verify the factory returned the correct tasks
            assert len(tasks_added) == 4
            assert tasks_added[1]['description'] == "Task Two"
            assert tasks_added[1]['done'] is True
    
            # List tasks from the file and verify
            all_tasks = core.list_tasks(task_file)
            assert len(all_tasks) == 4
    
            # Check counts or specific tasks
            done_count = sum(1 for t in all_tasks if t['done'])
            undone_count = sum(1 for t in all_tasks if not t['done'])
            assert done_count == 2
            assert undone_count == 2
    
            # Verify a specific task added by the factory exists correctly
            task_four_id = tasks_added[3]['id']
            retrieved_task_four = core.get_task_by_id(task_file, task_four_id)
            assert retrieved_task_four['description'] == "Task Four"
            assert retrieved_task_four['done'] is True
    
  3. Run Tests: Execute pytest -v.

  4. Analyze Output: The test_list_mixed_status_tasks should pass, demonstrating that the task_adder fixture successfully created the specified tasks in the temporary file, allowing the test to verify the list_tasks function under that specific state.

This workshop shows how fixture factories can encapsulate more complex setup logic, making tests cleaner and more focused on verifying behavior under specific, easily generated conditions. These advanced features provide powerful tools for tackling diverse and complex testing challenges.

8. Testing Different Application Types

Pytest is versatile and can be adapted to test various kinds of Python applications, not just simple libraries or core logic modules. Here, we'll look at strategies for testing Command-Line Interfaces (CLIs), and briefly touch upon testing web APIs and data science code.

Testing Command-Line Interfaces (CLIs)

Our taskmgr is intended to be a CLI tool. Testing CLIs involves simulating command-line calls and verifying:

  1. Exit Codes: Did the command exit successfully (usually code 0) or with an expected error code?
  2. Output (stdout/stderr): Did the command print the expected output to standard output or standard error?
  3. Side Effects: Did the command have the expected effect (e.g., modifying a file, making a network call - often mocked)?

Let's implement a basic CLI for taskmgr using argparse.

src/taskmgr/cli.py:

# src/taskmgr/cli.py
import argparse
import sys
from taskmgr import core
import os

# Define default task file location (e.g., in user's home directory)
# Use a hidden file convention
DEFAULT_TASK_FILE = os.path.expanduser("~/.tasks.json")

def main(args=None):
    """Main CLI entry point."""
    parser = argparse.ArgumentParser(description="A simple command-line task manager.")
    parser.add_argument(
        '--file', '-f', type=str, default=DEFAULT_TASK_FILE,
        help=f"Path to the task file (default: {DEFAULT_TASK_FILE})"
    )

    subparsers = parser.add_subparsers(dest='command', help='Available commands', required=True)

    # --- Add command ---
    parser_add = subparsers.add_parser('add', help='Add a new task')
    parser_add.add_argument('description', type=str, nargs='+', help='The description of the task')

    # --- List command ---
    parser_list = subparsers.add_parser('list', help='List all tasks')
    parser_list.add_argument('--all', '-a', action='store_true', help='Include completed tasks')

    # --- Done command ---
    parser_done = subparsers.add_parser('done', help='Mark a task as done')
    parser_done.add_argument('task_id', type=int, help='The ID of the task to mark as done')

    # --- Undone command ---
    parser_undone = subparsers.add_parser('undone', help='Mark a task as not done')
    parser_undone.add_argument('task_id', type=int, help='The ID of the task to mark as not done')

    # --- Clear command ---
    parser_clear = subparsers.add_parser('clear', help='Remove all tasks')
    parser_clear.add_argument('--yes', '-y', action='store_true', help='Confirm clearing all tasks')


    # If args is None, use sys.argv[1:]
    parsed_args = parser.parse_args(args if args is not None else sys.argv[1:])

    task_file = parsed_args.file

    try:
        if parsed_args.command == 'add':
            # Join multiple words in description back together
            description = ' '.join(parsed_args.description)
            task = core.add_task(task_file, description)
            print(f"Added task {task['id']}: '{task['description']}'")
            return 0 # Success exit code

        elif parsed_args.command == 'list':
            tasks = core.list_tasks(task_file)
            if not tasks:
                print("No tasks found.")
                return 0

            print(f"Tasks in {task_file}:")
            for task in tasks:
                if parsed_args.all or not task['done']:
                    status = "[X]" if task['done'] else "[ ]"
                    print(f"  {task['id']}: {status} {task['description']}")
            return 0

        elif parsed_args.command == 'done':
            task = core.mark_task_done(task_file, parsed_args.task_id)
            print(f"Marked task {task['id']} as done.")
            return 0

        elif parsed_args.command == 'undone':
             task = core.mark_task_undone(task_file, parsed_args.task_id)
             print(f"Marked task {task['id']} as not done.")
             return 0

        elif parsed_args.command == 'clear':
            if not parsed_args.yes:
                print("Confirmation required. Use --yes or -y to clear all tasks.", file=sys.stderr)
                return 1 # Error exit code - confirmation needed
            core.clear_tasks(task_file)
            print(f"All tasks cleared from {task_file}.")
            return 0

    except core.TaskNotFoundError as e:
        print(f"Error: {e}", file=sys.stderr)
        return 1 # Error exit code
    except ValueError as e: # Catch description errors from add_task
        print(f"Error: {e}", file=sys.stderr)
        return 1
    except Exception as e:
        # Generic catch-all for unexpected errors
        print(f"An unexpected error occurred: {e}", file=sys.stderr)
        return 2 # Different error code for unexpected

    return 0 # Default success

if __name__ == '__main__':
    # Allows running the script directly
    # sys.exit ensures the exit code from main() is used
    sys.exit(main())
Self-Correction: Added DEFAULT_TASK_FILE. Made command required in subparsers. Handled nargs='+' for descriptions. Implemented logic for list --all, done, undone, and clear with confirmation. Added basic error handling using try...except around the command logic, printing errors to stderr and returning non-zero exit codes. Added if __name__ == '__main__': block. Added return statements with exit codes within the try block.

Testing Strategies:

  1. Using subprocess: Call the CLI script as a separate process and inspect its exit code, stdout, and stderr. This is close to how a user interacts but can be slower and couples tests to the installed script location.

  2. Calling the main() Function Directly: Import the main function and call it with a list of strings representing the arguments (sys.argv). This is faster and tests the Python code more directly. However, you need to handle:

    • Argument Parsing: Pass arguments correctly.
    • Capturing Output: Redirect sys.stdout and sys.stderr (using io.StringIO or Pytest's capsys/capfd fixtures).
    • Exit Codes: main() should ideally return the intended exit code, or you need to mock/catch sys.exit.
    • File Paths: Use the task_file fixture to ensure tests use temporary files.
  3. Using CLI Testing Libraries: Libraries like click.testing.CliRunner (for Click apps) or typer.testing.CliRunner (for Typer apps) provide convenient utilities for invoking commands and capturing output within tests. For argparse, we'll primarily use strategy 2 with Pytest fixtures.

Example using main() and capsys:

# tests/test_cli.py
import pytest
from taskmgr import cli
from taskmgr import core # May need core for checking file state directly
import os
from io import StringIO

# Use the task_file fixture from conftest.py
# It's automatically available

def test_cli_add_task(task_file, capsys):
    """Test the 'add' command via the main function."""
    args = ['--file', task_file, 'add', 'Test CLI Add']
    exit_code = cli.main(args)

    assert exit_code == 0 # Check success exit code

    # Capture stdout
    captured = capsys.readouterr()
    assert "Added task 1: 'Test CLI Add'" in captured.out

    # Verify side effect (task added to file)
    tasks = core.list_tasks(task_file)
    assert len(tasks) == 1
    assert tasks[0]['description'] == 'Test CLI Add'

def test_cli_list_tasks(task_file, capsys):
    """Test the 'list' command."""
    # Add some tasks first using core API or previous CLI calls
    core.add_task(task_file, "CLI Task 1")
    core.add_task(task_file, "CLI Task 2 Done")
    core.mark_task_done(task_file, 2)

    # Test listing only undone tasks (default)
    args_list_undone = ['--file', task_file, 'list']
    exit_code_undone = cli.main(args_list_undone)
    captured_undone = capsys.readouterr()
    assert exit_code_undone == 0
    assert f"Tasks in {task_file}:" in captured_undone.out
    assert "1: [ ] CLI Task 1" in captured_undone.out
    assert "2: [X] CLI Task 2 Done" not in captured_undone.out # Should not be listed

    # Test listing all tasks
    args_list_all = ['--file', task_file, 'list', '--all']
    exit_code_all = cli.main(args_list_all)
    captured_all = capsys.readouterr()
    assert exit_code_all == 0
    assert f"Tasks in {task_file}:" in captured_all.out
    assert "1: [ ] CLI Task 1" in captured_all.out
    assert "2: [X] CLI Task 2 Done" in captured_all.out # Should be listed now

def test_cli_list_empty(task_file, capsys):
    """Test 'list' when no tasks exist."""
    args = ['--file', task_file, 'list']
    exit_code = cli.main(args)
    captured = capsys.readouterr()
    assert exit_code == 0
    assert "No tasks found." in captured.out

def test_cli_done_task(task_file, capsys):
    """Test the 'done' command."""
    task = core.add_task(task_file, "Task to complete via CLI")
    assert not task['done']

    args = ['--file', task_file, 'done', str(task['id'])]
    exit_code = cli.main(args)
    captured = capsys.readouterr()

    assert exit_code == 0
    assert f"Marked task {task['id']} as done." in captured.out

    # Verify side effect
    updated_task = core.get_task_by_id(task_file, task['id'])
    assert updated_task['done'] is True

def test_cli_done_not_found(task_file, capsys):
    """Test 'done' command with a non-existent ID."""
    args = ['--file', task_file, 'done', '999']
    exit_code = cli.main(args)
    captured = capsys.readouterr()

    assert exit_code == 1 # Check error exit code
    assert "Error: Task with ID 999 not found." in captured.err # Check stderr

def test_cli_clear_no_confirm(task_file, capsys):
     """Test 'clear' without confirmation."""
     core.add_task(task_file, "Task to potentially clear")
     args = ['--file', task_file, 'clear']
     exit_code = cli.main(args)
     captured = capsys.readouterr()

     assert exit_code == 1 # Error code
     assert "Confirmation required" in captured.err
     assert len(core.list_tasks(task_file)) == 1 # Ensure task not cleared

def test_cli_clear_with_confirm(task_file, capsys):
     """Test 'clear' with confirmation."""
     core.add_task(task_file, "Task to clear")
     args = ['--file', task_file, 'clear', '--yes']
     exit_code = cli.main(args)
     captured = capsys.readouterr()

     assert exit_code == 0 # Success code
     assert f"All tasks cleared from {task_file}" in captured.out
     assert len(core.list_tasks(task_file)) == 0 # Ensure task cleared

# Add tests for 'undone', invalid arguments, etc.

Key Elements:

  • capsys Fixture: A built-in Pytest fixture that captures output written to sys.stdout and sys.stderr during the test. capsys.readouterr() returns an object with .out and .err attributes containing the captured text.
  • Calling cli.main(args): We pass a list of strings representing the command-line arguments.
  • Asserting Exit Code: Check the return value of main().
  • Asserting Output: Check captured.out and captured.err.
  • Asserting Side Effects: Use core functions (or direct file reads) to verify changes to the task file.
  • task_file Fixture: Crucial for ensuring each test runs with a clean, isolated task file.

Testing Web APIs (e.g., Flask/Django) - Overview

Testing web APIs built with frameworks like Flask or Django involves sending simulated HTTP requests to your application and verifying the responses (status code, headers, JSON/HTML body).

Common Strategies:

  1. Framework Test Clients: Most web frameworks provide a test client that simulates requests within the same process without needing a live server.
    • Flask: app.test_client() provides methods like get(), post(), put(), delete().
    • Django: django.test.Client offers similar methods.
    • FastAPI: TestClient (based on requests).
    • These clients allow you to send requests to your route handlers (@app.route(...), etc.) and inspect the Response object.
  2. Pytest Plugins: Plugins like pytest-flask, pytest-django, and helpers for FastAPI simplify fixture setup (like providing the test client or app context) and integration.
  3. Mocking External Services: Use mocker or similar tools to mock database interactions (SQLAlchemy, Django ORM), external API calls (requests), or other services your API depends on, allowing focused testing of your API logic.

Example Snippet (Conceptual Flask):

# Assuming Flask app defined in my_flask_app.py
# And pytest-flask installed

# tests/test_flask_api.py
import pytest
# from my_flask_app import app # Import your Flask app instance

# pytest-flask provides the 'client' fixture automatically
def test_api_hello(client): # client is the test client fixture
    response = client.get('/api/hello')
    assert response.status_code == 200
    assert response.content_type == 'application/json'
    json_data = response.get_json()
    assert json_data == {"message": "Hello, World!"}

def test_api_create_item(client, mocker):
    # Mock database interaction
    mock_db_save = mocker.patch('my_flask_app.database.save_item')
    mock_db_save.return_value = {"id": 123, "name": "Test Item"}

    response = client.post('/api/items', json={"name": "Test Item"})

    assert response.status_code == 201 # Created
    json_data = response.get_json()
    assert json_data['id'] == 123
    assert json_data['name'] == "Test Item"
    mock_db_save.assert_called_once_with({"name": "Test Item"})

Testing Data Science Code (e.g., Pandas DataFrames) - Overview

Testing code that uses libraries like Pandas, NumPy, or Scikit-learn often involves checking the structure, content, or properties of data structures like DataFrames or arrays.

Common Strategies:

  1. Fixture for Sample Data: Use fixtures to create or load sample DataFrames or arrays needed for tests.
  2. pandas.testing: Pandas provides utility functions specifically for testing:
    • pd.testing.assert_frame_equal(df1, df2): Checks if two DataFrames are identical (or close, with tolerance options). Handles NaNs, dtypes, and indices correctly.
    • pd.testing.assert_series_equal(s1, s2)
    • pd.testing.assert_index_equal(idx1, idx2)
  3. Hypothesis Strategies: Hypothesis can generate Pandas DataFrames or NumPy arrays based on defined column types and constraints, enabling property-based testing of data transformations.
  4. Mocking: Mock file reading (pd.read_csv), database connections, or API calls used to acquire data if you only want to test the transformation logic.

Example Snippet (Conceptual Pandas):

# tests/test_data_processing.py
import pytest
import pandas as pd
from pandas.testing import assert_frame_equal
# Assume data_processor.py contains:
# def calculate_means(df: pd.DataFrame) -> pd.DataFrame:
#     return df.groupby('category').mean()
from my_project import data_processor

@pytest.fixture
def sample_dataframe():
    """Provides a sample Pandas DataFrame for tests."""
    data = {
        'category': ['A', 'B', 'A', 'B', 'A', 'C'],
        'value1': [10, 20, 12, 22, 14, 30],
        'value2': [5, 15, 7, 17, 9, 25],
    }
    return pd.DataFrame(data)

def test_calculate_means(sample_dataframe):
    expected_data = {
        'value1': {'A': 12.0, 'B': 21.0, 'C': 30.0},
        'value2': {'A': 7.0, 'B': 16.0, 'C': 25.0},
    }
    expected_df = pd.DataFrame(expected_data)
    expected_df.index.name = 'category' # Match index name

    result_df = data_processor.calculate_means(sample_dataframe)

    # Use pandas testing utility for robust comparison
    assert_frame_equal(result_df, expected_df, check_dtype=True)

The key is to use appropriate tools (pandas.testing, Hypothesis) and testing patterns (fixtures, mocking) tailored to the specific type of application and its dependencies.

Workshop Testing the Task Manager CLI

Let's apply the CLI testing techniques to our taskmgr project.

Goal: Write Pytest tests for the various commands of the taskmgr CLI (add, list, done, clear) using the cli.main function and the capsys fixture.

Steps:

  1. Ensure tests/test_cli.py Exists: We created this earlier.
  2. Add Necessary Imports: Make sure pytest, cli, core, os, and the task_file fixture are available.
  3. Implement Tests (as shown previously): Copy or adapt the example tests provided in the "Testing Command-Line Interfaces (CLIs)" section above into your tests/test_cli.py. Ensure they cover:

    • Adding a task (test_cli_add_task)
    • Listing tasks (undone and --all) (test_cli_list_tasks)
    • Listing when empty (test_cli_list_empty)
    • Marking a task done (test_cli_done_task)
    • Attempting to mark a non-existent task done (test_cli_done_not_found)
    • Clearing without confirmation (test_cli_clear_no_confirm)
    • Clearing with confirmation (test_cli_clear_with_confirm)
    • (Self-Correction/Addition): Add a test for adding a task with multi-word description.
    • (Self-Correction/Addition): Add a test for the undone command.

    Additional Tests for tests/test_cli.py:

    # tests/test_cli.py
    # ... other imports and tests ...
    
    def test_cli_add_multi_word_description(task_file, capsys):
        """Test adding a task with spaces in the description."""
        args = ['--file', task_file, 'add', 'This is', 'a multi word', 'task']
        exit_code = cli.main(args)
        assert exit_code == 0
        captured = capsys.readouterr()
        assert "Added task 1: 'This is a multi word task'" in captured.out
        # Verify core state
        tasks = core.list_tasks(task_file)
        assert tasks[0]['description'] == 'This is a multi word task'
    
    def test_cli_undone_task(task_file, capsys):
        """Test the 'undone' command."""
        task = core.add_task(task_file, "Task to un-complete via CLI")
        core.mark_task_done(task_file, task['id']) # Mark done first
        assert core.get_task_by_id(task_file, task['id'])['done'] is True
    
        args = ['--file', task_file, 'undone', str(task['id'])]
        exit_code = cli.main(args)
        captured = capsys.readouterr()
    
        assert exit_code == 0
        assert f"Marked task {task['id']} as not done." in captured.out
    
        # Verify side effect
        updated_task = core.get_task_by_id(task_file, task['id'])
        assert updated_task['done'] is False
    
    def test_cli_undone_not_found(task_file, capsys):
        """Test 'undone' command with a non-existent ID."""
        args = ['--file', task_file, 'undone', '998']
        exit_code = cli.main(args)
        captured = capsys.readouterr()
    
        assert exit_code == 1 # Error exit code
        assert "Error: Task with ID 998 not found." in captured.err # Check stderr
    
    def test_cli_add_empty_description_error(task_file, capsys):
        """Test that CLI add handles empty description error from core."""
        args = ['--file', task_file, 'add', ''] # argparse might handle this, let's try whitespace
        args_ws = ['--file', task_file, 'add', '   ']
        # Note: argparse with nargs='+' requires at least one argument.
        # Testing truly empty input might require different CLI structure or direct core test.
        # Let's test whitespace which should trigger the ValueError in core.
        exit_code = cli.main(args_ws)
        captured = capsys.readouterr()
    
        assert exit_code == 1 # Error code
        assert "Error: Task description cannot be empty." in captured.err
    

  4. Run Tests: Execute pytest -v tests/test_cli.py to run only the CLI tests, or pytest to run all tests.

  5. Analyze Output: Ensure all CLI tests pass. Check that the exit codes, stdout, and stderr are correctly asserted using capsys. Verify that the tests correctly interact with the temporary task_file to check side effects.

This workshop provides a practical application of testing a CLI application by directly invoking its entry point function and using Pytest fixtures like capsys and task_file (which uses tmp_path) to control inputs, capture outputs, and manage state isolation.

9. Test Coverage

Writing tests is essential, but how do you know if your tests are adequately covering your codebase? Are there parts of your application logic that are never executed during your test runs? Test coverage measurement helps answer these questions.

What is Test Coverage?

Test coverage is a metric that describes the degree to which the source code of a program is executed when a particular test suite is run. It's typically expressed as a percentage. The most common type of coverage measured is statement coverage:

  • Statement Coverage: Measures whether each executable statement in your code was executed by at least one test.

Other types include:

  • Branch Coverage: Measures whether each possible branch (e.g., the True and False paths of an if statement) has been taken.
  • Function/Method Coverage: Measures whether each function or method has been called.
  • (Path Coverage, Condition Coverage, etc.): More granular types, often harder to achieve 100% on and potentially less practical.

Statement coverage is the most common starting point and provides valuable insights.

Why Measure Coverage?

  1. Identify Untested Code: The primary benefit. Coverage reports highlight lines, functions, or branches of your code that are not executed by your tests. This immediately shows you where you need to add more tests to ensure all logic paths are verified.
  2. Build Confidence: High coverage (especially combined with well-written tests) increases confidence that your code behaves correctly under various conditions. It doesn't guarantee correctness (tests might not check the right things), but it reduces the chance of completely untested logic being shipped.
  3. Guide Test Writing: Reviewing coverage reports can guide you on what tests to write next, focusing your efforts on the untested areas.
  4. Detect Dead Code: Code that consistently shows 0% coverage across comprehensive test runs might be "dead code" – code that is never actually used and could potentially be removed.
  5. Track Progress: Monitoring coverage over time can help ensure that testing keeps pace with development and that new code is adequately tested.

Important Note: High coverage (e.g., 90-100%) is often a good goal, but it's not an end in itself. 100% coverage doesn't mean your code is bug-free. You could have 100% coverage with tests that have poor assertions or don't check meaningful properties. Focus on writing effective tests for critical logic, and use coverage as a tool to find gaps in your testing, not as the sole measure of test quality.

Using pytest-cov

The standard tool for measuring test coverage with Pytest is the pytest-cov plugin.

  1. Installation:

    pip install pytest-cov
    

  2. Basic Usage: Run Pytest with the --cov option, specifying the package(s) or module(s) you want to measure coverage for. Typically, this is your source code directory (e.g., src/taskmgr).

    # Measure coverage for the 'taskmgr' package inside 'src'
    pytest --cov=src/taskmgr
    
    # Alternatively, specify the top-level package name if installed
    # pytest --cov=taskmgr
    
    # Measure coverage for the current directory (less precise if tests are mixed in)
    # pytest --cov=.
    

Command-Line Options:

  • --cov=<path>: Specifies the path(s) to measure coverage for. Can be repeated. Use the path relative to your project root, often pointing to your source package directory (like src/taskmgr).
  • --cov-report=<type>: Specifies the output format for the coverage report. Common types:
    • term: Basic summary printed to the terminal (default, often combined with term-missing).
    • term-missing: Terminal summary including line numbers of missing statements (very useful!).
    • html: Generates a detailed HTML report in a htmlcov/ directory. Allows browsing source code with highlighted covered/missing lines.
    • xml: Generates an XML report (e.g., for CI systems like Jenkins/Cobertura).
    • json: Generates a JSON report.
    • annotate: Creates copies of source files annotated with coverage information. You can request multiple reports, e.g., --cov-report=term-missing --cov-report=html.
  • --cov-fail-under=<min>: Causes Pytest to exit with a non-zero status code (failing the build in CI) if the total coverage percentage is below <min>. Example: --cov-fail-under=85.
  • --cov-config=<file>: Specify a custom configuration file (usually .coveragerc or pyproject.toml) for more advanced coverage options (e.g., excluding specific files/lines).

Recommended Command:

A good command for local development and CI often includes terminal output with missing lines and an HTML report:

pytest --cov=src/taskmgr --cov-report=term-missing --cov-report=html

Interpreting Coverage Reports (HTML, Terminal)

  • Terminal (term-missing):

    ----------- coverage: platform linux, python 3.x.y -----------
    Name                  Stmts   Miss  Cover   Missing
    ---------------------------------------------------
    src/taskmgr/__init__.py   0      0   100%
    src/taskmgr/cli.py       65     10    85%   18, 25-28, 95-98, 115
    src/taskmgr/core.py      70      5    93%   15-16, 88, 105, 118
    ---------------------------------------------------
    TOTAL                   135     15    89%
    

    • Stmts: Total number of executable statements.
    • Miss: Number of statements not executed.
    • Cover: Percentage coverage ((Stmts - Miss) / Stmts).
    • Missing: Line numbers (or ranges) of statements not covered. This tells you exactly where to look.
  • HTML (htmlcov/index.html): Open the index.html file in your web browser. It provides a navigable summary similar to the terminal report. Clicking on a filename takes you to an annotated version of the source code:

    • Green lines: Covered statements.
    • Red lines: Missed statements (not executed).
    • Yellow lines (optional): Missed branches.
    • This visual representation is extremely helpful for understanding exactly what parts of complex functions or conditional logic were not tested.

Aiming for Meaningful Coverage

As mentioned, don't chase 100% coverage blindly. Instead:

  1. Focus on Critical Logic: Ensure your core business logic, complex algorithms, and error handling paths have high coverage with meaningful tests.
  2. Understand Gaps: Use the report to identify untested areas. Ask why they are untested. Is it trivial code (getters/setters)? Is it hard-to-reach error handling? Is it genuinely untested logic?
  3. Test Branches: Pay attention to branch coverage (often implicitly shown by missed lines within if/else blocks in the HTML report). Ensure both sides of important conditions are tested.
  4. Configuration (.coveragerc / pyproject.toml): Configure coverage measurement to exclude code that doesn't require testing (e.g., trivial __init__.py, code behind if __name__ == '__main__':, abstract methods meant to be overridden, potentially boilerplate code).

Example Configuration (pyproject.toml):

# pyproject.toml

# ... (other sections) ...

[tool.coverage.run]
source = ["src/taskmgr"] # Define source for coverage automatically
branch = true # Enable branch coverage measurement
parallel = true # Enable for parallel test runners like pytest-xdist

[tool.coverage.report]
# Fail if total coverage is below 85%
fail_under = 85
show_missing = true
skip_covered = false # Show all files in report, even 100% covered

[tool.coverage.paths]
source =
    src/
    */site-packages/

[tool.coverage.html]
directory = "coverage_html_report" # Customize output directory

[tool.coverage.xml]
output = "coverage.xml"

[tool.coverage.exclude]
# Add patterns for lines/code to exclude from coverage reporting
exclude_lines = [
    "pragma: no cover", # Standard pragma recognized by coverage.py
    "raise NotImplementedError",
    "if __name__ == .__main__.:",
    # Exclude simple __repr__ if not critical
    "def __repr__",
    "if TYPE_CHECKING:", # Exclude blocks only for type checkers
]

This configuration helps tailor coverage measurement and reporting to your project's specific needs.

Workshop Measuring Coverage for the Task Manager

Let's measure the test coverage for our taskmgr project and analyze the results.

Goal: Run Pytest with coverage enabled, generate terminal and HTML reports, identify untested lines, and potentially add a test to improve coverage.

Steps:

  1. Install pytest-cov: If you haven't already:

    # Ensure your virtual environment is active
    pip install pytest-cov
    

  2. Run Coverage: Execute Pytest with coverage options from your project root (taskmgr-project):

    pytest --cov=src/taskmgr --cov-report=term-missing --cov-report=html
    
    (Note: If you didn't use the src layout, adjust --cov=src/taskmgr to --cov=taskmgr or wherever your main package code resides relative to the root).

  3. Analyze Terminal Report: Observe the term-missing output in your terminal.

    • What is the total coverage percentage?
    • Which files (cli.py, core.py) have less than 100% coverage?
    • Note down some specific line numbers listed under the "Missing" column for cli.py and core.py.
  4. Analyze HTML Report:

    • Open the generated coverage_html_report/index.html (or htmlcov/index.html if you didn't configure the directory) in your web browser.
    • Navigate to cli.py and core.py.
    • Examine the lines highlighted in red (missed statements). Do these correspond to the line numbers from the terminal report?
    • Identify a specific piece of logic (e.g., an if condition, an except block, a specific part of a function) that is currently marked as missed.
  5. (Example) Identify and Fix a Gap:

    • Let's assume the coverage report shows that the generic except Exception as e: block in cli.py's main function is not covered (line like print(f"An unexpected error occurred: {e}", file=sys.stderr)). This means none of our tests caused an unexpected error within the try block of cli.main.
    • Add a Test: We can simulate such an error using mocking. Let's force core.list_tasks to raise an unexpected RuntimeError when called by the list command.

    Add to tests/test_cli.py:

    # tests/test_cli.py
    # ... imports ...
    
    @pytest.mark.errors # Mark as error test
    def test_cli_unexpected_error(task_file, capsys, mocker):
        """Test the generic exception handler in cli.main."""
        # Mock a core function called by the 'list' command to raise an unexpected error
        mock_list = mocker.patch('taskmgr.core.list_tasks', side_effect=RuntimeError("Something went very wrong!"))
    
        args = ['--file', task_file, 'list']
        exit_code = cli.main(args)
        captured = capsys.readouterr()
    
        # Assert the specific exit code for unexpected errors
        assert exit_code == 2
        # Assert the generic error message is printed to stderr
        assert "An unexpected error occurred: Something went very wrong!" in captured.err
        # Ensure the mock was called
        mock_list.assert_called_once_with(task_file)
    

  6. Re-run Coverage: Execute the coverage command again:

    pytest --cov=src/taskmgr --cov-report=term-missing --cov-report=html
    

  7. Verify Improvement:

    • Check the terminal report. Has the total coverage percentage increased? Is the line number corresponding to the generic except block in cli.py no longer listed as missing?
    • Check the HTML report for cli.py. The lines within the generic except Exception: block should now be green (covered).

This workshop demonstrates the practical workflow of using pytest-cov: run coverage, analyze reports to find gaps, write tests to fill those gaps (often involving mocking for error conditions), and re-run coverage to verify the improvement. Remember to focus on covering meaningful logic and branches rather than just chasing a percentage.

Conclusion Best Practices and Next Steps

We've journeyed through the fundamentals of testing, setting up Pytest, writing basic and structured tests, handling errors, using fixtures, parameterizing inputs, mocking dependencies, exploring advanced features, testing different application types like CLIs, and measuring test coverage. By applying these concepts, you can significantly enhance the quality, reliability, and maintainability of your Python projects developed on Linux or any other platform.

Recap of Key Concepts

  • Why Test: Find bugs early, improve design, enable refactoring, serve as documentation, build confidence.
  • Testable Code: Design with modularity, dependency injection, and clear inputs/outputs in mind.
  • Pytest Basics: Test functions (test_*), simple assert statements, automatic discovery.
  • Structuring Tests: Use files, classes (Test*), and markers (@pytest.mark.*) for organization and selection.
  • Handling Failures: Read Pytest's detailed tracebacks. Use pytest.raises and pytest.warns to test expected exceptions and warnings. Use -x and -k for execution control.
  • Fixtures (@pytest.fixture): The cornerstone of Pytest setup/teardown. Provide reusable context, manage dependencies, and enhance readability. Use scope for efficiency and conftest.py for sharing. Factory fixtures add flexibility.
  • Parameterization (@pytest.mark.parametrize): Run the same test logic with multiple input sets, reducing duplication and improving thoroughness.
  • Mocking (unittest.mock, pytest-mock/mocker): Isolate code under test by replacing dependencies (files, APIs, databases, other objects) with controllable fakes. Crucial for unit testing. Use mocker.patch and assert mock interactions (assert_called_with, etc.).
  • Advanced Features: Hooks (pytest_addoption, etc.) for customization, plugins for packaging extensions, property-based testing (Hypothesis) for broader input exploration.
  • Application Testing: Adapt techniques for CLIs (capturing output, checking exit codes), Web APIs (test clients, mocking backends), and Data Science (Pandas testing utilities).
  • Test Coverage (pytest-cov): Measure which parts of your code are executed by tests. Use reports (term-missing, html) to identify untested code and guide test writing. Aim for meaningful coverage, not just a high percentage.

Best Practices for Writing Testable Code

  • Keep Units Small: Functions and methods should ideally do one thing well (Single Responsibility Principle).
  • Inject Dependencies: Don't let functions or classes create their own collaborators (databases, services, complex objects). Pass them in as arguments or via constructors.
  • Separate Pure Logic from Side Effects: Isolate calculations and core logic from code that interacts with the outside world (I/O, network). Test the pure logic separately and mock the side effects when testing the interacting code.
  • Use Interfaces/Abstractions (where appropriate): Depending on abstract base classes or protocols can make swapping real objects for mocks easier.
  • Write Tests Alongside Code (or Before - TDD): Don't leave testing as an afterthought. Writing tests concurrently helps drive better design. Test-Driven Development (TDD) – writing tests before the code – is a powerful discipline for ensuring testability from the start.
  • Clear Naming: Name test functions descriptively, indicating the scenario being tested.
  • One Logical Assertion Per Test (Usually): While a test function might have multiple assert statements to check different facets of the result, each test function should generally verify a single logical concept or behavior path. Avoid testing completely unrelated things in the same function.
  • Keep Tests Independent: Tests should not depend on the order they run in or the side effects of other tests. Use fixtures or setup/teardown mechanisms to ensure a clean state for each test.
  • Test the Interface, Not the Implementation: Focus tests on the public API or observable behavior of your code. Avoid testing private methods directly; test them indirectly through the public methods that use them. This makes tests less brittle to refactoring internal implementation details.
  • Use Coverage Reports Intelligently: Don't just chase 100%. Use coverage to find unintentionally untested critical code paths or branches.

Integrating Testing into CI/CD Pipelines (Brief Overview)

Automated testing provides the most value when integrated into your development workflow, particularly within Continuous Integration and Continuous Deployment/Delivery (CI/CD) pipelines (e.g., using GitLab CI, GitHub Actions, Jenkins).

  1. Automated Execution: Configure your CI/CD pipeline to automatically run your Pytest suite (pytest or pytest --cov=...) whenever code is pushed or a merge/pull request is created.
  2. Fail the Build: Ensure the CI job fails if any tests fail (Pytest exits with a non-zero code) or if coverage drops below a defined threshold (--cov-fail-under). This prevents merging broken or untested code.
  3. Reporting: Configure CI jobs to publish test results (e.g., JUnit XML format) and coverage reports (HTML, XML/Cobertura) for visibility.
  4. Environment Consistency: Use containers (like Docker) or consistent virtual environments in your CI pipeline to ensure tests run in an environment that closely matches production.

Integrating tests into CI/CD provides rapid feedback, catches regressions early, and maintains code quality automatically as part of the development process.

Further Learning Resources

Testing is a deep and rewarding skill. By mastering Pytest and embracing the principles of testable code design, you equip yourself to build more reliable, maintainable, and professional Python applications. Happy testing!