githubEdit

vial-circle-checkTests

The Shopfoo.Product.Testsarrow-up-right project demonstrates how to test domain workflows. Tests exercise workflows through the IProductApi entry point β€” the same boundary used by the real application β€” making them integration-style tests that validate the full wiring from API to data layer.

Test Stack

Library
Role

TUnit

Test runner

Unquote

Quotation-based assertions

NSubstitute

Mocking external HTTP clients

FsCheck

Property-based testing with custom generators

Test Organization

The project mirrors the domain structure:

πŸ“‚ Shopfoo.Product.Tests/
β”œβ”€β”€πŸ“„ Examples.fs              ← Shared test data (books, authors, products)
β”œβ”€β”€πŸ“‚ Data/
β”‚  β””β”€β”€πŸ“„ OpenLibraryShould.fs
β””β”€β”€πŸ“‚ Workflows/
   β”œβ”€β”€πŸ“„ Types.fs              ← Test-specific types and helpers
   β”œβ”€β”€πŸ“„ ApiTestFixture.fs     ← Test harness with DI and mocks
   β”œβ”€β”€πŸ“„ AddProductShould.fs
   β”œβ”€β”€πŸ“„ DetermineStockShould.fs
   β”œβ”€β”€πŸ“„ GetProductShould.fs
   β”œβ”€β”€πŸ“„ GetPurchasePricesShould.fs
   β”œβ”€β”€πŸ“„ MarkAsSoldOutShould.fs
   β”œβ”€β”€πŸ“„ ReceiveSupplyShould.fs
   β””β”€β”€πŸ“„ SavePricesShould.fs

Each workflow has its own test file β€” the file tree is a map of tested features, reflecting the convention used in the production code.

Test Naming

Test classes are named {Feature}Should, and each test method completes the sentence started by the class name. This produces human-readable test names when the runner displays them as {Class}.{Method}:

DetermineStockShould ... deduct sales from initial stock SavePricesShould ... reject invalid RetailPrice

The main benefit is that "should" is factored out β€” no need to repeat it in every test name. The drawback is less flexibility in phrasing test names, since they must all grammatically follow "should".

This convention works particularly well with use case tests, where the feature name starts with a verb: AddProductShould, DetermineStockShould, SavePricesShould… In this case, the class name naturally maps to the When in BDD terms β€” the action being exercised. It is less literal for non-use-case classes like OpenLibraryShould, but the resulting phrases still read correctly.

circle-info

Prefer "given" over "when" to describe preconditions β€” e.g. reject supply given quantity is zero rather than reject supply when quantity is zero. This aligns with the Given/When/Then convention from BDD (Gherkin): the use case name already captures the When, so the test method describes the expected outcome (Then) followed by the precondition (Given). It reads slightly less naturally but is more rigorous.

Here is the full test list, illustrating how the naming convention scales across the project. Parameterized tests appear as child nodes under the parent test name, with argument values in parentheses.

chevron-rightFull test listhashtag

Test Fixture

The ApiTestFixture is the central test harness. It builds a dependency injection container that combines:

  • Production dependencies: the real IProductApi wiring via AddProductApi()

  • Test dependencies: program mocks via AddProgramMocks(), in-memory repositories, and NSubstitute mocks for external HTTP clients (IFakeStoreClient, IOpenLibraryClient)

Tests create a fixture with the initial data they need, then call fixture.Api to exercise workflows. This approach tests the complete chain β€” from the API surface through workflow execution to the data layer β€” while keeping external dependencies mocked.

TDD Style: Outside-In Diamond

This testing strategy follows the Outside-In Diamond approach (also called Outside-In Classicist), which combines two ideas:

  • Outside-In: tests enter through the outermost public boundary β€” here IProductApi β€” rather than testing internal components (workflows, pipelines) in isolation. This validates the full integration: instruction wiring, undo strategies, and data-layer access.

  • Classicist / Diamond: internal collaborators use real implementations (the actual Api class, real pipelines, in-memory repositories) rather than mocks. Only the system boundary is mocked β€” external HTTP clients (IFakeStoreClient, IOpenLibraryClient) that call third-party APIs.

This results in a diamond-shaped test distribution: few unit tests at the bottom, a thick layer of integration tests in the middle (through the API), and few end-to-end tests at the top. The shape contrasts with the classic test pyramid (many unit tests, fewer integration tests) and the London-school approach (mock every collaborator).

The Product test project illustrates this shape well:

  • Unit tests: only one β€” OpenLibraryShould β€” which tests a pure utility (key sanitizing) in isolation.

  • Integration tests: all the workflow tests in the Workflows/ folder, exercising the full chain through IProductApi.

  • End-to-end tests: none in the Shopfoo repository. E2E tests typically involve the full deployed stack (browser, server, database) and are generally better managed by a dedicated QA team β€” though nothing prevents the development team from writing them as well.

The main benefits are:

  • Confidence: tests cover the real wiring, catching integration issues that isolated unit tests would miss.

  • Refactoring safety: internal code can be restructured freely β€” only the IProductApi contract matters.

  • Few mocks: less test setup, less coupling to implementation details.

Assertions with Unquote

Unquote's key feature is step-by-step reduction: on failure, it evaluates the quoted expression incrementally, revealing intermediate values rather than a generic "expected X but got Y" message. It provides two assertion styles:

  • actual =! expected β€” reads as "should equal". Simpler and more concise, suitable for straightforward equality checks:

  • test <@ boolean-expression @> β€” more verbose but more flexible: it supports combining multiple assertions in a single quoted expression (with &&):

    The caveat is that the reduction stops at the first false sub-expression β€” subsequent assertions are not evaluated.

Full-state and multiple assertions

For best practices on constructing full expected values and combining multiple assertions in a single check, see Better assertions in Tips & Tricks.

Example-Based Tests

Most tests follow a straightforward pattern: set up a fixture, call the API, and assert on the result using Unquote:

Parameterized Tests

TUnit offers several ways to provide data to the tests. The simplest is the [<Arguments>] attribute β€” equivalent to xUnit's [<InlineData>] β€” but it accepts only constant values. For domain types, see Parameterized tests: mirror enums with active patterns in Tips & Tricks.

Property-Based Tests

FsCheck is used for testing domain invariants that must hold for all valid inputs. For example, the purchase price average calculation is tested with mathematical properties:

Active patterns as lightweight generators

FsCheck generators can be verbose for constrained domain types. A lighter alternative is to use active patterns β€” see Active patterns as lightweight FsCheck generators in Tips & Tricks.

Compare this with generating a valid Book, which requires a custom generator composing multiple fields β€” a valid ISBN (with checksum), a list of authors (each with a valid OLID), tags, etc.:

Validation Testing

Generating invalid domain values is even harder than generating valid ones. The AddProductShould test class tackles this with a FieldIssueType discriminated union that models each possible validation error, and a FieldIssue record that pairs the expected error with a function to inject the issue into a valid product:

FsCheck generates a NonEmptySet<FieldIssueType> β€” an arbitrary combination of issues β€” and the test folds them onto a valid product to produce an invalid one, then verifies that all expected errors are reported:

This approach requires substantial scaffolding β€” the NullOrWhitespace and TooLong wrapper types, the Extensions module converting each issue type to a FieldIssue with the right field name, max length, and product updater β€” but it ensures that the validation { } applicative CE correctly collects all errors rather than stopping at the first one.

Saga Tests

Saga and cancellation scenarios are tested in the Shopfoo.Program.Tests project using a dedicated Order domainarrow-up-right. These tests are documented in the Workflows page.

Going Further: Mutation Testing

Property-based testing assesses test quality β€” do the tests verify meaningful properties? Another complementary technique is mutation testingarrow-up-right: it introduces small changes (mutations) into the production code and checks whether the test suite detects them. While code coverage is a quantitative metric (how much code is exercised), mutation testing is qualitative (how well do the tests actually catch regressions).

Strykerarrow-up-right is the most popular mutation testing framework for .NET. In practice, mutation testing is more commonly used in C# codebases, whereas property-based testing is more prevalent in F#. The two techniques are not mutually exclusive β€” combining them provides stronger confidence in test quality when the domain warrants it.

Key Takeaways

  • Test through the API boundary: workflows are tested via IProductApi, not by calling programs directly. This validates the full wiring including instruction preparation and undo strategies.

  • In-memory repositories: data-layer implementations are replaced with simple in-memory stores, keeping tests fast and deterministic.

  • External clients are mocked: only HTTP-based external dependencies (IFakeStoreClient, IOpenLibraryClient) use NSubstitute β€” everything else is a real implementation with in-memory storage.

  • Property-based testing for domain rules: FsCheck validates mathematical properties and validation completeness, complementing example-based tests.

Last updated