Test

How to test code with infrastructure dependencies

October 25, 2024 • By Tomáš Pajurek

In this article, we

To better understand the matter at hand, imagine a background service or job (a system under test) that consumes messages containing JSON documents from a queue and subsequently stores these documents in a cloud object storage such as Azure Blob Storage or AWS S3. Is there a way to test such a service end-to-end in a way that scales both with the number of tests and team members? Can such a test be run easily on a local workstation by anybody on the team?

Architecture of the example service Architecture of the example service

To test such a service, typically one of the following strategies is used:

Testing with real infrastructure Testing with real infrastructure

Should the real infrastructure be used in tests?

Let’s start the discussion by asking whether real infrastructure dependencies such as SQL databases, Azure’s Storage Accounts, Key Vaults, Service Bus, Event Hubs, or AWS’s S3, Dynamo, Kinesis, or SES should be used during testing.

Based on our observation, with a slight grain of salt, there are two opposing camps of engineers with strong opinions on this topic.

On the more traditional side, there are engineers who intentionally design both production and test code so that external dependencies are used only in the latest stages of the testing pipeline (such as acceptance or system tests). These engineers are often willing to go to great lengths to abstract away access to all the external dependencies in the code and use mocks or other test doubles heavily. They attempt to cover as much of the codebase as possible with unit tests or other kinds of tests that do not require external dependencies.

On the opposite side, some engineers find almost zero value in any tests that are not running against a fully integrated system and disapprove introducing abstractions in the code only to decouple logic from the infrastructure access.

Obviously, both approaches have issues, and they both have merit. The optimal testing strategy will be a mixture of the two. However, selecting the optimal testing strategy in a given context is not a simple unidimensional decision; it requires evaluating the specific context of the given system and the team.

What are the challenges?

Now, let’s look into specific challenges we face when choosing one approach over the other.

When striving to use real infrastructure as little as possible (shifting right), most problems are caused by the introduced abstractions and the need for test doubles:

On the other hand, when using the real infrastructure as much as possible (shifting left), problems arise with provisioning the infrastructure and complexity of the test code:

How to choose the optimal strategy?

Now, imagine, just for a moment, that all the problems with the real infrastructure in the test code disappear:

In such a hypothetical situation, would there be any reason not to use real infrastructure in all the tests? In our opinion, the answer is no, with one exception. When a test case needs to verify how the system under test responds to infrastructure faults or delays, using real infrastructure is, in most cases, not feasible. Otherwise:

Of course, this is not the reality we live in. However, with modern programming languages, infrastructure tools, cloud services, and testing frameworks, we are much closer to this ultimate state now (2024) than we were just a few years ago. This is especially true when compared to 10-20 years ago, when many best practices around software testing were devised. Despite these advancements, many people continue to dogmatically follow those older practices today.

We believe the advances in the tooling mentioned above will continue at the same or higher pace, and thus, it is worth betting on this when designing a testing strategy. However, we are still far from the ideal situation, so using real infrastructure in tests should be carefully considered. This leads to several principles being explained further.

Evaluate the context of the tested system

Same as with software architecture, the correct testing strategy depends on many factors that continuously change. Factors to consider include:

Use as few abstractions and test doubles as possible but not fewer

This rephrasing of Albert Einstein’s quote, “Everything should be made as simple as possible, but not simpler”, is the main principle when deciding test strategy. We should minimize the number of abstractions and test doubles (for the reasons explained above) but not beyond the threshold where various qualities of the tests would suffer. The test qualities that might be compromised include:

Design code for testability but avoid introducing abstractions whose only purpose is to hide infrastructure access

Designing for testability is a sound principle, battle-tested not only in software but also in hardware and other kinds of engineering. In software engineering, it is known not just for making testing more efficient but making the tested code itself better by:

Unfortunately, attempting to design code for testability also often leads to adding more layers of abstractions in the quest to make the code using external infrastructure testable without using the actual infrastructure. Such abstractions serve no other purpose than hiding access to the infrastructure. They have no meaning related to the purpose of the actual application.

We are convinced that such abstractions are only an incidental complexity and should be eliminated if possible. However, this does not mean giving up on the separation of concerns or breaking layering/modularization rules imposed by our architecture of choice.

Example for Domain Driven Design (DDD)

When using DDD, we are not advocating starting to access infrastructure directly in the domain layer instead of using repositories. Repositories, event publishers, and such have a special, well-understood function in the DDD that goes beyond the simple hiding of infrastructure access. In the case of repositories, they are responsible for fetching domain model aggregates which includes constructing and deconstructing domain objects and might involve orchestrating access to more than one external dependency.

In general, DDD’s domain layer is designed to be highly testable without reliance on real infrastructure. However, when it comes to the application or infrastructure layer, we advocate implementing at least a few tests that utilize real infrastructure or a reliable emulator. It is also crucial to avoid introducing abstractions, beyond those required by DDD that serve only to hide access to infrastructure.

Challenge existing testing practices

In software testing, many dogmas or “best practices” might have been valid only in the past or only in some specific contexts, but they are, even today, considered the only correct way to test code. We think it is very important to challenge these dogmas and find new ways that are adapted to a given system, team and state-of-the-art tools. Examples of such dogmas include:

Building on the previous example of a dogma, here is a sample code snippet in C# using ASP.NET, Spotflow In-Memory Azure Test SDK and FluentAssertions library. This code demonstrates how straightforward it can be to test a web API that depends on Azure Blob Storage.

First, let’s introduce a production code that bootstraps and starts the web application. The application is accepting HTTP PUT requests on /upload/{name} route:

static class ExampleApplication
{
	public static async Task StartAsync(BlobContainerClient client)
	{
		var builder = WebApplication.CreateBuilder(args);
		builder.Services.AddSingleton(client);
		var app = builder.Build();
		app.MapPut("/upload/{name}", HandleRequest);
		await app.StartAsync();
		return app;
	}

	static IResult HandleRequest(string name, HttpContext context, UploadService service)
	{
		service.Upload(name, context.Request.Body);
	}
}

The application makes use of UploadService class for uploading the incoming data to the Azure Blob Storage. This service internally uses BlobContainerClient from the official Azure SDK to perform the actual infrastructure operations:

class UploadService(BlobContainerClient client)
{
		public void Upload(string name, Stream stream) => client.UploadBlob(name, stream);
}

At last, let’s see how the test code looks like:

[TestMethod]
public async Task Example_Service_Should_Upload_Blob()
{
	// Arrange: create in-memory storage account, start the app and prepare HTTP client
	var storage = new InMemoryStorageProvider();
	var account = storage.AddAccount();
	var conn = account.CreateConnectionString();

	var containerClient = new InMemoryBlobContainerClient(conn, "data", storage);

	using var app = await ExampleApplication.StartAsync(containerClient);
	using var httpClient = new HttpClient { BaseAddress = new Uri(app.Urls.First()) };

	// Act: send HTTP request
	var data = BinaryData.FromString("test-content").ToStream();
	var content = new StreamContent(content);
	var response = await httpClient.PutAsync("/upload/test-name", content);

	// Assert: check response code, and existing blobs in the storage
	response.Should().HaveStatusCode(HttpStatusCode.OK);
    containerClient.GetBlobs().Should().HaveCount(1);
    containerClient.GetBlobClient("test-name").Exists().Should().BeTrue();
}

We believe that this is a great example of how easily, with modern tools, an application that requires two distinct pieces of infrastructure (Azure Storage and web server) can be tested.

Prefer fakes over mocks when the need for a test double arises.

This might be a more controversial point, but we firmly believe fakes are superior to mocks on many levels when test doubles are needed.

For further discussion, consider the following example: class MessageProcessor that requires AppendBlobClient class for appending data into an Azure Storage Append Blob. MessageProcessor internally calls the method AppendBlock(Stream content) method on the AppendBlobClient. This client also has DownloadContent method that allows to read all appended blocks at once in a form of a Stream.

class AppendBlobClient
{
		void AppendBlock(Stream content) { .. }
		Stream DownloadContent() { .. }
}

We want to test scenarios where one or more messages are sent to MessageProcessor which should in turn append them into a blob so that all appended messages can be read at once.

With mocking, engineers typically craft mocks of dependencies of the code that is being tested and then assert that specific methods with specific parameters were called by the tested code, sometimes also in the specific order (so-called behavior verification). Such mocks are crafted for each specific test case. In the context of the example above, the engineer would create a mock object for the AppendBlobClient, for example with a library like Moq or NSubstitute in .NET, and then assert that the method AppendBlock was called with specific content.

With fakes, engineers implement a working version of the dependency that mimics the real behavior as closely as necessary but is as simplified as possible (e.g. storing data in memory only). This simplified version is then used in the tested code. To verify the correctness of the system under test, various properties of the fake are asserted after the test case is executed (so-called state verification). In the example above, the engineer could create a class that inherits AppendBlobClient, e.g., InMemoryAppendBlobClient : AppendBlobClient, provide a simple in-memory implementation of the blob and then assert that the blob has expected content by calling the DownloadContent method.

This might not seem like a big difference at first, but the implications are significant:

So far, we have considered fakes to be implemented in the same language and used in the same process as the tested code. However, fakes can also be implemented in a language-agnostic way, most frequently as a Docker container that can be run locally. The tested code simply connects to it via the local network as it would connect to the real dependency, e.g., over the Internet.

Testing with in-memory test doubles/fakes Testing with in-memory test doubles/fakes
Testing with fake implementation running as a Docker container on local network Testing with fake implementation running as a Docker container on local network

Moreover, with fakes, chances are that there are already existing fake implementations for many dependencies. Some examples:

We recommend an excellent article by Martin Fowler on the topic for an in-depth discussion of the differences between fakes, mocks, and other kinds of test doubles, or more generally, between state verification and behavior verification.

Conclusion

When it comes to testing code that requires external infrastructure, there are two main approaches: abstracting access to the external infrastructure via patterns such as Gateway and Repository, or using real infrastructure during testing. Both approaches come with specific challenges, such as incidental code complexity when using abstraction or test setup complexity when using real infrastructure.

For most non-trivial systems, both of these approaches need to be combined to achieve the desired levels of code testability as well as test complexity, repeatability and speed. Although there is no simple recipe that anyone can use to determine the testing strategy that would fit any specific system, team, and available tooling, we provide a framework for everybody to make their own decisions, alongside a few widely applicable principles. These principles can be summarized in the following points:

The author wishes to thank Michal Zatloukal, Jakub Švehla, and David Nepožitek for their valuable insights that contributed to this article.

Spotflow © 2025

SOCIALS & CONTACT

>>>>>>>>>>>>>>>>>>>>>>>
>

COMPARE

>>>>>>>>>>