Clean Architecture Applied – Intro

To make my life easier, I automated the generation of the invoices that I send as an independent contractor. This gave me the perfect excuse to build something that follows the rules set out by Uncle Bobs Clean Architecture. At least how I interpret them. Use this blog post as a source of inspiration on how to do it… Or a cautionary tale, however you see fit.

What is this Clean Architecture?

Let me start with defining what Clean Architecture is. It is a set of guidelines that will help me structure a system so it becomes a collection of loosely coupled components with the most stable parts, hopefully the domain, at its core. The further I move from the core, the more volatile the components become. Where volatile means “changing a lot” and not “behaving in unexpected ways”.

Just as the SOLID principles are there to provide guidelines to write maintainable code, so is Clean Architecture a set of guidelines to write easily maintainable and extendable software. As these are guidelines, I can follow or abandon them as I see fit. One of the founding rules for me, is that pragmatism takes priority over blindly following rules. The latter leads to religiously following a doctrine, which in itself is a dangerous practice. Always think for yourself and check that the pros outweigh the cons.

Source: http://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html

The image above resembles the Ports and Adapters and the Onion Architecture. That’s because Clean Architecture uses a lot of the same principles. It borrows heavily from Ports and Adapters for interoperability and how different components work together. It also uses the layered approach that can be found in the Onion Architecture. I think it kind of uses the best of those two systems and sprinkles a bit of best practices and pragmatism on top.

For the sake of the experiment, lets apply these guidelines. I will not go further into the specifics of Clean Architecture. Those can be found in the Clean Architecture book and the many articles that have already been written.

The problem details

Before I dive into the code, I will describe the problem I’m trying to solve. I want to automate making invoices, so that every month I have to do minimal work to get an invoice to send to my customers.

The data comes from my time keeping app TimeCamp. They have a very nice service which includes an API so I can easily access my time data. From that data I can calculate how many days I’ve worked for which customer.

With that information, I can then generate an invoice in PDF format with all information about my company, my client and the invoice lines. This way I don’t need to spend half an hour to an hour each month to fill out an Excel form to get an invoice. This might sound like a trivial time increase, but the experience from building this is equally valuable.

I encourage everybody reading this to experiment in their own time with a personal project and use techniques or technology you’re not familiar with. They’re awesome learning experiences and can be great poster projects when you want to advance your career.

Stay tuned because next week I will start detailing the architecture and the code of the application.

Advertisements

Different testing approaches

A while ago I attended a lecture of a programmer who explained the differences between the London and Detroit school
(also called Chicago school) of test driven design. I found his explanation kind of pointless.

The London school focuses on mock driven testing: dummy/stub/spy/mock out all the dependencies and injections. It focuses on interaction with other components. The Detroit school concentrates on the state of the system, which allows me to verify that the system correctly updates the state of the application. Source.

public class MyClass
{
  public MyClass(IRepository repository, ICalculator calculator) 
  {
    // save repository and calculator in fields
  }
  public void DoSomething(int id, int x, int y)
  {
    // update text in the database
    var sum = _calculator.Sum(x, y);
    _repository.Update(id, sum);
  }
}

public class MyClassLondonTests
{
  [Test]
  public void DoSomethingTest()
  {
    var mockRepo = new Mock<IRepository>();
    var mockCalc = new Mock<ICalculator>().Setup(x => x.Sum(2, 3)).Returns(5);
    new MyClass(mockRepo.Object, mockCalc.Object).DoSomething(1, 2, 3);
    mockRepo.Verify(x => x.Update(1, 5), Times.Once);
    mockCalc.Verify(x => x.Sum(2, 3), Times.Once);
  }
}

public class MyClassDetroitTests
{
  [Test]
  public void DoSomethingTest()
  {
    // made a fake in memory repository implementing the IRepository interface
    var fakeRepo = new InMemoryRepository();
    // my own calculator
    var calc = new Calculator();
    new MyClass(fakeRepo, calc).DoSomething(1, 2, 3);
    Assert.AreEqual(5, fakeRepo.Calculation.First(x => x.Id == 1).Value);
  }
}

This example is what I understand the London and Detroit schools represent. And I think it’s counterproductive to say one is better than the other. I think we should throw the schools out the window.

When I build a system, I don’t care that my system calls the Update function or the Sum function, in what order and how many times. I want to know that my use case does what it needs to do. In this case, and I think in most cases, checking my in-memory repository FakeRepository does that very easily.

The London school would just slow me down with unnecessary setup and verifying that I called all the correct functions with the correct inputs. It has an additional downside that is not to be underestimated: I want to be able to easily refactor my code. When I move the summation logic into MyClass, I would need to update a lot of the London school test setup, while with the Detroit setup, I would just need to remove the creation of the Calculator and injection into the MyClass constructor.

The Detroit school has the additional advantage that I use the Calculator class, I test that this class works in this use-case scenario. While there are programmers who like a unit test to just test the code in the MyClass and use an integration test to verify that the Calculator class works with the MyClass code, I find that a too fine-grained approach to testing.

While I sometimes write tests to verify specific parts of the code, I remove those later when the unit test, which tests the whole scenario, covers all the code in this scenario. In those small tests, I can make private functions public while I test them. When they work how I like, I make them private again and remove the temporary test.

Just to be clear, a scenario is a specific path through the code. So I can have a scenario “save sum to database” and another “database unavailable” to verify that the code keeps working even if something goes wrong.

The previous parts of this post described that I generally dislike using mocks and I advocate for using all the real parts. But that isn’t always convenient for tests. For example, getting a database in the correct state can be a time-consuming part of testing or simulating that a web-service call failure is handled correctly. For those parts, creating a mock/stub/fake/etc can be very time preserving and easy to set-up.

My general rule is: use all code that can run in-memory directly (think my own code, colleagues code, libraries that are used, etc), while code that requires access to external resources (think database, the internet, web-services, operating system information or hard drive access) should be replaced with a mock or a fake. I verify the result against whatever is most convenient.

Recently, I created mock-ups of HTML I wanted to generate, then generated the HTML using a fake data source, but keeping all the code I wrote and the library to generate the HTML from a template. Finally I compared the output with my mock-up. Afterwards, I refactored a lot of the internals of the code to be more efficient, I only needed to edit the test once or twice because I updated either the constructor or the signature of the method.

In this recent example I used both techniques. That’s the point I’m trying to make. I don’t care if I use some school of testing, I care about working code that is easy to understand. It’s not London or Detroit style of test driven design, both techniques are valuable. When calculating or generating things, London style value checking works very well. Detroit style is great for checking that external resources are called with the correct parameters.

So forget the terms that divide us and lets find joy in creating working software with all the tools at our disposal.

P.S. I’m opening up the comment section as an experiment. Let’s see if somebody has anything to say.

Static should be used sparingly

New year, new blog posts! Lets start with a problem from work. Let me present you with the problem code I had to analyse.

Once or twice a month, the name “Ken” got written to the database and another name was written to the log. This is a reproduction from the ASP.NET web call, actual code was a lot more complex.

private static string Name;
public static void Execute() 
{
  Name = Cache.GetNameFromSession();
  var description = $"I'm using the name {Name}";
  SaveToDatabase(description);
  Logger.Info($"Saving Name {Name} to the database");
}

If you figured out what went wrong, congratulations. I had to look an embarrassing amount of time to spot the static in the field description. After I noticed that, I recognised the timing problem that can occur.

Due to the static nature of the field, it is shared between multiple requests. The first request sets the Name to “Ken”, which creates the description “I’m using the name Ken” and saves that to the database. When it’s time to log the name, the second request has updated it to “Sophie” which is what will get logged.

Fortunately, the solution is quite simple: make the field non-static so it’s bound to the instance, then each request will have their own instance. The method itself is static and I couldn’t change that, so I cannot make the field non-static. The next solution is to call the cache where it needs to be called. A local variable is also not shared between multiple requests, so the problem doesn’t occur anymore.

public static void Execute() 
{
  var name = Cache.GetNameFromSession();
  var description = $"I'm using the name {name}";
  SaveToDatabase(description);
  Logger.Info($"Saving Name {name} to the database");
}

Let this be a lesson for all of us: use static sparingly. It’s great for a factory method such as Person.Create(name, age), but it’s the cause of a lot of subtle bugs when used in concurrent environments such as a web server.