Clean Architecture Applied – Review

During development, I already noticed some benefits and drawbacks.

Let’s start with the drawback: when I change a data structure, such as the Customer, there’s suddenly a few locations where this needs to be changed such as the tests and the repository. This isn’t unexpected, but I noticed there are a lot more changes than in other code. This can be because this experiment made me more aware or because the architecture accentuates this. In hindsight it is not such a big disadvantage, but more an observation of how this architecture brings these things to light.

A general observation about Clean Architecture is that this architecture is overkill for small project. It offers a lot of benefits, but I need to put in a lot of effort to get those benefits. Without all the interfaces and separate projects, I could be done much faster. Yet that would mean that it would be a lot less extendable. The extensibility can be added afterwards when I would need it. For example, I could put an IWorkRepository interface in place when I need to add additional data sources. It’s always a push and pull between YAGNI and not enough flexibility that make changing the design later difficult. The added interfaces and extension points did make me think twice about how I would go about some part of the code, which did benefit the end result.

The biggest advantage of this architecture is that it’s easy to extend. That’s not a big surprise as this is one of the pillars of Clean Architecture. Changing the single implementation of DinkToPdfDocumentGenerator into a wrapper for the internal functions that became FluidHtmlDocumentGenerator was effortless. There have been other places where this became apparent, but this was one of the most obvious.

An extension of the previous advantage is that the application is nicely compartmentalised. This allows me to focus on one problem at a time. It also forces me to keep my components small as they only focus on one part of the system. Small components are easy to reason with and the logic not only fits on a screen, but in my mind as well. I understand the problem space a lot better when it fits in my head and it helps me find a good solution for the problem at hand.

To be sure that components are truly separated, an inner circle component should never use the classes of the outer components. For example, in the TimeCampWorkRepository I have an object that allows me to deserialise the JSON structure I get back from the TimeCamp API. That object is not getting passed to the upper structure. I have a more generic object I map to so my inner components do not have to rely on the specific TimeCamp structure.

The WorkDay object is more than just a simplified view of the TimeCamp api. It’s a contract that describes what the component needs to do it’s work properly. If I use different repositories, for testing for example, all I have to do is make sure I return a good WorkDay object and my code will run.

Lastly, testing is not exclusive to Clean Architecture, but I can’t ignore their importance: tests really saved me a lot of time. While experimenting with certain alternatives or ways how to solve a problem, the tests showed me very quickly if I was going the right way. I use a mix of black box testing where I compare the result of a generation with a predetermined document and behavioural testing to check that I call some external services correctly.

The end

Clean Architecture is nothing new in my opinion, but it is a good set of principles to build an application upon. Don’t expect any ground breaking insights from the book, but a confirmation of what we, as a profession, are trying to work towards in terms of a good foundation. It offers a set of principles that outline how to structure an application so that it’s easy to build, maintained and extended.

I, for one, will be using this template as a starting point whenever I need to build a new application or when I need to work towards a better structure in legacy applications.

Advertisements

Clean Architecture Applied – Bringing it all together

Now all individual components are ready, it’s time to bring it all together in a working application.

To get all these parts together, I add the last project: the console that will run the whole thing. It wouldn’t be complete with the popular syntax of invoice auto. The Command Line Utils package allows me to convert the invoice auto args from the Main method into commands that can be executed. This package has a bit of a learning curve, but has a lot of options to customise the command line arguments. I’m not going to elaborate a lot on that, I liked the package and I’ll let my readers figure out how to use it best.

Lastly, there’s the setup of the objects in the Main function. It’s basic injection of the services with a little composition for the pdfGenerator that takes the htmlGenerator as input.

// in my console application
var dateProvider = new SystemDateProvider();
var customerRepository = new StaticCustomerRepository();
var timeCampWorkRepository = new TimeCampWorkRepository(configuration, customerRepository);
var htmlGenerator = new FluidHtmlDocumentGenerator();
var pdfGenerator = new DinkToPdfDocumentGenerator(htmlGenerator);
var generator = new InvoiceGenerator(dateProvider,
                                     customerRepository,
                                     timeCampWorkRepository,
                                     pdfGenerator);

Structuring the code

The ICustomerRepository interface should be placed in the use case layer (orange) and the implementation should be placed in the infrastructure layer (blue). Clean Architecture suggests grouping things that change with the same frequency and for the same reason. The infrastructure layer normally changes a lot more than the use case layer. I put the WorkRepository and the StaticCustomerRepository in separate projects. They change for different reasons at different times, so I should be able to build and deploy them at different times.

The document generators (FluidHtmlDocumentGenerator and DinkToPdfDocumentGenerator) are in a different project together. They both change when I need to update the document generation. Maybe not at the exact same time, but definitely for the same reason.

Above all else, the code should be easily testable. That is why I used all the interfaces. It’s easy to switch out an actual component for a fake one. The fake implementation allows me to control certain data. For example: I have a IDateProvider interface so I can control when Now actually is. If I would just use DateTime.Now, I could not simulate generating an invoice for a specific date. Now I just have two implementations: the SystemDateProvider in my actual application and the FakeDateProvider for in my tests.

// shared interface
public interface IDateProvider
{
  DateTime Now { get; }
}

// in my console application
internal class SystemDateProvider : IDateProvider
{
  public DateTime Now => DateTime.Now;
}

// in my tests
internal class FakeDateProvider : IDateProvider
{
  public FakeDateProvider(string now)
  {
    if (DateTime.TryParse(now, out var parsedNow))
    {
      Now = parsedNow;   
    }
  }
  public DateTime Now { get; }
}

The console project belongs in the infrastructure layer, together with the implementations of the interfaces. I put it in another layer in the image to indicate that the console brings all the other layers together.

The last blog in this series will talk about the pros and cons of this architecture.

Clean Architecture Applied – The customer repository

There are a lot of places where I need customer information. In the TimeCamp repository I need to link a work item to a customer. In the document generator, I need customer information such as name, address and VAT number.

Assigning each work item to a specific customer would take a lot of time, but I know that each item starts with one of two specific prefixes (WRK-001 or BUG-001 to indicate a story or a bug). My customer object will have a list of prefixes so I can identify which customer has to pay what amount.

I don’t want to add this information to the configuration of the app. If I ever want to move this logic to, let’s say, an Azure function, I don’t want to duplicate the customer setup. Setting up a proper database is too much work for now (not to mention a little costly to just store just a few records).

The easiest solution is to add a hard-coded class with customer information. To make sharing easier, I put the ICustomerRepository in a shared library and the implementation in another library. I created a fluent builder class so it’s easy to set up a new customer. I think it is a lot more descriptive than have a new Customer {Name = "My Customer NV"}. Opinions may vary on this topic.

public class StaticCustomerRepository : ICustomerRepository
  {
    public Customer[] GetAll()
    {
       return new[]
       {
          MyCustomer,
       };
    }    
    private static Customer MyCustomer => CustomerBuilder.NewCustomer.WithTagAndName("my-customer", "My Customer NV")
      .WithVatNumber("VAT1234567890")               
      .ChargingDaily(100)
      .WithIdentifiers("WRK-",                                        
                       "BUG-")
      .AtAddress("Address", "Info")
      .Build();
}

The share library resides in my use case layer where the other interfaces (IWorkRepository and IDocumentGenerator) live. I put this in a shared project so I could use this in another, separate, project that generates timesheets for my customers that uses the same customer information. I won’t be going into detail about that as that would lead us too far off track. I hope it won’t be too hard to imagine how that would work, using the principles of Clean Architecture.

Now that all components are finished, I have to bring it all together in a working application.

Clean Architecture Applied – The document generator

Now that I have time information about my workdays, the next step is to generate the invoice. There I face another problem. I want to generate a PDF, but I need an easy format to test against. Let me first take inventory of what I have:

  • I have an interface IDocumentGenerator that takes the workdays information
  • I found the DinkToPdf library that converts HTML to PDF

I learned that when I generate the PDF with DinkToPdf, the PDF has metadata included such as the date and time the PDF was created. So, this foils my plan to compare the created byte arrays. It’s a good thing that I have smart friends that remind me of the decorator pattern and pointed out that HTML is a lot easier to check.

I use a HTML templating engine to generate the HTML that represents an invoice. I found Fluid to be the most versatile library to generate the HTML I need.

To create the PDF, all I have to do is create a class that takes the HTML document generator, gets its output and let DinkToPdf do its work.

public interface IDocumentGenerator
{
    (string extension, byte[] document) Generate(Invoice invoice, Customer customer, BillableItem[] billableItems);
}

public class FluidHtmlDocumentGenerator : IDocumentGenerator
{
    public (string extension, byte[] document) Generate(Invoice invoice, Customer customer, BillableItem[] items)
    {
        using (var htmlStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(TemplateLocation))
        using (var htmlReader = new StreamReader(htmlStream))
        {
            var htmlTemplate = htmlReader.ReadToEnd();
            FluidTemplate.TryParse(htmlTemplate, out var template);
            var invoiceDocument = GenerateHtml(customer, items, invoice, template);
            return ("html", invoiceDocument);
        }
    }
}

public class DinkToPdfDocumentGenerator : IDocumentGenerator
{
    public DinkToPdfDocumentGenerator(IDocumentGenerator htmlGenerator)
    {
        _htmlGenerator = htmlGenerator;
    }

    public (string extension, byte[] document) Generate(Invoice invoice, Customer customer, BillableItem[] billableItems)
    {
        _htmlGenerator.Generate(invoice, customer, billableItems);
        // use DinkToPdf to generate PDF
    }
}

When testing the code, I just use the FluidHtmlDocumentGenerator, convert the byte array into text and compare that to the output I expect. I have one test that generates a PDF so I can check that the DinkToPdfDocumentGenerator generates the correct output. All I do there is compare the length of the array returned. I can change the test slightly so it writes the output to a file. This allows me to see that I generate a PDF that looks like the rendered HTML. This is a manual process since I haven’t found a way to automate this. Which is just a small inconvenience as I verify that my layout is correct with the HTML tests.

Clean Architecture talks about being independent of libraries. Hiding these libraries behind interfaces allows me to implement other classes with different libraries. That is how I can easily change HTML or PDF generation libraries without my InvoiceGenerator needing any change at all. It’s a very flexible and extendable structure.

For example, in most of my tests, my setup of my document generator looks like this: new FluidHtmlDocumentGenerator(). In my actual production code, this changes to: new DinkToPdfDocumentGenerator(new FluidHtmlDocumentGenerator()).

The next infrastructure part is the customer repository, which I will write about next week.

Clean Architecture Applied – The work repository

Now that I have the general steps to get to an invoice, I can start working on the individual components that reside in the infrastructure layer; starting with the work repository.

The domain logic would need time data, which reminds me of the Repository pattern. So I create an interface IWorkRepository that gets work days between two dates: Task<WorkDay[]> GetWorkDays(DateTime start, DateTime end).

[DebuggerDisplay("{DebuggerDisplay}")]
public class WorkDay
{
    public WorkDay(string customerTag, DateTime day, int secondsWorked, bool billable)
    {
         Day = day;
         TimeWorked = TimeSpan.FromSeconds(secondsWorked);
         Billable = billable;
         CustomerTag = customerTag;
    }
    public DateTime Day { get; }
    public TimeSpan TimeWorked { get; }
    public bool Billable { get; }
    public string CustomerTag { get; }
    private string DebuggerDisplay => $"{CustomerTag} - {Day:d}: {TimeWorked}h ";
}
The private DebuggerDisplay field is a little workaround to allow easy string manipulation (especially dates) to get nice debug information.

This would allow me to use a fake work repository when testing to start verifying other behaviour. However, I learned about Flurl, a nice library that allows me to build HTTP requests easily, but also allows me to use it for testing. The next thing I did was build the TimeCampWorkRepository, referencing the actual TimeCamp API URL. Flurl allows me to use the TimeCampWorkRepository in the tests so I can verify that the JSON I retrieve from the API is correctly deserialised and mapped to the WorkDay structure. I got that JSON from actually calling the API and saving the response. This allows me to verify my own code to the point right where the actual API call will happen and how the returned information is handled.

This also ties into the Clean Architecture principle that the database, or more general data store, should be pushed to the infrastructure layer. I put this code in its own project to clearly separate this concern. Putting an interface between the actual call to the TimeCamp API and the control flow of the invoice generator allows me to add other data sources if I should ever change the app I use to track my time.

Next up is the document generator.

Clean Architecture Applied – Applying the principles

Before I dive into the technical details, let me give you a little warning:

This solution is over-engineered, I could write this fairly straight forward and be done in a few hours. The experiment here is to implement this app using the guidelines from Clean Architecture. So expect too many projects, a bit of plumbing code and a lot of interfaces.
Also, I cannot make the full code available as this code contains secrets (such as my customers). It would also allow anybody to generate invoices in my company's name. I will not be making this code easily available. Ever.
I thought about making an example repository to share with the world. I do not think I would give that repository the same attention I give this code, it would not represent the quality that I put into my work.
If you or your company need help with improving code quality, I offer consultancy services and training. Head over to More Than Code for more information.

The problem domain is not a difficult problem, so let’s check out how I structured the code to get it all working.

The first step is to get the data from the TimeCamp API, which I need to transform into days worked so I can sum them and then put them into a format that I can save as a PDF-file.

I started by creating a solution with a project InvoiceGenerator.Core. I can also name it .Business, .Domain or .Rules, pick something that clearly communicates what is in the project. At the same time, I create an InvoiceGenerator.Tests project, because all important parts need to be tested well.

Because this is not a large project, I only create one test project. I believe in testing use cases, such as “generate HTML invoice from TimeCamp data”, that test a lot my own code (I’ll get back to that in a minute). This allows me to change the inner working of a module, without losing any functionality. My tests will guard the functionality. That is why my tests will always be very specific. This test knows about the TimeCamp API and that the invoice will be in the HTML-format.

What I mean by “my own code” is all code that I write. It encompasses everything from domain logic, internal plumbing and all code right up to the point it goes out of my hands, such as to a database, the network or the hard drive.

Domain and Use Case

Invoice generator

The first test then sounds easy: generate an HTML invoice from TimeCamp data. It all starts in the core business logic. For that I create a class InvoiceGenerator in the Core project. This will orchestrate the interactions between all the little components. It determines the month the invoice should be generated for (the previous one). Then it gets the data about the days I worked. Finally, it generates invoices for all customers I worked for.

public async Task<Invoice[]> Generate()
{
  var startOfPreviousMonth = _dateProvider.Now.StartOfPreviousMonth();
  var endOfPreviousMonth = startOfPreviousMonth.AddMonths(1).AddDays(-1);
  var workDays = await _workHoursRepository.GetWorkDays(startOfPreviousMonth, endOfPreviousMonth);
  var billableWorkDays = workDays.Where(day => day.Billable);
  var invoices = new List<Invoice>();
  var previousInvoiceFileName = _invoiceReader.GetPreviousInvoice();
  var description = _dateProvider.Now.ToString("MMMM").ToLower();
  foreach (var daysForCustomer in billableWorkDays.GroupBy(x => x.CustomerTag))
  {
    var customerTag = daysForCustomer.Key;
    var customer = GetCustomer(customerTag);
    var billableItems = GetBillableItems(customer, daysForCustomer, startOfPreviousMonth);
    var invoice = GeneratedInvoice(previousInvoiceFileName, customer, description, billableItems);
    previousInvoiceFileName = invoice.FileName;
    invoices.Add(invoice);
  }
  return invoices.ToArray();
}

The _dateProvider is there to make injecting a fake date a lot easier. It’s quite difficult, read impossible, to change the DateTime.Now date. Which makes testing different months a lot easier.

The previous invoice is used to calculate the new invoice number, this ensures that I can keep numbers nice and sequential.

Adding immutable objects

Dealing with state is complex. Most bugs are introduced when unexpected data influences the flow of code. That is why I made most objects that pass through boundaries immutable. Objects from the InvoiceGenerator back to the console (which will become my UI, I will go into more detail in a future post on this aspect) or from the DocumentGenerator back to the InvoiceGenerator are being created by the originating code (InvoiceGenerator, DocumentGenerator).

public class Invoice
{
  private readonly string _description;
  private string _extension;
  private readonly int _id;
  private Invoice(int id, DateTime invoiceDate, string description, byte[] document)
  {
    _id = id;
    _description = description;
    InvoiceDate = invoiceDate;
    Document = document;
  }

  public byte[] Document { get; private set; }
  public DateTime InvoiceDate { get; }
  public DateTime ExpiryDate => InvoiceDate.AddMonths(1);
  public string Number => $"{InvoiceDate:yyyy}{_id:0000}";
  public string FileName => $"{Number}-invoice-{_description}.{_extension}";
  public static Invoice FromPrevious(DateTime invoiceDate, Customer customer, string description, byte[] document)
  {
    var id = GetId(invoiceDate);
    return new Invoice(id, invoiceDate, description, document);
  }
}

This way the document or invoice can’t be updated after it’s created. Not by my or any other code. I make it easy on myself by creating a static factory method that can take care of creating an object. I could just use the constructor, but I think the method is a bit cleaner to read.

The principles

The core domain consists of the general flow of the application. Getting workdays, grouping them per customer, transforming them to billable records and filling that data into invoice templates.

It then hands control to several plugins (such as a IWorkRepository or a DocumentGenerator). This is part of the layer approach. It also allows me to experiment with different ways of getting the workdays and different template engines to generate an invoice. It also means that I can easily test different components.

I had already thought about how I would tackle each issue, but let’s go through it step by step. Starting with the work repository.

Clean Architecture Applied – Intro

To make my life easier, I automated the generation of the invoices that I send as an independent contractor. This gave me the perfect excuse to build something that follows the rules set out by Uncle Bobs Clean Architecture. At least how I interpret them. Use this blog post as a source of inspiration on how to do it… Or a cautionary tale, however you see fit.

What is this Clean Architecture?

Let me start with defining what Clean Architecture is. It is a set of guidelines that will help me structure a system so it becomes a collection of loosely coupled components with the most stable parts, hopefully the domain, at its core. The further I move from the core, the more volatile the components become. Where volatile means “changing a lot” and not “behaving in unexpected ways”.

Just as the SOLID principles are there to provide guidelines to write maintainable code, so is Clean Architecture a set of guidelines to write easily maintainable and extendable software. As these are guidelines, I can follow or abandon them as I see fit. One of the founding rules for me, is that pragmatism takes priority over blindly following rules. The latter leads to religiously following a doctrine, which in itself is a dangerous practice. Always think for yourself and check that the pros outweigh the cons.

Source: http://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html

The image above resembles the Ports and Adapters and the Onion Architecture. That’s because Clean Architecture uses a lot of the same principles. It borrows heavily from Ports and Adapters for interoperability and how different components work together. It also uses the layered approach that can be found in the Onion Architecture. I think it kind of uses the best of those two systems and sprinkles a bit of best practices and pragmatism on top.

For the sake of the experiment, lets apply these guidelines. I will not go further into the specifics of Clean Architecture. Those can be found in the Clean Architecture book and the many articles that have already been written.

The problem details

Before I dive into the code, I will describe the problem I’m trying to solve. I want to automate making invoices, so that every month I have to do minimal work to get an invoice to send to my customers.

The data comes from my time keeping app TimeCamp. They have a very nice service which includes an API so I can easily access my time data. From that data I can calculate how many days I’ve worked for which customer.

With that information, I can then generate an invoice in PDF format with all information about my company, my client and the invoice lines. This way I don’t need to spend half an hour to an hour each month to fill out an Excel form to get an invoice. This might sound like a trivial time increase, but the experience from building this is equally valuable.

I encourage everybody reading this to experiment in their own time with a personal project and use techniques or technology you’re not familiar with. They’re awesome learning experiences and can be great poster projects when you want to advance your career.

Stay tuned because next week I will start detailing the architecture and the code of the application.