How tiny.cloud uses free monads for testing

What does the programmer handbook say to do when you have to deal with file systems that present too many unavoidable problems? Well, it says Abstractions... that all-purpose hammer in the toolkit. We can abstract away the file system. Yes. That will solve everything.

At Tiny, we had a file system problem to solve. The problem was a program that interacted with a file system, which then ran into many potential problems (read and or write permissions, slow access, hard to test).

To go about abstracting and solving a file system problem, we chose 'Free Monads'. We needed a script that interacted with a file system, and we needed a way to test the solution. But why Free Monads? Primarily because we’d used them elsewhere, and wanted to build on our experience.

Were Free Monads the best abstraction tool for the job at Tiny? Unsure, but we’re happy with the results. Our program was written in Haskell, so we were bound to find good supporting libraries for these concepts.

The steps that we needed to handle involved:

Copy a file to somewhere
Unzip that file from its new location
Read a file from the unzipped contents (e.g. version.txt)
Use that file’s contents to determine which directory to create
Copy the "app.js" file from the unzipped contents into the new directory
Upload that directory to S3

Monads presented the best Tool for solving the problem.

Miniature screw driver

Before we reached the elegance of Free Monads, we first tried some other approaches. So let’s step through the hard-fought progression from hard-coded File Systems to the flexible Free Monads.

What are Free Monads?

The word comes from the Greek word 'monas', which means a 'unit', or 'alone'. It evolved into a term used in mathematics and computer science, specifically category theory. And it’s complicated. Think of a monad as a way to separate complexity and difficulty from a program, and make the application easier to work with.

A Free Monad allows you to sequence an operation 'functor' as though it were a monad. This means you can control the operation order. You can pass to subsequent operations any return values from previous operations. Haskell has a convenient 'do notation', which provides the syntactic sugar needed to make the whole sequence and operation passing happen.

Now back to the story.

Working with a separate interpreter

The first solution we tried was a separate interpreter that was context-free and provided an implementation for main operations. We defined the main operations as data types.

By having the separate interpreter, it meant that testing our file system abstraction was easier.

We could make the interpreter much simpler compared to the real interpreter, and check the right operations were generated without worrying whether the operations were implemented correctly. We wrote and ran tests to make sure that the production interpreter could understand particular operations.

This was the original problem:

Copy a file to somewhere
Unzip that file from its new location
Read a file from the unzipped contents (e.g. version.txt)
Use that file’s contents to determine which directory to create
Copy the "app.js" file from the unzipped contents into the new directory
Upload that directory to S3

And from these steps we built a datatype and named it 'FileOperation':

-- Assumes types LocalFile and S3Location exist

data FileOperation = CopyFile LocalFile LocalFile |
  Unzip LocalFile | 
  ReadFile LocalFile |
  MkDirectory LocalFile | 
  UploadDirectory LocalFile S3Location

And then we wrote a script as an operation series:

</> :: LocalFile -> String -> LocalFile

program :: LocalFile -> [FileOperation]
program cwd =
  let sourceFile = cwd </> "my-application.zip"
      destinationFile = cwd </> "dist" </> "my-application.zip"
  in [ CopyFile sourceFile destinationFile
     , Unzip destinationFile
     , ReadFile (cwd </> "dist" </> "version.txt")
     , MkDirectory (???) 
     , CopyFile (cwd </> "dist" </> "app.js") (??? </> "app.js")
     , UploadDirectory (???) s3://bucket-for-uploads
     ]

With the solution, we then add an interpreter to provide implementations for these actions:

interpretAll :: [FileOperation] -> IO ()
interpretAll operations = 
  traverse_ (\op -> interpret) operations

interpret :: FileOperation -> IO ()
interpret (CopyFile source dest) = ... implementation
interpret (Unzip zipFile) = ...

So we could run interpretAll against the result of the program (LocalFile "src"). And then we could provide a different interpreter for our tests.

That interpreter might just stringify the FileOperation, and we could test if the strings are what we expect. But you might notice, there were some '???' values in there. That’s because of the way our interpreter was set up. We didn’t actually retrieve any values from any operations and pass them to the next operation.

In our program, we needed to read a file, and use the file contents to determine the next action. However, interpreting the list does not have this ability. We found one workaround but it was an awkward solution that didn’t scale, and the solution stayed the way it was, with a separate interpreter for a long time. That was, until we introduced Free Monads.

Using Free Monads meant we could push the actual implementation of the file system interaction to the edge of our program, and mostly ignore it when writing our solution’s logic. We could also test whether individual operations were being executed correctly by the real interpreter. But most importantly, we could use the results of the individual operations as input for the next operation (this way there are no clumsy solutions, and it scales).

How we introduced Free Monads

Setting up a Free Monad based solution to handle passing operations could look something like this:

program :: LocalFile -> FreeFileOperation ()
program =
  let sourceFile = cwd </> "my-application.zip"
      destinationFile = cwd </> "dist" </> "my-application.zip"
  in do
        O.copyFile sourceFile destinationFile
        O.unzip destinationFile
        versionAsText <- O.readFile (cwd </> "dist" </> "version.txt")
        O.mkDirectory (cwd / "dist" </> versionAsText)
        O.uploadDirectory (cwd / "dist" </> versionAsText) (S3Location "s3://bucket")

What we noticed when we set this up was:

We didn’t need any clumsy workarounds to gain access to the previous results of the interpretation
After adjusting to Haskell’s "Do Notation", the content becomes more readable
It’s more clear how the operations interact
The program has no IO
There’s no longer an array of operations returned – just one operation returned, which is called 'FreeFileOperation'

Next, we added an interpreter:

interpretAll :: FreeFileOperation a -> IO a
interpretAll = ....

Similar to our earlier solution with a separate interpreter, to run a unit test we could have added a different interpreter, which renders basic strings and makes the "ReadFile" variable always return "0.0.0-0".

The benefit of these operations is that they are now composable. That is, we can write scripts that combine and test each of the operations, as well as write logic that processes the returned values, and then test that logic.

Suddenly, the file system is a replaceable part of the program, which expands testing opportunities. It can even work for different types of File Systems.

For example, we might make a function that:

Takes a directory containing a version.txt file
Makes a new directory with that version
Copying a specified file into it
Uploading it to a specified location

uploadVersion :: LocalFile -> String -> S3Location -> FreeFileOperation ()
uploadVersion directory appFilename s3Loc = do
  versionAsText <- O.readFile (directory </> "version.txt")
  O.mkDirectory (directory </> versionAsText)
  O.uploadDirectory (directory </> versionAsText) s3Loc

If you were to implement this, you wouldn’t need to know the implementation details of the interpreter because only the outer level of the program handles the return type of "interpretAll" – your solution can be free from knowing the interpreter type.

We found that the application stack only needs to be known to the interpreter with this solution. The operation logic is not aware of the application stack, so the application stack could be something simple, like 'IO()'. It could also be something more complex, for instance: MonadIO m, MonadReader r m, HasAppConfig r, MonadError e m, or AsAppError e.

We could have changed our application stack, and the vast majority of the solution code wouldn’t change.

We could add on 'State Monads' or 'Reader Monads' and completely change the error handling and hierarchies. This makes it easy to have a simple interpreter when writing tests.

However, this would all count for nothing if the Free Monad implementation – the 'algebra' – becomes too onerous.

Setting up Free Monad Actions

Start by importing the free monad machinery:

-- Import the Free Monad machinery
import           Control.Monad.Trans.Free       ( Free
                                                , liftF
                                                )

And:

-- Define your operations
data FileOperationF a =
  CopyFile LocalFile LocalFile a |
  Unzip LocalFile a | 
  ReadFile LocalFile (Text -> a) |
  MkDirectory LocalFile a | 
  UploadDirectory LocalFile S3Location a

These definitions are similar to the earlier Haskell solutions, except for the 'a' type parameter. What this change represents is how the next operations interact with this current operation. For operations that return values, instead of using ‘a’ by itself, the type is ‘ReturnType -> a’, because it will provide that ‘ReturnType’ to the Free Monad machinery before going to the next operation.

-- Write Functor instances which apply f to the next value.
instance Functor FileOperationF where
  fmap f op = case op of
    CopyFile s d a -> CopyFile s d (f a)
    Unzip s a -> Unzip s (f a)
    ReadFile s tToA -> ReadFile s (f . tToA)
    MkDirectory s a -> MkDirectory s (f a)
    UploadDirectory s s3 a -> UploadDirectory s s3 (f a)

The Free Monad actions can turn any "Functor" into a "Monad", and so the operation type in the program must be a functor. We applied the mapping function to the 'a'. If we had an operation that returned a value, we would instead apply the mapping function to the function that returns 'a'. To construct the Free type:

-- Construct the Free type
type FreeFileOperation a = Free FileOperation a

To define operation constructors:

copyFile :: LocalFile -> LocalFile -> FreeFileOperation ()
copyFile source dest = liftF (CopyFile source dest ())

-- And an example of ReadFile, which is different because it returns a value.
readFile :: LocalFile -> FreeFileOperation Text
readFile source = liftF (ReadFile source id)

What’s happening here is that to finish setting up the Free Monad, the 'liftF' function lifts the functor into a “Free”. We passed 'id' in as its 'Text -> a' argument because 'ReadFile’s' second argument was a function that was a 'Text -> a', and because we wanted the 'readFile' operation to retrieve a 'Text'. This demonstrates the basic setup of Free Monad based operations to solve the file system problem. The only missing piece of the solution was the interpreter.

Setting up a Free Monad interpreter

For the interpreter, the following is an 'IO' interpreter for simplicity:

-- Assuming a whole lot of shell functions are available from Tu, which is a shortened namespace for `turtle`, a common shell library in haskell.

-- CHECK: iterM vs foldFree and FileOperationF a -> IO a vs FileOperationF (IO a) -> IO a

interpret :: FileOperationF (IO ()) -> IO ()
interpret operation = case operation of
  CopyFile source dest rest -> do
    -- copy the file
    Tu.cp source dest
    -- execute the next action
    rest

  ...

  ReadFile source textToRest -> do
    text <- Tu.readText source
    textToRest text


-- And the overall runner using `iterM` to run the program
-- If your stack MonadError in it, you would use `runExceptT` on `iterM's` output and process
-- the possible error in some way.
runProgram :: FreeFileOperation () -> IO ()
runProgram program = iterM interpret program

When writing tests, we can switch out to a different interpreter if we need to. In our unit tests, we used an interpreter that can have completely different error handling to our real application ( it operates in `ExceptT FakeInterpreterError (WriterT OperationLog IO) a)`. Using the Free Monad approach, you can push the actual file system interaction parts to the edge, and ignore it for most of your program’s logic. And you can check your individual operations are executed correctly by your real interpreter.

Coproducts and combined algebras

So if you’re enjoying Free Monads and want to use them in another project what do you do about adding in a “WriteFileContents” action?

One option available to you is to extract all this into a shared library, and add “WriteFileContents” into your operation algebra. It works, but with a cost. The first project now needs to define an interpretation for an action it doesn’t even use. “WriteFileContents” becomes a valid operation – part of an implementation – even though the project has no use for “WriteFileContents”. The type system cannot tell that the operation is not valid for the project.

Your options are: make the operation do nothing, throw an error, or something else. Another problem is you’re losing type safety (this is important to avoid). However, you can use Coproducts and combined algebras to solve the problem.

Ideally, you need an algebra with all original operations, with some others, combined into one algebra that works with the same machinery. As an example of Coproducts, a Scala project our team worked with used Coproducts to represent a set of possible database operations.

Some projects should only have read operations, some might have write operations as well, and some have very specific operations that you don't want to be generally available. So using Coproducts, we combined the algebras together so that each project has the actions and only the actions that it should have available. Trying to define an unsupported sequence of operations results in a compile time error. This helps make sure your programs are valid through the type system.

What we concluded

Using Free Monads to improve our file system problem had a number of benefits. It improved our testing, and the flexibility of our code. Writing the solution in Haskell is actually rather straightforward, and improved our testing capacity. If your goal is developing with languages other than Haskell, you can find out more on Free Monads with Scala and TypeScript.

How tiny.cloud uses free monads for testing

What are Free Monads?

Working with a separate interpreter

How we introduced Free Monads

Setting up Free Monad Actions

Setting up a Free Monad interpreter

Coproducts and combined algebras

What we concluded

Related Articles

Text editor developers 2023: Survey says RTEs are core component of everything

Looking for a rich text editor? Start with our Rich Text Editor 2023 Survey

Working with rich text editors in 2023: Industry overview and challenges

Products

Developers

Company

Pricing

Products

Pricing

Developers

Company

Products

Pricing

Company

Developers