F# Pattern Matching – Part 2 – Active Patterns

A little while ago I dipped my toe into pattern matching and recursion with both F# and Elixir. I’d like to take a deeper dive into pattern matching in F# with a two-part series.

F# Pattern Matching – Part 1 – Introduction
F# Pattern Matching – Part 2 – Active Patterns (this post)

In part 1 of this series, we were introduced to pattern matching basics in F#. We started with very simple examples (standard let bindings) and worked our way to more complicated examples with record types and discriminated unions. We also looked at the useful match expression construct.

In part 2, we will continuing our exploration of pattern matching via active patterns. In addition, I will sprinkle in a few useful tidbits that we didn’t have a chance to examine in part 1.

Active Patterns

Just as standard pattern matching in F# has several types of patterns available there are quite a few variations of active patterns at our disposal. This myriad of patterns can empower us to create concise code that more closely models our business domain.

We will explore each of these active patterns in this post.

Definition

Let’s begin with the definition of active patterns, according to MSDN:

Active patterns enable you to define named partitions that subdivide input data, so that you can use these names in a pattern matching expression just as you would for a discriminated union. You can use active patterns to decompose data in a customized manner for each partition.

So what does this mean?

My simplified explanation is that you can use active patterns to match an abstraction, rather than the specific underlying data. You can hide away some complicated logic that represents the pattern you are trying to match and give it an appropriate name. This sounds a lot like a function doesn’t it? However, the compiler will not allow functions as a pattern in a match statement. Active patterns give you a way to do that.

Let’s dive into some examples to illustrate this point.

Single-Case

I’ve seen this active pattern referred to as single-case, single-complete and single-total. I prefer single-case myself, but if you see any of those terms when reading about active patterns, this is the one being referenced.

A single-case active pattern can be thought of as a pattern match compliant function that represents a single-case discriminator (think of it as a single-case discriminated union that contains a function body). Let’s do an example.

First I will show you a function being used in a match expression.

Here you can see that we are trying to do a match expression on a pattern that is a function (an abstraction that counts the number of words in a string). Now we could have just as easily put the code from the function body directly in the match expression like so:

match document.Split([|' '|]).Length with

However, the function models our domain better (i.e. it is more self documenting as to the intention of the match statement). Also, we may have more places in our code that we need to use the countWords function. Lastly, this is a contrived example where were are simply using a one line function body.

Unfortunately, line 6 (just the first illegal line of many that the compiler hits) will give you the following compiler error:

The pattern discriminator ‘countWords’ is not defined

Bummer. So how can we get around this limitation? Active patterns to the rescue!

This looks very similar to the previous example. The only difference is that we put (| and |) (called banana clips) around the countWords function identifier (and we also changed the name to CountWords, to follow standard naming conventions). What this does is transform the identifier into an active recognizer.

Let’s examine the match expression a little further. The expression starts with match document with. This is passing the value in document to the list of patterns. The first pattern makes a call to the CountWords active recognizer, passing the value from document as a parameter. The expression is then evaluated and returned to be matched against the rest of the pattern (in this case, the integer 5). This repeats for each pattern until a match is found. You’ll notice we are doing three types of patterns:

Matching against the literal value 5.
Capturing the result of CountWords into wc and comparing it to a when clause.
Capturing the result of CountWords into wc and treating it like a wildcard.

Number 3 is required to ensure we have a complete match expression, otherwise our result could fall through the bottom (the compiler will warn us too). Remember from part 1 in this series, we could have used the underscore character if we didn’t need to use the value on the right hand side of the -> (e.g. | Countwords _ -> printfn "too many words to count").

We can also combine multiple single-case active patterns to create something like a validation chain.

Here we’ve created two single-case active patterns checking the character count and whether or not we have at least one numeric character. Next, we’ve combined those together in a match expression for IsValidPassword, which is itself a single-case active pattern. These active patterns combine to form a (poor) password validation chain that returns a tuple of the validation result and a failure reason (note: check out this excellent post on a much better approach for success/failure strategies).

A quick digression here. Note the tidbit I snuck in with the IsValidPassword active pattern. The match expression (match ... with) has been replaced by the function keyword. Also, the parameter being passed into IsValidPassword is not there. This is a short-hand for any active pattern (or function for that matter) that simply does a pattern match on its parameter (or specifically, its last parameter). It also comes in handy for inline pattern matching in lambda expressions. I will not show that here, but take a look at this stackoverflow question for some good explanations.

So with these single-case active patterns in place, we could create a function such as this:

Our setPassword function calls the IsValidPassword active recognizer, which in turn calls each active recognizer contained inside the IsValidPassword until a match is found. The result returned is a tuple that is then matched for (true, _) or (false, failureReason). This will then result in either the password being set or an exception being thrown (see the two examples at the bottom of the code sample).

While, we’re still on this example, we’re going to look at a few different ways to use IsValidPassword.

Here we’re actually using the active recognizer directly in the parameter of setPassword. This is nice as it simplifies the match expression in the function. The end result is the same (although I changed the bad password to trigger the other failure message this time).

Here is one final thing I will show you before we move on to the next type of active pattern.

From this example you can see that a let expression is just a pattern match. This allows us to use the IsValidPassword directly, without needing a separate function. Very cool!

By the way, those tick marks for result' and message' on the goodPassword let expression are just because we already have result and message in the line of code above. We’re just avoiding identifier name collisions. The ticks are nothing special.

By now you’re probably in single-case active pattern overload. So let’s move on to the next type of active pattern.

Parameterized Single-Case

As the named suggests, you can create a single-case active pattern that contains extra parameters. Let’s jump right to an example.

There is not much of a difference from a standard single-case active pattern here other than passing a couple of extra parameters to the active recognizer GreaterThanThreshold. The key thing to note is that the value from the match .. with (in this case the text identifier) expression is passed in as the last value to the active recognizer.

Remember too, that we could have done the following:

In the profanityChecker function, we’ve used the function keyword shortcut in place of explicitly declaring the text parameter and then using it in the match text with expression.

There’s not much more to say about parameterized single-case active patterns. As I said, they are exactly the same as normal single-case active patterns except for allowing extra parameters.

Multiple-Case

I’ve heard this active pattern referred to as multiple-case and multiple-total. This active pattern is a nice way to classify some value and optionally decompose it. Let’s do a couple of examples. We’ll start with a very simple one that is sort of the canonical multiple-case active pattern.

The notable difference from the previous examples is that there are two active recognizers in the same definition, (|Even|Odd|), separated by a | (pipe) character. In a way, this resembles a discriminated union. Conveniently enough, the function describeNumber looks like a pattern match on a discriminated union too. That’s probably the easiest way to reason about multiple-case active patterns.

This example was simply classifying the input (the integer parameter) as Even or Odd based on the expression in the multiple-case active pattern.

We can do more interesting things by decomposing the data, however. Let’s do a more complicated example, showing how we can take advantage of multiple-case active patterns with abstractions that represent our domain.

We’ve created four active recognizers in our multiple-case active pattern (you can have a maximum of seven) representing the four seasons of the year. Let’s breakdown what is going on:

Our multiple-case active pattern takes a DateTime object and decomposes it into a tuple representing the Month and Day.
The tuple then goes through the following match patterns:
- If Month is 1 (Day is disregarded) or Month is 2 (Day is disregarded) return Winter.
- If Month is 3 and Day is < 21 return Winter.
- If Month is 3 (Day is disregarded, but we already failed the < 21 check, so it has to be >= 21) or Month is 4 (Day is disregarded) or Month is 5 (Day is disregarded) return Spring.
- Repeat similar pattern for remaining seasons…
- The last pattern is our catchall, which will match when Month is 12 (and Day has to be >= 21 because of the previous pattern). Using the underscore for the final pattern makes the compiler happy since it doesn’t know that the integer representing Month has to be in the 1-12 range (assuming we can trust the .Net base class library).

You may have noticed, I slipped in another tidbit we never covered in part 1. That is the single | (pipe) used to combine match patterns into a single OR expression on one line. This is a convenient way to return the same result for multiple pattern matches. There is one caveat though. All of the patterns must bind to the same variable(s). So that means we couldn’t do the following to the match expressions, even though it is a bit more convenient:

This will result in the following compile error:

The two sides of this ‘or’ pattern bind to different sets of variables.

This is because Day is being bound to _ (which really disregards the day) for the 1 and 2 cases, but d for the third case. Also, even removing the when clause or changing the two _ to a bound identifier other than d will still result in the same error. It cannot be different between the cases, period.

Now with that being said, we could shorten the code by doing the following (I cleaned it up to look nicer and more readable, though I tend not to space things this way. It makes code maintenance harder when you add new cases etc.):

Each pattern is decomposing the Day into the same d variable. It just has to be the same for each OR clause (the same line), not the same for each pattern (separate lines). The downside to this approach is that we are binding Day to the d identifier, but then not using it on the right hand side of ->. It is required to use an identifier for the when clauses however, so we can’t just ditch the d altogether.

So it is a toss up and a matter of preference. This version is probably more readable to most people, but it creates unnecessary bindings.

UPDATE: Aug-11-2020
Thank you Rolf for bringing to my attention an issue with pattern_match2_fsharp_blog_12.fsx. I’ve decided to leave the offending code in since this may be a gotcha worth bringing to your attention.

When multiple OR patterns contain a guard clause, the clause applies to all the patterns, not just the last pattern. So in the case of | 1, d | 2, d | 3, d when d < 21 -> Winter the d < 21 clause would apply to both the 1, d and 2, d cases as well. This means any dates with a day that is greater than or equal to 21 will pass through all of the matches and land in the catch-all of Winter at the bottom.

For this reason, you should stick with the version in pattern_match2_fsharp_blog_10.fsx.

See the last example in the Guards on Patterns section in the MSDN documentation for more details.

Now that we have this nice little multiple-case active pattern to determine the season, let’s make use of it. We’ll even use another OR pattern for good measure.

Here we have a pattern matching function that will take a DateTime and convert it to a Canadian season. In Canada we joke that we really only have two seasons: Construction and winter. So our function will take a match on Spring, Summer or Fall and print Construction season. A match on Winter will print Winter season.
Remember that by using the function keyword shortcut, since this function only contains a pattern match expression, we don’t have to specify the DateTime parameter explicitly or the match .. with.

Now before we move on to the next active pattern, I shall digress for a moment. Since we are talking about the OR match pattern, we should probably look quickly at the AND match pattern.

As you would expect, we use the & (ampersand) character. We’ll go back to a simple combination of single-case active patterns for this example.

This is a solution to the standard FizzBuzz problem. Yes I went there. It should be pretty self-explanatory. If you don’t know what that is and can’t deduce it from the code (and comments), checkout the link above.

There is also a nice solution doing something similar that I found on FsSnip using multiple-case patterns.

While we’re in this digression intermission, lets hit on one final tidbit that didn’t make into part 1. I promise we’ll get back to active patterns after this.

Maybe you’ve heard of partial application for functions. That is where you make a new function that calls another function with some parameters baked-in. This is an effective form of code reuse in F#. Check out this post for an excellent overview.

Active patterns also allow partial application. Let’s look at a rather silly example.

So what are we doing here? First we have created a parameterized single-case active pattern called StartsWith. This takes a string, text, which is then checked to see if it starts with the first parameter find. The partial application happens on line 4 with IsAnFWord. We are baking the string f into a call to StartsWith

The wordCheck function can then use the IsAnFWord active recognizer in a match statement without needing to pass in the first parameter (remember the string f is already included for us), while still using StartsWith (which does requires the first parameter to be passed in).

I hope that made sense. If not, do some reading on regular partial application for functions (there are lots of examples out there). The exact same principles apply to partial application for active patterns.

Ok, intermission is over. Let’s move on to the next active pattern.

Partial Active Pattern

Ironically, partial active pattern sounds similar to partial application for active patterns (what we just talked about), but they are not the same thing. Partial active patterns are also referred to as single-case partial active patterns or a few other variations. There are no multi-case partial active patterns though, so partial active pattern should suffice.

So what does partial active pattern mean? The simplified explanation is that sometimes you will get a match and sometimes you won’t. So what is returned is an option type. If you don’t know about the option type, go check it out. I’ll wait here.

Sometimes option types are a pain, because you need to match against None or Some x (to extract the value and store it in x). Luckily, partial active patterns allow you to use option types for cleaner pattern matching expressions without the extra fuss. Recall our FizzBuzz example from earlier:

In lines 6, 7 and 8 we are forced to match against the value true. This seems like extra noise that would be nice to avoid (unless for some reason we wanted to match against false of course). So how can we make this a little nicer? By using a partial active pattern for both Fizz and Buzz.

Notice we’ve added an _ (underscore) as a second active recognizer for Fizz and Buzz. This signals that we are dealing with a partial active pattern, which then requires us to return an option type (the compiler will complain if you do not). We have done that by returning the given active recognizer wrapped in an option type (calling the Some constructor) when our condition is true and returning None when our condition is not met.

Looking at the fizzBuzz function, you can see that we no longer require true for the match expression.

Let’s try a more interesting example to show off the power of partial active patterns. We’ll do some log parsing.

There is quite a lot going on here in this small amount of code.

First, we have created a discriminated union to hold our types of log messages. We can have an Info, Warning or Error. Each union type holds a DateTime and a Message string. For Error, there is also a Code integer.

Next we have two partial active patterns that are wrapping the DateTime.TryParseExact and Int32.TryParse functions. These will handle invalid data for us by wrapping data in an Option type. Either we have the data we expect in the proper format or we don’t. When we don’t, we’ll get None returned, which then saves us from ugly error handling code.

Now let’s put this code to use with another partial active pattern.

This partial active pattern tokenizes a line and attempts to match the tokens against our expected format for an Error (we’ll see the format shortly).

Notice in the match expression, we can decompose the tokens nicely using an Array match pattern, which in turn decomposes data using our existing Date and Int partial active patterns.

So if we have a proper DateTime, then the string Error, an Integer and then our message string, we will match successfully and create an Error, wrapped in an Option. If any of those pieces don’t match we return None.

We could do the same for Info and Warning, but I’ll leave that out for brevity. Next let’s create a function to parse errors from the log file and then test it out.

Our parseErrors takes a log string sequence and recursively parses each line (we convert the sequence of strings to a list of strings on the initial call to the recursive function on line 9 to take advantage of list pattern matching). If an Error is matched via our partial active pattern for errors, we prepend it to our list of errors. Otherwise, we just continue parsing the next line.

For our test, we have sample sequence of strings (stored in testLog) in the format [DateTime] [Type] [Error Code if applicable] Message. This sequence is to simulate reading lines from a log file. We take that sample log and call parseErrors to produce a list containing only errors.

Ok, on to the final active pattern.

Parameterized Partial Active Pattern

A parameterized partial active pattern is just the same as a partial active pattern, but with one or more extra parameters. Let’s jump straight to our example.

This is very similar to the regular partial active pattern FizzBuzz example. However, we’ve added a MultipleOf partial active pattern that takes two parameters. The first parameter is our multiplicand and the second parameter is our number that we are matching against. With this parameterized partial active pattern, we can then take advantage of partial application for the Fizz and Buzz partial active patterns by baking in the 3 and 5 values. The rest of the code is exactly the same as the previous example used in regular partial active patterns.

So concludes my extremely long winded explanation of active patterns and a few extra tidbits thrown in for good measure. Hopefully you see how they can take pattern matching to the next level to allow your problem domain to shine through your code better.

Thanks for reading. Please leave any questions or suggestions in the comments.

4 thought on “F# Pattern Matching – Part 2 – Active Patterns”

Rolf says:

August 11, 2020 at 2:52 am

Dear Jason

Thank you for your blog post. The example in ‘pattern_match2_fsharp_blog_12.fsx’ does probably not behave as expected. The ‘when’ clause is applied to all of the OR patterns, not just the last one. For example “printCanadianSeason System.DateTime(2020, 04, 24)” will result in “Winter”.
See also the last example in https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/match-expressions

All the best
Rolf

1. Jason Down says:
  
  August 11, 2020 at 10:29 am
  
  Rolf,
  
  You are correct! Thank you for pointing this out. I will update the article with the proper example.
  
Luke Pammant says:

September 17, 2020 at 1:41 am

Not sure where/how I found this series but I’m glad I did. As a newbie to F# I appreciate well thought out content like this! Thank you.

1. Jason Down says:
  
  September 17, 2020 at 10:20 am
  
  Hi Luke,
  
  I’m glad you found the series useful! It was definitely a learning experience writing it as well. I’m hoping to get back to writing some more articles soon.

F# Pattern Matching – Part 2 – Active Patterns