Daniel Westheide

on making software

The Neophyte’s Guide to Scala Part 16: Where to Go From Here

As I have already hinted at in the previous article, the Neophyte’s Guide to Scala is coming to an end. Over the last five months, we have delved into numerous language and library features of Scala, hopefully deepening your understanding of those features and their underlying ideas and concepts.

As such, I hope this guide has served you well as a supplement to whatever introductory resources you have been using to learn Scala, whether you attended the Scala course at Coursera or learned from a book. I have tried to cover all the quirks I stumbled over back when I learned the language – things that were only mentioned briefly or not covered at all in the books and tutorials available to me – and I hope that especially those explanations were of value to you.

As the series progressed, we ventured into more advanced territory, covering ideas like type classes and path-dependent types. While I could go on writing about more and more arcane features, I felt like this would go against the idea of this series, which is clearly targeted at aspiring neophytes.

Hence, I will conclude this series with some suggestions of where to go from here if you want more. Rest assured that I will continue blogging about Scala, just not within the context of this series.

How you want to continue your journey with Scala, of course, depends a lot on your individual preferences: Maybe you are now at a point where you would like to teach Scala to others, or maybe you are excited about Scala’s type system and would like to explore some of the language’s more arcane features by delving into type-level programming.

More often than not, a good way to really get comfortable with a new language and its whole ecosystem of libraries is to use it for creating something useful, i.e. a real application that is more than just toying around. Personally, I have also gained a lot from contributing to open source projects early on.

In the following, I will elaborate those four paths, which, of course, are not mutually exclusive, and provide you with numerous links to highly recommendable additional resources.

Teaching Scala

Having followed this series, you should be familiar enough with Scala to be able to teach the basics. Maybe you are in a Java or Ruby shop and want to get your coworkers excited about Scala and functional programming.

Great, then why not organize a workshop? A nice way of introducing people to a new language is to not do a talk with lots of slides, but to teach by example, introducing a language in small steps by solving tiny problems together. Active participation is key!

If that’s something you’d like to do, the Scala community has you covered. Have a look at Scala Koans, a collection of small lessons, each of which provides a problem to be solved by fixing an initially failing test. The project is inspired by the Ruby Koans project and is a really good resource for teaching the language to others in small collaborative coding sessions.

Another amazing project ideally suited for workshops or other events is Scalatron, a game in which bots fight against each other in a virtual arena. Why not teach the language by developing such a bot together in a workshop, that will then fight against the computer? Once the participants are familiar enough with the language, organize a tournament, where each participant will develop their own bot.

Mastering arcane powers

We have only seen a tiny bit of what the Scala type system allows you to do. If this small hint at the high wizardry that’s possible got you excited and you want to master the arcane powers of type-level programming, a good starting resource is the blog series Type-Level Programming in Scala by Mark Harrah.

After that, I recommend to have a look at Shapeless, a library in which Miles Sabin explores the limits of the Scala language in terms of generic and polytypic programming.

Creating something useful

Reading books, doing tutorials and toying around with a new language is all fine to get a certain understanding of it, but in order to become really comfortable with Scala and its paradigm and learn how to think the Scala way, I highly recommend that you start creating something useful with it – something that is clearly more than a toy application (this is true for learning any language, in my opinion).

By tackling a real-world problem and trying to create a useful and usable application, you will also get a good overview of the ecosystem of libraries and get a feeling for which of those can be of service to you in specific situations.

In order to find relevant libraries or get updates of ones you are interested in, you should subscribe to implicit.ly and regularly take a look at Scala projects on GitHub.

It’s all about the web

These days, most applications written in Scala will be some kind of server applications, often with a RESTful interface exposed via HTTP and a web frontend.

If the actor model for concurrency is a good fit for your use case and you hence choose to use the Akka toolkit, an excellent choice for exposing a REST API via HTTP is Spray Routing. This is a great tool if you don’t need a web frontend, or if you want to develop a single-page web application that will talk to your backend by means of a REST API.

If you need something less minimalistic, of course, Play, which is part of the Typesafe stack, is a good choice, especially if you seek something that is widely adopted and hence well supported.

Living in a concurrent world

If after our two parts on actors and Akka, you think that Akka is a good fit for your application, you will likely want to learn a lot more about it before getting serious with it.

While the Akka documentation is pretty exhaustive and thus serves well as a reference, I think that the best choice for actually learning Akka is Derek Wyatt’s book Akka Concurrency, a preliminary version of which is already available as a PDF.

Once you have got serious with Akka, you should definitely subscribe to Let It Crash, which provides you with news and advanced tips and tricks and regarding all things Akka.

If actors are not your thing and you prefer a concurrency model allowing you to leverage the composability of Futures, your library of choice is probably Twitter’s Finagle. It allows you to modularize your application as a bunch of small remote services, with support for numerous popular protocols out of the box.

Contributing

Another really great way to learn a lot about Scala quickly is to start contributing to one or more open source projects – preferably to libraries you have been using while working on your own application.

Of course, this is nothing that is specific to Scala, but still I think it deserves to be mentioned. If you have only just learned Scala and are not using it at your day job already, it’s nearly the only choice you have to learn from other, more experienced Scala developers.

It forces you to read a lot of Scala code from other people, discovering how to do things differently, possibly more idiomatically, and you can have those experienced developers review your code in pull requests.

I have found the Scala community at large to be very friendly and helpful, so don’t shy away from contributing, even if you think you’re too much of a rookie when it comes to Scala.

While some projects might have their own way of doing things, it’s certainly a good idea to study the Scala Style Guide to get familiar with common coding conventions.

Connecting

By contributing to open source projects, you have already started connecting with the Scala community. However, you may not have the time to do that, or you may prefer other ways of connecting to like-minded people.

Try finding a local Scala user group or meetup. Scala Tribes provides an overview of Scala communities across the globe, and the Scala topic at Lanyrd keeps you up-to-date on any kind of Scala-related event, from conferences to meetups.

If you don’t like connecting in meatspace, the scala-user mailing list/Google group and the Scala IRC channel on Freenode may be good alternatives.

Other resources

Regardless of which of the paths outlined above you follow, there are a few resources I would like to recommend:

Conclusion

I hope you have enjoyed this series and that I could get you excited about Scala. While this series is coming to an end, I seriously hope that it’s just the beginning of your journey through Scala land. Let me know in the comments how your journey went so far and where you think it will go from here.

The Neophyte’s Guide to Scala Part 15: Dealing With Failure in Actor Systems

In the previous part of this series, I introduced you to the second cornerstone of Scala concurrency: The actor model, which complements the model based on composable futures backed by promises. You learnt how to define and create actors, how to send messages to them and how an actor processes these messages, possibly modifying its internal state as a result or asynchronously sending a response message to the sender.

While that was hopefully enough to get you interested in the actor model for concurrency, I left out some crucial concepts you will want to know about before starting to develop actor-based applications that consist of more than a simple echo actor.

The actor model is meant to help you achieve a high level of fault tolerance. In this article, we are going to have a look at how to deal with failure in an actor-based application, which is fundamentally different from error handling in a traditional layered server architecture.

The way you deal with failure is closely linked to some core Akka concepts and to some of the elements an actor system in Akka consists of. Hence, this article will also serve as a guide to those ideas and components.

Actor hierarchies

Before going into what happens when an error occurs in one of your actors, it’s essential to introduce one crucial idea underlying the actor approach to concurrency – an idea that is the very foundation for allowing you to build fault-tolerant concurrent applications: Actors are organized in a hierarchy.

So what does this mean? First of all, it means that every single of your actors has got a parent actor, and that each actor can create child actors. Basically, you can think of an actor system as a pyramid of actors. Parent actors watch over their children, just as in real life, taking care that they get back on their feet if they stumble. You will see shortly how exactly this is done.

The guardian actor

In the previous article, we only had two different actors, a Barista actor and a Customer actor. I will not repeat their rather trivial implementations, but focus on how we created instances of these actor types:

1
2
3
4
import akka.actor.ActorSystem
val system = ActorSystem("Coffeehouse")
val barista = system.actorOf(Props[Barista], "Barista")
val customer = system.actorOf(Props(classOf[Customer], barista), "Customer")

As you can see, we create these two actors by calling the actorOf method defined on the ActorSystem type.

So what is the parent of these two actors? Is it the actor system? Not quite, but close. The actor system is not an actor itself, but it has got a so-called guardian actor that serves as the parent of all root-level user actors, i.e. actors we create by calling actorOf on our actor system.

There shouldn’t be a whole lot of actors in your system that are children of the guardian actor. It really makes more sense to have only a few top-level actors, each of them delegating most of the work to their children.

Actor paths

The hierarchical structure of an actor system becomes apparent when looking at the actor paths of the actors you create. These are basically URLs by which actors can be addressed. You can get an actor’s path by calling path on its ActorRef:

1
2
barista.path // => akka.actor.ActorPath = akka://Coffeehouse/user/Barista
customer.path // => akka.actor.ActorPath = akka://Coffeehouse/user/Customer

The akka protocol is followed by the name of our actor system, the name of the user guardian actor and, finally, the name we gave our actor when calling actorOf on the system. In the case of remote actors, running on different machines, you would additionally see a host and a port.

Actor paths can be used to look up another actor. For example, instead of requiring the barista reference in its constructor, the Customer actor could call the actorSelection method on its ActorContext, passing in a relative path to retrieve a reference to the barista:

1
context.actorSelection("../Barista")

However, while being able to look up an actor by its path can sometimes come in handy, it’s often a much better idea to pass in dependencies in the constructor, just as we did before. Too much intimate knowledge about where your dependencies are located in the actor system makes your system more susceptible to bugs, and it will be difficult to refactor later on.

An example hierarchy

To illustrate how parents watch over their children and what this has got to do with keeping your system fault-tolerant, I’m going to stick to the domain of the coffeehouse. Let’s give the Barista actor a child actor to which it can delegate some of the work involved in running a coffeehouse.

If we really were to model the work of a barista, we were likely giving them a bunch of child actors for all the various subtasks. But to keep this article focused, we have to be a little simplistic with our example.

Let’s assume that the barista has got a register. It processes transactions, printing appropriate receipts and incrementing the day’s sales so far. Here is a first version of it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import akka.actor._
object Register {
  sealed trait Article
  case object Espresso extends Article
  case object Cappuccino extends Article
  case class Transaction(article: Article)
}
class Register extends Actor {
  import Register._
  import Barista._
  var revenue = 0
  val prices = Map[Article, Int](Espresso -> 150, Cappuccino -> 250)
  def receive = {
    case Transaction(article) =>
      val price = prices(article)
      sender ! createReceipt(price)
      revenue += price
  }
  def createReceipt(price: Int): Receipt = Receipt(price)
}

It contains an immutable map of the prices for each article, and an integer variable representing the revenue. Whenever it receives a Transaction message, it increments the revenue accordingly and returns a printed receipt to the sender.

The Register actor, as already mentioned, is supposed to be a child actor of the Barista actor, which means that we will not create it from our actor system, but from within our Barista actor. The initial version of our actor-come-parent looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
object Barista {
  case object EspressoRequest
  case object ClosingTime
  case class EspressoCup(state: EspressoCup.State)
  object EspressoCup {
    sealed trait State
    case object Clean extends State
    case object Filled extends State
    case object Dirty extends State
  }
  case class Receipt(amount: Int)
}
class Barista extends Actor {
  import Barista._
  import Register._
  import EspressoCup._
  import context.dispatcher
  import akka.util.Timeout
  import akka.pattern.ask
  import akka.pattern.pipe
  import concurrent.duration._

  implicit val timeout = Timeout(4.seconds)
  val register = context.actorOf(Props[Register], "Register")
  def receive = {
    case EspressoRequest =>
      val receipt = register ? Transaction(Espresso)
      receipt.map((EspressoCup(Filled), _)).pipeTo(sender)
    case ClosingTime => context.stop(self)
  }
}

First off, we define the message types that our Barista actor is able to deal with. An EspressoCup can have one out of a fixed set of states, which we ensure by using a sealed trait.

The more interesting part is to be found in the implementation of the Barista class. The dispatcher, ask, and pipe imports as well as the implicit timeout are required because we make use of Akka’s ask syntax and futures in our Receive partial function: When we receive an EspressoRequest, we ask the Register actor for a Receipt for our Transaction. This is then combined with a filled espresso cup and piped to the sender, which will thus receive a tuple of type (EspressoCup, Receipt). This kind of delegating subtasks to child actors and then aggregating or amending their work is typical for actor-based applications.

Also, note how we create our child actor by calling actorOf on our ActorContext instead of the ActorSystem. By doing so, the actor we create becomes a child actor of the one who called this method, instead of a top-level actor whose parent is the guardian actor.

Finally, here is our Customer actor, which, like the Barista actor, will sit at the top level, just below the guardian actor:

1
2
3
4
5
6
7
8
9
10
11
12
13
object Customer {
  case object CaffeineWithdrawalWarning
}
class Customer(coffeeSource: ActorRef) extends Actor with ActorLogging {
  import Customer._
  import Barista._
  import EspressoCup._
  def receive = {
    case CaffeineWithdrawalWarning => coffeeSource ! EspressoRequest
    case (EspressoCup(Filled), Receipt(amount)) =>
      log.info(s"yay, caffeine for ${self}!")
  }
}

It is not terribly interesting for our tutorial, which focuses more on the Barista actor hierarchy. What’s new is the use of the ActorLogging trait, which allows us to write to the log instead of printing to the console.

Now, if we create our actor system and populate it with a Barista and two Customer actors, we can happily feed our two under-caffeinated addicts with a shot of black gold:

1
2
3
4
5
6
7
import Customer._
val system = ActorSystem("Coffeehouse")
val barista = system.actorOf(Props[Barista], "Barista")
val customerJohnny = system.actorOf(Props(classOf[Customer], barista), "Johnny")
val customerAlina = system.actorOf(Props(classOf[Customer], barista), "Alina")
customerJohnny ! CaffeineWithdrawalWarning
customerAlina ! CaffeineWithdrawalWarning

If you try this out, you should see two log messages from happy customers.

To crash or not to crash?

Of course, what we are really interested in, at least in this article, is not happy customers, but the question of what happens if things go wrong.

Our register is a fragile device – its printing functionality is not as reliable as it should be. Every so often, a paper jam causes it to fail. Let’s add a PaperJamException type to the Register companion object:

1
class PaperJamException(msg: String) extends Exception(msg)

Then, let’s change the createReceipt method in our Register actor accordingly:

1
2
3
4
5
6
def createReceipt(price: Int): Receipt = {
  import util.Random
  if (Random.nextBoolean())
    throw new PaperJamException("OMG, not again!")
  Receipt(price)
}

Now, when processing a Transaction message, our Register actor will throw a PaperJamException in about half of the cases.

What effect does this have on our actor system, or on our whole application? Luckily, Akka is very robust and not affected by exceptions in our code at all. What happens, though, is that the parent of the misbehaving child is notified – remember that parents are watching over their children, and this is the situation where they have to decide what to do.

Supervisor strategies

The whole act of being notified about exceptions in child actors, however, is not handled by the parent actor’s Receive partial function, as that would confound the parent actor’s own behaviour with the logic for dealing with failure in its children. Instead, the two responsibilities are clearly separated.

Each actor defines its own supervisor strategy, which tells Akka how to deal with certain types of errors occurring in your children.

There are basically two different types of supervisor strategy, the OneForOneStrategy and the AllForOneStrategy. Choosing the former means that the way you want to deal with an error in one of your children will only affect the child actor from which the error originated, whereas the latter will affect all of your child actors. Which of those strategies is best depends a lot on your individual application.

Regardless of which type of SupervisorStrategy you choose for your actor, you will have to specify a Decider, which is a PartialFunction[Throwable, Directive] – this allows you to match against certain subtypes of Throwable and decide for each of them what’s supposed to happen to your problematic child actor (or all your child actors, if you chose the all-for-one strategy).

Directives

Here is a list of the available directives:

1
2
3
4
5
sealed trait Directive
case object Resume extends Directive
case object Restart extends Directive
case object Stop extends Directive
case object Escalate extends Directive
  • Resume: If you choose to Resume, this probably means that you think of your child actor as a little bit of a drama queen. You decide that the exception was not so exceptional after all – the child actor or actors will simply resume processing messages as if nothing extraordinary had happened.

  • Restart: The Restart directive causes Akka to create a new instance of your child actor or actors. The reasoning behind this is that you assume that the internal state of the child/children is corrupted in some way so that it can no longer process any further messages. By restarting the actor, you hope to put it into a clean state again.

  • Stop: You effectively kill the actor. It will not be restarted.

  • Escalate: If you choose to Escalate, you probably don’t know how to deal with the failure at hand. You delegate the decision about what to do to your own parent actor, hoping they are wiser than you. If an actor escalates, they may very well be restarted themselves by their parent, as the parent will only decide about its own child actors.

The default strategy

You don’t have to specify your own supervisor strategy in each and every actor. In fact, we haven’t done that so far. This means that the default supervisor strategy will take effect. It looks like this:

1
2
3
4
5
6
7
8
final val defaultStrategy: SupervisorStrategy = {
  def defaultDecider: Decider = {
    case _: ActorInitializationException  Stop
    case _: ActorKilledException          Stop
    case _: Exception                     Restart
  }
  OneForOneStrategy()(defaultDecider)
}

This means that for exceptions other than ActorInitializationException or ActorKilledException, the respective child actor in which the exception was thrown will be restarted.

Hence, when a PaperJamException occurs in our Register actor, the supervisor strategy of the parent actor (the barista) will cause the Register to be restarted, because we haven’t overridden the default strategy.

If you try this out, you will likely see an exception stacktrace in the log, but nothing about the Register actor being restarted.

Let’s verify that this is really happening. To do so, however, you will need to learn about the actor lifecycle.

The actor lifecycle

To understand what the directives of a supervisor strategy actually do, it’s crucial to know a little bit about an actor’s lifecycle. Basically, it boils down to this: when created via actorOf, an actor is started. It can then be restarted an arbitrary number of times, in case there is a problem with it. Finally, an actor can be stopped, ultimately leading to its death.

There are numerous lifecycle hook methods that an actor implementation can override. It’s also important to know their default implementations. Let’s go through them briefly:

  • preStart: Called when an actor is started, allowing you to do some initialization logic. The default implementation is empty.
  • postStop: Empty by default, allowing you to clean up resources. Called after stop has been called for the actor.
  • preRestart: Called right before a crashed actor is restarted. By default, it stops all children of that actor and then calls postStop to allow cleaning up of resources.
  • postRestart: Called immediately after an actor has been restarted. Simply calls preStart by default.

Let’s see if our Register gets indeed restarted upon failure by simply adding some log output to its postRestart method. Make the Register type extend the ActorLogging trait and add the following method to it:

1
2
3
4
override def postRestart(reason: Throwable) {
  super.postRestart(reason)
  log.info(s"Restarted because of ${reason.getMessage}")
}

Now, if you send the two Customer actors a bunch of CaffeineWithdrawalWarning messages, you should see the one or the other of those log outputs, confirming that our Register actor has been restarted.

Death of an actor

Often, it doesn’t make sense to restart an actor again and again – think of an actor that talks to some other service over the network, and that service has been unreachable for a while. In such cases, it is a very good idea to tell Akka how often to restart an actor within a certain period of time. If that limit is exceeded, the actor is instead stopped and hence dies. Such a limit can be configured in the constructor of the supervisor strategy:

1
2
3
4
5
6
import scala.concurrent.duration._
import akka.actor.OneForOneStrategy
import akka.actor.SupervisorStrategy.Restart
OneForOneStrategy(10, 2.minutes) {
  case _ => Restart
}

The self-healing system?

So, is our system running smoothly, healing itself whenever this damn paper jam occurs? Let’s change our log output:

1
2
3
4
override def postRestart(reason: Throwable) {
  super.postRestart(reason)
  log.info(s"Restarted, and revenue is $revenue cents")
}

And while we are at it, let’s also add some more logging to our Receive partial function, making it look like this:

1
2
3
4
5
6
7
def receive = {
  case Transaction(article) =>
    val price = prices(article)
    sender ! createReceipt(price)
    revenue += price
    log.info(s"Revenue incremented to $revenue cents")
}

Ouch! Something is clearly not as it should be. In the log, you will see the revenue increasing, but as soon as there is a paper jam and the Register actor restarts, it is reset to 0. This is because restarting indeed means that the old instance is discarded and a new one created as per the Props we initially passed to actorOf.

Of course, we could change our supervisor strategy, so that it resumes in case of a PaperJamException. We would have to add this to the Barista actor:

1
2
3
4
5
val decider: PartialFunction[Throwable, Directive] = {
  case _: PaperJamException => Resume
}
override def supervisorStrategy: SupervisorStrategy =
  OneForOneStrategy()(decider.orElse(SupervisorStrategy.defaultStrategy.decider))

Now, the actor is not restarted upon a PaperJamException, so its state is not reset.

Error kernel

So we just found a nice solution to preserve the state of our Register actor, right?

Well, sometimes, simply resuming might be the best thing to do. But let’s assume that we really have to restart it, because otherwise the paper jam will not disappear. We can simulate this by maintaining a boolean flag that says if we are in a paper jam situation or not. Let’s change our Register like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Register extends Actor with ActorLogging {
  import Register._
  import Barista._
  var revenue = 0
  val prices = Map[Article, Int](Espresso -> 150, Cappuccino -> 250)
  var paperJam = false
  override def postRestart(reason: Throwable) {
    super.postRestart(reason)
    log.info(s"Restarted, and revenue is $revenue cents")
  }
  def receive = {
    case Transaction(article) =>
      val price = prices(article)
      sender ! createReceipt(price)
      revenue += price
      log.info(s"Revenue incremented to $revenue cents")
  }
  def createReceipt(price: Int): Receipt = {
    import util.Random
    if (Random.nextBoolean()) paperJam = true
    if (paperJam) throw new PaperJamException("OMG, not again!")
    Receipt(price)
  }
}

Also remove the supervisor strategy we added to the Barista actor.

Now, the paper jam remains forever, until we have restarted the actor. Alas, we cannot do that without also losing important state regarding our revenue.

This is where the error kernel pattern comes in. Basically, it is just a simple guideline you should always try to follow, stating that if an actor carries important internal state, then it should delegate dangerous tasks to child actors, so as to prevent the state-carrying actor from crashing. Sometimes, it may make sense to spawn a new child actor for each such task, but that’s not a necessity.

The essence of the pattern is to keep important state as far at the top of the actor hierarchy as possible, while pushing error-prone tasks as far to the bottom of the hierarchy as possible.

Let’s apply this pattern to our Register actor. We will keep the revenue state in the Register actor, but move the error-prone behaviour of printing the receipt to a new child actor, which we appropriately enough call ReceiptPrinter. Here is the latter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
object ReceiptPrinter {
  case class PrintJob(amount: Int)
  class PaperJamException(msg: String) extends Exception(msg)
}
class ReceiptPrinter extends Actor with ActorLogging {
  var paperJam = false
  override def postRestart(reason: Throwable) {
    super.postRestart(reason)
    log.info(s"Restarted, paper jam == $paperJam")
  }
  def receive = {
    case PrintJob(amount) => sender ! createReceipt(amount)
  }
  def createReceipt(price: Int): Receipt = {
    if (Random.nextBoolean()) paperJam = true
    if (paperJam) throw new PaperJamException("OMG, not again!")
    Receipt(price)
  }
}

Again, we simulate the paper jam with a boolean flag and throw an exception each time someone asks us to print a receipt while in a paper jam. Other than the new message type, PrintJob, this is really just extracted from the Register type.

This is a good thing, not only because it moves away this dangerous operation from the stateful Register actor, but it also makes our code simpler and consequently easier to reason about: The ReceiptPrinter actor is responsible for exactly one thing, and the Register actor has become simpler, too, now being only responsible for managing the revenue, delegating the remaining functionality to a child actor:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Register extends Actor with ActorLogging {
  import akka.pattern.ask
  import akka.pattern.pipe
  import context.dispatcher
  implicit val timeout = Timeout(4.seconds)
  var revenue = 0
  val prices = Map[Article, Int](Espresso -> 150, Cappuccino -> 250)
  val printer = context.actorOf(Props[ReceiptPrinter], "Printer")
  override def postRestart(reason: Throwable) {
    super.postRestart(reason)
    log.info(s"Restarted, and revenue is $revenue cents")
  }
  def receive = {
    case Transaction(article) =>
      val price = prices(article)
      val requester = sender
      (printer ? PrintJob(price)).map((requester, _)).pipeTo(self)
    case (requester: ActorRef, receipt: Receipt) =>
      revenue += receipt.amount
      log.info(s"revenue is $revenue cents")
      requester ! receipt
  }
}

We don’t spawn a new ReceiptPrinter for each Transaction message we get. Instead, we use the default supervisor strategy to have the printer actor restart upon failure.

One part that merits explanation is the weird way we increment our revenue: First we ask the printer for a receipt. We map the future to a tuple containing the answer as well as the requester, which is the sender of the Transaction message and pipe this to ourselves. When processing that message, we finally increment the revenue and send the receipt to the requester.

The reason for that indirection is that we want to make sure that we only increment our revenue if the receipt was successfully printed. Since it is vital to never ever modify the internal state of an actor inside of a future, we have to use this level of indirection. It helps us make sure that we only change the revenue within the confines of our actor, and not on some other thread.

Assigning the sender to a val is necessary for similar reasons: When mapping a future, we are no longer in the context of our actor either – since sender is a method, it would now likely return the reference to some other actor that has sent us a message, not the one we intended.

Now, our Register actor is safe from constantly being restarted, yay!

Of course, the very idea of having the printing of the receipt and the management of the revenue in one place is questionable. Having them together came in handy for demonstrating the error kernel pattern. Yet, it would certainly be a lot better to seperate the receipt printing from the revenue management altogether, as these are two concerns that don’t really belong together.

Timeouts

Another thing that we may want to improve upon is the handling of timeouts. Currently, when an exception occurs in the ReceiptPrinter, this leads to an AskTimeoutException, which, since we are using the ask syntax, comes back to the Barista actor in an unsuccessfully completed Future.

Since the Barista actor simply maps over that future (which is success-biased) and then pipes the transformed result to the customer, the customer will also receive a Failure containing an AskTimeoutException.

The Customer didn’t ask for anything, though, so it is certainly not expecting such a message, and in fact, it currently doesn’t handle these messages. Let’s be friendly and send customers a ComebackLater message – this is a message they already understand, and it makes them try to get an espresso at a later point. This is clearly better, as the current solution means they will never know that they will not get their espresso.

To achieve this, let’s recover from AskTimeoutException failures by mapping them to ComebackLater messages. The Receive partial function of our Barista actor thus now looks like this:

1
2
3
4
5
6
7
8
def receive = {
  case EspressoRequest =>
    val receipt = register ? Transaction(Espresso)
    receipt.map((EspressoCup(Filled), _)).recover {
      case _: AskTimeoutException => ComebackLater
    } pipeTo(sender)
  case ClosingTime => context.system.shutdown()
}

Now, the Customer actors know they can try their luck later, and after trying often enough, they should finally get their eagerly anticipated espresso.

Death Watch

Another principle that is important in order to keep your system fault-tolerant is to keep a watch on important dependencies – dependencies as opposed to children.

Sometimes, you have actors that depend on other actors without the latter being their children. This means that they can’t be their supervisors. Yet, it is important to keep a watch on their state and be notified if bad things happen.

Think, for instance, of an actor that is responsible for database access. You will want actors that require this actor to be alive and healthy to know when that is no longer the case. Maybe you want to switch your system to a maintenance mode in such a situation. For other use cases, simply using some kind of backup actor as a replacement for the dead actor may be a viable solution.

In any case, you will need to place a watch on an actor you depend on in order to get the sad news of its passing away. This is done by calling the watch method defined on ActorContext. To illustrate, let’s have our Customer actors watch the Barista – they are highly addicted to caffeine, so it’s fair to say they depend on the barista:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Customer(coffeeSource: ActorRef) extends Actor with ActorLogging {
  import context.dispatcher

  context.watch(coffeeSource)

  def receive = {
    case CaffeineWithdrawalWarning => coffeeSource ! EspressoRequest
    case (EspressoCup(Filled), Receipt(amount)) =>
      log.info(s"yay, caffeine for ${self}!")
    case ComebackLater =>
      log.info("grumble, grumble")
      context.system.scheduler.scheduleOnce(300.millis) {
        coffeeSource ! EspressoRequest
      }
    case Terminated(barista) =>
      log.info("Oh well, let's find another coffeehouse...")
  }
}

We start watching our coffeeSource in our constructor, and we added a new case for messages of type Terminated – this is the kind of message we will receive from Akka if an actor we watch dies.

Now, if we send a ClosingTime to the message and the Barista tells its context to stop itself, the Customer actors will be notified. Give it a try, and you should see their output in the log.

Instead of simply logging that we are not amused, this could just as well initiate some failover logic, for instance.

Summary

In this part of the series, which is the second one dealing with actors and Akka, you got to know some of the important components of an actor system, all while learning how to put the tools provided by Akka and the ideas behind it to use in order to make your system more fault-tolerant.

While there is still a lot more to learn about the actor model and Akka, we shall leave it at that for now, as this would go beyond the scope of this series. In the next part, which shall bring this series to a conclusion, I will point you to a bunch of Scala resources you may want to peruse to continue your journey through Scala land, and if actors and Akka got you excited, there will be something in there for you, too.

The Neophyte’s Guide to Scala Part 14: The Actor Approach to Concurrency

After several articles about how you can leverage the Scala type system to achieve a great amount of both flexibility and compile-time safety, we are now shifting back to a topic that we already tackled previously in this series: Scala’s take on concurrency.

In these earlier articles, you learned about an approach that allows you to work asynchronously by making use of composable Futures.

This approach is a very good fit for numerous problems. However, it’s not the only one Scala has to offer. A second cornerstone of Scala concurrency is the Actor model. It provides an approach to concurrency that is entirely based on passing messages between processes.

Actors are not a new idea – the most prominent implementation of this model can be found in Erlang. The Scala core library has had its own actors library for a long time, but it faces the destiny of deprecation in the coming Scala version 2.11, when it will ultimately be replaced by the actors implementation provided by the Akka toolkit, which has been a de-facto standard for actor-based development with Scala for quite a while.

In this article, you will be introduced to the rationale behind Akka’s actor model and learn the basics of coding within this paradigm using the Akka toolkit. It is by no means an in-depth discussion of everything you need to know about Akka actors, and in that, it differs from most of the previous articles in this series. Rather, the intention is to familiarize you with the Akka mindset and serve as an initial spark to get you excited about it.

The problems with shared mutable state

The predominant approach to concurrency today is that of shared mutable state – a large number of stateful objects whose state can be changed by multiple parts of your application, each running in their own thread. Typically, the code is interspersed with read and write locks, to make sure that the state can only be changed in a controlled way and prevent multiple threads from mutating it simultaneously. At the same time, we are trying hard not to lock too big a block of code, as this can drastically slow down the application.

More often than not, code like this has originally been written without having concurrency in mind at all – only to be made fit for a multi-threaded world once the need arose. While writing software without the need for concurrency like this leads to very straightforward code, adapting it to the needs of a concurrent world leads to code that is really, really difficult to read and understand.

The core problem is that low-level synchronization constructs like locks and threads are very hard to reason about. As a consequence, it’s very hard to get it right: If you can’t easily reason about what’s going on, you can be sure that nasty bugs will ensue, from race conditions to deadlocks or just strange behaviour – maybe you’ll only notice after some months, long after your code has been deployed to your production servers.

Also, working with these low-level constructs makes it a real challenge to achieve an acceptable performance.

The Actor model

The Actor programming model is aimed at avoiding all the problems described above, allowing you to write highly performant concurrent code that is easy to reason about. Unlike the widely used approach of shared mutable state, it requires you to design and write your application from the ground up with concurrency in mind – it’s not really possible to add support for it later on.

The idea is that your application consists of lots of light-weight entities called actors. Each of these actors is responsible for only a very small task, and is thus easy to reason about. A more complex business logic arises out of the interaction between several actors, delegating tasks to others or passing messages to collaborators for other reasons.

The Actor System

Actors are pitiful creatures: They cannot live on their own. Rather, each and every actor in Akka resides in and is created by an actor system. Aside from allowing you to create and find actors, an ActorSystem provides for a whole bunch of additional functionality, none of which shall concern us right now.

In order to try out the example code, please add the following resolver and dependency to your SBT-based Scala 2.10 project first:

1
2
3
resolvers += "Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases"

libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.2.3"

Now, let’s create an ActorSystem. We’ll need it as an environment for our actors:

1
2
3
4
5
import akka.actor.ActorSystem
object Barista extends App {
  val system = ActorSystem("Barista")
  system.shutdown()
}

We created a new instance of ActorSystem and gave it the name "Barista" – we are returning to the domain of coffee, which should be familiar from the article on composable futures.

Finally, we are good citizens and shut down our actor system once we no longer need it.

Defining an actor

Whether your application consists of a few dozen or a few million actors totally depends on your use case, but Akka is absolutely okay with a few million. You might be baffled by this insanely high number. It’s important to understand that there is not a one-to-one relationship between an actor and a thread. You would soon run out of memory if that were the case. Rather, due to the non-blocking nature of actors, one thread can execute many actors – switching between them depending on which of them has messages to be processed.

To understand what is actually happening, let’s first create a very simple actor, a Barista that can receive orders but doesn’t really do anything apart from printing messages to the console:

1
2
3
4
5
6
7
8
9
10
11
sealed trait CoffeeRequest
case object CappuccinoRequest extends CoffeeRequest
case object EspressoRequest extends CoffeeRequest

import akka.actor.Actor
class Barista extends Actor {
  def receive = {
    case CappuccinoRequest => println("I have to prepare a cappuccino!")
    case EspressoRequest => println("Let's prepare an espresso.")
  }
}

First, we define the types of messages that our actor understands. Typically, case classes are used for messages sent between actors if you need to pass along any parameters. If all the actor needs is an unparameterized message, this message is typically represented as a case object – which is exactly what we are doing here.

In any case, it’s crucial that your messages are immutable, or else bad things will happen.

Next, let’s have a look at our class Barista, which is the actual actor, extending the aptly named Actor trait. Said trait defines a method receive which returns a value of type Receive. The latter is really only a type alias for PartialFunction[Any, Unit].

Processing messages

So what’s the meaning of this receive method? The return type, PartialFunction[Any, Unit] may seem strange to you in more than one respect.

In a nutshell, the partial function returned by the receive method is responsible for processing your messages. Whenever another part of your software – be it another actor or not – sends your actor a message, Akka will eventually let it process this message by calling the partial function returned by your actor’s receive method, passing it the message as an argument.

Side-effecting

When processing a message, an actor can do whatever you want it to, apart from returning a value.

Wat!?

As the return type of Unit suggests, your partial function is side-effecting. This might come as a bit of a shock to you after we emphasized the usage of pure functions all the time. For a concurrent programming model, this actually makes a lot of sense. Actors are where your state is located, and having some clearly defined places where side-effects will occur in a controllable manner is totally fine – each message your actor receives is processed in isolation, one after another, so there is no need to reason about synchronization or locks.

Untyped

But… this partial function is not only side-effecting, it’s also as untyped as you can get in Scala, expecting an argument of type Any. Why is that, when we have such a powerful type system at our fingertips?

This has a lot to do with some important design choices in Akka that allow you to do things like forwarding messages to other actors, installing load balancing or proxying actors without the sender having to know anything about them and so on.

In practice, this is usually not a problem. With the messages themselves being strongly typed, you typically use pattern matching for processing those types of messages you are interested in, just as we did in our tiny example above.

Sometimes though, the weakly typed actors can indeed lead to nasty bugs the compiler can’t catch for you. If you have grown to love the benefits of a strong type system and think you don’t want to go away from that at any costs for some parts of your application, you may want to look at Akka’s new experimental Typed Channels feature.

Asynchronous and non-blocking

I wrote above that Akka would let your actor eventually process a message sent to it. This is important to keep in mind: Sending a message and processing it is done in an asynchronous and non-blocking fashion. The sender will not be blocked until the message has been processed by the receiver. Instead, they can immediately continue with their own work. Maybe they expect to get a messsage from your actor in return after a while, or maybe they are not interested in hearing back from your actor at all.

What really happens when some component sends a message to an actor is that this message is delivered to the actor’s mailbox, which is basically a queue. Placing a message in an actor’s mailbox is a non-blocking operation, i.e. the sender doesn’t have to wait until the message is actually enqueued in the recipient’s mailbox.

The dispatcher will notice the arrival of a new message in an actor’s mailbox, again asynchronously. If the actor is not already processing a previous message, it is now allocated to one of the threads available in the execution context. Once the actor is done processing any previous messages, the dispatcher sends it the next message from its mailbox for processing.

The actor blocks the thread to which it is allocated for as long as it takes to process the message. While this doesn’t block the sender of the message, it means that lengthy operations degrade overall performance, as all the other actors have to be scheduled for processing messages on one of the remaining threads.

Hence, a core principle to follow for your Receive partial functions is to spend as little time inside them as possible. Most importantly, avoid calling blocking code inside your message processing code, if possible at all.

Of course, this is something you can’t prevent doing completely – the majority of database drivers nowadays is still blocking, and you will want to be able to persist data or query for it from your actor-based application. There are solutions to this dilemma, but we won’t cover them in this introductory article.

Creating an actor

Defining an actor is all well and good, but how do we actually use our Barista actor in our application? To do that, we have to create a new instance of our Barista actor. You might be tempted to do it the usual way, by calling its constructor like so:

1
val barista = new Barista // will throw exception

This will not work! Akka will thank you with an ActorInitializationException. The thing is, in order for the whole actor thingie to work properly, your actors need to be managed by the ActorSystem and its components. Hence, you have to ask the actor system for a new instance of your actor:

1
2
import akka.actor.{ActorRef, Props}
val barista: ActorRef = system.actorOf(Props[Barista], "Barista")

The actorOf method defined on ActorSystem expects a Props instance, which provides a means of configuring newly created actors, and, optionally, a name for your actor instance. We are using the simplest form of creating such a Props instance, providing the apply method of the companion object with a type parameter. Akka will then create a new instance of the actor of the given type by calling its default constructor.

Be aware that the type of the object returned by actorOf is not Barista, but ActorRef. Actors never communicate with another directly and hence there are supposed to be no direct references to actor instances. Instead, actors or other components of your application aquire references to the actors they need to send messages to.

Thus, an ActorRef acts as some kind of proxy to the actual actor. This is convenient because an ActorRef can be serialized, allowing it to be a proxy for a remote actor on some other machine. For the component aquiring an ActorRef, the location of the actor – local in the same JVM or remote on some other machine – is completely transparent. We call this property location transparency.

Please note that ActorRef is not parameterized by type. Any ActorRef can be exchanged for another, allowing you to send arbitrary messages to any ActorRef. This is by design and, as already mentioned above, allows for easily modifying the topology of your actor system wihout having to make any changes to the senders.

Sending messages

Now that we have created an instance of our Barista actor and got an ActorRef linked to it, we can send it a message. This is done by calling the ! method on the ActorRef:

1
2
3
barista ! CappuccinoRequest
barista ! EspressoRequest
println("I ordered a cappuccino and an espresso")

Calling the ! is a fire-and-forget operation: You tell the Barista that you want a cappuccino, but you don’t wait for their response. It’s the most common way in Akka for interacting with other actors. By calling this method, you tell Akka to enqueue your message in the recipient’s mailbox. As described above, this doesn’t block, and eventually the recipient actor will process your message.

Due to the asynchronous nature, the result of the above code is not deterministic. It might look like this:

1
2
3
I have to prepare a cappuccino!
I ordered a cappuccino and an espresso
Let's prepare an espresso.

Even though we first sent the two messages to the Barista actor’s mailbox, between the processing of the first and second message, our own output is printed to the console.

Answering to messages

Sometimes, being able to tell others what to do just doesn’t cut it. You would like to be able to answer by in turn sending a message to the sender of a message you got – all asynchronously of course.

To enable you to do that and lots of other things that are of no concern to us right now, actors have a method called sender, which returns the ActorRef of the sender of the last message, i.e. the one you are currently processing.

But how does it know about that sender? The answer can be found in the signature of the ! method, which has a second, implicit parameter list:

1
def !(message: Any)(implicit sender: ActorRef = Actor.noSender): Unit

When called from an actor, its ActorRef is passed on as the implicit sender argument.

Let’s change our Barista so that they immediately send a Bill to the sender of a CoffeeRequest before printing their usual output to the console:

1
2
3
4
5
6
7
8
9
10
11
12
13
case class Bill(cents: Int)
case object ClosingTime
class Barista extends Actor {
  def receive = {
    case CappuccinoRequest =>
      sender ! Bill(250)
      println("I have to prepare a cappuccino!")
    case EspressoRequest =>
      sender ! Bill(200)
      println("Let's prepare an espresso.")
    case ClosingTime => context.system.shutdown()
  }
}

While we are at it, we are introducing a new message, ClosingTime. The Barista reacts to it by shutting down the actor system, which they, like all actors, can access via their ActorContext.

Now, let’s introduce a second actor representing a customer:

1
2
3
4
5
6
7
case object CaffeineWithdrawalWarning
class Customer(caffeineSource: ActorRef) extends Actor {
  def receive = {
    case CaffeineWithdrawalWarning => caffeineSource ! EspressoRequest
    case Bill(cents) => println(s"I have to pay $cents cents, or else!")
  }
}

This actor is a real coffee junkie, so it needs to be able to order new coffee. We pass it an ActorRef in the constructor – for the Customer, this is simply its caffeineSource – it doesn’t know whether this ActorRef points to a Barista or something else. It knows that it can send CoffeeRequest messages to it, and that is all that matters to them.

Finally, we need to create these two actors and send the customer a CaffeineWithdrawalWarning to get things rolling:

1
2
3
4
val barista = system.actorOf(Props[Barista], "Barista")
val customer = system.actorOf(Props(classOf[Customer], barista), "Customer")
customer ! CaffeineWithdrawalWarning
barista ! ClosingTime

Here, for the Customer actor, we are using a different factory method for creating a Props instance: We pass in the type of the actor we want to have instantiated as well as the constructor arguments that actor takes. We need to do this because we want to pass the ActorRef of our Barista actor to the constructor of the Customer actor.

Sending the CaffeineWithdrawalWarning to the customer makes it send an EspressoRequest to the barista who will then send a Bill back to the customer. The output of this may look like this:

1
2
Let's prepare an espresso.
I have to pay 200 cents, or else!

First, while processing the EspressoRequest message, the Barista sends a message to the sender of that message, the Customer actor. However, this operation doesn’t block until the latter processes it. The Barista actor can continue processing the EspressoRequest immediately, and does this by printing to the console. Shortly after, the Customer starts to process the Bill message and in turn prints to the console.

Asking questions

Sometimes, sending an actor a message and expecting a message in return at some later time isn’t an option – the most common place where this is the case is in components that need to interface with actors, but are not actors themselves. Living outside of the actor world, they cannot receive messages.

For situations such as these, there is Akka’s ask support, which provides some sort of bridge between actor-based and future-based concurrency. From the client perspective, it works like this:

1
2
3
4
5
6
7
8
9
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.duration._
implicit val timeout = Timeout(2.second)
implicit val ec = system.dispatcher
val f: Future[Any] = barista2 ? CappuccinoRequest
f.onSuccess {
  case Bill(cents) => println(s"Will pay $cents cents for a cappuccino")
}

First, you need to import support for the ask syntax and create an implicit timeout for the Future returned by the ? method. Also, the Future needs an ExecutionContext. Here, we simply use the default dispatcher of our ActorSystem, which is conveniently also an ExecutionContext.

As you can see, the returned future is untyped – it’s a Future[Any]. This shouldn’t come as a surprise, since it’s really a received message from an actor, and those are untyped, too.

For the actor that is being asked, this is actually the same as sending some message to the sender of a processed message. This is why asking our Barista works out of the box without having to change anything in our Barista actor.

Once the actor being asked sends a message to the sender, the Promise belonging to the returned Future is completed.

Generally, telling is preferable to asking, because it’s more resource-sparing. Akka is not for polite people! However, there are situations where you really need to ask, and then it’s perfectly fine to do so.

Stateful actors

Each actor may maintain an internal state, but that’s not strictly necessary. Sometimes, a large part of the overall application state consists of the information carried by the immutable messages passed between actors.

An actor only ever processes one message at a time. While doing so, it may modify its internal state. This means that there is some kind of mutable state in an actor, but since each message is processed in isolation, there is no way the internal state of our actor can get messed up due to concurrency problems.

To illustrate, let’s turn our stateless Barista into an actor carrying state, by simply counting the number of orders:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Barista extends Actor {
  var cappuccinoCount = 0
  var espressoCount = 0
  def receive = {
    case CappuccinoRequest =>
      sender ! Bill(250)
      cappuccinoCount += 1
      println(s"I have to prepare cappuccino #$cappuccinoCount")
    case EspressoRequest =>
      sender ! Bill(200)
      espressoCount += 1
      println(s"Let's prepare espresso #$espressoCount.")
    case ClosingTime => context.system.shutdown()
  }
}

We introduced two vars, cappuccinoCount and espressoCount that are incremented with each respective order. This is actually the first time in this series that we have used a var. While to be avoided in functional programming, they are really the only way to allow your actors to carry state. Since each message is processed in isolation, our above code is similar to using AtomicInteger values in a non-actor environment.

Conclusion

And here ends our introduction to the actor programming model for concurrency and how to work within this paradigm using Akka. While we have really only scratched the surface and have ignored some important concepts of Akka, I hope to have given enough of an insight into this approach to concurrency to give you a basic understanding and get you interested in learning more.

In the coming articles, I will elaborate our little example, adding some meaningful behaviour to it while introducing more of the ideas behind Akka actors, among them the question of how errors are handled in an actor system.

P.S. Please note that starting with this article I have switched to a biweekly schedule for the remaining parts of this series.

The Neophyte’s Guide to Scala Part 13: Path-dependent Types

In last week’s article, I introduced you to the idea of type classes – a pattern that allows you to design your programs to be open for extension without giving up important information about concrete types. This week, I’m going to stick with Scala’s type system and talk about one of its features that distinguishes it from most other mainstream programming languages: Scala’s form of dependent types, in particular path-dependent types and dependent method types.

One of the most widely used arguments against static typing is that “the compiler is just in the way” and that in the end, it’s all only data, so why care about building up a complex hierarchy of types?

In the end, having static types is all about preventing bugs by allowing the über-smart compiler to humiliate you on a regular basis, making sure you’re doing the right thing before it’s too late.

Using path-dependent types is one powerful way to help the compiler prevent you from introducing bugs, as it places logic that is usually only available at runtime into types.

Sometimes, accidentally introducing path-dependent types can lead to frustation, though, especially if you have never heard of them. Hence, it’s definitely a good idea to get familiar with them, whether you decide to put them to use or not.

The problem

I will start by presenting a problem that path-dependent types can help us solving: In the realm of fan fiction, the most atrocious things happen – usually, the involved characters will end up making out with each other, regardless how inappropriate it is. There is even crossover fan fiction, in which two characters from different franchises are making out with each other.

However, elitist fan fiction writers look down on this. Surely there is a way to prevent such wrongdoing! Here is a first version of our domain model:

1
2
3
4
5
6
7
8
9
object Franchise {
   case class Character(name: String)
 }
class Franchise(name: String) {
  import Franchise.Character
  def createFanFiction(
    lovestruck: Character,
    objectOfDesire: Character): (Character, Character) = (lovestruck, objectOfDesire)
}

Characters are represented by instances of the Character case class, and the Franchise class has a method to create a new piece of fan fiction about two characters. Let’s create two franchises and some characters:

1
2
3
4
5
6
7
8
val starTrek = new Franchise("Star Trek")
val starWars = new Franchise("Star Wars")

val quark = Franchise.Character("Quark")
val jadzia = Franchise.Character("Jadzia Dax")

val luke = Franchise.Character("Luke Skywalker")
val yoda = Franchise.Character("Yoda")

Unfortunately, at the moment we are unable to prevent bad things from happening:

1
starTrek.createFanFiction(lovestruck = jadzia, objectOfDesire = luke)

Horrors of horrors! Someone has created a piece of fan fiction in which Jadzia Dax is making out with Luke Skywalker. Preposterous! Clearly, we should not allow this. Your first intuition might be to somehow check at runtime that two characters making out are from the same franchise. For example, we could change the model like so:

1
2
3
4
5
6
7
8
9
10
11
12
object Franchise {
  case class Character(name: String, franchise: Franchise)
}
class Franchise(name: String) {
  import Franchise.Character
  def createFanFiction(
      lovestruck: Character,
      objectOfDesire: Character): (Character, Character) = {
    require(lovestruck.franchise == objectOfDesire.franchise)
    (lovestruck, objectOfDesire)
  }
}

Now, the Character instances have a reference to their Franchise, and trying to create a fan fiction with characters from different franchises will lead to an IllegalArgumentException (feel free to try this out in a REPL).

Safer fiction with path-dependent types

This is pretty good, isn’t it? It’s the kind of fail-fast behaviour we have been indoctrinated with for years. However, with Scala, we can do better. There is a way to fail even faster – not at runtime, but at compile time. To achieve that, we need to encode the connection between a Character and its Franchise at the type level.

Luckily, the way Scala’s nested types work allow us to do that. In Scala, a nested type is bound to a specific instance of the outer type, not to the outer type itself. This means that if you try to use an instance of the inner type outside of the instance of the enclosing type, you will face a compile error:

1
2
3
4
5
6
7
8
9
10
class A {
  class B
  var b: Option[B] = None
}
val a1 = new A
val a2 = new A
val b1 = new a1.B
val b2 = new a2.B
a1.b = Some(b1)
a2.b = Some(b1) // does not compile

You cannot simply assign an instance of the B that is bound to a2 to the field on a1 – the one is an a2.B, the other expects an a1.B. The dot syntax represents the path to the type, going along concrete instances of other types. Hence the name, path-dependent types.

We can put these to use in order to prevent characters from different franchises making out with each other:

1
2
3
4
5
6
class Franchise(name: String) {
  case class Character(name: String)
  def createFanFictionWith(
    lovestruck: Character,
    objectOfDesire: Character): (Character, Character) = (lovestruck, objectOfDesire)
}

Now, the type Character is nested in the type Franchise, which means that it is dependent on a specific enclosing instance of the Franchise type.

Let’s create our example franchises and characters again:

1
2
3
4
5
6
7
8
val starTrek = new Franchise("Star Trek")
val starWars = new Franchise("Star Wars")

val quark = starTrek.Character("Quark")
val jadzia = starTrek.Character("Jadzia Dax")

val luke = starWars.Character("Luke Skywalker")
val yoda = starWars.Character("Yoda")

You can already see in how our Character instances are created that their types are bound to a specific franchise. Let’s see what happens if we try to put some of these characters together:

1
2
starTrek.createFanFictionWith(lovestruck = quark, objectOfDesire = jadzia)
starWars.createFanFictionWith(lovestruck = luke, objectOfDesire = yoda)

These two compile, as expected. They are tasteless, but what can we do?

Now, let’s see what happens if we try to create some fan fiction about Jadzia Dax and Luke Skywalker:

1
starTrek.createFanFictionWith(lovestruck = jadzia, objectOfDesire = luke)

Et voilà: The thing that should not be does not even compile! The compiler complains about a type mismatch:

1
2
3
4
found   : starWars.Character
required: starTrek.Character
               starTrek.createFanFictionWith(lovestruck = jadzia, objectOfDesire = luke)
                                                                                   ^

This technique also works if our method is not defined on the Franchise class, but in some other module. In this case, we can make use of dependent method types, where the type of one parameter depends on a previous parameter:

1
2
def createFanFiction(f: Franchise)(lovestruck: f.Character, objectOfDesire: f.Character) =
  (lovestruck, objectOfDesire)

As you can see, the type of the lovestruck and objectOfDesire parameters depends on the Franchise instance passed to the method. Note that this only works if the instance on which other types depend is in its own parameter list.

Abstract type members

Often, dependent method types are used in conjunction with abstract type members. Suppose we want to develop a hipsterrific key-value store. It will only support setting and getting the value for a key, but in a typesafe manner. Here is our oversimplified implementation:

1
2
3
4
5
6
7
8
9
10
11
12
object AwesomeDB {
  abstract class Key(name: String) {
    type Value
  }
}
import AwesomeDB.Key
class AwesomeDB {
  import collection.mutable.Map
  val data = Map.empty[Key, Any]
  def get(key: Key): Option[key.Value] = data.get(key).asInstanceOf[Option[key.Value]]
  def set(key: Key)(value: key.Value): Unit = data.update(key, value)
}

We have defined a class Key with an abstract type member Value. The methods on AwesomeDB refer to that type without ever knowing or caring about the specific manifestation of this abstract type.

We can now define some concrete keys that we want to use:

1
2
3
4
5
6
7
8
9
10
trait IntValued extends Key {
 type Value = Int
}
trait StringValued extends Key {
  type Value = String
}
object Keys {
  val foo = new Key("foo") with IntValued
  val bar = new Key("bar") with StringValued
}

Now we can set and get key/value pairs in a typesafe manner:

1
2
3
4
val dataStore = new AwesomeDB
dataStore.set(Keys.foo)(23)
val i: Option[Int] = dataStore.get(Keys.foo)
dataStore.set(Keys.foo)("23") // does not compile

Path-dependent types in practice

While path-dependent types are not necessarily omnipresent in your typical Scala code, they do have a lot of practical value beyond modelling the domain of fan fiction.

One of the most widespread uses is probably seen in combination with the cake pattern, which is a technique for composing your components and managing their dependencies, relying solely on features of the language. See the excellent articles by Debasish Ghosh and Precog’s Daniel Spiewak to learn more about both the cake pattern and how it can be improved by incorporating path-dependent types.

In general, whenever you want to make sure that objects created or managed by a specific instance of another type cannot accidentally or purposely be interchanged or mixed, path-dependent types are the way to go.

Path-dependent types and dependent method types play a crucial role for attempts to encode information into types that is typically only known at runtime, for instance heterogenous lists, type-level representations of natural numbers and collections that carry their size in their type. Miles Sabin is exploring the limits of Scala’s type system in this respect in his excellent library Shapeless.

The Neophyte’s Guide to Scala Part 12: Type Classes

After having discussed several functional programming techniques for keeping things DRY and flexible in the last two weeks, in particular function composition, partial function application, and currying, we are going to stick with the general notion of making your code as flexible as possible.

However, this time, we are not looking so much at how you can leverage functions as first-class objects to achieve this goal. Instead, this article is all about using the type system in such a manner that it’s not in the way, but rather supports you in keeping your code extensible: You’re going to learn about type classes.

You might think that this is some exotic idea without practical relevance, brought into the Scala community by some vocal Haskell fanatics. This is clearly not the case. Type classes have become an important part of the Scala standard library and even more so of many popular and commonly used third-party open-source libraries, so it’s generally a good idea to make yourself familiar with them.

I will discuss the idea of type classes, why they are useful, how to benefit from them as a client, and how to implement your own type classes and put them to use for great good.

The problem

Instead of starting off by giving an abstract explanation of what type classes are, let’s tackle this subject by means of an – admittedly simplified, but nevertheless resonably practical – example.

Imagine that we want to write a fancy statistics library. This means we want to provide a bunch of functions that operate on collections of numbers, mostly to compute some aggregate values for them. Imagine further that we are restricted to accessing an element from such a collection by index and to using the reduce method defined on Scala collections. We impose this restriction on ourselves because we are going to re-implement a little bit of what the Scala standard library already provides – simply because it’s a nice example without many distractions, and it’s small enough for a blog post. Finally, our implementation assumes that the values we get are already sorted.

We will start with a very crude implementation of median, quartiles, and iqr numbers of type Double:

1
2
3
4
5
6
7
8
9
10
11
object Statistics {
  def median(xs: Vector[Double]): Double = xs(xs.size / 2)
  def quartiles(xs: Vector[Double]): (Double, Double, Double) =
    (xs(xs.size / 4), median(xs), xs(xs.size / 4 * 3))
  def iqr(xs: Vector[Double]): Double = quartiles(xs) match {
    case (lowerQuartile, _, upperQuartile) => upperQuartile - lowerQuartile
  }
  def mean(xs: Vector[Double]): Double = {
    xs.reduce(_ + _) / xs.size
  }
}

The median cuts a data set in half, whereas the lower and upper quartile (first and third element of the tuple returned by our quartile method), split the lowest and highest 25 percent of the data, respectively. Our iqr method returns the interquartile range, which is the difference between the upper and lower quartile.

Now, of course, we want to support more than just double numbers. So let’s implement all these methods again for Int numbers, right?

Well, no! First of all, that would be a tiny little bit repetitious, wouldn’t it? Also, in situations such as these, we quickly run into situations where we cannot overload a method without some dirty tricks, because the type parameter suffers from type erasure.

If only Int and Double would extend from a common base class or implement a common trait like Number! We could be tempted to change the type required and returned by our methods to that more general type. Our method signatures would look like this:

1
2
3
4
5
6
object Statistics {
  def median(xs: Vector[Number]): Number = ???
  def quartiles(xs: Vector[Number]): (Number, Number, Number) = ???
  def iqr(xs: Vector[Number]): Number = ???
  def mean(xs: Vector[Number]): Number = ???
}

Thankfully, in this case there is no such common trait, so we aren’t tempted to walk this road at all. However, in other cases, that might very well be the case – and still be a bad idea. Not only do we drop previously available type information, we also close our API against future extensions to types whose sources we don’t control: We cannot make some new number type coming from a third party extend the Number trait.

Ruby’s answer to that problem is monkey patching, polluting the global namespace with an extension to that new type, making it act like a Number after all. Java developers who have got beaten up by the Gang of Four in their youth, on the other hand, will think that an Adapter may solve all of their problems:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
object Statistics {
  trait NumberLike[A] {
    def get: A
    def plus(y: NumberLike[A]): NumberLike[A]
    def minus(y: NumberLike[A]): NumberLike[A]
    def divide(y: Int): NumberLike[A]
  }
  case class NumberLikeDouble(x: Double) extends NumberLike[Double] {
    def get: Double = x
    def minus(y: NumberLike[Double]) = NumberLikeDouble(x - y.get)
    def plus(y: NumberLike[Double]) = NumberLikeDouble(x + y.get)
    def divide(y: Int) = NumberLikeDouble(x / y)
  }
  type Quartile[A] = (NumberLike[A], NumberLike[A], NumberLike[A])
  def median[A](xs: Vector[NumberLike[A]]): NumberLike[A] = xs(xs.size / 2)
  def quartiles[A](xs: Vector[NumberLike[A]]): Quartile[A] =
    (xs(xs.size / 4), median(xs), xs(xs.size / 4 * 3))
  def iqr[A](xs: Vector[NumberLike[A]]): NumberLike[A] = quartiles(xs) match {
    case (lowerQuartile, _, upperQuartile) => upperQuartile.minus(lowerQuartile)
  }
  def mean[A](xs: Vector[NumberLike[A]]): NumberLike[A] =
    xs.reduce(_.plus(_)).divide(xs.size)
}

Now we have solved the problem of extensibility: Users of our library can pass in a NumberLike adapter for Int (which we would likely provide ourselves) or for any possible type that might behave like a number, without having to recompile the module in which our statistics methods are implemented.

However, always wrapping your numbers in an adapter is not only tiresome to write and read, it also means that you have to create a lot of instances of your adapter classes when interacting with our library.

Type classes to the rescue!

A powerful alternative to the approaches outlined so far is, of course, to define and use a type class. Type classes, one of the prominent features of the Haskell language, despite their name, haven’t got anything to do with classes in object-oriented programming.

A type class C defines some behaviour in the form of operations that must be supported by a type T for it to be a member of type class C. Whether the type T is a member of the type class C is not inherent in the type. Rather, any developer can declare that a type is a member of a type class simply by providing implementations of the operations the type must support. Now, once T is made a member of the type class C, functions that have constrained one or more of their parameters to be members of C can be called with arguments of type T.

As such, type classes allow ad-hoc and retroactive polymorphism. Code that relies on type classes is open to extension without the need to create adapter objects.

Creating a type class

In Scala, type classes can be implemented and used by a combination of techniques. It’s a little more involved than in Haskell, but also gives developers more control.

Creating a type class in Scala involves several steps. First, let’s define a trait. This is the actual type class:

1
2
3
4
5
6
7
object Math {
  trait NumberLike[T] {
    def plus(x: T, y: T): T
    def divide(x: T, y: Int): T
    def minus(x: T, y: T): T
  }
}

We have created a type class called NumberLike. Type classes always take one or more type parameters, and they are usually designed to be stateless, i.e. the methods defined on our NumberLike trait operate only on the passed in arguments. In particular, where our adapter above operated on its member of type T and one argument, the methods defined for our NumberLike type class take two parameters of type T each – the member has become the first parameter of the operations supported by NumberLike.

Providing default members

The second step in implementing a type class is usually to provide some default implementations of your type class trait in its companion object. We will see in a moment why this is generally a good strategy. First, however, let’s do this, too, by making Double and Int members of our NumberLike type class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
object Math {
  trait NumberLike[T] {
    def plus(x: T, y: T): T
    def divide(x: T, y: Int): T
    def minus(x: T, y: T): T
  }
  object NumberLike {
    implicit object NumberLikeDouble extends NumberLike[Double] {
      def plus(x: Double, y: Double): Double = x + y
      def divide(x: Double, y: Int): Double = x / y
      def minus(x: Double, y: Double): Double = x - y
    }
    implicit object NumberLikeInt extends NumberLike[Int] {
      def plus(x: Int, y: Int): Int = x + y
      def divide(x: Int, y: Int): Int = x / y
      def minus(x: Int, y: Int): Int = x - y
    }
  }
}

Two things: First, you see that the two implementations are basically identical. That is not always the case when creating members of a type classes. Our NumberLike type class is just a rather narrow domain. Later in the article, I will give examples of type classes where there is a lot less room for duplication when implementing them for multiple types. Second, please ignore the fact that we are losing precision in NumberLikeInt by doing integer division. It’s all to keep things simple for this example.

As you can see, members of type classes are usually singleton objects. Also, please note the implicit keyword before each of the type class implementations. This is one of the crucial elements for making type classes possible in Scala, making type class members implicitly available under certain conditions. More about that in the next section.

Coding against type classes

Now that we have our type class and two default implementations for common types, we want to code against this type class in our statistics module. Let’s focus on the mean method for now:

1
2
3
4
5
object Statistics {
  import Math.NumberLike
  def mean[T](xs: Vector[T])(implicit ev: NumberLike[T]): T =
    ev.divide(xs.reduce(ev.plus(_, _)), xs.size)
}

This may look a little intimidating at first, but it’s actually quite simple. Our method takes a type parameter T and a single parameter of type Vector[T].

The idea to constrain a parameter to types that are members of a specific type class is realized by means of the implicit second parameter list. What does this mean? Basically, that a value of type NumberLike[T] must be implicitly available in the current scope. This is the case if an implicit value has been declared and made available in the current scope, very often by importing the package or object in which that implicit value is defined.

If and only if no other implicit value can be found, the compiler will look in the companion object of the type of the implicit parameter. Hence, as a library designer, putting your default type class implementations in the companion object of your type class trait means that users of your library can easily override these implementations with their own ones, which is exactly what you want. Users can also pass in an explicit value for an implicit parameter to override the implicit values that are in scope.

Let’s see if the default type class implementations can be resolved:

1
2
val numbers = Vector[Double](13, 23.0, 42, 45, 61, 73, 96, 100, 199, 420, 900, 3839)
println(Statistics.mean(numbers))

Wonderful! If we try this with a Vector[String], we get an error at compile time, stating that no implicit value could be found for parameter ev: NumberLike[String]. If you don’t like this error message, you can customize it by annotating your type class trait with the @implicitNotFound annotation:

1
2
3
4
5
6
7
8
9
object Math {
  import annotation.implicitNotFound
  @implicitNotFound("No member of type class NumberLike in scope for ${T}")
  trait NumberLike[T] {
    def plus(x: T, y: T): T
    def divide(x: T, y: Int): T
    def minus(x: T, y: T): T
  }
}

Context bounds

A second, implicit parameter list on all methods that expect a member of a type class can be a little verbose. As a shortcut for implicit parameters with only one type parameter, Scala provides so-called context bounds. To show how those are used, we are going to implement our other statistics methods using those instead:

1
2
3
4
5
6
7
8
9
10
11
12
object Statistics {
  import Math.NumberLike
  def mean[T](xs: Vector[T])(implicit ev: NumberLike[T]): T =
    ev.divide(xs.reduce(ev.plus(_, _)), xs.size)
  def median[T : NumberLike](xs: Vector[T]): T = xs(xs.size / 2)
  def quartiles[T: NumberLike](xs: Vector[T]): (T, T, T) =
    (xs(xs.size / 4), median(xs), xs(xs.size / 4 * 3))
  def iqr[T: NumberLike](xs: Vector[T]): T = quartiles(xs) match {
    case (lowerQuartile, _, upperQuartile) =>
      implicitly[NumberLike[T]].minus(upperQuartile, lowerQuartile)
  }
}

A context bound T : NumberLike means that an implicit value of type NumberLike[T] must be available, and so is really equivalent to having a second implicit parameter list with a NumberLike[T] in it. If you want to access that implicitly available value, however, you need to call the implicitly method, as we do in the iqr method. If your type class requires more than one type parameter, you cannot use the context bound syntax.

Custom type class members

As a user of a library that makes use of type classes, you will sooner or later have types that you want to make members of those type classes. For instance, we might want to use the statistics library for instances of the Joda Time Duration type. To do that, we need Joda Time on our classpath, of course:

1
2
3
libraryDependencies += "joda-time" % "joda-time" % "2.1"

libraryDependencies += "org.joda" % "joda-convert" % "1.3"

Now we just have to create an implicitly available implementation of NumberLike (please make sure you have Joda Time on your classpath when trying this out):

1
2
3
4
5
6
7
8
9
object JodaImplicits {
  import Math.NumberLike
  import org.joda.time.Duration
  implicit object NumberLikeDuration extends NumberLike[Duration] {
    def plus(x: Duration, y: Duration): Duration = x.plus(y)
    def divide(x: Duration, y: Int): Duration = Duration.millis(x.getMillis / y)
    def minus(x: Duration, y: Duration): Duration = x.minus(y)
  }
}

If we import the package or object containing this NumberLike implementation, we can now compute the mean value for a bunch of durations:

1
2
3
4
5
6
7
8
9
import Statistics._
import JodaImplicits._
import org.joda.time.Duration._

val durations = Vector(standardSeconds(20), standardSeconds(57), standardMinutes(2),
  standardMinutes(17), standardMinutes(30), standardMinutes(58), standardHours(2),
  standardHours(5), standardHours(8), standardHours(17), standardDays(1),
  standardDays(4))
println(mean(durations).getStandardHours)

Use cases

Our NumberLike type class was a nice exercise, but Scala already ships with the Numeric type class, which allows you to call methods like sum or product on collections for whose type T a Numeric[T] is available. Another type class in the standard library that you will use a lot is Ordering, which allows you to provide an implicit ordering for your own types, available to the sort method on Scala’s collections.

There are more type classes in the standard library, but not all of them are ones you have to deal with on a regular basis as a Scala developer.

A very common use case in third-party libraries is that of object serialization and deserialization, most notably to and from JSON. By making your classes members of an appropriate formatter type class, you can customize the way your classes are serialized to JSON, XML or whatever format is currently the new black.

Mapping between Scala types and ones supported by your database driver is also commonly made customizable and extensible via type classes.

Summary

Once you start to do some serious work with Scala, you will inevitably stumble upon type classes. I hope that after reading this article, you are prepared to take advantage of this powerful technique.

Scala type classes allow you to develop your Scala code in such a way that it’s open for retroactive extension while retaining as much concrete type information as possible. In contrast to approaches from other languages, they give developers full control, as default type class implementations can be overridden without much hassle, and type classes implementations are not made available in the global namespace.

You will see that this technique is especially useful when writing libraries intended to be used by others, but type classes also have their use in application code to decrease coupling between modules.

The Neophyte’s Guide to Scala Part 11: Currying and Partially Applied Functions

Last week’s article was all about avoiding code duplication, either by lifting existing functions to match your new requirements or by composing them. In this article, we are going to have a look at two other mechanisms the Scala language provides in order to enable you to reuse your functions: Partial application of functions and currying.

Partially applied functions

Scala, like many other languages following the functional programming paradigm, allows you to apply a function partially. What this means is that, when applying the function, you do not pass in arguments for all of the parameters defined by the function, but only for some of them, leaving the remaining ones blank. What you get back is a new function whose parameter list only contains those parameters from the original function that were left blank.

Do not confuse partially applied functions with partially defined functions, which are represented by the PartialFunction type in Scala.

To illustrate how partial function application works, let’s revisit our example from last week: For our imaginary free mail service, we wanted to allow the user to configure a filter so that only emails meeting certain criteria would show up in their inbox, with all others being blocked.

Our Email case class still looks like this:

1
2
3
4
5
6
case class Email(
  subject: String,
  text: String,
  sender: String,
  recipient: String)
type EmailFilter = Email => Boolean

The criteria for filtering emails were represented by a predicate Email => Boolean, which we aliased to the type EmailFilter, and we were able to generate new predicates by calling appropriate factory functions.

Two of the factory functions from last week’s article created EmailFiter instances that checked if the email text satisfied a given minimum or maximum length. This time we want to make use of partial function application to implement these factory functions. We want to have a general method sizeConstraint and be able to create more specific size constraints by fixing some of its parameters.

Here is our revised sizeConstraint method:

1
2
type IntPairPred = (Int, Int) => Boolean
def sizeConstraint(pred: IntPairPred, n: Int, email: Email) = pred(email.text.size, n)

We also define an alias IntPairPred for the type of predicate that checks a pair of integers (the value n and the text size of the email) and returns whether the email text size is okay, given n.

Note that unlike our sizeConstraint function from last week, this one does not return a new EmailFilter predicate, but simply evaluates all the arguments passed to it, returning a Boolean. The trick is to get such a predicate of type EmailFilter by partially applying sizeConstraint.

First, however, because we take the DRY principle very seriously, let’s define all the commonly used instances of IntPairPred. Now, when we call sizeConstraint, we don’t have to repeatedly write the same anonymous functions, but can simply pass in one of those:

1
2
3
4
5
val gt: IntPairPred = _ > _
val ge: IntPairPred = _ >= _
val lt: IntPairPred = _ < _
val le: IntPairPred = _ <= _
val eq: IntPairPred = _ == _

Finally, we are ready to do some partial application of the sizeConstraint function, fixing its first parameter with one of our IntPairPred instances:

1
2
val minimumSize: (Int, Email) => Boolean = sizeConstraint(ge, _: Int, _: Email)
val maximumSize: (Int, Email) => Boolean = sizeConstraint(le, _: Int, _: Email)

As you can see, you have to use the placeholder _ for all parameters not bound to an argument value. Unfortunately, you also have to specify the type of those arguments, which makes partial function application in Scala a bit tedious.

The reason is that the Scala compiler cannot infer these types, at least not in all cases – think of overloaded methods where it’s impossible for the compiler to know which of them you are referring to.

On the other hand, this makes it possible to bind or leave out arbitrary parameters. For example, we can leave out the first one and pass in the size to be used as a constraint:

1
2
val constr20: (IntPairPred, Email) => Boolean = sizeConstraint(_: IntPairPred, 20, _: Email)
val constr30: (IntPairPred, Email) => Boolean = sizeConstraint(_: IntPairPred, 30, _: Email)

Now we have two functions that take an IntPairPred and an Email and compare the email text size to 20 and 30, respectively, but the comparison logic has not been specified yet, as that’s exactly what the IntPairPred is good for.

This shows that, while quite verbose, partial function application in Scala is a little more flexible than in Clojure, for example, where you have to pass in arguments from left to right, but can’t leave out any in the middle.

From methods to function objects

When doing partial application on a method, you can also decide to not bind any parameters whatsoever. The parameter list of the returned function object will be the same as for the method. You have effectively turned a method into a function that can be assigned to a val or passed around:

1
val sizeConstraintFn: (IntPairPred, Int, Email) => Boolean = sizeConstraint _

Producing those EmailFilters

We still haven’t got any functions that adhere to the EmailFilter type or that return new predicates of that type – like sizeConstraint, minimumSize and maximumSize don’t return a new predicate, but a Boolean, as their signature clearly shows.

However, our email filters are just another partial function application away. By fixing the integer parameter of minimumSize and maximumSize, we can create new functions of type EmailFilter:

1
2
val min20: EmailFilter = minimumSize(20, _: Email)
val max20: EmailFilter = maximumSize(20, _: Email)

Of course, we could achieve the same by partially applying our constr20 we created above:

1
2
val min20: EmailFilter = constr20(ge, _: Email)
val max20: EmailFilter = constr20(le, _: Email)

Spicing up your functions

Maybe you find partial function application in Scala a little too verbose, or simply not very elegant to write and look at. Lucky you, because there is an alternative.

As you should know by now, methods in Scala can have more than one parameter list. Let’s define our sizeConstraint method such that each parameter is in its own parameter list:

1
2
def sizeConstraint(pred: IntPairPred)(n: Int)(email: Email): Boolean =
  pred(email.text.size, n)

If we turn this into a function object that we can assign or pass around, the signature of that function looks like this:

1
val sizeConstraintFn: IntPairPred => Int => Email => Boolean = sizeConstraint _

Such a chain of one-parameter functions is called a curried function, so named after Haskell Curry, who re-discovered this technique and got all the fame for it. In fact, in the Haskell programming language, all functions are in curried form by default.

In our example, it takes an IntPairPred and returns a function that takes an Int and returns a new function. That last function, finally, takes an Email and returns a Boolean.

Now, if we we want to bind the IntPairPred, we simply apply sizeConstraintFn, which takes exactly this one parameter and returns a new one-parameter function:

1
2
val minSize: Int => Email => Boolean = sizeConstraint(ge)
val maxSize: Int => Email => Boolean = sizeConstraint(le)

There is no need to use any placeholders for parameters left blank, because we are in fact not doing any partial function application.

We can now create the same EmailFilter predicates as we did using partial function application before, by applying these curried functions:

1
2
val min20: Email => Boolean = minSize(20)
val max20: Email => Boolean = maxSize(20)

Of course, it’s possible to do all this in one step if you want to bind several parameters at once. It just means that you immediately apply the function that was returned from the first function application, without assigning it to a val first:

1
2
val min20: Email => Boolean = sizeConstraintFn(ge)(20)
val max20: Email => Boolean = sizeConstraintFn(le)(20)

Currying existing functions

It’s not always the case the you know beforehand whether writing your functions in curried form makes sense or not – after all, the usual function application looks a little more verbose than for functions that have only declared a single parameter list for all their parameters. Also, you’ll sometimes want to work with third-party functions in curried form, when they are written with a single parameter list.

Transforming a function with multiple parameters in one list to curried form is, of course, just another higher-order function, generating a new function from an existing one. This transformation logic is available in form of the curried method on functions with more than one parameter. Hence, if we have a function sum, taking two parameters, we can get the curried version simply calling calling its curried method:

1
2
val sum: (Int, Int) => Int = _ + _
val sumCurried: Int => Int => Int = sum.curried

If you need to do the reverse, you have Function.uncurried at your fingertips, which you need to pass the curried function to get it back in uncurried form.

Injecting your dependencies the functional way

To close this article, let’s have a look at what role curried functions can play in the large. If you come from the enterprisy Java or .NET world, you will be very familiar with the necessity to use more or less fancy dependency injection containers that take the heavy burden of providing all your objects with their respective dependencies off you. In Scala, you don’t really need any external tool for that, as the language already provides numerous features that make it much less of a pain to this all on your own.

When programming in a very functional way, you will notice that there is still a need to inject dependencies: Your functions residing in the higher layers of your application will have to make calls to other functions. Simply hard-coding the functions to call in the body of your functions makes it difficult to test them in isolation. Hence, you will need to pass the functions your higher-layer function depends on as arguments to that function.

It would not be DRY at all to always pass the same dependencies to your function when calling it, would it? Curried functions to the rescue! Currying and partial function application are one of several ways of injecting dependencies in functional programming.

The following, very simplified example illustrates this technique:

1
2
3
4
5
6
7
8
9
10
11
12
case class User(name: String)
trait EmailRepository {
  def getMails(user: User, unread: Boolean): Seq[Email]
}
trait FilterRepository {
  def getEmailFilter(user: User): EmailFilter
}
trait MailboxService {
  def getNewMails(emailRepo: EmailRepository)(filterRepo: FilterRepository)(user: User) =
    emailRepo.getMails(user, true).filter(filterRepo.getEmailFilter(user))
  val newMails: User => Seq[Email]
}

We have a service that depends on two different repositories, These dependencies are declared as parameters to the getNewMails method, each in their own parameter list.

The MailboxService already has a concrete implementation of that method, but is lacking one for the newMails field. The type of that field is User => Seq[Email] – that’s the function that components depending on the MailboxService will call.

We need an object that extends MailboxService. The idea is to implement newMails by currying the getNewMails method and fixing it with concrete implementations of the dependencies, EmailRepository and FilterRepository:

1
2
3
4
5
6
7
8
9
10
object MockEmailRepository extends EmailRepository {
  def getMails(user: User, unread: Boolean): Seq[Email] = Nil
}
object MockFilterRepository extends FilterRepository {
  def getEmailFilter(user: User): EmailFilter = _ => true
}
object MailboxServiceWithMockDeps extends MailboxService {
  val newMails: (User) => Seq[Email] =
    getNewMails(MockEmailRepository)(MockFilterRepository) _
}

We can now call MailboxServiceWithMoxDeps.newMails(User("daniel")) without having to specify the two repositories to be used. In a real application, of course, we would very likely not use a direct reference to a concrete implementation of the service, but have this injected, too.

This is probably not the most powerful and scaleable way of injecting your dependencies in Scala, but it’s good to have this in your tool belt, and it’s a very good example of the benefits that partial function application and currying can provide in the wild. If you want know more about this, I recommend to have a look at the excellent slides for the presentation ”Dependency Injection in Scala” by Debasish Ghosh, which is also where I first came across this technique.

Summary

In this article, we discussed two additional functional programming techniques that help you keep your code free of duplication and, on top of that, give you a lot of flexibility, allowing you to reuse your functions in manifold ways. Partial function application and currying have more or less the same effect, but sometimes one of them is the more elegant solution.

In the next part of this series, we will continue to look at ways to keep things flexible, discussing the what and how of type classes in Scala.

The Neophyte’s Guide to Scala Part 10: Staying DRY With Higher-order Functions

In the previous articles, I discussed the composable nature of Scala’s container types. As it turns out, being composable is a quality that you will not only find in Future, Try, and other container types, but also in functions, which are first class citizens in the Scala language.

Composability naturally entails reusability. While the latter has often been claimed to be one of the big advantages of object-oriented programming, it’s a trait that is definitely true for pure functions, i.e. functions that do not have any side-effects and are referentially transparent.

One obvious way is to implement a new function by calling already existing functions in its body. However, there are other ways to reuse existing functions: In this blog post, I will discuss some fundementals of functional programming that we have been missing out on so far. You will learn how to follow the DRY principle by leveraging higher-order functions in order to reuse existing code in new contexts.

On higher-order functions

A higher-order function, as opposed to a first-order function, can have one of three forms:

  1. One or more of its parameters is a function, and it returns some value.
  2. It returns a function, but none of its parameters is a function.
  3. Both of the above: One or more of its parameters is a function, and it returns a function.

If you have followed this series, you have seen a lot of usages of higher-order functions of the first type: We called methods like map, filter, or flatMap and passed a function to it that was used to transform or filter a collection in some way. Very often, the functions we passed to these methods were anonymous functions, sometimes involving a little bit of duplication.

In this article, we are only concerned with what the other two types of higher-order functions can do for us: The first of them allows us to produce new functions based on some input data, whereas the other gives us the power and flexibility to compose new functions that are somehow based on existing functions. In both cases, we can eliminate code duplication.

And out of nowhere, a function was born

You might think that the ability to create new functions based on some input data is not terribly useful. While we mainly want to deal with how to compose new functions based on existing ones, let’s first have a look at how a function that produces new functions may be used.

Let’s assume we are implementing a freemail service where users should be able to configure when an email is supposed to be blocked. We are representing emails as instances of a simple case class:

1
2
3
4
5
case class Email(
  subject: String,
  text: String,
  sender: String,
  recipient: String)

We want to be able to filter new emails by the criteria specified by the user, so we have a filtering function that makes use of a predicate, a function of type Email => Boolean to determine whether the email is to be blocked. If the predicate is true, the email is accepted, otherwise it will be blocked:

1
2
type EmailFilter = Email => Boolean
def newMailsForUser(mails: Seq[Email], f: EmailFilter) = mails.filter(f)

Note that we are using a type alias for our function, so that we can work with more meaningful names in our code.

Now, in order to allow the user to configure their email filter, we can implement some factory functions that produce EmailFilter functions configured to the user’s liking:

1
2
3
4
5
6
val sentByOneOf: Set[String] => EmailFilter =
  senders => email => senders.contains(email.sender)
val notSentByAnyOf: Set[String] => EmailFilter =
  senders => email => !senders.contains(email.sender)
val minimumSize: Int => EmailFilter = n => email => email.text.size >= n
val maximumSize: Int => EmailFilter = n => email => email.text.size <= n

Each of these four vals is a function that returns an EmailFilter, the first two taking as input a Set[String] representing senders, the other two an Int representing the length of the email body.

We can use any of these functions to create a new EmailFilter that we can pass to the newMailsForUser function:

1
2
3
4
5
6
7
val emailFilter: EmailFilter = notSentByAnyOf(Set("johndoe@example.com"))
val mails = Email(
  subject = "It's me again, your stalker friend!",
  text = "Hello my friend! How are you?",
  sender = "johndoe@example.com",
  recipient = "me@example.com") :: Nil
newMailsForUser(mails, emailFilter) // returns an empty list

This filter removes the one mail in the list because our user decided to put the sender on their black list. We can use our factory functions to create arbitrary EmailFilter functions, depending on the user’s requirements.

Reusing existing functions

There are two problems with the current solution. First of all, there is quite a bit of duplication in the predicate factory functions above, when initially I told you that the composable nature of functions made it easy to stick to the DRY principle. So let’s get rid of the duplication.

To do that for the minimumSize and maximumSize, we introduce a function sizeConstraint that takes a predicate that checks if the size of the email body is okay. That size will be passed to the predicate by the sizeConstraint function:

1
2
type SizeChecker = Int => Boolean
val sizeConstraint: SizeChecker => EmailFilter = f => email => f(email.text.size)

Now we can express minimumSize and maximumSize in terms of sizeConstraint:

1
2
val minimumSize: Int => EmailFilter = n => sizeConstraint(_ >= n)
val maximumSize: Int => EmailFilter = n => sizeConstraint(_ <= n)

Function composition

For the other two predicates, sentByOneOf and notSentByAnyOf, we are going to introduce a very generic higher-order function that allows us to express one of the two functions in terms of the other.

Let’s implement a function complement that takes a predicate A => Boolean and returns a new function that always returns the opposite of the given predicate:

1
def complement[A](predicate: A => Boolean) = (a: A) => !predicate(a)

Now, for an existing predicate p we could get the complement by calling complement(p). However, sentByAnyOf is not a predicate, but it returns one, namely an EmailFilter.

Scala functions provide two composing functions that will help us here: Given two functions f and g, f.compose(g) returns a new function that, when called, will first call g and then apply f on the result of it. Similarly, f.andThen(g) returns a new function that, when called, will apply g to the result of f.

We can put this to use to create our notSentByAnyOf predicate without code duplication:

1
val notSentByAnyOf = sentByOneOf andThen(g => complement(g))

What this means is that we ask for a new function that first applies the sentByOneOf function to its arguments (a Set[String]) and then applies the complement function to the EmailFilter predicate returned by the former function. Using Scala’s placeholder syntax for anonymous functions, we could write this more concisely as:

1
val notSentByAnyOf = sentByOneOf andThen(complement(_))

Of course, you will now have noticed that, given a complement function, you could also implement the maximumSize predicate in terms of minimumSize instead of extracting a sizeConstraint function. However, the latter is more flexible, allowing you to specify arbitrary checks on the size of the mail body.

Composing predicates

Another problem with our email filters is that we can currently only pass a single EmailFilter to our newMailsForUser function. Certainly, our users want to configure multiple criteria. We need a way to create a composite predicate that returns true if either any, none or all of the predicates it consists of return true.

Here is one way to implement these functions:

1
2
3
4
def any[A](predicates: (A => Boolean)*): A => Boolean =
  a => predicates.exists(pred => pred(a))
def none[A](predicates: (A => Boolean)*) = complement(any(predicates: _*))
def every[A](predicates: (A => Boolean)*) = none(predicates.view.map(complement(_)): _*)

The any function returns a new predicate that, when called with an input a, checks if at least one of its predicates holds true for the value a. Our none function simply returns the complement of the predicate returned by any – if at least one predicate holds true, the condition for none is not satisfied. Finally, our every function works by checking that none of the complements to the predicates passed to it holds true.

We can now use this to create a composite EmailFilter that represents the user’s configuration:

1
2
3
4
5
val filter: EmailFilter = every(
    notSentByAnyOf(Set("johndoe@example.com")),
    minimumSize(100),
    maximumSize(10000)
  )

Composing a transformation pipeline

As another example of function composition, consider our example scenario again. As a freemail provider, we want not only to allow user’s to configure their email filter, but also do some processing on emails sent by our users. These are simple functions Email => Email. Some possible transformations are the following:

1
2
3
4
5
6
7
8
9
val addMissingSubject = (email: Email) =>
  if (email.subject.isEmpty) email.copy(subject = "No subject")
  else email
val checkSpelling = (email: Email) =>
  email.copy(text = email.text.replaceAll("your", "you're"))
val removeInappropriateLanguage = (email: Email) =>
  email.copy(text = email.text.replaceAll("dynamic typing", "**CENSORED**"))
val addAdvertismentToFooter = (email: Email) =>
  email.copy(text = email.text + "\nThis mail sent via Super Awesome Free Mail")

Now, depending on the weather and the mood of our boss, we can configure our pipeline as required, either by multiple andThen calls, or, having the same effect, by using the chain method defined on the Function companion object:

1
2
3
4
5
val pipeline = Function.chain(Seq(
  addMissingSubject,
  checkSpelling,
  removeInappropriateLanguage,
  addAdvertismentToFooter))

Higher-order functions and partial functions

I won’t go into detail here, but now that you know more about how you can compose or reuse functions by means of higher-order functions, you might want to have a look at partial functions again.

Chaining partial functions

In the article on pattern matching anonymous functions, I mentioned that partial functions can be used to create a nice alternative to the chain of responsibility pattern: The orElse method defined on the PartialFunction trait allows you to chain an arbitrary number of partial functions, creating a composite partial function. The first one, however, will only pass on to the next one if it isn’t defined for the given input. Hence, you can do something like this:

1
val handler = fooHandler orElse barHandler orElse bazHandler

Lifting partial functions

Also, sometimes a PartialFunction is not what you need. If you think about it, another way to represent the fact that a function is not defined for all input values is to have a standard function whose return type is an Option[A] – if the function is not defined for an input value, it will return None, otherwise a Some[A].

If that’s what you need in a certain context, given a PartialFunction named pf, you can call pf.lift to get the normal function returning an Option. If you have one of the latter and require a partial function, call Function.unlift(f).

Summary

In this article, we have seen the value of higher-order functions, which allow you to reuse existing functions in new, unforeseen contexts and compose them in a very flexible way. While in the examples, you didn’t save much in terms of lines of code, because the functions I showed were rather tiny, the real point is to illustrate the increase in flexibility. Also, composing and reusing functions is something that has benefits not only for small functions, but also on an architectural level.

In the next article, we will continue to examine ways to combine functions by means of partial function application and currying.

The Neophyte’s Guide to Scala Part 9: Promises and Futures in Practice

In the previous article in this series, I introduced you to the Future type, its underlying paradigm, and how to put it to use to write highly readable and composable asynchronously executing code.

In that article, I also mentioned that Future is really only one piece of the puzzle: It’s a read-only type that allows you to work with the values it will compute and handle failure to do so in an elegant way. In order for you to be able to read a computed value from a Future, however, there must be a way for some other part of your software to put that value there. In this post, I will show you how this is done by means of the Promise type, followed by some guidance on how to use futures and promises in practice.

Promises

In the previous article on futures, we had a sequential block of code that we passed to the apply method of the Future companion object, and, given an ExecutionContext was in scope, it magically executed that code block asynchronously, returning its result as a Future.

While this is an easy way to get a Future when you want one, there is an alternative way to create Future instances and have them complete with a success or failure. Where Future provides an interface exclusively for querying, Promise is a companion type that allows you to complete a Future by putting a value into it. This can be done exactly once. Once a Promise has been completed, it’s not possible to change it any more.

A Promise instance is always linked to exactly one instance of Future. If you call the apply method of Future again in the REPL, you will indeed notice that the Future returned is a Promise, too:

1
2
3
4
5
import concurrent.Future
import concurrent.ExecutionContext.Implicits.global
val f: Future[String] = Future { "Hello world!" }
// REPL output: 
// f: scala.concurrent.Future[String] = scala.concurrent.impl.Promise$DefaultPromise@793e6657

The object you get back is a DefaultPromise, which implements both Future and Promise. This is an implementation detail, however. The Future and the Promise to which it belongs may very well be separate objects.

What this little example shows is that there is obviously no way to complete a Future other than through a Promise – the apply method on Future is just a nice helper function that shields you from this.

Now, let’s see how we can get our hands dirty and work with the Promise type directly.

Promising a rosy future

When talking about promises that may be fulfilled or not, an obvious example domain is that of politicians, elections, campaign pledges, and legislative periods.

Suppose the politicians that then got elected into office promised their voters a tax cut. This can be represented as a Promise[TaxCut], which you can create by calling the apply method on the Promise companion object, like so:

1
2
3
4
5
6
import concurrent.Promise
case class TaxCut(reduction: Int)
// either give the type as a type parameter to the factory method:
val taxcut = Promise[TaxCut]()
// or give the compiler a hint by specifying the type of your val:
val taxcut2: Promise[TaxCut] = Promise()

Once you have created a Promise, you can get the Future belonging to it by calling the future method on the Promise instance:

1
val taxcutF: Future[TaxCut] = taxcut.future

The returned Future might not be the same object as the Promise, but calling the future method of a Promise multiple times will definitely always return the same object to make sure the one-to-one relationship between a Promise and its Future is preserved.

Completing a Promise

Once you have made a Promise and told the world that you will deliver on it in the forseeable Future, you better do your very best to make it happen.

In Scala, you can complete a Promise either with a success or a failure.

Delivering on your Promise

To complete a Promise with a success, you call its success method, passing it the value that the Future associated with it is supposed to have:

1
taxcut.success(TaxCut(20))

Once you have done this, that Promise instance is no longer writable, and future attempts to do so will lead to an exception.

Also, completing your Promise like this leads to the successful completion of the associated Future. Any success or completion handlers on that future will now be called, or if, for instance, you are mapping that future, the mapping function will now be executed.

Usually, the completion of the Promise and the processing of the completed Future will not happen in the same thread. It’s more likely that you create your Promise, start computing its value in another thread and immediately return the uncompleted Future to the caller.

To illustrate, let’s do something like that for our taxcut promise:

1
2
3
4
5
6
7
8
9
10
11
12
object Government {
  def redeemCampaignPledge(): Future[TaxCut] = {
    val p = Promise[TaxCut]()
    Future {
      println("Starting the new legislative period.")
      Thread.sleep(2000)
      p.success(TaxCut(20))
      println("We reduced the taxes! You must reelect us!!!!1111")
    }
    p.future
  }
}

Please don’t get confused by the usage of the apply method of the Future companion object in this example. I’m just using it because it is so convenient for executing a block of code asynchronously. I could just as well have implemented the computation of the result (which involves a lot of sleeping) in a Runnable that is executed asynchrously by an ExecutorService, with a lot more boilerplate code. The point is that the Promise is not completed in the caller thread.

Let’s redeem our campaign pledge then and add an onComplete callback function to our future:

1
2
3
4
5
6
7
8
9
import scala.util.{Success, Failure}
val taxCutF: Future[TaxCut] = Government.redeemCampaignPledge()
  println("Now that they're elected, let's see if they remember their promises...")
  taxCutF.onComplete {
    case Success(TaxCut(reduction)) =>
      println(s"A miracle! They really cut our taxes by $reduction percentage points!")
    case Failure(ex) =>
      println(s"They broke their promises! Again! Because of a ${ex.getMessage}")
  }

If you try this out multiple times, you will see that the order of the console output is not deterministic. Eventually, the completion handler will be executed and run into the success case.

Breaking Promises like a sir

As a politician, you are pretty much used to not keeping your promises. As a Scala developer, you sometimes have no other choice, either. If that happens, you can still complete your Promise instance gracefully, by calling its failure method and passing it an exception:

1
2
3
4
5
6
7
8
9
10
11
12
13
case class LameExcuse(msg: String) extends Exception(msg)
object Government {
  def redeemCampaignPledge(): Future[TaxCut] = {
       val p = Promise[TaxCut]()
       Future {
         println("Starting the new legislative period.")
         Thread.sleep(2000)
         p.failure(LameExcuse("global economy crisis"))
         println("We didn't fulfill our promises, but surely they'll understand.")
       }
       p.future
     }
}

This implementation of the redeemCampaignPledge() method will to lots of broken promises. Once you have completed a Promise with the failure method, it is no longer writable, just as is the case with the success method. The associated Future will now be completed with a Failure, too, so the callback function above would run into the failure case.

If you already have a Try, you can also complete a Promise by calling its complete method. If the Try is a Success, the associated Future will be completed successfully, with the value inside the Success. If it’s a Failure, the Future will completed with that failure.

Future-based programming in practice

If you want to make use of the future-based paradigm in order to increase the scalability of your application, you have to design your application to be non-blocking from the ground-up, which basically means that the functions in all your application layers are asynchronous and return futures.

A likely use case these days is that of developing a web application. If you are using a modern Scala web framework, it will allow you to return your responses as something like a Future[Response] instead of blocking and then returning your finished Response. This is important since it allows your web server to handle a huge number of open connections with a relatively low number of threads. By always giving your web server Future[Response], you maximize the utilization of the web server’s dedicated thread pool.

In the end, a service in your application might make multiple calls to your database layer and/or some external web service, receiving multiple futures, and then compose them to return a new Future, all in a very readable for comprehension, such as the one you saw in the previous article. The web layer will turn such a Future into a Future[Response].

However, how do you implement this in practice? There are are three different cases you have to consider:

Non-blocking IO

Your application will most certainly involve a lot of IO. For example, your web application will have to talk to a database, and it might act as a client that is calling other web services.

If at all possible, make use of libraries that are based on Java’s non-blocking IO capabilities, either by using Java’s NIO API directly or through a library like Netty. Such libraries, too, can serve many connections with a reasonably-sized dedicated thread pool.

Developing such a library yourself is one of the few places where directly working with the Promise type makes a lot of sense.

Blocking IO

Sometimes, there is no NIO-based library available. For instance, most database drivers you’ll find in the Java world nowadays are using blocking IO. If you made a query to your database with such a driver in order to respond to a HTTP request, that call would be made on a web server thread. To avoid that, place all the code talking to the database inside a future block, like so:

1
2
3
4
// get back a Future[ResultSet] or something similar:
Future {
  queryDB(query)
}

So far, we always used the implicitly available global ExecutionContext to execute such future blocks. It’s probably a good idea to create a dedicated ExecutionContext that you will have in scope in your database layer.

You can create an ExecutionContext from a Java ExecutorService, which means you will be able to tune the thread pool for executing your database calls asynchronously independently from the rest of your application:

1
2
3
4
import java.util.concurrent.Executors
import concurrent.ExecutionContext
val executorService = Executors.newFixedThreadPool(4)
val executionContext = ExecutionContext.fromExecutorService(executorService)

Long-running computations

Depending on the nature of your application, it will occasionally have to call long-running tasks that don’t involve any IO at all, which means they are CPU-bound. These, too, should not be executed by a web server thread. Hence, you should turn them into Futures, too:

1
2
3
Future {
  longRunningComputation(data, moreData)
}

Again, if you have long-running computations, having them run in a separate ExecutionContext for CPU-bound tasks is a good idea. How to tune your various thread pools is highly dependent on your individual application and beyond the scope of this article.

Summary

In this article, we looked at promises, the writable part of the future-based concurrency paradigm, and how to use them to complete a Future, followed by some advice on how to put futures to use in practice.

In the next part of this series, we are taking a step back from concurrency issues and examine how functional programming in Scala can help you to make your code more reusable, a claim that has long been associated with object-oriented programming.

The Neophyte’s Guide to Scala Part 8: Welcome to the Future

As an aspiring and enthusiastic Scala developer, you will likely have heard of Scala’s approach at dealing with concurrency – or maybe that was even what attracted you in the first place. Said approach makes reasoning about concurrency and writing well-behaved concurrent programs a lot easier than the rather low-level concurrency APIs you are confronted with in most other languages.

One of the two cornerstones of this approach is the Future, the other being the Actor. The former shall be the subject of this article. I will explain what futures are good for and how you can make use of them in a functional way.

Please make sure that you have version 2.9.3 or later if you want to get your hands dirty and try out the examples yourself. The futures we are discussing here were only incorporated into the Scala core distribution with the 2.10.0 release and later backported to Scala 2.9.3. Originally, with a slightly different API, they were part of the Akka concurrency toolkit.

Why sequential code can be bad

Suppose you want to prepare a cappuccino. You could simply execute the following steps, one after another:

  1. Grind the required coffee beans
  2. Heat some water
  3. Brew an espresso using the ground coffee and the heated water
  4. Froth some milk
  5. Combine the espresso and the frothed milk to a cappuccino

Translated to Scala code, you would do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import scala.util.Try
// Some type aliases, just for getting more meaningful method signatures:
type CoffeeBeans = String
type GroundCoffee = String
case class Water(temperature: Int)
type Milk = String
type FrothedMilk = String
type Espresso = String
type Cappuccino = String
// dummy implementations of the individual steps:
def grind(beans: CoffeeBeans): GroundCoffee = s"ground coffee of $beans"
def heatWater(water: Water): Water = water.copy(temperature = 85)
def frothMilk(milk: Milk): FrothedMilk = s"frothed $milk"
def brew(coffee: GroundCoffee, heatedWater: Water): Espresso = "espresso"
def combine(espresso: Espresso, frothedMilk: FrothedMilk): Cappuccino = "cappuccino"
// some exceptions for things that might go wrong in the individual steps
// (we'll need some of them later, use the others when experimenting
// with the code):
case class GrindingException(msg: String) extends Exception(msg)
case class FrothingException(msg: String) extends Exception(msg)
case class WaterBoilingException(msg: String) extends Exception(msg)
case class BrewingException(msg: String) extends Exception(msg)
// going through these steps sequentially:
def prepareCappuccino(): Try[Cappuccino] = for {
  ground <- Try(grind("arabica beans"))
  water <- Try(heatWater(Water(25)))
  espresso <- Try(brew(ground, water))
  foam <- Try(frothMilk("milk"))
} yield combine(espresso, foam)

Doing it like this has several advantages: You get a very readable step-by-step instruction of what to do. Moreover, you will likely not get confused while preparing the cappuccino this way, since you are avoiding context switches.

On the downside, preparing your cappuccino in such a step-by-step manner means that your brain and body are on wait during large parts of the whole process. While waiting for the ground coffee, you are effectively blocked. Only when that’s finished, you’re able to start heating some water, and so on.

This is clearly a waste of valuable resources. It’s very likely that you would want to initiate multiple steps and have them execute concurrently. Once you see that the water and the ground coffee is ready, you’d start brewing the espresso, in the meantime already starting the process of frothing the milk.

It’s really no different when writing a piece of software. A web server only has so many threads for processing requests and creating appropriate responses. You don’t want to block these valuable threads by waiting for the results of a database query or a call to another HTTP service. Instead, you want an asynchronous programming model and non-blocking IO, so that, while the processing of one request is waiting for the response from a database, the web server thread handling that request can serve the needs of some other request instead of idling along.

“I heard you like callbacks, so I put a callback in your callback!”

Of course, you already knew all that - what with Node.js being all the rage among the cool kids for a while now. The approach used by Node.js and some others is to communicate via callbacks, exclusively. Unfortunately, this can very easily lead to a convoluted mess of callbacks within callbacks within callbacks, making your code hard to read and debug.

Scala’s Future allows callbacks, too, as you will see very shortly, but it provides much better alternatives, so it’s likely you won’t need them a lot.

“I know Futures, and they are completely useless!”

You might also be familiar with other Future implementations, most notably the one provided by Java. There is not really much you can do with a Java future other than checking if it’s completed or simply blocking until it is completed. In short, they are nearly useless and definitely not a joy to work with.

If you think that Scala’s futures are anything like that, get ready for a surprise. Here we go!

Semantics of Future

Scala’s Future[T], residing in the scala.concurrent package, is a container type, representing a computation that is supposed to eventually result in a value of type T. Alas, the computation might go wrong or time out, so when the future is completed, it may not have been successful after all, in which case it contains an exception instead.

Future is a write-once container – after a future has been completed, it is effectively immutable. Also, the Future type only provides an interface for reading the value to be computed. The task of writing the computed value is achieved via a Promise. Hence, there is a clear separation of concerns in the API design. In this post, we are focussing on the former, postponing the use of the Promise type to the next article in this series.

Working with Futures

There are several ways you can work with Scala futures, which we are going to examine by rewriting our cappuccino example to make use of the Future type. First, we need to rewrite all of the functions that can be executed concurrently so that they immediately return a Future instead of computing their result in a blocking way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import scala.concurrent.future
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
import scala.util.Random

def grind(beans: CoffeeBeans): Future[GroundCoffee] = Future {
  println("start grinding...")
  Thread.sleep(Random.nextInt(2000))
  if (beans == "baked beans") throw GrindingException("are you joking?")
  println("finished grinding...")
  s"ground coffee of $beans"
}

def heatWater(water: Water): Future[Water] = Future {
  println("heating the water now")
  Thread.sleep(Random.nextInt(2000))
  println("hot, it's hot!")
  water.copy(temperature = 85)
}

def frothMilk(milk: Milk): Future[FrothedMilk] = Future {
  println("milk frothing system engaged!")
  Thread.sleep(Random.nextInt(2000))
  println("shutting down milk frothing system")
  s"frothed $milk"
}

def brew(coffee: GroundCoffee, heatedWater: Water): Future[Espresso] = Future {
  println("happy brewing :)")
  Thread.sleep(Random.nextInt(2000))
  println("it's brewed!")
  "espresso"
}

There are several things that require an explanation here.

First off, there is the apply method on the Future companion object, that requires two arguments:

1
2
3
object Future {
  def apply[T](body: => T)(implicit execctx: ExecutionContext): Future[T]
}

The computation to be computed asynchronously is passed in as the body by-name parameter. The second argument, in its own argument list, is an implicit one, which means we don’t have to specify one if a matching implicit value is defined somewhere in scope. We make sure this is the case by importing the global execution context.

An ExecutionContext is something that can execute our future, and you can think of it as something like a thread pool. Since the ExecutionContext is available implicitly, we only have a single one-element argument list remaining. Single-argument lists can be enclosed with curly braces instead of parentheses. People often make use of this when calling the future method, making it look a little bit like we are using a feature of the language and not calling an ordinary method. The ExecutionContext is an implicit parameter for virtually all of the Future API.

Furthermore, of course, in this simple example, we don’t actually compute anything, which is why we are putting in some random sleep, simply for demonstration purposes. We also print to the console before and after our “computation” to make the non-deterministic and concurrent nature of our code clearer when trying it out.

The computation of the value to be returned by a Future will start at some non-deterministic time after that Future instance has been created, by some thread assigned to it by the ExecutionContext.

Callbacks

Sometimes, when things are simple, using a callback can be perfectly fine. Callbacks for futures are partial functions. You can pass a callback to the onSuccess method. It will only be called if the Future completes successfully, and if so, it receives the computed value as its input:

1
2
3
grind("arabica beans").onSuccess { case ground =>
  println("okay, got my ground coffee")
}

Similarly, you could register a failure callback with the onFailure method. Your callback will receive a Throwable, but it will only be called if the Future did not complete successfully.

Usually, it’s better to combine these two and register a completion callback that will handle both cases. The input parameter for that callback is a Try:

1
2
3
4
5
import scala.util.{Success, Failure}
grind("baked beans").onComplete {
  case Success(ground) => println(s"got my $ground")
  case Failure(ex) => println("This grinder needs a replacement, seriously!")
}

Since we are passing in baked beans, an exception occurs in the grind method, leading to the Future completing with a Failure.

Composing futures

Using callbacks can be quite painful if you have to start nesting callbacks. Thankfully, you don’t have to do that! The real power of the Scala futures is that they are composable.

If you have followed this series, you will have noticed that all the container types we discussed made it possible for you to map them, flat map them, or use them in for comprehensions and that I mentioned that Future is a container type, too. Hence, the fact that Scala’s Future type allows you to do all that will not come as a surprise at all.

The real question is: What does it really mean to perform these operations on something that hasn’t even finished computing yet?

Mapping the future

Haven’t you always wanted to be a traveller in time who sets out to map the future? As a Scala developer you can do exactly that! Suppose that once your water has heated you want to check if its temperature is okay. You can do so by mapping your Future[Water] to a Future[Boolean]:

1
2
3
4
val temperatureOkay: Future[Boolean] = heatWater(Water(25)).map { water =>
  println("we're in the future!")
  (80 to 85).contains(water.temperature)
}

The Future[Boolean] assigned to temperatureOkay will eventually contain the successfully computed boolean value. Go change the implementation of heatWater so that it throws an exception (maybe because your water heater explodes or something) and watch how we're in the future will never be printed to the console.

When you are writing the function you pass to map, you’re in the future, or rather in a possible future. That mapping function gets executed as soon as your Future[Water] instance has completed successfully. However, the timeline in which that happens might not be the one you live in. If your instance of Future[Water] fails, what’s taking place in the function you passed to map will never happen. Instead, the result of calling map will be a Future[Boolean] containing a Failure.

Keeping the future flat

If the computation of one Future depends on the result of another, you’ll likely want to resort to flatMap to avoid a deeply nested structure of futures.

For example, let’s assume that the process of actually measuring the temperature takes a while, so you want to determine whether the temperature is okay asynchronously, too. You have a function that takes an instance of Water and returns a Future[Boolean]:

1
2
3
def temperatureOkay(water: Water): Future[Boolean] = Future {
  (80 to 85).contains(water.temperature)
}

Use flatMap instead of map in order to get a Future[Boolean] instead of a Future[Future[Boolean]]:

1
2
3
4
5
6
val nestedFuture: Future[Future[Boolean]] = heatWater(Water(25)).map {
  water => temperatureOkay(water)
}
val flatFuture: Future[Boolean] = heatWater(Water(25)).flatMap {
  water => temperatureOkay(water)
}

Again, the mapping function is only executed after (and if) the Future[Water] instance has been completed successfully, hopefully with an acceptable temperature.

For comprehensions

Instead of calling flatMap, you’ll usually want to write a for comprehension, which is essentially the same, but reads a lot clearer. Our example above could be rewritten like this:

1
2
3
4
val acceptable: Future[Boolean] = for {
  heatedWater <- heatWater(Water(25))
  okay <- temperatureOkay(heatedWater)
} yield okay

If you have multiple computations that can be computed in parallel, you need to take care that you already create the corresponding Future instances outside of the for comprehension.

1
2
3
4
5
6
7
8
def prepareCappuccinoSequentially(): Future[Cappuccino] = {
  for {
    ground <- grind("arabica beans")
    water <- heatWater(Water(20))
    foam <- frothMilk("milk")
    espresso <- brew(ground, water)
  } yield combine(espresso, foam)
}

This reads nicely, but since a for comprehension is just another representation for nested flatMap calls, this means that the Future[Water] created in heatWater is only really instantiated after the Future[GroundCoffee] has completed successfully. You can check this by watching the sequential console output coming from the functions we implemented above.

Hence, make sure to instantiate all your independent futures before the for comprehension:

1
2
3
4
5
6
7
8
9
10
11
def prepareCappuccino(): Future[Cappuccino] = {
  val groundCoffee = grind("arabica beans")
  val heatedWater = heatWater(Water(20))
  val frothedMilk = frothMilk("milk")
  for {
    ground <- groundCoffee
    water <- heatedWater
    foam <- frothedMilk
    espresso <- brew(ground, water)
  } yield combine(espresso, foam)
}

Now, the three futures we create before the for comprehension start being completed immediately and execute concurrently. If you watch the console output, you will see that it’s non-deterministic. The only thing that’s certain is that the "happy brewing" output will come last. Since the method in which it is called requires the values coming from two other futures, it is only created inside our for comprehension, i.e. after those futures have completed successfully.

Failure projections

You will have noticed that Future[T] is success-biased, allowing you to use map, flatMap, filter etc. under the assumption that it will complete successfully. Sometimes, you may want to be able to work in this nice functional way for the timeline in which things go wrong. By calling the failed method on an instance of Future[T], you get a failure projection of it, which is a Future[Throwable]. Now you can map that Future[Throwable], for example, and your mapping function will only be executed if the original Future[T] has completed with a failure.

Outlook

You have seen the Future, and it looks bright! The fact that it’s just another container type that can be composed and used in a functional way makes working with it very pleasant.

Making blocking code concurrent can be pretty easy by wrapping it in a call to future. However, it’s better to be non-blocking in the first place. To achieve this, one has to make a Promise to complete a Future. This and using the futures in practice will be the topic of the next part of this series.

The Neophyte’s Guide to Scala Part 7: The Either Type

In the previous article, I discussed functional error handling using Try, which was introduced in Scala 2.10. I also mentioned the existence of another, somewhat similar type called Either, which is the subject of this article. You will learn how to use it, when to use it, and what its particular pitfalls are.

Speaking of which, at least at the time of this writing, Either has some serious design flaws you need to be aware of, so much so that one might argue about whether to use it at all. So why then should you learn about Either at all?

For one, people will not all migrate their existing code bases to use Try for dealing with exceptions, so it is good to be able to understand the intricacies of this type, too.

Moreover, Try is not really an all-out replacement for Either, only for one particular usage of it, namely handling exceptions in a functional way. As it stands, Try and Either really complement each other, each covering different use cases. And, as flawed as Either may be, in certain situations it will still be a very good fit.

The semantics

Like Option and Try, Either is a container type. Unlike the aforementioned types, it takes not only one, but two type parameters: An Either[A, B] instance can contain either an instance of A, or an instance of B. This is different from a Tuple2[A, B], which contains both an A and a B instance.

Either has exactly two sub types, Left and Right. If an Either[A, B] object contains an instance of A, then the Either is a Left. Otherwise it contains an instance of B and is a Right.

There is nothing in the semantics of this type that specifies one or the other sub type to represent an error or a success, respectively. In fact, Either is a general-purpose type for use whenever you need to deal with situations where the result can be of one of two possible types. Nevertheless, error handling is a popular use case for it, and by convention, when using it that way, the Left represents the error case, whereas the Right contains the success value.

Creating an Either

Creating an instance of Either is trivial. Both Left and Right are case classes, so if we want to implement a rock-solid internet censorship feature, we can just do the following:

1
2
3
4
5
6
7
import scala.io.Source
import java.net.URL
def getContent(url: URL): Either[String, Source] =
  if (url.getHost.contains("google"))
    Left("Requested URL is blocked for the good of the people!")
  else
    Right(Source.fromURL(url))

Now, if we call getContent(new URL("http://danielwestheide.com")), we will get a scala.io.Source wrapped in a Right. If we pass in new URL("https://plus.google.com"), the return value will be a Left containing a String.

Working with Either values

Some of the very basic stuff works just as you know from Option or Try: You can ask an instance of Either if it isLeft or isRight. You can also do pattern matching on it, which is one of the most familiar and convenient ways of working with objects of this type:

1
2
3
4
getContent(new URL("http://google.com")) match {
  case Left(msg) => println(msg)
  case Right(source) => source.getLines.foreach(println)
}

Projections

You cannot, at least not directly, use an Either instance like a collection, the way you are familiar with from Option and Try. This is because Either is designed to be unbiased.

Try is success-biased: it offers you map, flatMap and other methods that all work under the assumption that the Try is a Success, and if that’s not the case, they effectively don’t do anything, returning the Failure as-is.

The fact that Either is unbiased means that you first have to choose whether you want to work under the assumption that it is a Left or a Right. By calling left or right on an Either value, you get a LeftProjection or RightProjection, respectively, which are basically left- or right-biased wrappers for the Either.

Mapping

Once you have a projection, you can call map on it:

1
2
3
4
5
6
val content: Either[String, Iterator[String]] =
  getContent(new URL("http://danielwestheide.com")).right.map(_.getLines())
// content is a Right containing the lines from the Source returned by getContent
val moreContent: Either[String, Iterator[String]] =
  getContent(new URL("http://google.com")).right.map(_.getLines)
// moreContent is a Left, as already returned by getContent

Regardless of whether the Either[String, Source] in this example is a Left or a Right, it will be mapped to an Either[String, Iterator[String]]. If it’s called on a Right, the value inside it will be transformed. If it’s a Left, that will be returned unchanged.

We can do the same with a LeftProjection, of course:

1
2
3
4
5
6
val content: Either[Iterator[String], Source] =
  getContent(new URL("http://danielwestheide.com")).left.map(Iterator(_))
// content is the Right containing a Source, as already returned by getContent
val moreContent: Either[Iterator[String], Source] =
  getContent(new URL("http://google.com")).left.map(Iterator(_))
// moreContent is a Left containing the msg returned by getContent in an Iterator

Now, if the Either is a Left, its wrapped value is transformed, whereas a Right would be returned unchanged. Either way, the result is of type Either[Iterator[String], Source].

Please note that the map method is defined on the projection types, not on Either, but it does return a value of type Either, not a projection. In this, Either deviates from the other container types you know. The reason for this has to do with Either being unbiased, but as you will see, this can lead to some very unpleasant problems in certain cases. It also means that if you want to chain multiple calls to map, flatMap and the like, you always have to ask for your desired projection again in between.

Flat mapping

Projections also support flat mapping, avoiding the common problem of creating a convoluted structure of multiple inner and outer Either types that you will end up with if you nest multiple calls to map.

I’m putting very high requirements on your suspension of disbelief now, coming up with a completely contrived example. Let’s say we want to calculate the average number of lines of two of my articles. You’ve always wanted to do that, right? Here’s how we could solve this challenging problem:

1
2
3
4
5
val part5 = new URL("http://t.co/UR1aalX4")
val part6 = new URL("http://t.co/6wlKwTmu")
val content = getContent(part5).right.map(a =>
  getContent(part6).right.map(b =>
    (a.getLines().size + b.getLines().size) / 2))

What we’ll end up with is an Either[String, Either[String, Int]]. Now, content being a nested structure of Rights, we could flatten it by calling the joinRight method on it (you also have joinLeft available to flatten a nested structure of Lefts).

However, we can avoid creating this nested structure altogether. If we flatMap on our outer RightProjection, we get a more pleasant result type, unpacking the Right of the inner Either:

1
2
3
val content = getContent(part5).right.flatMap(a =>
  getContent(part6).right.map(b =>
    (a.getLines().size + b.getLines().size) / 2))

Now content is a flat Either[String, Int], which makes it a lot nicer to work with, for example using pattern matching.

For comprehensions

By now, you have probably learned to love working with for comprehensions in a consistent way on various different data types. You can do that, too, with projections of Either, but the sad truth is that it’s not quite as nice, and there are things that you won’t be able to do without resorting to ugly workarounds, out of the box.

Let’s rewrite our flatMap example, making use of for comprehensions instead:

1
2
3
4
5
def averageLineCount(url1: URL, url2: URL): Either[String, Int] =
  for {
    source1 <- getContent(url1).right
    source2 <- getContent(url2).right
  } yield (source1.getLines().size + source2.getLines().size) / 2

This is not too bad. Note that we have to call right on each Either we use in our generators, of course.

Now, let’s try to refactor this for comprehension – since the yield expression is a little too involved, we want to extract some parts of it into value definitions inside our for comprehension:

1
2
3
4
5
6
7
def averageLineCountWontCompile(url1: URL, url2: URL): Either[String, Int] =
  for {
    source1 <- getContent(url1).right
    source2 <- getContent(url2).right
    lines1 = source1.getLines().size
    lines2 = source2.getLines().size
  } yield (lines1 + lines2) / 2

This won’t compile! The reason will become clearer if we examine what this for comprehension corresponds to, if you take away the sugar. It translates to something that is similar to the following, albeit much less readable:

1
2
3
4
5
6
7
8
def averageLineCountDesugaredWontCompile(url1: URL, url2: URL): Either[String, Int] =
  getContent(url1).right.flatMap { source1 =>
    getContent(url2).right.map { source2 =>
      val lines1 = source1.getLines().size
      val lines2 = source2.getLines().size
      (lines1, lines2)
    }.map { case (x, y) => x + y / 2 }
  }

The problem is that by including a value definition in our for comprehension, a new call to map is introduced automatically – on the result of the previous call to map, which has returned an Either, not a RightProjection. As you know, Either doesn’t define a map method, making the compiler a little bit grumpy.

This is where Either shows us its ugly trollface. In this example, the value definitions are not strictly necessary. If they are, you can work around this problem, replacing any value definitions by generators, like this:

1
2
3
4
5
6
7
def averageLineCount(url1: URL, url2: URL): Either[String, Int] =
  for {
    source1 <- getContent(url1).right
    source2 <- getContent(url2).right
    lines1 <- Right(source1.getLines().size).right
    lines2 <- Right(source2.getLines().size).right
  } yield (lines1 + lines2) / 2

It’s important to be aware of these design flaws. They don’t make Either unusable, but can lead to serious headaches if you don’t have a clue what’s going on.

Other methods

The projection types have some other useful methods:

You can convert your Either instance to an Option by calling toOption on one of its projections. For example, if you have an e of type Either[A, B], e.right.toOption will return an Option[B]. If your Either[A, B] instance is a Right, that Option[B] will be a Some. If it’s a Left, it will be None. The reverse behaviour can be achieved, of course, when calling toOption on the LeftProjection of your Either[A, B]. If you need a sequence of either one value or none, use toSeq instead.

Folding

If you want to transform an Either value regardless of whether it is a Left or a Right, you can do so by means of the fold method that is defined on Either, expecting two transform functions with the same result type, the first one being called if the Either is a Left, the second one if it’s a Right.

To demonstrate this, let’s combine the two mapping operations we implemented on the LeftProjection and the RightProjection above:

1
2
3
4
val content: Iterator[String] =
  getContent(new URL("http://danielwestheide.com")).fold(Iterator(_), _.getLines())
val moreContent: Iterator[String] =
  getContent(new URL("http://google.com")).fold(Iterator(_), _.getLines())

In this example, we are transforming our Either[String, Source] into an Iterator[String], no matter if it’s a Left or a Right. You could just as well return a new Either again or execute side-effects and return Unit from your two functions. As such, calling fold provides a nice alternative to pattern matching.

When to use Either

Now that you have seen how to work with Either values and what you have to take care of, let’s move on to some specific use cases.

Error handling

You can use Either for exception handling very much like Try. Either has one advantage over Try: you can have more specific error types at compile time, while Try uses Throwable all the time. This means that Either can be a good choice for expected errors.

You’d have to implement a method like this, delegating to the very useful Exception object from the scala.util.control package:

1
2
3
import scala.util.control.Exception.catching
def handling[Ex <: Throwable, T](exType: Class[Ex])(block: => T): Either[Ex, T] =
  catching(exType).either(block).asInstanceOf[Either[Ex, T]]

The reason you might want to do that is because while the methods provided by scala.util.Exception allow you to catch only certain types of exceptions, the resulting compile-time error type is always Throwable.

With this method at hand, you can pass along expected exceptions in an Either:

1
2
3
import java.net.MalformedURLException
def parseURL(url: String): Either[MalformedURLException, URL] =
  handling(classOf[MalformedURLException])(new URL(url))

You will have other expected error conditions, and not all of them result in third-party code throwing an exception you need to handle, as in the example above. In these cases, there is really no need to throw an exception yourself, only to catch it and wrap it in a Left. Instead, simply define your own error type, preferably as a case class, and return a Left wrapping an instance of that error type in your expected error condition occurs.

Here is an example:

1
2
3
4
5
6
case class Customer(age: Int)
class Cigarettes
case class UnderAgeFailure(age: Int, required: Int)
def buyCigarettes(customer: Customer): Either[UnderAgeFailure, Cigarettes] =
  if (customer.age < 16) Left(UnderAgeFailure(customer.age, 16))
  else Right(new Cigarettes)

You should avoid using Either for wrapping unexpected exceptions. Try does that better, without all the flaws you have to deal with when working with Either.

Processing collections

Generally, Either is a pretty good fit if you want to process a collection, where for some items in that collection, this might result in a condition that is problematic, but should not directly result in an exception, which would result in aborting the processing of the rest of the collection.

Let’s assume that for our industry-standard web censorship system, we are using some kind of black list:

1
2
3
4
5
6
7
8
9
type Citizen = String
case class BlackListedResource(url: URL, visitors: Set[Citizen])

val blacklist = List(
  BlackListedResource(new URL("https://google.com"), Set("John Doe", "Johanna Doe")),
  BlackListedResource(new URL("http://yahoo.com"), Set.empty),
  BlackListedResource(new URL("https://maps.google.com"), Set("John Doe")),
  BlackListedResource(new URL("http://plus.google.com"), Set.empty)
)

A BlackListedResource represents the URL of a black-listed web page plus the citizens who have tried to visit that page.

Now we want to process this black list, where our main purpose is to identify problematic citizens, i.e. those that have tried to visit blocked pages. At the same time, we want to identify suspicous web pages – if not a single citizen has tried to visit a black-listed page, we must assume that our subjects are bypassing our filter somehow, and we need to investigate that.

Here is how we can process our black list:

1
2
3
4
val checkedBlacklist: List[Either[URL, Set[Citizen]]] =
  blacklist.map(resource =>
    if (resource.visitors.isEmpty) Left(resource.url)
    else Right(resource.visitors))

We have created a sequence of Either values, with the Left instances representing suspicious URLs and the Right ones containing sets of problem citizens. This makes it almost a breeze to identify both our problem citizens and our suspicious web pages:

1
2
val suspiciousResources = checkedBlacklist.flatMap(_.left.toOption)
val problemCitizens = checkedBlacklist.flatMap(_.right.toOption).flatten.toSet

These more general use cases beyond exception handling are where Either really shines.

Summary

You have learned how to make use of Either, what its pitfalls are, and when to put it to use in your code. It’s a type that is not without flaws, and whether you want to have to deal with them and incorporate it in your own code is ultimately up to you.

In practice, you will notice that, now that we have Try at our hands, there won’t be terribly many use cases for it. Nevertheless, it’s good to know about it, both for those situations where it will be your perfect tool and to understand pre-2.10 Scala code you come across where it’s used for error handling.