Learn You a Haskell for Great Good! Functors, Applicative Functors and Monoids
Translations of this material:
- into Russian: Изучай Haskell ради добра! Аппликативные функторы. Translated in draft, editing and proof-reading required.
-
Submitted for translation by asinitsyn 03.12.2009
Published 2 years ago.
Text
Functors, Applicative Functors and Monoids
Haskell's combination of purity, higher order functions, parameterized algebraic data types, and typeclasses allows us to implement polymorphism on a much higher level than possible in other languages. We don't have to think about types belonging to a big hierarchy of types. Instead, we think about what the types can act like and then connect them with the appropriate typeclasses. An Int can act like a lot of things. It can act like an equatable thing, like an ordered thing, like an enumerable thing, etc.
Typeclasses are open, which means that we can define our own data type, think about what it can act like and connect it with the typeclasses that define its behaviors. Because of that and because of Haskell's great type system that allows us to know a lot about a function just by knowing its type declaration, we can define typeclasses that define behavior that's very general and abstract. We've met typeclasses that define operations for seeing if two things are equal or comparing two things by some ordering. Those are very abstract and elegant behaviors, but we just don't think of them as anything very special because we've been dealing with them for most of our lives. We recently met functors, which are basically things that can be mapped over. That's an example of a useful and yet still pretty abstract property that typeclasses can describe. In this chapter, we'll take a closer look at functors, along with slightly stronger and more useful versions of functors called applicative functors. We'll also take a look at monoids, which are sort of like socks.
Functors redux
We've already talked about functors in their own little section. If you haven't read it yet, you should probably give it a glance right now, or maybe later when you have more time. Or you can just pretend you read it.
Still, here's a quick refresher: Functors are things that can be mapped over, like lists, Maybes, trees, and such. In Haskell, they're described by the typeclass Functor, which has only one typeclass method, namely fmap, which has a type of fmap :: (a -> b) -> f a -> f b. It says: give me a function that takes an a and returns a b and a box with an a (or several of them) inside it and I'll give you a box with a b (or several of them) inside it. It kind of applies the function to the element inside the box.
A word of advice. Many times the box analogy is used to help you get some intuition for how functors work, and later, we'll probably use the same analogy for applicative functors and monads. It's an okay analogy that helps people understand functors at first, just don't take it too literally, because for some functors the box analogy has to be stretched really thin to still hold some truth. A more correct term for what a functor is would be computational context. The context might be that the computation can have a value or it might have failed (Maybe and Either a) or that there might be more values (lists), stuff like that.
If we want to make a type constructor an instance of Functor, it has to have a kind of * -> *, which means that it has to take exactly one concrete type as a type parameter. For example, Maybe can be made an instance because it takes one type parameter to produce a concrete type, like Maybe Int or Maybe String. If a type constructor takes two parameters, like Either, we have to partially apply the type constructor until it only takes one type parameter. So we can't write instance Functor Either where, but we can write instance Functor (Either a) where and then if we imagine that fmap is only for Either a, it would have a type declaration of fmap :: (b -> c) -> Either a b -> Either a c. As you can see, the Either a part is fixed, because Either a takes only one type parameter, whereas just Either takes two so fmap :: (b -> c) -> Either b -> Either c wouldn't really make sense.
We've learned by now how a lot of types (well, type constructors really) are instances of Functor, like [], Maybe, Either a and a Tree type that we made on our own. We saw how we can map functions over them for great good. In this section, we'll take a look at two more instances of functor, namely IO and (->) r.
If some value has a type of, say, IO String, that means that it's an I/O action that, when performed, will go out into the real world and get some string for us, which it will yield as a result. We can use <- in do syntax to bind that result to a name. We mentioned that I/O actions are like boxes with little feet that go out and fetch some value from the outside world for us. We can inspect what they fetched, but after inspecting, we have to wrap the value back in IO. By thinking about this box with little feet analogy, we can see how IO acts like a functor.
Let's see how IO is an instance of Functor. When we fmap a function over an I/O action, we want to get back an I/O action that does the same thing, but has our function applied over its result value.
instance Functor IO where
fmap f action = do
result <- action
return (f result)
The result of mapping something over an I/O action will be an I/O action, so right off the bat we use do syntax to glue two actions and make a new one. In the implementation for fmap, we make a new I/O action that first performs the original I/O action and calls its result result. Then, we do return (f result). return is, as you know, a function that makes an I/O action that doesn't do anything but only presents something as its result. The action that a do block produces will always have the result value of its last action. That's why we use return to make an I/O action that doesn't really do anything, it just presents f result as the result of the new I/O action.
We can play around with it to gain some intuition. It's pretty simple really. Check out this piece of code:
main = do line <- getLine
let line' = reverse line
putStrLn "You said " ++ line' ++ " backwards!"
putStrLn "Yes, you really said" ++ line' ++ " backwards!"
The user is prompted for a line and we give it back to the user, only reversed. Here's how to rewrite this by using fmap:
main = do line <- fmap reverse getLine
putStrLn "You said " ++ line ++ " backwards!"
putStrLn "Yes, you really said" ++ line ++ " backwards!"
Just like when we fmap reverse over Just "blah" to get Just "halb", we can fmap reverse over getLine. getLine is an I/O action that has a type of IO String and mapping reverse over it gives us an I/O action that will go out into the real world and get a line and then apply reverse to its result. Like we can apply a function to something that's inside a Maybe box, we can apply a function to what's inside an IO box, only it has to go out into the real world to get something. Then when we bind it to a name by using <-, the name will reflect the result that already has reverse applied to it.
The I/O action fmap (++"!") getLine behaves just like getLine, only that its result always has "!" appended to it!
If we look at what fmap's type would be if it were limited to IO, it would be fmap :: (a -> b) -> IO a -> IO b. fmap takes a function and an I/O action and returns a new I/O action that's like the old one, except that the function is applied to its contained result.
If you ever find yourself binding the result of an I/O action to a name, only to apply a function to that and call that something else, consider using fmap, because it looks prettier. If you want to apply multiple transformations to some data inside a functor, you can declare your own function at the top level, make a lambda function or ideally, use function composition:
import Data.Char
import Data.List
main = do line <- fmap (intersperse '-' . reverse . map toUpper) getLine
putStrLn line
$ runhaskell fmapping_io.hs
hello there
E-R-E-H-T- -O-L-L-E-H
As you probably know, intersperse '-' . reverse . map toUpper is a function that takes a string, maps toUpper over it, the applies reverse to that result and then applies intersperse '-' to that result. It's like writing (\xs -> intersperse '-' (reverse (map toUpper xs))), only prettier.
Another instance of Functor that we've been dealing with all along but didn't know was a Functor is (->) r. You're probably slightly confused now, since what the heck does (->) r mean? The function type r -> a can be rewritten as (->) r a, much like we can write 2 + 3 as (+) 2 3. When we look at it as (->) r a, we can see (->) in a slighty different light, because we see that it's just a type constructor that takes two type parameters, just like Either. But remember, we said that a type constructor has to take exactly one type parameter so that it can be made an instance of Functor. That's why we can't make (->) an instance of Functor, but if we partially apply it to (->) r, it doesn't pose any problems. If the syntax allowed for type constructors to be partially applied with sections (like we can partially apply + by doing (2+), which is the same as (+) 2), you could write (->) r as (r ->). How are functions functors? Well, let's take a look at the implementation, which lies in Control.Monad.Instances
We usually mark functions that take anything and return anything as a -> b. r -> a is the same thing, we just used different letters for the type variables.
instance Functor ((->) r) where
fmap f g = (\x -> f (g x))
If the syntax allowed for it, it could have been written as
instance Functor (r ->) where
fmap f g = (\x -> f (g x))
But it doesn't, so we have to write it in the former fashion.
First of all, let's think about fmap's type. It's fmap :: (a -> b) -> f a -> f b. Now what we'll do is mentally replace all the f's, which are the role that our functor instance plays, with (->) r's. We'll do that to see how fmap should behave for this particular instance. We get fmap :: (a -> b) -> ((->) r a) -> ((->) r b). Now what we can do is write the (->) r a and (-> r b) types as infix r -> a and r -> b, like we normally do with functions. What we get now is fmap :: (a -> b) -> (r -> a) -> (r -> b).
Hmmm OK. Mapping one function over a function has to produce a function, just like mapping a function over a Maybe has to produce a Maybe and mapping a function over a list has to produce a list. What does the type fmap :: (a -> b) -> (r -> a) -> (r -> b) for this instance tell us? Well, we see that it takes a function from a to b and a function from r to a and returns a function from r to b. Does this remind you of anything? Yes! Function composition! We pipe the output of r -> a into the input of a -> b to get a function r -> b, which is exactly what function composition is about. If you look at how the instance is defined above, you'll see that it's just function composition. Another way to write this instance would be:
instance Functor ((->) r) where
fmap = (.)
This makes the revelation that using fmap over functions is just composition sort of obvious. Do :m + Control.Monad.Instances, since that's where the instance is defined and then try playing with mapping over functions.
ghci> :t fmap (*3) (+100)
fmap (*3) (+100) :: (Num a) => a -> a
ghci> fmap (*3) (+100) 1
303
ghci> (*3) `fmap` (+100) $ 1
303
ghci> (*3) . (+100) $ 1
303
ghci> fmap (show . (*3)) (*100) 1
"300"
We can call fmap as an infix function so that the resemblance to . is clear. In the second input line, we're mapping (*3) over (+100), which results in a function that will take an input, call (+100) on that and then call (*3) on that result. We call that function with 1.
How does the box analogy hold here? Well, if you stretch it, it holds. When we use fmap (+3) over Just 3, it's easy to imagine the Maybe as a box that has some contents on which we apply the function (*3). But what about when we're doing fmap (*3) (+100)? Well, you can think of the function (+100) as a box that contains its eventual result. Sort of like how an I/O action can be thought of as a box that will go out into the real world and fetch some result. Using fmap (*3) on (+100) will create another function that acts like (+100), only before producing a result, (*3) will be applied to that result. Now we can see how fmap acts just like . for functions.
The fact that fmap is function composition when used on functions isn't so terribly useful right now, but at least it's very interesting. It also bends our minds a bit and let us see how things that act more like computations than boxes (IO and (->) r) can be functors. The function being mapped over a computation results in the same computation but the result of that computation is modified with the function.
Before we go on to the rules that fmap should follow, let's think about the type of fmap once more. Its type is fmap :: (a -> b) -> f a -> f b. We're missing the class constraint (Functor f) =>, but we left it out here for brevity, because we're talking about functors anyway so we know what the f stands for. When we first learned about curried functions, we said that all Haskell functions actually take one parameter. A function a -> b -> c actually takes just one parameter of type a and then returns a function b -> c, which takes one parameter and returns a c. That's how if we call a function with too few parameters (i.e. partially apply it), we get back a function that takes the number of parameters that we left out (if we're thinking about functions as taking several parameters again). So a -> b -> c can be written as a -> (b -> c), to make the currying more apparent.
In the same vein, if we write fmap :: (a -> b) -> (f a -> f b), we can think of fmap not as a function that takes one function and a functor and returns a functor, but as a function that takes a function and returns a new function that's just like the old one, only it takes a functor as a parameter and returns a functor as the result. It takes an a -> b function and returns a function f a -> f b. This is called lifting a function. Let's play around with that idea by using GHCI's :t command:
ghci> :t fmap (*2)
fmap (*2) :: (Num a, Functor f) => f a -> f a
ghci> :t fmap (replicate 3)
fmap (replicate 3) :: (Functor f) => f a -> f [a]
The expression fmap (*2) is a function that takes a functor f over numbers and returns a functor over numbers. That functor can be a list, a Maybe , an Either String, whatever. The expression fmap (replicate 3) will take a functor over any type and return a functor over a list of elements of that type.
When we say a functor over numbers, you can think of that as a functor that has numbers in it. The former is a bit fancier and more technically correct, but the latter is usually easier to get.
This is even more apparent if we partially apply, say, fmap (++"!") and then bind it to a name in GHCI.
You can think of fmap as either a function that takes a function and a functor and then maps that function over the functor, or you can think of it as a function that takes a function and lifts that function so that it operates on functors. Both views are correct and in Haskell, equivalent.
The type fmap (replicate 3) :: (Functor f) => f a -> f [a] means that the function will work on any functor. What exactly it will do depends on which functor we use it on. If we use fmap (replicate 3) on a list, the list's implementation for fmap will be chosen, which is just map. If we use it on a Maybe a, it'll apply replicate 3 to the value inside the Just, or if it's Nothing, then it stays Nothing.
ghci> fmap (replicate 3) [1,2,3,4]
[[1,1,1],[2,2,2],[3,3,3],[4,4,4]]
ghci> fmap (replicate 3) (Just 4)
Just [4,4,4]
ghci> fmap (replicate 3) (Right "blah")
Right ["blah","blah","blah"]
ghci> fmap (replicate 3) Nothing
Nothing
ghci> fmap (replicate 3) (Left "foo")
Left "foo"
Next up, we're going to look at the functor laws. In order for something to be a functor, it should satisfy some laws. All functors are expected to exhibit certain kinds of functor-like properties and behaviors. They should reliably behave as things that can be mapped over. Calling fmap on a functor should just map a function over the functor, nothing more. This behavior is described in the functor laws. There are two of them that all instances of Functor should abide by. They aren't enforced by Haskell automatically, so you have to test them out yourself.
The first functor law states that if we map the id function over a functor, the functor that we get back should be the same as the original functor. If we write that a bit more formally, it means that fmap id = id. So essentially, this says that if we do fmap id over a functor, it should be the same as just calling id on the functor. Remember, id is the identity function, which just returns its parameter unmodified. It can also be written as \x -> x. If we view the functor as something that can be mapped over, the fmap id = id law seems kind of trivial or obvious.
Let's see if this law holds for a few values of functors.
ghci> fmap id (Just 3)
Just 3
ghci> id (Just 3)
Just 3
ghci> fmap id [1..5]
[1,2,3,4,5]
ghci> id [1..5]
[1,2,3,4,5]
ghci> fmap id []
[]
ghci> fmap id Nothing
Nothing
If we look at the implementation of fmap for, say, Maybe, we can figure out why the first functor law holds.
© Miran Lipovača. License: creative commons attribution noncommercial blah blah blah ... license
