Lifting monads to Kleisli categories with distributive laws

It seems my first non-trivial post on my new website is going to be something fairly technical and likely not of any interest to anyone in particular. However, it’s related to something I’m working on at the moment, and a little perfunctory searching seems to indicate that there’s some sort of hole in the literature here, at least as far as my understanding goes. This is partly also a means to justify my faffing about to get MathJax working on this site. I assume familiarity with functors and natural transformations, but not much category theory beyond that.

A monad (on some category 𝒞\mathcal{C}), historically also known as a triple, consists of three pieces of data:

  1. a functor T:𝒞𝒞\left. T:\mathcal{C}\rightarrow\mathcal{C} \right.;
  2. a natural transformation η=ηT:1T\left. \eta = \eta^{T}:1\rightarrow T \right. called the unit; and
  3. a natural transformation μ=μT:TTT\left. \mu = \mu^{T}:TT\rightarrow T \right. called multiplication

such that μTμ=μμT\mu \circ T\mu = \mu \circ \mu T (associativity) and μTη=μηT=1\mu \circ T\eta = \mu \circ \eta T = 1 (unit law), famously summarised as ‘a monad is a monoid in the category of endofunctors’1. Given a monad TT, we may form its Kleisli category 𝐊𝐥(T){\mathbf{K}\mathbf{l}}(T), which has the same objects as the underlying category and where an arrow XYX\nrightarrow Y is a 𝒞\mathcal{C}-arrow XTY\left. X\rightarrow TY \right.. I’ll use this different notation for Kleisli arrows because it’s important to keep straight the distinction between them and arrows in the underlying category. The identities are given by the unit maps ηX\eta_{X} and composition of f:XYf:X\nrightarrow Y and g:YZg:Y\nrightarrow Z uses the multiplication:

XfTYTgTTZμZTZX\overset{f}{\rightarrow}TY\overset{Tg}{\rightarrow}TTZ\overset{\mu_{Z}}{\rightarrow}TZ

This construction is well-known from its use in functional programming to model effects.

Distributive laws come up a lot in the world of string diagrams and SMTs, where we use them to combine two theories together subject to some laws governing their interaction. The most familiar example of a distributive law is multiplication over addition:

a(b+c)=ab+aca(b + c) = ab + ac

Beck (p95) extended the notion to monads, defining a law distributing a monad SS over a monad TT to be a natural transformation λ:STTS\left. \lambda:ST\rightarrow TS \right. satisfying certain coherence laws. In the paper, he in fact gives a way to recover the multiplication/addition law back from this construction, which is pretty cool.

The main theorem related to distributive laws of monads is the equivalence of

  1. a distributive law of SS over TT;
  2. a coherent multiplication on TSTS;
  3. a lifting of TT to the category of SS-algebras (which I haven’t introduced, since it isn’t really relevant to this post); and
  4. a lifting of SS to the Kleisli category of TT.

Now, this certainly is true, but this is where the hole appears. Beck’s original paper omits this last condition (perhaps because Kleisli categories weren’t widespread back then?). On the other hand, Cheng implicitly assumes the result for Kleisli categories. I have a feeling that it possibly follows out of the treatment of adjoint functors in Section 3 in the Beck paper, but I can’t quite get my head around it, and in any case I needed to write the construction down from first principles for my application.

To warm up, let’s look at the most well-known use for a distributive law λ:STTS\left. \lambda:ST\rightarrow TS \right.: giving the composite functor TSTS a monad structure. Simply composing the units ηTηS\eta^{T} \circ \eta^{S} gives a natural transformation which will be the unit of the composite monad, but the bit which actually requires a distributive law is defining a multiplication μTS:TSTSTS\left. \mu^{TS}:TSTS\rightarrow TS \right.. It’s one of those times when there’s essentially only one thing you can write down, and that is exactly what you need:

TSTSTλSTTSSμTμSTSTSTS\overset{T\lambda S}{\rightarrow}TTSS\overset{\mu^{T}\mu^{S}}{\rightarrow}TS

and all the relevant coherence conditions match up to satisfy the monad laws.

For the Kleisli category, we need to define a monad S̃\widetilde{S} on 𝐊𝐥(T){\mathbf{K}\mathbf{l}}(T). Even just to define the functor part we already need the distributive law: the objects are the same, but for a map f:XYf:X\nrightarrow Y, we need to define

S̃f:SXTSY=SXSfSTYλYTSY\left. \widetilde{S}f:SX\rightarrow TSY = SX\overset{Sf}{\rightarrow}STY\overset{\lambda_{Y}}{\rightarrow}TSY \right.

So handwaving over the proofs that this is a proper functor, we now just need a unit and multiplication. The unit is again just a composite of the constituent units:

ηX:XS̃X=XηXSSXηSXTTSX\eta_{X}:X\nrightarrow\widetilde{S}X = X\overset{\eta_{X}^{S}}{\rightarrow}SX\overset{\eta_{SX}^{T}}{\rightarrow}TSX

and the multiplication is a little more fiddly, but still straightforward:

μX:S̃S̃XS̃X=SSXμXSSXSηXTSTXλXTSX\mu_{X}:\widetilde{S}\widetilde{S}X\nrightarrow\widetilde{S}X = SSX\overset{\mu_{X}^{S}}{\rightarrow}SX\overset{S\eta_{X}^{T}}{\rightarrow}STX\overset{\lambda_{X}}{\rightarrow}TSX

And, again glossing over all the annoying coherence proofs, we’re done!

There are quite a few pieces I’ve left out of all of this, notably going back the other way, from a lifting of SS to 𝐊𝐥(T){\mathbf{K}\mathbf{l}}(T) to a distributive law STTS\left. ST\rightarrow TS \right., and most of the verifications that all these things satisfy the laws they are supposed to, but this is mostly what I wanted to figure out. Hopefully it’s useful to someone other than me. Any mistakes or comments, send me an email or tweet and I’d be happy to discuss.