November 10, 2008

`It's more likely than you think.`

A monoid is a simple algebraic structure defined as a set **M** equipped with an operator ***** and a neutral element **1** such that the following properties hold:

The operator might (but isn't required to) obey other properties like:associativity∀x,y,z ∈ M: x*(y*z)=(x*y)*zidentity∀x ∈ M: x*1=1*x=xclosure∀x,y ∈ M ∃z ∈ M: x*y=z

idempotency∀x ∈ M: x*x=xcommutativity∀x,y ∈ M: x*y=y*x

In programming we often encounter *free monoids*, where elements of the set M are zero-or-more instances from some other underlying set. Free monoids are often described as strings over some alphabet (where the "alphabet" might consist of characters, integers, or WhizBangObjects).

To see how a data structure can behave like a monoid, let's consider integer lists under the concatenation operation `++`

. To be a monoid we need a neutral element in the set of lists. Empty lists fit the bill (since concatenating with an empty list is an identity operation). Formally, we can call this the monoid (

) or alternatively the free monoid over **List<int>**, **++**, **[]**`int`

values.

Are lists also idempotent or commutative? It's easy to see that they're not.example[1,2] ++ [3,4] = [1,2,3,4]

associativity∀L_{1},L_{2},L_{3}∈ List<int>: (L_{1}++L_{2})++L_{3}=L_{1}++(L_{2}++L_{3})identity∀L ∈ List<int>: L++[] = []++L = Lclosure∀L_{1},L_{2}∈ List<int> ∃L_{3}∈ List<int>: L_{1}++L_{2}=L_{3}

Now, let's dream a little dream and say we magically made thenot idempotent[7] ++ [7]=[7,7] [7]≠[7,7]not commutative[10] ++ [20]=[10,20] [20] ++ [10]=[20,10] [10,20]≠[20, 10]

`++`

commutative. That is:
[20] ++ [10] ++ [20]What an odd situation. Clearly, we're not dealing with lists anymore. In fact, just by twiddling a single algebraic knob we've discovered multisets. If we were feeling particularly nerdy we could drop the word "multiset" altogether and just say "commutative free monoid". This is usually a winning strategy at parties.=[20,10,20]=[10,20,20]=[20,20,10]!

Anyhow, let's push this bizarre situation a little further, and assert that `++`

should be both commutative **and** idempotent. Now we get:

[20] ++ [10] ++ [20] = [20] ++ [20] ++ [10] = [20] ++ [10] = [20,10]Look ma, it's a set!

Datatype | Operator | Neutral Element | Algebraic Properties |
---|---|---|---|

List<T> | List<T>.concat | empty list | |

Multiset<T> | Multiset<T>.union | empty multiset | commutative |

Set<T> | Set<T>.union | empty set | commutative, idempotent |

cgibbard presented a really neat use of monoids to combine comparison operators.

sigfpe wrote:

*The most important insight I ever had relating mathematics and computating was when I suddenly realised that when a mathematician talks about 'abstraction' they mean exactly the same thing as a computer scientist. It's a shared interface onto a bunch of different objects.*

*A monoid is a great example: the interface consists of an object called the identity and a function taking two arguments and returning one. There are then some rules about how these things interact. But the point is that both mathematics and computer science are full of things that 'expose' this interface. Groups, monoids, categories, rings, these are all just interfaces. And what's cool is that for each of these interfaces there are many objects that expose that interface, meaning that if you write code for the interface, the same code is reusable with all of those objects.
*

AlanCrowe pointed out that I'm writing about something that's been named the Bloom Hierarchy

otto_s argued (convincingly) that functions under composition aren't isomorphic to containers of functions (but rather it's the syntactic terms associated with functions that are). To avoid confusion, I dropped the isomorphism from the table below.

Datatype | Operator | Neutral Element | Algebraic Properties | Isomorphic Monoids |
---|---|---|---|---|

String | String.concat | empty string | List<Char> | |

Nullable<T> |
op(x,y) =
if (x.HasValue) then x else y |
null | idempotent | |

functions T→T | composition | id(x) = x | inherits algebraic properties of functions used (ie, if all functions used are commutative, then the composition monoid is also commutative) |