(A really quick post today). Following a recent post, Run length encoding in Python, I thought it would be nice to look at this simple encoding in Haskell. The RLE idea is simple, given some input:
"aaaabbaaa"
compress it by taking the length of each run of characters:
"4a2b3a"
Running this in GHCi:
> encode "aaaabbaaa" [(4,'a'),(2,'b'),(3,'a')] > decode (encode "aaaabbaaa") "aaaabbaaa"
So there’s a simple QuickCheck property we’ll want to establish. First, let’s look at the Python implementations:
def encode(l): return [(len(list(group)),name) for name, group in groupby(l)] def decode(l): return sum([length * [item] for length,item in l],[])
Ok, so group each run into a list, then pair up the length, with the list item, and to decode, replicate the character by its length, and concatenate. Easy in Haskell:
encode = map (length &&& head) . group decode = concatMap (uncurry replicate)
Who says static typing implies verbosity?
We also get a generic implementation for free, given that Haskell infers that the code never inspects the elements in the list directly, our `encode’ function will just as happily encode Strings as lists of numbers or any other list. That is, ‘encode’ will work on any type `a’ that looks like it can do (==).
encode :: Eq a => [a] -> [(Int, a)]
So:
> encode [1,1,1,1,2,2,2,7,7] [(4,1),(3,2),(2,7)] > encode ["foo", "foo", "bar", "foo", "foo", "foo"] [(2,"foo"),(1,"bar"),(3,"foo")]
Polymorphism rocks.
The original poster used a couple of unit tests, but in Haskell we’d usually specify generic properties the function must satisfy:
prop_id :: String -> Bool prop_id xs = decode (encode xs) == xs
QuickCheck then generates input for us:
> quickCheck prop_id OK, passed 100 tests.
So go out and sketch your little algorithms in (shorter, faster, safer) pseudo-Python :-)
Thanks to psykotic for the idea!