While reading through Real World Haskell a short code snippet got me thinking about folds. Chapter 4, devoted to functional programming in general, uses a brief example built around the tails function to illustrate Haskell's as-patterns. This function makes it's home in the Data.List module and, given an input string s, will return a list of all results of tail x where x is a sublist of s. It's a bit convoluted to describe in prose, but the example offered by O'Sullivan et. al. makes it clear:

[@varese ~]$ ghci

...

*Main> :m +Data.List

*Main Data.List> tails "foobar"

["foobar","oobar","obar","bar","ar","r",""]

Maybe it was the fact that this function was introduced just after a lengthy discussion on left and right folds in the same chapter. Maybe it was due to an uptick in my use of folds when working with Scala. Whatever the cause, a thought grabbed hold of me: this function should be easy to implement by folding, shouldn't it? Unfortunately my previous attempt to sit down with Haskell code didn't quite work out as expected. Throw in a new FC 15 install (and all the goodies it contains) and there was no end of temptation for the unfocused. Fortunately good sense prevailed: GHC was installed, Chicken Scheme was not and it was time to start.

My intuition turned out to be correct; a version using foldl came together fairly quickly:

The overall algorithm used here is fairly straightforward. Beginning with an empty list we move left to right and do two things with every character c we find:

- Create a new string in our accumulator list containing c
- Add c to every string already in the accumulator

Visualizing the "left to right" evaluation of foldl gives us something like the following:

- r1 = append 'f' to every string in [] and add string 'f'
- r2 = append 'o' to every string in r1 and add string 'o'
- r3 = append 'o' to every string in r2 and add string 'o'

The result is exactly what we expect:

*Main> mytailsl "foobar"

["foobar","oobar","obar","bar","ar","r",""]

So we now have foldl... what about foldr? At this point I must offer a confession; left folds have always seemed intuitive to me while right folds most definitely have not. A left fold appears to be equivalent to a simple traversal of a list from "start" to "finish", gathering data as you move along. A right folds seems artificial somehow, like some kind of inside-out list traversal. And while a simple mental model for right folds remains somewhat elusive working through this sample did ease the pain somewhat. It took a bit of work, but after getting a handle on the order of evaluation a working implementation emerged:

This algorithm is a mirror image of the one used with foldl; since we're moving right to left we have to build member strings from back to front. For every new character c we build a new string by appending c to the largest string in the accumulator. We then return a new list with the new string for a head and the old accumulator for a tail. Note that we have to account for the empty list case when evaluating the first expression. Using this version on our test string we get:

*Main> mytailsr "foobar"

["foobar","oobar","obar","bar","ar","r",""]

Interestingly, while struggling to come to terms with right folds a related function presented itself. Taking the algorithm used in the left fold case and applying it directly to a right fold yields a function that might be called antitails. More precisely, for an input string s this function returns a list of strings (s - t) for every t in tails s.

*Main> reverse (antitails "foobar")

["","f","fo","foo","foob","fooba","foobar"]

*Main> mytailsl "foobar"

["foobar","oobar","obar","bar","ar","r",""]

The logic behind this operation looks something like the following:

Finally, it should be noted that the actual implementation of tails in GHC appears to be a recursive definition built around... as-patterns.

## No comments:

## Post a Comment