How to do Merge Sort in Haskell

Overview

Haskell is a functional programming language. It is a programming paradigm where we change data through expressions that don’t have side effects. These expressions are often called pure functions. Let’s see how we can implement one of the most popular sorting algorithms, Merge Sort, in Haskell.

Algorithm

Merge Sort uses the divide and conquer technique for sorting. We divide the input list into smaller lists and solve the sorting problem for each of them. Then, we combine the results to form the final solution. We use the following algorithm:

Divide the input list into two halves until we reach the base case (list of size 1).
Conquer by trying to sort the smaller lists when the base has reached.
Combine the two sorted lists into one.

Program

Let’s translate the algorithm above into a Haskell program:

-- Divide a list into two halves (even and odd positions)
divide :: [Int] -> ([Int], [Int])
divide = \list -> 
    case list of
        [] -> ([], [])
        x:xs -> 
            let (odds, evens) = divide xs
            in (x:evens, odds)
-- Merge two lists in a sorted fashion
merge :: [Int] -> [Int] -> [Int]
merge = \s1 -> \s2 ->
    case s1 of
        [] -> s2
        x:xs ->
            case s2 of
                [] -> s1
                y:ys | x>y -> y:merge s1 ys
                _ -> x:merge xs s2
-- Sort the list using the merge sort algorithm
mergesort :: [Int] -> [Int]
mergesort = \list ->
    case list of
        [] -> []
        [x] -> [x]
        _ ->
            let (evens, odds) = divide list
            in merge (mergesort evens) (mergesort odds)
-- main = print (mergesort [4,4,1,8,9,2])
main = print (merge [1,4] []) -- Testing merge

Explanation

The `divide` function

The divide function divides a list into two halves based on odd and even index positions.

Line 2: divide :: [Int] -> ([Int], [Int]) is the type annotation for our function called divide. Haskell is statically typed, and every function or variable has a type. In this case, divide is a function that takes a list of the Int type and returns a tuple of lists (type Int).
Line 3: We have the function definition, which shows that our function takes a list. It also takes a case expression on the list that implements the recursive formulation of the problem.
Line 5: The first equation shows that if there is any empty list, then return a tuple of empty lists is returned.
Line 6: The second equation implements the general case. x:xs is a syntax in Haskell that neatly provides us with the head of the list, x, and the remainder list xs. We do a recursive call of the divide function on the list xs. We can safely assume it correctly provides the solution in odds and evens. For the complete list x:xs, we attach the head x with evens and return the tuple (evens, odds) for the final solution.

The `merge` function

The merge function solves our problem’s ‘conquer’ step. It merges two lists into one such that the returned list is sorted.

Line 11: merge :: [Int] -> [Int] -> [Int] is the type annotation for the function merge. The type annotation shows that our function takes two lists of the Int type as arguments and returns a list of the Int type (the merged list).
Line 12: We have the function definition, which shows that our functions takes two lists, s1 s2. This is followed by a nested case expression on these lists that implements the problem’s recursive formulation.
Line 14: The first equation tells that if the first list is empty, we return the other list, s2. This is because there’s nothing to merge.
Line 17: We have the general case where the first list is not empty. This is followed by a case expression on the second list, s2. If the second list is empty, the first list s1 is returned. For the cases where both lists are not empty, we compare the head of both lists x and y, respectively. If x is greater than y, then y should be the head of the merged list. We call the merge function again on list s1 and ys (the remainder of the second list) for the remaining elements.
The last equation represents the flip case which follows the same idea as above. x is the head of the merged list. We call the merge function on the lists s2 and xs (the remainder of the second list) for the remaining elements.

Note: In this explanation, we avoid visualizing what happens during the program execution of a recursive function. Instead, we treat the recursive function as a black box to solve minor valid problems.

Visualizing the recursive workflow

The Merge Sort function

Let’s put all the pieces together in the final function.

Line 22: mergesort :: [Int] -> [Int] is the type annotation for the mergesort function. The type annotation shows that our function takes a list of the Int type as an argument. It also returns a list of the Int type (the sorted list).
Line 23: We have the function definition that shows that our function takes a list. This is followed by a case expression on the list that implements the problem’s recursive formulation. The first two equations implement simple cases. If the input list is empty, return an empty list. If a list contains a single element, the same list is returned.
Line 27: The final equation represents the general solution for all lists whose size is greater than 1. We call our divide function, which divides the list into two halves. We call our mergesort function recursively on each list. We can safely assume that it correctly sorts each individual list. Finally, we call our merge function that combines the two sorted lists into a single list and returns it.