Are your Functions Total?

In recent years, and for a variety of reasons, functional programming has become an increasingly popular topic. Languages like Haskell, Scala, Clojure, and others, have had huge gains in popularity and consequently there have been many large-scale projects written in some of these languages. There has also been increasing pressure to move programming more towards the rigor of mathematics, something which I strongly support.

One thing which could get more attention in the context of functional programming is the concept of Total Functional Programming, which uses the concepts of total functions combined with specific types of recursion in order to guarantee that all your programs terminate. Astute readers will note that this means, since the Halting Problem is not decidable over Turing machines, that a Total Functional Programming language is not Turing Complete. In practical use, this is not an undesirable property and in fact systems written in Turing complete languages are often restricted to be Turing incomplete by, for example, adding termination conditions to programs so that they do not run indefinitely. If you’ve ever been working on some kind of numerical program and added something like if loopCounter > limit then return in order to stop a loop which has sufficiently converged, you have done exactly this.

Another benefit of Total Functional Programing arises due to the Curry-Howard Isomorphism, which tells us that programs are representations of mathematical proofs, and therefore if we had a Turing complete language in which these programs (proofs) did not terminate then we would have significant logical inconsistencies. In the total functional programming context, these consistencies are avoided. This can have some nice implications from the theoretical computer science perspective.

In practical terms though, the benefits of Total Functional Programming, or at least making sure that functions are always total, is that software is more likely to be correct, easier to reason about, and because its construction will be more rigorous we can assume that the results will be of a higher quality. This is especially important in certain classes of systems, but as these tools are becoming more available and known to more developers it makes sense to use them where appropriate. Total functional programming comprises two design aspects, total functions and limited recursion.

Total functions

A total function is a function which is defined for all elements of its domain. This is in contrast to a partial function which is defined only on a subset of its domain. As an example, division as an operation is not total because it is not defined for $0 \in \mathbb{R}$. However, the division function can be made total by specifying a return value for 0.

The Haskell wiki has a page specifically devoted to avoiding partial functions which is worth a read.

The benefit to keeping your functions total where possible (and it’s almost always possible) is that you can eliminate the problem of functions returning values for which there is no proper definition or handling, because the function is defined on all possible values in its domain. In strong and statically typed languages, this allows for the compilers to provide lots of extra support in developing error-free programs since you can have the type checker confirm your functions are not doing to end up in some undefined or error state. The Haskell wiki is helpful when it comes to further examples of non-total functions:

A good example of this is head. You shouldn’t use this function or write functions like it. The problem is in the type, it says [a] -> a which is actually impossible as far as total functions are concerned. Why? Because you might have an empty list! Instead, a more honest type that lets you write a total function would be [a] -> Maybe a. This makes the possibility of not getting a result more explicit and keeps your functions total.

Another way, besides adding further definitions for elements in your domain for which your function is not defined, would be to exclude those elements from the domain of your functions altogether and to have this exclusion be enforced by the type system. This is possible with Refinement Types, which are essentially just types paired with some type of predicate that causes a compile-time error when not satisfied. Note, refinement types are related to Behavioral Subtyping (Liskov Substitution Principle, the L in SOLID). For the example of the division operator, we could simply exclude 0 as a valid element from the domain of our function. This is straightforward, as you can see in this example using Liquid Haskell:

{-@ divide :: Int -> {v: Int | v != 0 } -> Int @-}

Simply by making all your functions total, you make huge amounts of progress along the path of more stable and correct programs.

Restricted recursion

This point is a bit more subtle, but necessary in order to achieve guaranteed termination of programs. In order for this termination to occur, we must ensure that there is no infinite recursion. There are a few ways to go about this, the simplest of which is probably substructural recursion, which is simply writing functions which only recurse on a subset of the data structure on which it operates (that is, only using primitive recursive functions). So in the case of a list, the recursion will always occur on a subset of the original list. As an example, if you were summing the even elements of a list then you could do it in the following way (in Haskell):

sumEven :: [Int] -> Int
sumEven [] = 0
sumEven (x:xs)
	| even x = x + sumEven xs
	| otherwise = sumEven xs

Note also that the way you’d implement this in idiomatic Haskell would probably be to simply use :code:sumEven = sum $ filter even, which in the end comes down to a right fold:

sumEven' :: [Int] -> Int
sumEven' = foldr (+) 0 . filter even

And also note that the implementation of foldr in the Haskell Prelude uses the standard form of substructural recursion with a helper function, traditionally called go:

foldr k z = go
	where
		go []     = z
		go (y:ys) = y `k` go ys

Note also that the foldr implementation is tail recursive while our original sumEven implementation is not.

Conclusion

With only these two techniques, we can make use of the benefits of total functional programming. Compile-time checking that functions will never encounter any arguments for which they aren’t defined, and confirmation that all computations are terminating, can bring huge amounts of safety and reliability to software. These tools and techniques exist widely and are easy to use in all languages, so there’s really no reason not to make use of them wherever you can. It will completely eliminate runtime errors of that sort. Are your functions total? If not, why not?