Many of us have heard a lot of talk about functional programming and its benefits, especially when it comes to highly concurrent applications where thread locking and synchronization would be necessary in order to avoid mangling state among processes.
Languages like Scala, Erlang, and Clojure are increasingly coming to mind as people look for ways to take advantage of multiple-core processors without the headaches threading can create. While I'm not quite ready to take the plunge and completely eliminate state from my programming, I'm looking for more and more ways to get rid of it when possible.
In doing so, I've come across several situations where a simple refactoring to a more functional style can yield rewards in readability and testability, and I'd like to talk about one of them here.
while loops are one of the imperative programmer's most basic tools,
and it might seem difficult to eliminate mutable state. However, a little persistence can
pay off. Let's start with the following Scala code:
It's clear that state is changing inside this function, highlighted by the use of Scala's
var keyword that specifies a variable that is allowed to change
val variables are unmodifiable).
While this state change isn't visible from the outside (a new copy of the variable is
created each time the function is entered),
var often means there's
unnecessary clutter in the code and that it can be simplified. And indeed it can.
First of all, any programmer worth her salt would first change this to a
for loop. A Java implementation might look like this:
Scala's for loops look more like Java's foreach syntax:
In addition to the Java-like iterating structure (left-arrow instead of :); the range
(0 to limit) is a Scala built-in, but there are also libraries for ranges
in Java, Ruby, and plenty of other languages.
The code in the last example seems to more clearly encapsulate the structure of the function.
The first and last lines in the
while based version of
are just boilerplate code, and while most any programmer can easily follow the structure of
the code, we should always be striving for improvement.
In general, I find that I'm less likely to have bugs with a smaller codebase. Certainly there are exceptions (regular expressions comes to mind), but assuming that two pieces of code both read well, I'd rather have less to read.
In the preceding Scala for loop syntax,
i is implicitly a
within the context of the loop, which means it can't be reassigned to a new value. This
is a great step forward, as we could initially have created an infinite loop by manipulating
the value of
i inside the while loop. We have now insulated ourselves against
this type of change.
Another functional way to write essentially the same loop would be to use Scala's
foreach, a method available on Lists and Ranges that takes a function as a parameter.
This seems more expressive to me, as it emphasizes the importance of the range object in the loop.
It would also have given us the ability to use a shorthand if we'd not needed the
"i =" text:
But there is also something to be said for the familiarity of the more traditional
The code above is still not completely functional, however, because the
statement depends upon the definition of standard output at the time (imagine redefining
it to a
java.io.ByteArrayOutputStream, for example). This also makes the function
hard to test.
Now, if we're going to build an application to work at the console, at some point
we're going to have to actually use
println if we want the user to know that the
application is working. Does this mean that the functional approach completely eschews input and
output streams? Certainly not, but we have two problems in the example at hand:
printUpTois responsible for both the construction of each member of a group of output strings, and for the actual printing of those strings. Therefore, the function has two reasons to change. For instance, we might want to change the information that gets printed on each line or its formatting. We might also want to change how the output gets to the user.
printUpTois awkward to test. In order to verify that the correct output comes goes to the user, we'll need to either find a mocking library to change the way the Console object behaves, or redirect standard output to another stream that we can read to verify its contents afterwards (maybe a
I'd prefer a further refactoring in order to alleviate these issues. The problem of verifying
console output in our test code won't be completely eliminated, but we'll at least isolate it
to its own test. Here, we're extracting
println from the string-building part of
Now we can test
applyUpTo by passing our own custom action in, and tests for
println-based behavior can be isolated, so that we isolate some of the necessary
mocking and redirection.
As a word of explanation, the
action passed in needs to take a
as an argument and return the
(). Now we can also move
printUpTo to a different class if we so desire (assuming this function is part of
a larger application), eliminating our need to recompile the code in
time the string format changes.
I'd most likely put the
applyUpTo call on its own in a
if there's really nothing more to the application.
There is another problem with the code above, however: both of the functions are returning
values (the equivalent of Java's
void). This means that both of these functions rely
on side effects to get their jobs done.
If we were to pass a side effect-free function into
applyUpTo, we wouldn't
actually have any output from the
applyUpTo to tell us that we did something. A side
effect is any effect of a function beyond its return value, so we can say that if a Scala function
has a return type of
Unit, there are either side effects, or it is a trivial function
that we can safely eliminate.
We can make the helper function side effect-free like so:
In this case, we're relying on a bit of syntactic sugar for transferring the iterating value of
println, and a placeholder (
map's iterating value.
buildUpTo function is now more functional and more easily testable (we can
do a fairly simple comparison with the returned Iterable), but it's important to note that
memory usage might be a concern, as a map of all the values could become pretty large.
Another issue with this is speed: in the more functional version, we're iterating through the collection twice rather than just once. We don't often get things for free, and indeed there is a trade-off here between memory/speed and functionality.
The example here has been on a very small scale, and perhaps a bit contrived, but we can extrapolate the ideas to a wide range of refactorings. We want to remove as much code with side effects and mutable state as we can from the rest of our application.
This way we can divide up the functional parts of our application among processors in order to take advantage of the multiple cores that we all have in our machines these days. There are numerous other situations in which functional programming can lead to cleaner code, and there are also places where it might be inappropriate or impossible.
It's important to remember the most common reason we might want to write functional code:
concurrency. If I'm only going to be running this small application on my local machine,
from the command line, and printing is going to be the limiting factor speed-wise (and it
probably will in this case), I would most likely stop at the simplest
version above and be done with it.
It will be efficient with memory and fast. But when the function
becomes more complex, or we need to run numerous instances of the application, that's where
concurrency, and functional programming, are best applied. Our goal should be to find, and
use, the right tool for the job.