Saturday, 20 February 2010

Thursday, 18 February 2010

Object to Classes - Use Maps

In the 10 years that I have worked as a computer programmer I have worked almost exclusively with object oriented languages. Primarily Java, but also Ruby, Groovy, and Perl (though object oriented Perl wasn't much fun!). Thinking in objects therefore comes very naturally to me. Part of the challenge of learning Clojure is that it isn't an object oriented language. I'm having to re-wire my brain to work in the functional paradigm, something that is both a challenge and great fun.

I read about Abstraction Barriers in SICP and wanted to apply it to the tool I'm currently developing in Clojure. I created a new namespace to encapsulate the concept of a 'version' (as in a software version - I need functions to manipulate version strings like 1.0.1-b01-SNAPSHOT). Based on what I had read in SICP I created functions like 'make-version' to act as the abstraction barrier. But then the question arose - what should this actually return? Initially, again inspired by SICP, make-version created a closure with a message-passing dispatch function:
(defn make-version [major minor patch build snapshot]
  (fn [selector]
    (cond
     (= :major selector) major
     (= :minor selector) minor
     (= :patch selector) patch
     true (throw (IllegalArgumentException. "Selector not recognized"))))
  )
But it occurred to me that by doing this I would lose all the benefits of the data structures Clojure provides wrt concurrency etc. So I thought about the implementation a bit more, and did some reading on the Clojure Google group. After doing a search for "data abstraction" on the group I found a post by Rich Hickey that lit a light-bulb in my head:
 "I know people usually think of collections when they see vector/map/ set, and they think classes and types define something else. However, the vast majority of class and type instances in various languages are actually maps, and what the class/type defines is a specification of what should be in the map. Many of the languages don't expose the instances as maps as such and in failing to do so greatly deprive the users of the language from writing generic interoperable code. 
Classes and types usually create desert islands. When you say:
//Java 
class Foo {int x; int y; int z;} 

--Haskell 
Foo = Foo {x :: int, y :: int, z :: int}
you end up with types with a dearth of functionality. Sure, you might get hashCode and equals for free, or some other free stuff by deriving from Eq or Show, but the bottom line is you are basically starting from scratch every time. No existing user code can do anything useful with your instances."
[...snip...]
"I guess I want to advocate - don't merely replicate the things with which you are familiar. Try to do things in the Clojure way. If your logical structure is a mapping of names to values, please use a map. Positional data is fragile, non-self-descriptive and unmanageable after a certain length - look at function argument lists. Note that using maps doesn't preclude also having positional constructors, nor does it dictate a space cost for repeating key names - e.g. structmaps provide positional constructors and shared key storage. "
As a follow up, Stuart Sierra posted a link to a blog entry he had written about how to model data. In it he directly contrasts the difference between the OO mindset and the Clojure / functional mindset:
"So here’s a slightly radical notion: don’t use classes to model the real world. Treat data as data. Every modern programming language has at least a few built-in data structures that usually provide all the semantics you need.
It all boils down to this:
"It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures."(Alan Perlis)
In OO programming we create data structures at the drop of a hat, but that's not the Clojure way. The Clojure way is to use the data structures provided by the language, and the vast library of functions that know how to manipulate them.

So yes, it's fine to have an abstraction barrier with functions like make-version. But there is no need to create custom data structures using closures, message-passing, and dispatch functions. Just use a map. It's simple, it works, and it's the Clojure way.

Wednesday, 17 February 2010

mvn clojure:swank throws java.lang.NumberFormatException: Invalid number: 2009-09-14

For some reason the 'swank' target of the clojure-maven-plugin isn't working for me on Windows. At home, on Ubuntu, it works fine. At work, with the same code base, it doesn't work. It throws the following exception:

Exception in thread "main" clojure.lang.LispReader$ReaderException: java.lang.NumberFormatException: Invalid number: 2009-09-14

I decided to work around this by extending my 'clj' starter script. I added a --pom switch which tells the script to build the classpath using the pom file in the current directory. It makes use of the build-classpath target of the maven-dependency-plugin. So now at work I can start up a Swank server simply by typing:

clj --pom "c:\dev\nwalex.com\clojure-scripts\start-swank.clj"

I have this aliased to 'swank' in my .aliases file in Cygwin.

Anyway, here's the full script:
#!/bin/bash

function init_classpath {
    # set up the classpath dynamically. Note this includes the jline jar
    CLASSPATH=""
    for jarfile in `ls -l ~/.clojure-classpath/ | pcol 11`; do
        JAR=`cygpath --windows $jarfile`
        CLASSPATH="$CLASSPATH;$JAR"
    done
}

function init_classpath_from_pom {
    echo "Initializing classpath from pom.xml..."

    # create the classpath file
    mvn dependency:build-classpath -Dmdep.outputFile=classpath 2>&1 > /dev/null

    # store it and remove the file
    CLASSPATH=`more classpath`
    rm classpath

    # also add everything under the clojure source directories
    CLASSPATH="$CLASSPATH;./src/main/clojure;./src/test/clojure"
}
 
if [ $# -eq 0 ] ; then
    init_classpath
    stty -icanon min 1 -echo
    java -Djline.terminal=jline.UnixTerminal -cp $CLASSPATH jline.ConsoleRunner clojure.main
else
    TMPFILE=""
    while [ $# -gt 0 ] ; do
        case "$1" in
        --pom)
            init_classpath_from_pom
            ;;
        -cp|--classpath)
            CLASSPATH="$CLASSPATH;$2"
            shift
            ;;
        -e)
            TMPFILE="/tmp/$(basename $0).$$.tmp"
            /bin/echo $2 > $TMPFILE
            ARGS=$TMPFILE
            break
            ;;
        *)
            ARGS="$ARGS $1"
            ;;
        esac
        shift
    done

    if [ "$CLASSPATH" == "" ] ; then
        init_classpath
    fi

    if [ "$ARGS" != "" ]; then 
        ARGS=`cygpath --windows $ARGS`
    fi
 
    java -cp "$CLASSPATH" clojure.main $ARGS
    if [ "$TMPFILE" != "" ] ; then
        rm $TMPFILE
    fi
fi
start-swank.clj is simply:
(require 'swank.swank)
(swank.swank/start-server "nul" :encoding "utf-8" :port 4005)
Update 1: Updated script to include the src/main/clojure and src/test/clojure directories on the classpath.