Using dependent types in the deeper functions and
requiring a Proxy to reach them meant we required
dictionary passing to get the Nats. This made the
pad and crop layers almost 1000 times slower than
they should have been.
Changes shapes to get rid of the Vector, all data is
now held in contiguous memory.
Add fast c implementations for pooling layers.
Now does mnist on my laptop in 12 minutes.
I had a basic monadic interface on, thinking that it might be nice
to use for dropout and the like.
In retrospect I think that was too heavy. Being a purely functional
heterogeneous list, substituting layers is easy, so it one wants to
do that using MonadRandom it can still be done.