[Haskell-cafe] Automated Differentiation of Matrices (hmatrix)

Wed Apr 10 19:32:30 CEST 2013

The problem is both Repa and Hmatrix, (and most of the Vector types) want
to know something about the data type we're storing in their shapes, so
they can smash it flat and unbox it, but for most AD modes that value isn't
actually something you can smash flat like that.

newtype Tower a = Tower [a]

data Dual a = Dual a a

data Tape a t

    = Zero
    | Lift !a
    | Var !a {-# UNPACK #-} !Int
    | Binary !a a a t t
    | Unary !a a t
    deriving (Show, Data, Typeable)

newtype Kahn a s = Kahn (Tape a (Kahn a s))

etc.

Of those only Dual -- the simplest one-derivative Forward mode -- can be
made to fit in a container that wants to unbox it.

(Technically the new Reverse mode can also be made to work as it is a
bounded number of numbers and Ints indexing into a tape)

With ad 3.4 you can't even choose to unbox that one because we pun between
the notion of the infinitesimal we're protecting you from confusing and the
AD Mode we're using to prevent you from using particulars about any given
AD mode. It simplified the signature, but in retrospect made it harder to
extend AD with more primitive operations with non-trivial known
derivatives, and to take advantage of knowledge about storage like we need
here.

With ad 4.0 you can at least write the appropriate Element-like instances
for a mode like Dual to put it in those containers.

In the more distant future I'd like to release a version where instead we
can work with well behaved containers by wrapping them through a future
version of linear or another more BLAS-like binding possibly based on the
work I have going into github.com/analytics.

In that scenario instead of having the vector of AD values, we'd have an
AD'd vector. Analogous to switching from the matrix of complex values vs. a
matrix + an imaginary matrix scenario I described earlier, where here we're
using dual numbers. Interestingly if you think about it the Unboxed Vector
instances already do that split for us, but only if we can split things up
equally. I may borrow ideas from the wavelet tree machinery I'm building
for analytics to improve on that eventually though.

-Edward

On Wed, Apr 10, 2013 at 12:39 PM, Dominic Steinitz <dominic at steinitz.org>wrote:

> Hi Edward,
>
> I see now that the issues are deeper than performance.
>
> I took another package that supports matrix operations: repa.
>
> data MyMatrix a = MyMatrix
>   {
>     myRows :: Int
>   , myCols :: Int
>   , myElts :: [a]
>   } deriving (Show, Functor, Foldable, Traversable)
>
> f (MyMatrix r c es) = sum es
>
> g (MyMatrix r c es) = head $ toList $ sumS $ sumS n
>   where
>     n   = fromListUnboxed (Z :. r :. c) es
>
>
> I can take the grad of f but not of g even though they are the same
> function:
>
> *Main> :t grad f
> grad f :: Num a => MyMatrix a -> MyMatrix a
> *Main> :t grad g
>
> <interactive>:1:6:
>     Could not deduce (repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
>                         (ad-3.4:Numeric.AD.Internal.Types.AD s a))
>       arising from a use of `g'
>     from the context (Num a)
>       bound by the inferred type of
>                it :: Num a => MyMatrix a -> MyMatrix a
>       at Top level
>     or from (Numeric.AD.Internal.Classes.Mode s)
>       bound by a type expected by the context:
>                  Numeric.AD.Internal.Classes.Mode s =>
>                  MyMatrix (ad-3.4:Numeric.AD.Internal.Types.AD s a)
>                  -> ad-3.4:Numeric.AD.Internal.Types.AD s a
>       at <interactive>:1:1-6
>     Possible fix:
>       add an instance declaration for
>       (repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
>          (ad-3.4:Numeric.AD.Internal.Types.AD s a))
>     In the first argument of `grad', namely `g'
>     In the expression: grad g
>
>
> 2. By monomorphic, do you mean that I can do:
>
> g :: MyMatrix Double -> Double
> g (MyMatrix r c es) = head $ toList $ sumS $ sumS n
>   where
>     n   = fromListUnboxed (Z :. r :. c) es
>
>
> but not get a type error as I currently do:
>
> *Main> :t grad g
>
> <interactive>:1:6:
>     Couldn't match type `Double'
>                   with `ad-3.4:Numeric.AD.Internal.Types.AD s a0'
>     Expected type: MyMatrix (ad-3.4:Numeric.AD.Internal.Types.AD s a0)
>                    -> ad-3.4:Numeric.AD.Internal.Types.AD s a0
>       Actual type: MyMatrix Double -> Double
>     In the first argument of `grad', namely `g'
>     In the expression: grad g
>
>
> If so that would help as at least I could then convert between e.g. repa
> and structures that ad can play with. Of course, better would be that I
> could just apply ad to e.g. repa.
>
> Dominic.
>
> On 9 Apr 2013, at 23:03, Edward Kmett <ekmett at gmail.com> wrote:
>
> hmatrix and ad don't (currently) mix.
>
> The problem is that hmatrix uses a packed structure that can't hold any of
> the AD mode variants we have as an Element. =(
>
> I've been working with Alex Lang to explore in ad 4.0 ways that we can
> support monomorphic AD modes and still present largely the same API. We've
> seen a number of performance gains off of this refactoring already, but it
> doesn't go far enough to address what you need.
>
> A goal a bit farther out is to support AD on vector/matrix operations, but
> it is a much bigger refactoring than the one currently in the pipeline. =/
>
> To support automatic differentiation on vector-based operations in a form
> that works with standard BLAS-like storage like the packed matrix rep used
> in hmatrix we need to convert from a 'matrix of AD variables' to an 'AD
> mode over of matrices'. This is similar to the difference between a matrix
> of complex numbers and a real matrix plus an imaginary matrix.
>
> This is a long term goal, but not one you're likely to see support for out
> of 'ad' in the short term.
>
> I can't build AD on hmatrix itself due in part to licensing restrictions
> and differing underlying storage requirements, so there are a lot of little
> issues in making that latter vision a reality.
>
> -Edward
>
>
> On Tue, Apr 9, 2013 at 10:46 AM, Dominic Steinitz <dominic at steinitz.org>wrote:
>
>> Hi Cafe,
>>
>> Suppose I want to find the grad of a function then it's easy I just
>> use http://hackage.haskell.org/package/ad-3.4:
>>
>> import Numeric.AD
>> import Data.Foldable (Foldable)
>> import Data.Traversable (Traversable)
>>
>> data MyMatrix a = MyMatrix (a, a)
>>   deriving (Show, Functor, Foldable, Traversable)
>>
>> f :: Floating a => MyMatrix a -> a
>> f (MyMatrix (x, y)) = exp $ negate $ (x^2 + y^2) / 2.0
>>
>> main :: IO ()
>> main = do
>>   putStrLn $ show $ f $ MyMatrix (0.0, 0.0)
>>   putStrLn $ show $ grad f $ MyMatrix (0.0, 0.0)
>>
>> But now suppose I am doing some matrix calculations
>> http://hackage.haskell.org/package/hmatrix-0.14.1.0 and I want to find
>> the grad of a function of a matrix:
>>
>> import Numeric.AD
>> import Numeric.LinearAlgebra
>> import Data.Foldable (Foldable)
>> import Data.Traversable (Traversable)
>>
>> g :: (Element a, Floating a) => Matrix a -> a
>> g m = exp $ negate $ (x^2 + y^2) / 2.0
>>   where r = (toLists m)!!0
>>         x = r!!0
>>         y = r!!1
>>
>> main :: IO ()
>> main = do
>>   putStrLn $ show $ g $ (1 >< 2) ([0.0, 0.0] :: [Double])
>>   putStrLn $ show $ grad g $ (1 >< 2) ([0.0, 0.0] :: [Double])
>>
>> Then I am in trouble:
>>
>> /Users/dom/Dropbox/Private/Whales/MyAD.hs:24:21:
>>     No instance for (Traversable Matrix) arising from a use of `grad'
>>     Possible fix: add an instance declaration for (Traversable Matrix)
>>     In the expression: grad g
>>     In the second argument of `($)', namely
>>       `grad g $ (1 >< 2) ([0.0, 0.0] :: [Double])'
>>     In the second argument of `($)', namely
>>       `show $ grad g $ (1 >< 2) ([0.0, 0.0] :: [Double])'
>>
>> /Users/dom/Dropbox/Private/Whales/MyAD.hs:24:26:
>>     Could not deduce (Element
>>                         (ad-3.4:Numeric.AD.Internal.Types.AD<http://numeric.ad.internal.types.ad/>s Double))
>>       arising from a use of `g'
>>     from the context (Numeric.AD.Internal.Classes.Mode s)
>>       bound by a type expected by the context:
>>                  Numeric.AD.Internal.Classes.Mode s =>
>>                  Matrix (ad-3.4:Numeric.AD.Internal.Types.AD<http://numeric.ad.internal.types.ad/>s Double)
>>                  -> ad-3.4:Numeric.AD.Internal.Types.AD<http://numeric.ad.internal.types.ad/>s Double
>>       at /Users/dom/Dropbox/Private/Whales/MyAD.hs:24:21-26
>>     Possible fix:
>>       add an instance declaration for
>>       (Element (ad-3.4:Numeric.AD.Internal.Types.AD<http://numeric.ad.internal.types.ad/>s Double))
>>     In the first argument of `grad', namely `g'
>>     In the expression: grad g
>>     In the second argument of `($)', namely
>>       `grad g $ (1 >< 2) ([0.0, 0.0] :: [Double])'
>>
>> What are my options here? Clearly I can convert my matrix into a list
>> (which is traversable), find the grad and convert it back into a
>> matrix but given I am doing numerical calculations and speed is an
>> important factor, this seems undesirable.
>>
>> I think I would have the same problem with:
>>
>> http://hackage.haskell.org/package/repa
>> http://hackage.haskell.org/package/yarr-1.3.1
>>
>> although I haven'¯t checked.
>>
>> Thanks, Dominic.
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20130410/162cb1b2/attachment-0001.htm>