[Haskell-cafe] Interpreting profiling results

Patrick LeBoutillier patrick.leboutillier at gmail.com
Tue Jan 5 19:31:34 EST 2010


On Mon, Jan 4, 2010 at 10:05 AM, Daniel Fischer
<daniel.is.fischer at web.de> wrote:
> Am Montag 04 Januar 2010 02:17:06 schrieb Patrick LeBoutillier:
>
>> Hi,
>
>>
>
>> This question didn't get any replies on the beginners list, I thought
>
>> I'd try it here...
>
> Sorry, been occupied with other things. I already took a look, but hadn't
> anything conclusive enough to reply yet.

No sweat... I didn't mean to be pushy :)

Thanks a lot for all the pointers, they have speeded up my code a lot.

Patrick


>
>>
>
>> I've written (and improved using other solutions I've found on the
>
>> net) a simple sudoku solver which I'm trying to profile. Here is the
>
>> code:
>
>>
>
>>
>
>> import Array
>
> Better
>
> import Data.Array.Unboxed
>
> *much* faster
>
>> import List (transpose, nub, (\\))
>
>> import Data.List
>
>>
>
>> data Sudoku = Sudoku { unit :: Int, cells :: Array (Int, Int) Int,
>
> cells :: UArray (Int,Int) Int
>
>> holes :: [(Int, Int)] }
>
>>
>
>> cell :: Sudoku -> (Int, Int) -> Int
>
>> cell s i = (cells s) ! i
>
>>
>
>> instance Read Sudoku where
>
>>  readsPrec _ s = [(Sudoku unit cells holes, "")]
>
>>    where unit = length . words . head . lines $ s
>
>>          cells = listArray ((1, 1), (unit, unit)) (map read . words $ s)
>
>>          holes = [ c | c <- indices cells, (cells ! c) == 0]
>
>>
>
>> instance Show Sudoku where
>
>>  show s = unlines [unwords [show $ cell s (x,y) | x <- [1 .. unit s]]
>
>>
>
>> | y <- [1 .. unit s]]
>
>>
>
>> genNums :: Sudoku -> (Int, Int) -> [Int]
>
>> genNums s c@(i,j) = ([1 .. u] \\) . nub $ used
>
>>  where
>
> nub isn't nice. It's quadratic in the length of the list. Use e.g.
>
> map head . group . sort
>
> or
>
> Data.[Int]Set.toList . Data.[Int]Set.fromList
>
> if the type is in Ord (and you don't need the distinct elements in the order
> they come in). That gives an O(n*log n) nub with a sorted result.
>
> And (\\) isn't particularly fast either (O(m*n), where m and n are the
> lengths of the lists). If you use one of the above instead of nub, you can
> use the O(min m n) 'minus' for sorted lists:
>
> xxs@(x:xs) `minus` yys@(y:ys)
>
> | x < y = x : xs `minus` yys
>
> | x == y = xs `minus` ys
>
> | otherwise = xxs `minus` ys
>
> xs `minus` _ = xs
>
> Here, you can do better:
>
> genNums s c@(i,j) = nums
>
> where
>
> nums = [n | n <- [1 .. u], arr!n]
>
> arr :: [U]Array Int Bool
>
> arr = accumArray (\_ _ -> False) True (0,u) used
>
>>    used = (row s u i j) ++ (col s u i j) ++ (square s sq u i j)
>
>>    u = unit s
>
> Not good to calculate sq here. You'll use it many times, calculate once and
> store it in s.
>
>>    sq = truncate . sqrt . fromIntegral $ u
>
>>
>
>> row s u i j = [cell s (i, y) | y <- [1 .. u]]
>
>>
>
>> col s u i j = [cell s (x, j) | x <- [1 .. u]]
>
>>
>
>> square s sq u i j = [cell s (x, y) | y <- [1 .. u], x <- [1 .. u], f x i,
>> f
>
>> y j] where f a b = div (a-1) sq == div (b-1) sq
>
> Test for f y j before you generate x to skip early.
>
> square s sq u i j = [cell s (ni+x,nj+y) | x <- [1 .. sq], y <- [1 .. sq]]
>
> where
>
> qi = (i-1) `div` sq
>
> qj = (j-1) `div` sq
>
> ni = qi*sq
>
> nj = qj*sq
>
>>
>
>> solve :: Sudoku -> [Sudoku]
>
>> solve s =
>
>>  case holes s of
>
>>    [] -> [s]
>
>>    (h:hs) -> do
>
>>      n <- genNums s h
>
>>      let s' = Sudoku (unit s) ((cells s) // [(h, n)]) hs
>
>>      solve s'
>
>>
>
>> main = print . head . solve . read =<< getContents
>
>>
>
>>
>
>> When I compile as such:
>
>>
>
>> $ ghc -O2 --make Sudoku.hs -prof -auto-all -caf-all -fforce-recomp
>
>>
>
>> and run it on the following puzzle:
>
>>
>
>> 0 2 3 4
>
>> 3 4 1 0
>
>> 2 1 4 0
>
>> 0 3 2 1
>
>>
>
>> I get the following profiling report:
>
>>
>
>>        Fri Jan  1 10:34 2010 Time and Allocation Profiling Report  (Final)
>
>>
>
>>           Sudoku +RTS -p -RTS
>
>>
>
>>        total time  =        0.00 secs   (0 ticks @ 20 ms)
>
> That means the report is basically useless. Not entirely, because the
> allocation figures may already contain useful information. Run on a 9x9
> puzzle (a not too hard one, but not trivial either).
>
> Also, run the profiling with -P instead of -p, you'll get more info about
> time and allocation then.
>
>>        total alloc =     165,728 bytes  (excludes profiling overheads)
>
>>
>
>> COST CENTRE                    MODULE               %time %alloc
>
>>
>
>> CAF                            GHC.Handle             0.0   10.7
>
>> CAF                            Text.Read.Lex          0.0    2.1
>
>> CAF                            GHC.Read               0.0    1.2
>
>> square                         Main                   0.0    2.8
>
>> solve                          Main                   0.0    1.3
>
>> show_aVx                       Main                   0.0    3.7
>
>> readsPrec_aYF                  Main                   0.0   60.6
>
>> main                           Main                   0.0    9.6
>
>> genNums                        Main                   0.0    5.0
>
>> cell                           Main                   0.0    1.2
>
>>
>
>>
>
>>
>
>>                        individual    inherited
>
>> COST CENTRE              MODULE
>
>>       no.    entries  %time %alloc   %time %alloc
>
>>
>
>> MAIN                     MAIN
>
>>         1           0   0.0    0.3     0.0  100.0
>
>>  main                    Main
>
>>       186           1   0.0    9.6     0.0   85.6
>
>>  show_aVx               Main
>
>>       196           2   0.0    3.7     0.0    3.7
>
>>   cell                  Main
>
>>       197          16   0.0    0.0     0.0    0.0
>
>>  solve                  Main
>
>>       188           5   0.0    1.3     0.0   11.8
>
>>   genNums               Main
>
>>       189           8   0.0    5.0     0.0   10.4
>
>>    square               Main
>
>>       194          88   0.0    2.8     0.0    3.2
>
>>     cell                Main
>
>>       195          16   0.0    0.4     0.0    0.4
>
>>    col                  Main
>
>>       192           4   0.0    0.7     0.0    1.1
>
>>     cell                Main
>
>>       193          16   0.0    0.4     0.0    0.4
>
>>    row                  Main
>
>>       190           4   0.0    0.7     0.0    1.1
>
>>     cell                Main
>
>>       191          16   0.0    0.4     0.0    0.4
>
>>  readsPrec_aYF          Main
>
>>       187           3   0.0   60.6     0.0   60.6
>
>>  CAF                     GHC.Read
>
>>       151           1   0.0    1.2     0.0    1.2
>
>>  CAF                     Text.Read.Lex
>
>>       144           8   0.0    2.1     0.0    2.1
>
>>  CAF                     GHC.Handle
>
>>       128           4   0.0   10.7     0.0   10.7
>
>>  CAF                     GHC.Conc
>
>>       127           1   0.0    0.0     0.0    0.0
>
>>
>
>> Does the column 'entries' represent the number of times the function
>
>> was called?
>
> Number of times it was 'entered', not quite the same as the number of times
> it was called.
>
> I think (Warning: speculation ahead, I don't *know* how the profiles are
> generated) it's thus:
>
> Say you call a function returning a list. One call, first entry. It finds
> the beginning of the list, the first k elements and hands them to the
> caller. Caller processes these, asks "can I have more, or was that it?".
> Same call, second entry: f looks for more, finds the next m elements, hands
> them to caller. Caller processes. Repeat until whatever happens first,
> caller doesn't ask whether there's more or callee finds there's nothing more
> (or hits bottom).
>
>> If so, I don't understand how the 'square' function could
>
>> be called 88 times when it's caller is only called 8 times. Same thing
>
>> with 'genNums' (called 8 times, and solve called 5 times)
>
>>
>
>> What am I missing here?
>
>>
>
>> Patrick



-- 
=====================
Patrick LeBoutillier
Rosemère, Québec, Canada


More information about the Haskell-Cafe mailing list