[Haskell-cafe] Code Review: Sudoku solver

Sat Apr 8 19:21:42 EDT 2006

Am Samstag, 8. April 2006 02:20 schrieb Daniel Fischer:
> Am Freitag, 7. April 2006 17:33 schrieben Sie:
> > > Just out of curiosity, speed was not the objective when I wrote my
> > > solver, I wanted to avoid guesswork (as much as possible), but in
> > > comparison with Cale Gibbard's and Alson Kemp's solvers (which are much
> > > more beautifully coded), it turned out that mine is blazingly fast, so
> > > are there faster solvers around (in Haskell, in other languages)?
> >
> > if I modify your solver to produce similar output to mine (input/first
> > propagation, solved puzzle, number and list of guesses made), your's
> > takes about a third of the time of mine (solving 36628 17hint puzzles
> > in 6m vs 17m, 2GHz Pentium M), and I wasn't exactly unhappy with
> > my solver before I did this comparison!-)
>
> Mine's even faster now (at least on my computer, would you care to test it
> on your's? If you don't want to get EnumSet, just change DiffArray to
> Array, worked wonders for me), I'll dig into yours tomorrow to see what I
> can get out of it to improve my algorithm.

Unforunately, no new inference rules :-(
Two things I don't like about your code:
1. no type declarations
2. too much name shadowing, that makes following the code difficult

apart from that: clever
>
> > like you, I've been trying to remove guesses, and the speed came as a
> > welcome bonus (I'm still using lists all over the place, with lots of not
> > nice adhoc code still remaining; not all propagators are iterated fully
>
> lists and adhoc code tend to be performance killers, I doubled the speed of
> mine by de-adhoccing the code (and that although I introduced the
> speed-killer DiffArray)

I believe if you change the representation of puzzles from [(pos,range)]
to an Array, you'll get a significant speedup
>
> > yet because I only recently removed a logic bug that slowed down the
> > search instead of speading it up; ..). so the more interesting bit is
> > that our solvers disagree on which are the most difficult puzzles
> > (requiring the largest number of guesses):
> >
> > df
> > puzzles involving guesses: 5319
>
> If that's not a typo, I'm baffled. My original needed to guess in 5309

Rot! Typo in _my_ previous message, 5319 is correct.

> puzzles, and I can't imagine what inference I could have dropped when
> cleaning up the code.
>
> > largest number of guesses:
> >     10 (#36084), 11 (#22495)
> >
> > cr
> > puzzles involving guesses: 8165
> > largest number of guesses:
> >     10 (#9175), 10 (#17200), 10 (#29823), 10 (#30811)
> >
> > df's solver needs 0/0/3/0 for cr's trouble spots, while cr's solver needs
> > 5/9 guesses for df's. lots of potential for interesting investigations,

We use different guessing strategies (plus, I also have group-inference).
But the given number of guesses is the number of guesses in the successful 
branch, and I think a better measure for the nefariousness of a puzzle is the 
accumulated number of guesses in all branches or the number of branches 
(still better is the time needed to solve the puzzle).

This is the list of the 30 puzzles with the most branches:
puzzle  #branches
  3992   213
  7475   120
 12235   117
  5341    69
 11815    60
  9402    60
 11544    59
  9184    54
 10403    50
 31110    48
  8575    48
  1489    45
  2732    40
 11523    39
  6730    39
 10929    38
   960    35
 19474    32
  6412    31
  1599    30
 36084    29
 21832    29
 22495    28
  4657    28
 34747    27
 10404    27
 29931    26
   942    25
   563    24

the top 30 in CPUTime (in milliseconds, cpuTimePrecision = 10^10)
  3992  6480
  9184  1520
 31110  1470
 10403  1310
 12235  1260
  7475  1130
  2732  1080
   960  1050
  5341   990
 11544   960
 11815   930
  1395   730
 10929   710
  1863   710
  1330   700
 20807   630
  4181   610
 10634   570
 34401   550
   959   550
 34747   520
  1599   520
 14912   510
 29282   500
  7983   500
 29273   480
 23958   470
  2245   460
  2232   440
 36425   430

so puzzle 3992 is outstandingly bad in both respects (I fed it into my old 
step by step solver and boy, failure is detected _very_ late in practically 
all branches) and from a couple of tests I have the vague impression that the 
correlation between the number of guesses in the successful branch and time 
is not very strong (3992 has 6, 9184 and 2732 only 3, 31110 has 5, 10403 8, 
12235 9, 7475 6 and 960 7), but I don't think I'll do a statistical analysis, 
I'll stick to time as the measure.

Here's the meanest puzzle:

0 0 0 0 4 0 0 5 9
3 0 0 2 0 0 0 0 0
1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 7 0 0
0 4 6 0 0 0 0 0 0
9 5 0 0 0 0 0 0 0
0 0 0 0 5 6 0 4 0
0 0 0 8 0 0 3 0 0
0 0 0 0 0 0 0 0 0

and that's so mean that David Place's incrsud beats my solver on this by a 
factor of 5.5!

> > though mostly for me!-)
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^
> I'm not sure about that :-)
>
> > cheers,
> > claus
>
Cheers again,
Daniel

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
	-- Blair P. Houghton