# [Haskell-cafe] Code Review: Sudoku solver

Sun Apr 9 06:18:23 EDT 2006

```I just ran a simple metric for the dancing-links solver.

The only real metric I could use was the number of "coverOthers" calls which
counts the number of selections made (there is no distinction between certainty
and guessing).

So the minimum # of selections is 81 (of which 17 are the free hints, and 64
were real moves).  Each selection beyond 81 involved backtracking.

Of the 36628 puzzles, there were 21395 puzzles that were solved with 81
selections, and therefore no backtracking.  So 58% of them were (some with luck)
solved instantly.  This low percentage is why this is not a by logic solver.

puzzles      selections each  selections  excess/above 81
21395 minimal  81 (17+64*1)     1732995        0
9841 up to   145 (17+64*2)     1043838   246717 (== 1043838 - 9841*81)
3192 up to   273 (17+64*4)      611275   352723
1215 up to   529 (17+64*8)      453302   354887
576 up to  1041 (17+64*16)     415496   368840
248 up to  2065 (17+64*32)     354028   333940
105 up to  4113 (17+64*64)     300909   292404
42 up to  8209 (17+64*128)    248645   245243
10 up to 16401 (17+64*256)    120724   119914
3 up to 32785 (17+64*512)     62319    62076
1 used  32875                 32875    32794 (puzzle # 10995)
-----                           -------  -------
36628                           5376406  2409538

I think the important thing about the list is that the work in each group is
double the one before it, but the count of puzzles is always less than half the
one before it.  Thus the total selections decreases.  So the hard puzzles are
time consuming but not catastrophic.

Anyway, 94% of the puzzles were solved with no more than 17+64*4 selections.

And only 50.7% of the selections beyond the 17 hints required backtracking.
(Computed from 2409538 / ( 5376406 - 36628 * 17 ) ).  That's an average of 65.8
bad selections per puzzle that requires 64 good selections to solve.  This seems
like an excellent benchmark for comparing brute force methods.

A final point about dancing links is that this is a generic data structure and
algorithm for solving any "binary cover" problem.   (Such as arranging
pentaminos).  It is not specialized to knowing anything about Sudoku.  The
initialization tells it "there are 324 constraints" and "this is a comprehensive
set of moves, limited only by the initial hints".  And it runs only 50% slower
than bit-twiddling specialized logic.

The funny thing is that Sudoku is too small for this method, by which I mean
that the constant cost of initializing the data structure that is used can
become comparable to running the algorithm.  I had to profile my code while
rewriting it and optimize the initialization code to have a much smaller
overhead than Knuth's code.  Most non-Sudoku problems that one needs to brute
force with this will run much longer than any setup costs.

Finally, here were the worst 30 puzzles:

selections puzzle#
32875   10995

23577   412
20707   27267
18035   26632

14350   21765
14247   33677
13966   10992
13660   7250
13453   28608
12243   19106
12181   10993
9300    18177
8686    24590
8638    12445

8161    19387
8157    7921
7916    35782
7900    33131
7850    36162
7663    21951
7427    31110
7321    16608
7268    33554
7200    1259
7108    9750
6805    28607
6781    21919
6716    14969
6432    26234
6361    15697

I don't notice any overlap with the list of hard puzzles to solve with the other
programs.

--
Chris
```