[Haskell-cafe] Open Kattis Problem Srednji: Hints to improve my algorithm

Thu Sep 24 09:59:18 UTC 2020

On Wed, Sep 23, 2020 at 03:57:33AM -0400, Viktor Dukhovni wrote:

> The best algorithm that comes to mind runs in linear time in the length
> of the list, and requires linear (2N) additional space.  No sorting
> (that would not be linear) or complex testing of candidates is required,
> just some counting and O(N) book-keeping.

On my machine the constant factor seems to be about 0.7 seconds for a
randomly "desorted" list of length 10 million numbers, in which choosing
the desired median to be 5 million yields 9,979,641,307 possible
combinations of sequences:

    9979641307
       2,160,167,416 bytes allocated in the heap
             106,240 bytes copied during GC
          80,053,184 bytes maximum residency (2 sample(s))
             744,512 bytes maximum slop
                  80 MiB total memory in use (0 MB lost due to fragmentation)

                                         Tot time (elapsed)  Avg pause  Max pause
      Gen  0       960 colls,     0 par    0.005s   0.005s     0.0000s    0.0001s
      Gen  1         2 colls,     0 par    0.005s   0.005s     0.0026s    0.0050s

      INIT    time    0.000s  (  0.000s elapsed)
      MUT     time    0.652s  (  0.698s elapsed)
      GC      time    0.010s  (  0.010s elapsed)
      EXIT    time    0.000s  (  0.000s elapsed)
      Total   time    0.663s  (  0.708s elapsed)

      %GC     time       0.0%  (0.0% elapsed)

      Alloc rate    3,311,613,717 bytes per MUT second

      Productivity  98.4% of total user, 98.5% of total elapsed

The tuned up algorithm uses 1*N+constant space, which for 10 million
64-bit Ints in an Unboxed Vector works out to the reported 80 MB.

Most of the CPU time (and heap allocaton) is likely spent reading and
converting the input stream of decimal integers.  The actual CPU time
spent solving the problem is likely a fraction of that cost.

The RTS stats for 100M numbers confirm the linear scaling in time and
space (this time 103,749,385,441 ways to place the median):

    103749385441
      21,701,903,208 bytes allocated in the heap
             935,496 bytes copied during GC
         800,053,240 bytes maximum residency (2 sample(s))
              67,592 bytes maximum slop
                 766 MiB total memory in use (0 MB lost due to fragmentation)

                                         Tot time (elapsed)  Avg pause  Max pause
      Gen  0      9610 colls,     0 par    0.052s   0.052s     0.0000s    0.0001s
      Gen  1         2 colls,     0 par    0.049s   0.049s     0.0247s    0.0491s

      INIT    time    0.000s  (  0.000s elapsed)
      MUT     time    6.811s  (  7.297s elapsed)
      GC      time    0.101s  (  0.102s elapsed)
      EXIT    time    0.000s  (  0.000s elapsed)
      Total   time    6.912s  (  7.399s elapsed)

      %GC     time       0.0%  (0.0% elapsed)

      Alloc rate    3,186,237,035 bytes per MUT second

      Productivity  98.5% of total user, 98.6% of total elapsed

-- 
    Viktor.