[Haskell-cafe] looking for a good algorithm
caseyh at istar.ca
Thu Nov 12 14:42:18 EST 2009
To: Casey Hawthorne <caseyh at istar.ca>
Subject: Re: [Haskell-cafe] looking for a good algorithm
From: Casey Hawthorne <caseyh at istar.ca>
Date: Thu, 12 Nov 2009 11:14:02 -0800
On third thought, convert the table to a 2D array of bits (or a 1D
array of bits mapped to a 2D coordinate system).
The bit is true for either an Int or Double.
If space is a concern and one can tolerate a low rate of false
positives, than a 2D Bloom Filter might be appropriate, or a 1D Bloom
Filter with mapping to 2D coordinates.
Then on to simmulated annealing, possibly.
Note: Donald Knuth's new fascicle is out:
Volume 4 Fascicle 1, Bitwise Tricks & Techniques; Binary Decision
Diagrams (2009), xiii+261pp. ISBN 0-321-58050-8
On Wed, 11 Nov 2009 15:32:35 -0800, you wrote:
>So, as I understand it, you have a very large sparse table, thousands
>of rows and hundreds of columns, of which each cell within a column of
>type String, Int, or Double can contain one of those types or nothing.
>Then you to want to shuffle the rows to maximize the number of columns
>whose first 100 rows have at least one number (Int or Double), given a
>list of preferred column names since there is no guarantee that every
>number column will have at least one number in its first 100 rows
>I'm wondering about hashing on the rows and hashing on the columns,
>then the column hash has the number of Int's or Double's (don't need
>the String's) in that column and the rows they are in.
>The row hash would have the number of Int's and Double's in that row
>and what column's they are in.
>Then scan the row hash and sort into descending order, and by tagging
>those rows, not by actually moving them.
>Then I think your ready for simmulated annealing.
More information about the Haskell-Cafe