[Haskell-cafe] Announce: hlcm 0.2.2 - Parallel closed frequent
Alexandre.Termier at imag.fr
Wed Jun 16 09:11:40 EDT 2010
I'm pleased to announce the release of hlcm on Hackage :
hlcm is data mining tool for computing closed frequent itemsets.
This problem is famous as "market basket analysis":
- given a list of transactions :
- and a minimal frequency threshold in [1..4], let's say 2: we want
items that are sold together in at least 2 transactions
hlcm will tell you that ["bread","butter","chocolate"] appears in 3
transactions and that ["bread","butter"] appears in 4 transactions.
You can many funnier applications with your own data, for example log
analysis, mining words in web pages, etc.
You can see details on getting started with the program here:
The library documentation is here :
hlcm is based on the most efficient algorithm for closed frequent
itemset mining, LCM, which is much, much faster than the well-known
Apriori algorithm (more details when following the pointers from the
hlcm can also exploit parallelism through Strategies, with promising
speedups. We still have more work to do in order to beat existing C/C++
implementations, but you can have a look at the paper that we submitted
at Haskell Symposium this year for a detailed experimental study:
Don't miss out the section about the influence of RTS parameters on
Feel free to send me an e-mail if you have any question about hlcm.
LIG (Laboratoire d'Informatique de Grenoble)
Université Joseph Fourier
681 rue de la Passerelle
B.P. 72, 38402 Saint Martin d'Hères (FRANCE)
Phone: +33 4 76 82 72 07
Fax: +33 4 76 82 72 87
More information about the Haskell-Cafe