[Haskell-cafe] Announce: hlcm 0.2.2 - Parallel closed frequent
itemsets mining
Alexandre Termier
Alexandre.Termier at imag.fr
Wed Jun 16 09:11:40 EDT 2010
Dear all,
I'm pleased to announce the release of hlcm on Hackage :
http://hackage.haskell.org/package/hlcm-0.2.2
hlcm is data mining tool for computing closed frequent itemsets.
This problem is famous as "market basket analysis":
- given a list of transactions :
[["bread", "butter","chocolate","tomato"]
,["bread","butter"]
,["bread","pencil","butter","chocolate"]
,["bread","butter","book"]]
- and a minimal frequency threshold in [1..4], let's say 2: we want
items that are sold together in at least 2 transactions
hlcm will tell you that ["bread","butter","chocolate"] appears in 3
transactions and that ["bread","butter"] appears in 4 transactions.
You can many funnier applications with your own data, for example log
analysis, mining words in web pages, etc.
You can see details on getting started with the program here:
http://membres-liglab.imag.fr/termier/HLCM/hlcm/hlcm/Main.html
The library documentation is here :
http://membres-liglab.imag.fr/termier/HLCM/hlcm/HLCM.html
hlcm is based on the most efficient algorithm for closed frequent
itemset mining, LCM, which is much, much faster than the well-known
Apriori algorithm (more details when following the pointers from the
homepage: http://membres-liglab.imag.fr/termier/HLCM/hlcm.html).
hlcm can also exploit parallelism through Strategies, with promising
speedups. We still have more work to do in order to beat existing C/C++
implementations, but you can have a look at the paper that we submitted
at Haskell Symposium this year for a detailed experimental study:
http://membres-liglab.imag.fr/termier/HLCM/hlcm.pdf
Don't miss out the section about the influence of RTS parameters on
parallel performance.
Feel free to send me an e-mail if you have any question about hlcm.
Alexandre
--
_____________________________________________________________
Alexandre Termier
LIG (Laboratoire d'Informatique de Grenoble)
Université Joseph Fourier
681 rue de la Passerelle
B.P. 72, 38402 Saint Martin d'Hères (FRANCE)
Phone: +33 4 76 82 72 07
Fax: +33 4 76 82 72 87
http://membres-liglab.imag.fr/termier/
_____________________________________________________________
More information about the Haskell-Cafe
mailing list