[Haskell-beginners] Question on Lazy Evaluation, Maps & Funktions

Leonhard Applis Leonhard.Applis at protonmail.com
Tue Dec 17 10:31:29 UTC 2019


Hi there, 

I currently have a task at hand which requires to find a n-tuple of sentences s, where the biggest distance of sentences is searched(So the most dis-similar sentences shall be chosen). 

Especially with growing n, this problem is causing me trouble, as in a naive implementation the number I need to calculate the distances grows exponentially by n and linear by s.
The distance function itself is drop-in distance :: Text -> Text -> Double 
For the questions the distance can be expected to behave in constant time, and the list of n-tuples to be deepseq ready. 

Given my real world problems, this is giving me a hard time waiting. 

Coming from a more imperative environment, one approach I'd take would be to calculate a dictionary of predefined distances, so instead of calculating the distance new I'd only have to lookup the distance.
This would - in my understanding - mean that I create the dictionary which takes s * (s-1) steps to calculate the full dictionary, and then for s^n only dictionary lookups. 

But from the little I remember "Real World Haskell", the functions and variables are only evaluated once and hang around until the garbage collector comes around. 
Question 1.)is this right, and does this mean, that if there is no GC in between, the function distance(a,b) is only calculated once, and therefore the dictionary approach is mostly useless?

Question 2.) 
The function distance is symmetric, is there a way to help my Haskell (and the evaluation) to understand that it only needs to be run once? 
Iff not, does it make sense to iterate the over the tuples first and "order" them, to help distances only occur in a certain order? Is there a way to create unique, ordered n-tuples in the first place?

Additionally, I am grateful for any kind of help regarding performance-tricks to this case. 
I know that this is a typical case for multi threading, but I first want to try and find "the best" base-algorithm. Also I think it's a nice question :) 

Thank you very much
Leonhard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20191217/812914f9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: publickey - Leonhard.Applis at protonmail.com - 0x807FDDF3.asc
Type: application/pgp-keys
Size: 1843 bytes
Desc: not available
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20191217/812914f9/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 477 bytes
Desc: OpenPGP digital signature
URL: <http://mail.haskell.org/pipermail/beginners/attachments/20191217/812914f9/attachment.sig>


More information about the Beginners mailing list