[web-devel] Join support in persistent

Michael Snoyman michael at snoyman.com
Sun Apr 3 20:42:53 CEST 2011


On Sun, Apr 3, 2011 at 10:08 AM, Greg Weber <greg at gregweber.info> wrote:
>
>
> On Sat, Apr 2, 2011 at 11:50 PM, Michael Snoyman <michael at snoyman.com>
> wrote:
>>
>>
>> On Sun, Apr 3, 2011 at 9:40 AM, Greg Weber <greg at gregweber.info> wrote:
>>>
>>> [snip]
>>>>
>>>> I think you're solving a different problem. Are you talking about the
>>>> fact that the EntryAuthorIn constructor takes a list instead of a Set?
>>>> That's not where the slowdown comes from. Actually, for the current
>>>> backends, a set would needlessly slow things down, since the In constructor
>>>> simply converts things to SQL and lets the database do the work.
>>>> I'm not sure what you're suggesting here to be honest, can you clarify?
>>>
>>> An O(m + n) implementation instead of O(m * n) by using constant lookups
>>> instead of repeatedly searching through a list.
>>
>> But where will you use the Set? The slowdown is because we end up with two
>> lists: [(Key one, one)] and [(Key many, many)]. For each (Key one, one), we
>> need to filter the entire [(Key many, many)] to find the records which match
>> each one. As far as the code is concerned, doing the initial filter to the
>> relevant (Key many) values is constant*.
>
>  We end up with 2 lists because we are building 2 lists. We should build
> data structures that allow for an efficient implementation of this function.
> Greg
>
>  [snip]

Ahh... I see what you mean. I just pushed a commit that uses a Map to
hopefully improve efficiency. I think we end up with some kind of
crazy formula like O(m + m * log n)... which is of course just O(m *
log n), but whatever it is, definitely seems better than O(m * n).
Thanks for pointing it out.

I also implemented the "outer join" feature. I think the next step is
to figure out a standard way to clean up the interface so we don't
have 15 parameters floating around. This seems like a general question
in Haskell: how do you encode named parameters + default values. We
can do it with a different data type for each function, or even a
single type class + multiple data types, but I'm wondering if someone
out there can point to some prior art that does a good job.

Michael



More information about the web-devel mailing list