[web-devel] Join support in persistent

Sun Apr 3 08:50:44 CEST 2011

On Sun, Apr 3, 2011 at 9:40 AM, Greg Weber <greg at gregweber.info> wrote:

> [snip]
>
>  I think you're solving a different problem. Are you talking about the
>> fact that the EntryAuthorIn constructor takes a list instead of a Set?
>> That's not where the slowdown comes from. Actually, for the current
>> backends, a set would needlessly slow things down, since the In constructor
>> simply converts things to SQL and lets the database do the work.
>>
>> I'm not sure what you're suggesting here to be honest, can you clarify?
>>
>
> An O(m + n) implementation instead of O(m * n) by using constant lookups
> instead of repeatedly searching through a list.
>
> But where will you use the Set? The slowdown is because we end up with two
lists: [(Key one, one)] and [(Key many, many)]. For each (Key one, one), we
need to filter the entire [(Key many, many)] to find the records which match
each one. As far as the code is concerned, doing the initial filter to the
relevant (Key many) values is constant*.

* Well, it's "constant" in the sense that the database backend handles it.
In the case of SQL, it's converted to a "Entry"."author" IN [1,2,3]. How the
database handles that code is outside the scope of this optimization.

>
> [snip]
>
>
>>> It would be best for it to also be clear from the function names or
>>> arguments..
>>>
>>> I actually thought it *was* clear that it would be an inner join and not
>> an outer join. But how would you change the names? I don't want to end up
>> with selectJoiningOneToManyRelationshipUsingInnerJoin ;)
>>
>
> How do I know it is an inner join? If there is just one function I would
> assume it is an outer and I can get an inner by filtering the results, not
> that there is no way to do an outer. If there is a second function that is
> called *Outer, then it might be clear that the other is an inner. The other
> approach is instead of EntryAuthorEq to have something like InnerJoin (if
> possible) or otherwise EntryAuthorInner.
>
> [snip]
>
>
>>
>>>> I agree, I am not trying to say that we need to elegantly handle every
>>> possible query. I am just pushing that for those that we are currently
>>> handling to be elegant. Persistent integration with directly writing SQL
>>> should probably be a high priority.
>>>
>>>
>> Can I get some feedback on what's missing for this? In the
>> Database.Persist.GenericSql.Raw module[1], there are two functions (withStmt
>> and execute) that let you run any SQL command against the DB you want. I've
>> used this myself when I needed to do something that Persistent didn't allow
>> (a full text search in my case).
>>
>> I know that the functions are neither pretty nor well documented, but
>> what's missing that is preventing people from dropping down to SQL now? If
>> it's just a documentation issue, I'll address it.
>>
>> Michael
>>
>>
> Yes we need docs. Is it possible after a raw query to build the data
> structures back up? -  achieve the same return result of selectList? The
> other aspect of integration would be the ability to mix SQL fragments with a
> normal persistent query. For example, for the sql join function under
> discussion the user could supply the raw join SQL to make it an outer join.
>

I think by exposing (and documenting) some of the functions that Persistent
uses internally, we can get pretty far here. For example, for marshaling,
the most useful function is likely fromPersistValues[1], which anyone can
use right now. However, the user needs to be careful to pass in the right
set of fields.

Michael

[1]
http://hackage.haskell.org/packages/archive/persistent/0.4.2/doc/html/Database-Persist-Base.html#v:fromPersistValues
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/web-devel/attachments/20110403/1a1a0256/attachment.htm>