[web-devel] Join support in persistent

Greg Weber greg at gregweber.info
Sun Apr 3 05:40:33 CEST 2011


That is wonderful- application joins or database joins. Lets compare to
Rails:

selectOneMany [AuthorIsPublicEq True] [AuthorNameAsc] [EntryIsPublicEqTrue]
[EntryPublishedDesc] EntryAuthorEq

Author.where(:isPublic => true).order("name").includes(:entries) &
Entry.where(:isPublic => true).order("published DESC")

Note that a Rails query is lazy and the & is combining the queries. However,
when there are no filtering criteria on the association, Rails prefers to
perform 2 queries- one to retrieve the authors, and then one to retrieve the
entries based on the author ids:

SELECT "entries".* FROM "entries" WHERE ("entries".author_id IN
(51,1,78,56,64,84,63,60))

Originally rails always did do a SQL level join, but then decided to switch
to preferring app-level, largely because it reduced the number of Ruby
objects that needed to be allocated, resulting in much better performance
for large data sets [1].

It appears that persistent.Join is instead performing an n + 1 query- one
query per each author. We should avoid these kinds of queries, and then
there will not be much point to an outer join in the db.

Looking at the behavior of Rails for joins, I don't like how it decides
between types of joins. The sql produced by Rails is not in fact identical
to the persistent one: it will do an outer join with the entry filtering in
a WHERE clause, not as part of the JOIN conditions.
If we are to support joins it needs to be very apparent which type of join
is performed.


selectOneMany doesn't have an offset and limit. If we added it we end up
with queries like this:

selectOneMany [] [] [] [] EntryAuthorEq 0 0

This function with 5+ required arguments is somewhat awkward/difficult to
use and to read. Rails is composable in a readable way because it copied
haskellDB. I would like to get away from the empty optional arguments.
I am all for adding these changes for now, I just hope we can move to a more
composable API in the future.
I thought the API that Aur came up with was a better effort in that
direction, although there are definitely practical issues with it.

[1]
http://akitaonrails.com/2008/05/25/rolling-with-rails-2-1-the-first-full-tutorial-part-2

Greg Weber

On Sat, Apr 2, 2011 at 2:50 PM, Michael Snoyman <michael at snoyman.com> wrote:

> Hey all,
>
> After a long discussion with Aur Saraf, I think we came up with a good
> approach to join support in Persistent. Let's review the goals:
>
> * Allow for non-relational backends, such as Redis (simple key-value
> stores)
> * Allow SQL backends to take advantage of the database's JOIN abilities.
> * Not force SQL backends to use JOIN if they'd rather avoid it.
> * Keep a simple, straight-forward, type-safe API like we have
> everywhere else in Persistent.
> * Cover the most common (say, 95%) of use cases out-of-the-box.
>
> So our idea (well, if you don't like it, don't blame Aur...) is to
> provide a separate module (Database.Persist.Join) which provides
> special functions for the most common join operations. To start with,
> I want to handle a two-table one-to-many relationship. For
> demonstration purposes, let's consider a blog entry application, with
> entities Author and Entry. Each Entry has precisely one Author, and
> each Author can have many entries. In Persistent, it looks like:
>
> Author
>    name String Asc
>    isPublic Bool Eq
> Entry
>    author AuthorId Eq
>    title String
>    published UTCTime Desc
>    isPublic Bool Eq
>
> In order to get a list of all entries along with their authors, you
> can use the newly added[1] selectOneMany function:
>
>    selectOneMany [AuthorIsPublicEq True] [AuthorNameAsc]
> [EntryIsPublicEqTrue] [EntryPublishedDesc] EntryAuthorEq
>
> This will return a value of type:
>
>    type AuthorPair = (AuthorId, Author)
>    type EntryPair = (EntryId, Entry)
>    [(AuthorPair, [EntryPair])]
>
> In addition to Database.Persist.Join, there is also a parallel module
> named Database.Persist.Join.Sql, which has an alternative version of
> selectOneMany that is powered by a SQL JOIN. It has almost identical
> semantics: the only catch comes in when you don't fully specify
> ordering. But then again, if you don't specify ordering in the first
> place the order of the results is undefined, so it really *is*
> identical semantics, just slightly different behavior.
>
> Anyway, it's almost 1 in the morning, so I hope I haven't rambled too
> much. The basic idea is this: Persistent 0.5 will provide a nice,
> high-level approach to relations. I'll be adding more functions to
> these modules as necessary, and I'd appreciate input on what people
> would like to see there.
>
> Michael
>
> [1]
> https://github.com/snoyberg/persistent/commit/d2b52a6a7b7a6af6234315492f24f821a0ea7ce4#diff-2
>
> _______________________________________________
> web-devel mailing list
> web-devel at haskell.org
> http://www.haskell.org/mailman/listinfo/web-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/web-devel/attachments/20110402/93e9776f/attachment.htm>


More information about the web-devel mailing list