Adding Network.URI.escape

Gwern Branwen gwern0 at gmail.com
Mon Jan 4 09:53:39 EST 2010


On Mon, Jan 4, 2010 at 6:15 AM, Graham Klyne <GK-lists at ninebynine.org> wrote:
> It's clearly *possible*, but where do we stop?  That said, I guess a *small*
> number of special cases would make sense, e.g.

Yes, a few is what I mean. There are no doubt endless possible cases
and layers of interpretation, but in real-life, 2 or 3 use-cases tend
to dominate. Libraries can only hope to bundle common special cases,
after all.

(I recall one information theoretic paper I read arguing that over the
space of all possible programs, libraries are a net negative in terms
of size; but of course, this is like type safety or no pointers - we
don't *want* to write all possible programs.)

>  escapeHttpOrFileUri
>
> with carefully written health warnings in the associated documentation; e.g.
> "This function applies URI escaping to an http: or file: URI on the
> assumption that the individual path segments within the URI do not contain
> '/' or '?' or '#' or [...] characters.  If any of these characters are
> present in any path segment then the URI components and path segments should
> be escaped separately before being assembled into a final URI, and no
> further escaping should be applied once the URI has been constructed (cf.
> RFC 3986 [...])", etc., etc.
>
> My point here is that crafting a clear description of when the provided
> escaping is correct to use will be somewhat harder than writing the
> functions.  Also, I suspect that escaping function will need to be a little
> more subtle than 'escapeURIString isUnescapedInURI', at least to the extent
> of splitting off the query and fragment before escaping the pieces
> separately, then re-assembling.
>
> Maybe the greatest value in doing this would be to demonstrate concretely
> the complexities inherent in URI escaping, and provide some code that can be
> adapted for different schemes and circumstances.
>
> #g

Yes, it does demonstrate that the task is difficult and messy. As it
is, the danger can easily be breezed through by someone who wants to
get a job done - and that's just not the Haskell way!
Can we leave the task to you? Even just your escapeHttpOrFileUri is an
improvement.

-- 
gwern


More information about the Libraries mailing list