[Haskell-cafe] Re: the Network.URI parser
Peter Gammie
peteg42 at gmail.com
Tue May 27 22:23:49 EDT 2008
On 27/05/2008, at 6:08 PM, Neil Mitchell wrote:
>> It most certainly is a security flaw.
>
> In the src of an img, yes, probably. In the href of a link, its a
> completely valid thing to do - and one that I've done loads of times.
> The URI is fine, its just the particular location that is dodgy.
Sure, but for other reasons (potential inaccessibility) I am quite
happy to ban JavaScript from URIs. (Not all URIs, just the ones coming
from untrusted users.)
>> whole pile of dodgy URIs. Most get culled (in my case) by the HaXml
>> parser
>> and/or XHTML 1.0 Strict validation, and now I hope to eliminate the
>> rest by
>> carefully handling the URIs.
>
> I don't think that's possible. A URI can validly have javascript, and
> can validly be a lot of things which are unsafe.
Sure, I now realise my notion of allowable URI goes beyond (is an
additional restriction of) the RFC.
One way to show the URI is valid is to fetch what it is pointing to,
and ensure it is an image or whatever.
>> On that topic, does anyone have any good advice for handling these
>> things?
>
> My advice is that you are targeting security at the wrong level. You
> shouldn't be cleaning the HTML to get a secure page, you should be
> having the level that interprets the HTML be secure regardless of the
> input.
I am taking comments on a web forum from arbitrary people. The
interpretation of the HTML occurs at the user's browser. A lot of
people will be using outdated browsers (IE 5.5 / 6), ergo security (at
the source) becomes my problem. I cannot force them to upgrade their
browsers.
>> If anyone knows of the state-of-the-art in this area, I'd
>> appreciate a
>> pointer.
>>
>> http://htmlpurifier.org/live/smoketests/printDefinition.php
>>
>> doesn't seem to think the style attribute is unsafe. Have they not
>> been
>> following the MySpace fiascos?
>
> Safety is a property of the HTML viewer, not of the HTML or CSS.
Well, yes and no. I am heavily restricting the XHTML I accept (e.g. no
scripts, no style attribute, ...), in an attempt to keep things
visually accessible and avoid phishing attacks. I was alluding to the
use of absolute positioning in CSS. If I had a CSS parser I might
allow the style attribute.
Safety for me involves making sure that what is displayed is
trustworthy and easily identifiable as such. This is not something the
HTML viewer can always help with.
I think we're off-topic enough for me to stop here. Thanks for your
comments.
cheers
peter
More information about the Haskell-Cafe
mailing list