[Haskell-cafe] Parsing Atom feed link
fr33domlover
fr33domlover at riseup.net
Sat Sep 26 07:59:12 UTC 2015
Hello,
I'm using the package 'feed' to parse news feeds and I noticed it fails to
parse the item links of many feeds! I investigated and apparently here is why.
Many Atom feeds in practice seem to publish their item links like this:
<link href="http://....."/>
But in the 'feed' package, a link is taken to be an item link only if its
relation is alternate, i.e. add an attribute rel="alternate". So for these
feeds it returns 'Nothing' for 'getItemLink'.
A quick workaround is to go over the links manually of course, which solves the
issue locally, but I think it should be solved more generally.
Is the problem in 'feed' which recognizes links incorrectly, or do those Atom
feeds simply use the wrong way to publish item links? Are they required to use
"alternate"?
I looked at the examples at https://tools.ietf.org/html/rfc4287 and I just
checked Octopress (Ruby) and Ikiwiki (Perl) generated feeds. The RFC's examples
have item links without "alernate" (as the only item link provided) and both
generators I mentioned create feed item links without "alternate".
Should 'feed' be fixed to recognize them?
From the RFC:
atom:link elements MAY have a "rel" attribute that indicates the
link relation type. If the "rel" attribute is not present, the
link element MUST be interpreted as if the link relation type
is "alternate".
--fr33
More information about the Haskell-Cafe
mailing list