[Haskell-cafe] Accumulating related XML nodes using HXT
Daniel McAllansmith
dm.maillists at gmail.com
Mon Oct 30 16:10:56 EST 2006
Hello.
I have some html from which I want to extract records.
Each record is represented within a number of <tr> nodes, and all records <tr>
nodes are contained by the same parent node.
The things I've tried so far end up giving me the cartesian product of record
fields, so for the html fragment included below I'd end up with:
[ Prod "Television" 17 "/prod17" "A very nice telly."
, Prod "Television" 17 "/prod17" "Mind your fillings."
, Prod "Cyclotron" 24 "/prod24" "A very nice telly."
, Prod "Cyclotron" 24 "/prod24" "Mind your fillings."
]
instead of:
[ Prod "Television" 17 "/prod17" "A very nice telly."
, Prod "Cyclotron" 24 "/prod24" "Mind your fillings."
]
How should I go about accumulating related <tr> nodes into individual records?
Thanks
Daniel
HTML fragment follows:
...
<tr>
<tr>
<td><strong>Product:</strong></td>
<td><strong><a href="/prod17">Television</a></strong> (code: 17)</td>
</tr>
<tr>
<td><strong>Description:</strong></td>
<td>A very nice telly.</td>
</tr>
<tr>
<td><hr color="#00000"></td>
</tr>
<tr>
<td><strong>Product:</strong></td>
<td><strong><a href="/prod24">Cyclotron</a></strong> (code: 24)</td>
</tr>
<tr>
<td><strong>Description:</strong></td>
<td>Mind your fillings.</td>
</tr>
<tr>
<td><hr color="#00000"></td>
</tr>
</tr>
...
More information about the Haskell-Cafe
mailing list