<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div></div><div><br></div><div><br>On Aug 30, 2018, at 11:21, Olaf Klinke <<a href="mailto:olf@aatal-apotheke.de">olf@aatal-apotheke.de</a>> wrote:<br><br></div><blockquote type="cite"><br><span>[*] To the parser experts on this list: How much time should a parser take that processes a 50MB, 130000-line text file, extracting 5 values (String, UTCTime, Int, Double) from each line?</span><br><span>_______________________________________________</span><br><br></blockquote><br><div>The combination of attoparsec + a streaming adapter for pipes/conduit/streaming should easily be able to handle tens of megabytes per second and hundreds of thousands of lines per second. </div><div><br></div><div>For an example, check out <a href="https://github.com/wyager/Callsigns/blob/master/Callsigns.hs">https://github.com/wyager/Callsigns/blob/master/Callsigns.hs</a></div><div><br></div><div>Which parses a pipe-separated-value file from the FCC pretty quickly. As I recall it goes through a >100MB file in under three seconds, and it has to do a bunch of other work besides. </div><div><br></div><div>I also ported the above code to use Streaming instead of Pipes. I recall that using Streaming master, the parser I use to read the dictionary:</div><div><br></div><div><span style="background-color: rgba(255, 255, 255, 0);">takeTill isEndOfLine <span class="pl-k" style="box-sizing: border-box;"><*</span> endOfLine</span></div><div><span style="background-color: rgba(255, 255, 255, 0);"><br></span></div><div><span style="background-color: rgba(255, 255, 255, 0);">Handles about 3 million lines per second. I can’t remember what the number is for Pipes but it’s probably similar. That’s really good for such a simple thing to write!</span></div><div><br></div><div>Unfortunately there is a performance bug in Streaming that’s fixed in master but hasn’t been released for a number of months :-/</div><div><br></div><div>—Will</div></body></html>