[Haskell-cafe] Slow IO

Daniel Fischer daniel.is.fischer at web.de
Mon Sep 11 10:05:38 EDT 2006


Am Sonntag, 10. September 2006 02:29 schrieben Sie:
> Hello,
>
>   Try Don Stewart's ByteString library
> (http://www.cse.unsw.edu.au/~dons/fps.html). It is much faster than
> the standard Haskell IO and now has lazy.
>
> -Jeff

Yay, that's really an improvement!
Depending on the size of the file/graph data, using ByteString.Lazy.Char8 
reduces the run time by ranging from 'just a little' to a factor of over 30 
-- and some data that made the programme segfault before now run in a couple 
of seconds.
Thanks Bryan, Simon, David, Don & Duncan.

However, the programme still spends the overwhelming part of the time doing 
IO, and I've started thinking about it. 
The problem spec states that the input file contains about 500 test cases, 
each given by between 1 and 100,000 lines, each line containing a single word 
of between 2 and 1000 letters.
So the file should be about 12.5G on average.
A time limit of 7s is given. 
So even if we just counted newlines, we would have to scan 1,700 million 
(1.7*10^9) chars per second. 
Could any ordinary computer of today do that, using whatever language?

Cheers,
Daniel

>
> On 9/9/06, Daniel Fischer <daniel.is.fischer at web.de> wrote:
> > Hello all,
> > Now I have an IO-problem, too.
> > SPOJ problem 41 asks basically to determine whether a directed graph
> > allows a path that uses every edge exactly once. The data from which the
> > graphs are to be constructed are contained in a (huge) file, every item
> > (number of test cases, size of test case, edges) on a single line. I've
> > created my own test case file (much smaller, but already 217 MB, 2.2
> > million lines) to check my programme's performance. Now profiling reveals
> > that over 90% of time and allocation are consumed by reading the file
> > (line by line, reading the whole file as a lazy String and then chopping
> > that to pieces is rather worse).
> >
> > So how do I quickly
> > 1. read an Integer n from the file,
> > 2. read the next n lines and
> > 3. pass the first and last letter of those to the function that builds
> > and checks the graph?
> >
> > When that is achieved, I could see whether my algorithm is good.
> >
> > Thanks for any help,
> > Daniel
> > --
> >

-- 

"In My Egotistical Opinion, most people's C programs should be
indented six feet downward and covered with dirt."
	-- Blair P. Houghton



More information about the Haskell-Cafe mailing list