[Haskell-cafe] Is it safe to use unsafePerformIO here?

Daniel Fischer daniel.is.fischer at web.de
Thu Sep 17 16:01:01 EDT 2009


Am Donnerstag 17 September 2009 21:07:28 schrieb Cristiano Paris:
> On Tue, Sep 15, 2009 at 11:31 PM, Daniel Fischer
>
> <daniel.is.fischer at web.de> wrote:
> > ...
> > Yeah, you do *not* want the whole file to be read here, except above for
> > testing purposes.
>
> That's not true. Sometimes I want to, sometimes don't.

The "for the case of sorting by metadata" was tacitly assumed :)

> But I want to use the same code for reading files and exploit laziness
> to avoid reading the body.
>
> > Still, ByteStrings are probably the better choice (if you want the body
> > and that can be large).
>
> That's not a problem by now.
>
> > To avoid reading the body without unsafePerformIO:
> >
> > readBit fn
> >    = Control.Exception.bracket (openFile fn ReadMode) hClose
> >          (\h -> do
> >                l <- hGetLine h
> >                let i = read l
> >                bdy <- hGetContents h
> >                return $ Bit i bdy)
>
> Same problem with the "withFile"-version: nothing gets printed if I
> try to print out the body: that's way I used seq.

Ah, yes. The file is closed too soon.
>
> I'm starting to think that the only way to do this without using
> unsafePerformIO is to have the body being an IO action: simply, under
> Haskell assumption, that's not possible to write, because Haskell
> enforce safety above all.

Well, what about

readBit fn = do
    txt <- readFile fn
    let (l,_:bdy) = span (/= '\n') txt
    return $ Bit (read l) bdy

?

With

main = do
    args <- getArgs
    let n = case args of
                (a:_) -> read a
                _ -> 1000
    bl <- mapM readBit ["file1.txt","file2.txt"]
    mapM_ (putStrLn . show . index) $ sortBy (comparing index) bl
    mapM_ (putStrLn . take 20 . drop n . body) bl



./cparis3 30 +RTS -sstderr
2
3
CCGGGCGCGGTGGCTCACGC
CCGGGCGCGGTGGCTCACGC
         408,320 bytes allocated in the heap
           1,220 bytes copied during GC
          34,440 bytes maximum residency (1 sample(s))
          31,096 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)


./cparis3 20000 +RTS -sstderr                                             
2                                                                         
3                                                                         
AAAATTAGCCGGGCGTGGTG                                                      
AAAATTAGCCGGGCGTGGTG                                                      
       1,069,168 bytes allocated in the heap                              
         105,700 bytes copied during GC                                   
         137,356 bytes maximum residency (1 sample(s))                    
          27,344 bytes maximum slop                                       
               1 MB total memory in use (0 MB lost due to fragmentation)  

./cparis3 2000000 +RTS -sstderr
2
3
CCTGGCCAACATGGTGAAAC
CCTGGCCAACATGGTGAAAC
      80,939,296 bytes allocated in the heap
       8,925,240 bytes copied during GC
         137,056 bytes maximum residency (2 sample(s))
          45,528 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

  %GC time      38.5%  (27.0% elapsed)

  Alloc rate    1,264,577,704 bytes per MUT second

  Productivity  61.5% of total user, 38.8% of total elapsed

./cparis3 20000000 +RTS -sstderr
2
3
CAGAGCGAGACTCCGTCTCA
CAGAGCGAGACTCCGTCTCA
     806,034,756 bytes allocated in the heap
      76,775,944 bytes copied during GC
         136,876 bytes maximum residency (2 sample(s))
          43,324 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:  1536 collections,     0 parallel,  0.35s,  0.35s elapsed
  Generation 1:     2 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.53s  (  0.67s elapsed)
  GC    time    0.35s  (  0.36s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.88s  (  1.02s elapsed)

  %GC time      40.0%  (34.9% elapsed)

  Alloc rate    1,526,482,681 bytes per MUT second

  Productivity  60.0% of total user, 51.7% of total elapsed

Seems to work as desired.

>
> Cristiano



More information about the Haskell-Cafe mailing list