Space problem: reading a list of records from a binary database

Eray Ozkural erayo@cs.bilkent.edu.tr
Tue, 28 May 2002 15:24:45 +0300


--Boundary-00=_Nc388AtInGnOEDC
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline

Hi there,

While trying to read a list of records from a binary database I think I've 
encountered a space leak but I'm not proficient enough to detect the cause of 
the error.

The culprit is the readSessionDB function in SessionDB module which I attach 
with other relevant code.

What I do in Session.readSession is using FFI to read a fixed size record from 
an input stream. That far, okay. (I have no idea if I'm doing it right, 
though)

The problem is, combined with my approach in Session.readSessionDB, it results 
in huge amounts of space allocation, about 1MB, whereas the raw data would 
consume about 50K. It would be a [Session] with about 3000 elements, each of 
which contains 3 Ints and 6 Floats.

I wonder if the following code in SessionDB is the proper way to read a list 
of records from a file. I get the feeling that the (s:rest) in the last 
return statement causes my problems. 

readSessionList :: Handle -> IO [Session]
readSessionList h =
  do session <- readSession h
     case session of
          Nothing -> return []
	  Just s -> do rest <- readSessionList h
		       return (s:rest)

Comments appreciated,

-- 
Eray Ozkural (exa) <erayo@cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

--Boundary-00=_Nc388AtInGnOEDC
Content-Type: text/plain;
  charset="us-ascii";
  name="SessionDB.hs"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="SessionDB.hs"

module SessionDB
where
import IO
import System
import IOExts
import Array
import Foreign
import CForeign
import SMA
import Session

readSessionDB :: String -> IO [Session]
readSessionDB file =
  do loc <- locateStock file
     case loc of
          Nothing -> do return []
	  Just fname -> do h <- openFileEx fname (BinaryMode ReadMode)
			   putStrLn ("reading " ++ (show fname))
			   list <- readSessionList h
			   return list

readSessionList :: Handle -> IO [Session]
readSessionList h =
  do session <- readSession h
     case session of
          Nothing -> return []
	  Just s -> do rest <- readSessionList h
		       return (s:rest)

--Boundary-00=_Nc388AtInGnOEDC
Content-Type: text/plain;
  charset="us-ascii";
  name="Session.hs"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="Session.hs"

module Session
where
import IO
import System
import IOExts
import Array
import Foreign
import CForeign

data Session = Session {stockId, date, sessionId :: Int,
			lowestValue, highestValue, lastValue, averageValue,
			lot, volume :: Float} deriving (Eq, Show)

emptySession = Session {
			stockId = 0,
			date = 0,
			sessionId = 0,
			lowestValue = 0,
			highestValue = 0,
			lastValue = 0,
			averageValue = 0,
			lot = 0,
			volume = 0
		       }

readSession :: Handle -> IO (Maybe Session)
readSession h = 
 do eof <- hIsEOF h
    if eof then return Nothing else 
      do buf <- mallocArray 73:: IO (Ptr CChar)
	 readChars <- hGetBuf h buf 73
-- 	 putStrLn ("read " ++ (show readChars))
	 let sidptr = castPtr buf :: Ptr CShort
	     dateptr = castPtr (plusPtr buf 2) :: Ptr CUInt
	     sesptr = (plusPtr buf 6) :: Ptr CChar
	     ptr = castPtr (plusPtr buf 7) :: Ptr CLLong
          in do stockid <- peek sidptr
		date <- peek dateptr
		sessid <- peek sesptr
		rec <- peekArray 8 ptr
--  		destructArray 73 buf
		let list  = map (\x -> (fromIntegral x) / 10000) rec
			  :: [Float]
 		    vals = array (1,8) (zip [1..8] list)
   		    in return (Just Session {stockId = (fromIntegral stockid),
   					     date = (fromIntegral date),
   					     sessionId = (fromIntegral sessid),
   					     lowestValue = vals!1,
   					     highestValue = vals!2,
   					     lastValue = vals!3,
   					     averageValue = vals!4,
   					     lot = vals!5,
   					     volume = vals!6
   					    })

--Boundary-00=_Nc388AtInGnOEDC--