[Haskell-cafe] Another Question
Richard O'Keefe
ok at cs.otago.ac.nz
Thu Feb 3 22:20:25 CET 2011
On 3/02/2011, at 8:18 PM, Navin Rustagi wrote:
> hi ,
>
> I am stuck in the following problem.
>
> I am maintaining a list of tuples of the form
> ([Char],Int,Int, Int,Int) . The purpose of maintaining the tuples is that the
> program reads from a file line by line , Matches the contents with the first element of the tuple and updates the tuple respectively.
>
> The precise function I am using is as follows
>
>
> tupup::Bool->[Char]->Int->Int->Int->Int->Char->([Char],Int,Int,Int,Int)
> tupup val elf els elr ell elx ys= if val then
> case ys of 'A' -> (elf, els+1,elr,ell,elx)
> 'G' -> (elf,els, elr +1,ell,elx)
> 'C' -> (elf,els,elr, ell +1,elx)
> 'T' -> (elf,els,elr,ell, elx +1)
> else (elf,els,elr,ell,elx)
>
>
> uptable::[[Char]]->[([Char],Int,Int,Int,Int)]->[([Char],Int,Int,Int,Int)]
> uptable (xf:xs) main_array = map (\(x,y,z,r,t)-> tupup (x==xf) x y z r t (secvalue xs) ) main_array
>
>
> It gives the error ERROR - Control stack overflow. I assume it is because of the lazy evaluation .
The lazy evaluation of the ech+1 expressions, to be specific.
By the way, could I make a plea for narrower lines in your source code?
Ninety-nine columns is much wider than I find comfortable.
A little white space helps readability too.
I want to offer you a completely Haskell 98 solution.
Step 1.
Let's look at the types. Trust me, this WILL pay off. Twice.
The type (String,Int,Int,Int,Int) occurs several times here.
In fact this appears to represent a "maplet" String :-> (Int,Int,Int,Int).
We'll refactor that and make a data type for the counts.
data Codon_Counts = Codon_Counts Int Int Int Int
type Codon_Map = [(String, Codon_Counts)]
Step 2.
Two things are intermingled here: a control structure for applying a function
to the second part of maplets whose first part equals a given key, and what
to do to those second parts. Let's factor out the control structure.
map_matching_maplets :: Eq k => k -> (a -> a) -> [(k, a)] -> [(k, a)]
map_matching_maplets k f =
map (\maplet@(key,val) -> if key == k then (key,f val) else maplet)
Step 3.
Write an 'update_codon_map' function.
update_codon_map :: [String] -> Codon_Map -> Codon_Map
update_codon_map (what:how) =
map_matching_maplets what $
case secvalue how of
'A' -> \(Codon_Counts a g c t) -> Codon_Counts (a+1) g c t
'G' -> \(Codon_Counts a g c t) -> Codon_Counts a (g+1) c t
'C' -> \(Codon_Counts a g c t) -> Codon_Counts a g (c+1) t
'T' -> \(Codon_Counts a g c t) -> Codon_Counts a g c (t+1)
Step 4.
Add strictness annotations. In Haskell 98, the only place you
can put strictness annotations is in types. That's one of the
reasons I introduced the Codon_Counts type. Change just one line
so the declaration reads
data Codon_Counts
= Codon_Counts !Int !Int !Int !Int
Step 5.
I suspect that there is still a big performance problem.
I *think* that this list of tuples is *really* a map from
some key to a set of codon counts, so that a call to
update_codon_map is expected to change just one maplet.
If that's so, then having a Codon_Map type has a big payoff, because
you can do
import qualified Data.Map as Map
type Codon_Map = Map.Map String Codon_Counts
update_codon_map (what:how) map = Map.update f what
where f (Codon_Counts a g c t) =
Just $ case secvalue how of
'A' -> Codon_Counts (a+1) g c t
'G' -> Codon_Counts a (g+1) c t
'C' -> Codon_Counts a g (c+1) t
'T' -> Codon_Counts a g c (t+1)
If there are n items in the map, the list based code is O(n),
while this is O(log n).
More information about the Haskell-Cafe
mailing list