[Haskell-beginners] first open source haskell project and a mystery to boot
Brent Yorgey
byorgey at seas.upenn.edu
Thu Oct 13 18:18:40 CEST 2011
On Wed, Oct 12, 2011 at 11:59:30AM -0700, Alia wrote:
> --------------------------------------------------------------------
> -- Testing Area
> --------------------------------------------------------------------
> outlook s
> | s == "sunny" = 1
> | s == "overcast" = 2
> | s == "rain" = 3
>
> temp :: (Real a, Fractional n) => a -> n
> temp i = (realToFrac i) / (realToFrac 100)
>
> humidity :: (Real a, Fractional n) => a -> n
> humidity i = (realToFrac i) / (realToFrac 100)
>
>
> windy x
> | x == False = 0
> | x == True = 1
>
> -- attributes
> a1 = Discrete outlook
> a2 = Continuous temp
> a3 = Continuous humidity
> a4 = Discrete windy
>
> outlookData = ["sunny","sunny","overcast","rain","rain","rain","overcast","sunny","sunny","rain","sunny","overcast","overcast","rain"]
> tempData = [85, 80, 83, 70, 68, 65, 64, 72, 69, 75, 75, 72, 81, 71]
> humidityData = [85, 90, 78, 96, 80, 70, 65, 95, 70, 80, 70, 90, 75, 80]
> windyData = [False, True, False, False, False, True, True, False, False, False, True, True, False, True]
> outcomes = [0,0,1,1,1,0,1,0,1,1,1,1,1,0]
>
> d1 = zip outlookData outcomes
> d2 = zip tempData outcomes
> d3 = zip humidityData outcomes
> d4 = zip windyData outcomes
>
> t1 = id3 [a1] d1
> t2 = id3 [a2] d2
> t3 = id3 [a3] d3
> t4 = id3 [a4] d4
>
> --t5 = id3 [a1,a2,a3,a4] [d1,d2,d3,d4]
> -- doesn't work because you can't mix strings and numbers in a list
> --
This also doesn't work because [d1,d2,d3,d4] isn't the right type,
even if you could mix strings and numbers in a list: d1, d2, etc. are
each lists of pairs, so [d1,d2,d3,d4] is a list of lists of pairs.
I think what you really want is to combine all the data for each
observation into a single structure. Something like this:
data Item = Item String Double Double Bool
outlook (Item "sunny" _ _ _) = 1
outlook (Item "overcast" _ _ _) = 2
outlook (Item "rain" _ _ _) = 3
temp (Item _ i _ _) = (realToFrac i) / (realToFrac 100)
humidity (Item _ _ i _) = (realToFrac i) / (realToFrac 100)
windy (Item _ _ _ False) = 0
windy (Item _ _ _ True) = 1
-- attributes
a1 = Discrete outlook
a2 = Continuous temp
a3 = Continuous humidity
a4 = Discrete windy
outlookData =
["sunny","sunny","overcast","rain","rain","rain","overcast","sunny","sunny","rain","sunny","overcast","overcast","rain"]
tempData = [85, 80, 83, 70, 68, 65, 64, 72, 69, 75, 75, 72, 81,
71]
humidityData = [85, 90, 78, 96, 80, 70, 65, 95, 70, 80, 70, 90, 75,
80]
windyData = [False, True, False, False, False, True, True, False,
False, False, True, True, False, True]
outcomes = [0,0,1,1,1,0,1,0,1,1,1,1,1,0]
d = zip (zipWith4 Item outlookData tempData humidityData windyData)
outcomes
t1 = id3 [a1] d
t2 = id3 [a2] d
t3 = id3 [a3] d
t4 = id3 [a4] d
t5 = id3 [a1,a2,a3,a4] d
Now t5 works just fine.
-Brent
More information about the Beginners
mailing list