[Haskell-beginners] Parse file with existentials
Dmitriy Matrosov
sgf.dma at gmail.com
Tue Dec 13 09:24:32 UTC 2016
> {-# LANGUAGE GADTs #-}
> {-# LANGUAGE DataKinds #-}
> {-# LANGUAGE RankNTypes #-}
> {-# LANGUAGE KindSignatures #-}
> {-# LANGUAGE StandaloneDeriving #-}
> {-# LANGUAGE FlexibleInstances #-}
> {-# LANGUAGE DeriveDataTypeable #-}
>
> import Prelude hiding (getLine)
> import Data.Maybe
> import Data.List
> import Data.Typeable
> import Control.Monad.Identity
> import Control.Monad.Trans.Identity
> import Control.Monad.Writer
> import Control.Applicative
> import System.FilePath
Hi.
I ask for an opinion about interface (implemented below) for parsing
`rsync` filter files. The parser does not parse full syntax, i wrote
it for determining rsync filter dependencies, when installing them
using `shake`.
So, i distinguish two kinds of lines: include of another filter file,
which looks like
. file
and any other. I want to distinguish them at type level, so e.g. a
record function for one constructor can't be applied to another, etc.
I know, that i can prevent this at runtime by exporting only smart
constructor, but i want a type check.
> data RsyncFilterT = IncludeT | LineT
> type IncludeT = 'IncludeT
> type LineT = 'LineT
>
> -- Particular rsync filters distinguishable at type-level.
> data RsyncFilter :: RsyncFilterT -> * where
> Include :: {getInclude :: FilePath} -> RsyncFilter 'IncludeT
> Line :: {getLine :: String} -> RsyncFilter 'LineT
> deriving instance Show (RsyncFilter a)
> deriving instance Typeable RsyncFilter
For accessing records i use lenses redefined here. I redefine them
with `Applicative` instead of `Functor` to make modify/set work even
if value does not have a required record (by returning original
(unmodified) value using `pure`).
> type LensA a b = forall f. Applicative f => (b -> f b) -> a -> f a
>
> viewA :: LensA a b -> a -> b
> viewA l = fromJust . getLast . getConst . l (Const . Last . Just)
> viewAmaybe :: LensA a b -> a -> Maybe b
> viewAmaybe l = getLast . getConst . l (Const . Last . Just)
>
> modifyA :: LensA a b -> (b -> b) -> a -> a
> modifyA l f = runIdentity . l (Identity . f)
>
> modifyAA :: Applicative t => LensA a b -> (b -> t b) -> a -> t a
> modifyAA l f = runIdentityT . l (IdentityT . f)
>
> setA :: LensA a b -> b -> a -> a
> setA l s = modifyA l (const s)
Here're lenses for `RsyncFilter` (its constructors are distinguishable
at type-level, so i don't really need `Applicative` lenses here):
> includeL :: LensA (RsyncFilter 'IncludeT) FilePath
> includeL f z at Include {getInclude = x} =
> fmap (\x' -> z{getInclude = x'}) (f x)
> lineL :: LensA (RsyncFilter 'LineT) FilePath
> lineL f z at Line {getLine = x} =
> fmap (\x' -> z{getLine = x'}) (f x)
The order of lines (may) matter, so i need to store all `RsyncFilter
a` values in a list in original file order. But now the values are of
different type. So.. i use existential container:
> -- Generic container for any type of rsync filter.
> data AnyFilter = forall (a :: RsyncFilterT). Typeable a =>
> AnyFilter (RsyncFilter a)
> deriving instance Show AnyFilter
> deriving instance Typeable AnyFilter
And still i want to work on values of certain type to have some
guarantees against misuse, so i need to cast `AnyFilter` back into
`RsyncFilter` value:
> -- Extract rsync filter from AnyFilter.
> getFilter :: (forall (a :: RsyncFilterT). Typeable a =>
> RsyncFilter a -> b) -> AnyFilter -> b
> getFilter f (AnyFilter x) = f x
and here i also want to use lenses, but now the value may be of
different type, that the lens expect, so i really need `Applicative`
lenses here:
> rsyncIncludeL' :: LensA AnyFilter (RsyncFilter 'IncludeT)
> rsyncIncludeL' f z = maybe (pure z) (fmap AnyFilter . f)
(getFilter cast z)
> rsyncIncludeL :: LensA AnyFilter FilePath
> rsyncIncludeL = rsyncIncludeL' . includeL
>
> rsyncLineL' :: LensA AnyFilter (RsyncFilter 'LineT)
> rsyncLineL' f z = maybe (pure z) (fmap AnyFilter . f)
> (getFilter cast z)
> rsyncLineL :: LensA AnyFilter String
> rsyncLineL = rsyncLineL' . lineL
Then i define another Read/Show class just to be able to keep default
Read/Show instances:
> class Serialize a where
> fromString :: String -> Maybe a
> toString :: a -> String
>
> instance Serialize (RsyncFilter 'LineT) where
> fromString = Just . Line
> toString (Line xs) = xs
>
> -- RULE and PATTERN separator is space (`_` not supported).
> -- Only short rule names without modifiers are supported.
> instance Serialize (RsyncFilter 'IncludeT) where
> fromString = go . break (== ' ')
> where
> go :: (String, String) -> Maybe (RsyncFilter 'IncludeT)
> go (r, _ : x : xs)
> | r == "." = Just (Include (x : xs))
> go _ = Nothing
> toString (Include xs) = ". " ++ xs
>
> instance Serialize AnyFilter where
> fromString x =
> fmap AnyFilter (fromString x :: Maybe (RsyncFilter
'IncludeT))
> <|> fmap AnyFilter (fromString x :: Maybe (RsyncFilter 'LineT))
> toString x = fromMaybe "" $
> fmap toString (viewAmaybe rsyncIncludeL' x)
> <|> fmap toString (viewAmaybe rsyncLineL' x)
and a lens from String to AnyFilter, which effectively parses file and
writes it back:
> rsyncAnyL :: LensA String AnyFilter
> rsyncAnyL f z = maybe (pure z) (fmap toString . f) (fromString z)
And here is how i use this:
> -- | Replace path prefix, if matched.
> replacePrefix :: FilePath -> FilePath -> FilePath -> FilePath
> replacePrefix old new x = maybe x (combine new . joinPath) $
> -- For ensuring that path prefix starts and ends at path
> -- components (directories) boundaries, i first split them.
> stripPrefix (splitDirectories old) (splitDirectories x)
>
> -- | Rewrite path in rsync inlcude line @line@ from source path
> -- @srcdir@ to install path @prefix@
> --
> -- > usedIncludes srcdir prefix line
> --
> -- and collect (rewritten) rsync include pathes in @Writer@ monad.
> -- Other lines return as is.
> usedIncludes :: FilePath -- ^ Source path.
> -> FilePath -- ^ Install path.
> -> String -- ^ Line from rsync filter file.
> -> Writer [FilePath] String
> usedIncludes srcdir prefix =
> modifyAA (rsyncAnyL . rsyncIncludeL) $ \x -> do
> let x' = replacePrefix srcdir prefix x
> tell [x']
> return x'
and then a shake rule:
-- | Add file rule for instaling rsync filters with extension
-- @ext@, rewriting source path @srcdir@ to install path @prefix@
-- in any rsync includes:
--
-- > rsyncFilter ext srcdir prefix
--
rsyncFilter :: String -- ^ Extension.
-> FilePath -- ^ Install path.
-> FilePath -- ^ Source path.
-> Rules ()
rsyncFilter ext prefix srcdir = prefix ++ "//*" <.> ext %> \out -> do
let src = replacePrefix prefix srcdir out
ls <- readFileLines src
let (rs, incs) = runWriter $ mapM (usedIncludes srcdir prefix) ls
need incs
putNormal $ "> Write " ++ out
writeFileChanged out . unlines $ rs
I probably won't think too much about this API, if i haven't read
[Luke Palmer's post about existentials][1] . And now i doubt, did i
fall into the same trap with existentials and does not see an obvious
solution with functions?
[1]:
https://lukepalmer.wordpress.com/2010/01/24/haskell-antipattern-existential-typeclass/
--
Dmitriy Matrosov
More information about the Beginners
mailing list