GHC API: monad and error handling

Thomas Schilling nominolo at googlemail.com
Mon Jul 14 12:28:23 EDT 2008


On 14 Jul 2008, at 16:58, Henning Thielemann wrote:
>
> Another instance of mixing errors (ApiError) and exceptions  
> (GhcError, GhcIOException)? How should I handle errors that I made  
> myself by calling the GHC API the wrong way? Of course, I must  
> correct those calls to GHC instead. The clean design would be to  
> drop the ApiError constructor and indicate wrong uses of the GHC  
> library with simple 'error'.

Ok, I read your articles.  While I must say the terminology is a bit  
confusing, I agree with the overall distinction.  However, for an  
API, using 'error' should be done very seldom, preferably never.   
Thus I probably should reconsider the use case and think about how to  
restructure the API in a way that this error never occurs.  Oleg and  
Chung-chieh's "Lightweight Static Capabilites"[1] show some  
interesting and simple ideas how one could do that.

   [1]: http://okmij.org/ftp/papers/lightweight-static-capabilities.pdf

>> Note that we have to wrap IO exceptions and propagate them  
>> separately in the Ghc monad.(**)
>
> That's a good thing. All (IO) exceptions should be handled this way.
>  (See the extensible exception thread on this list.)
>
>
>>  For the latter It is, however, not always clear what is a compile  
>> error and what is not.  Consider, for example, the following  
>> function:
>>
>> -- | Takes a 'ModuleName' and possibly a 'PackageId', and consults  
>> the
>> -- filesystem and package database to find the corresponding  
>> 'Module',-- using the algorithm that is used for an @import@  
>> declaration.
>> findModule :: ModuleName -> Maybe PackageId -> Ghc Module
>> findModule mod_name maybe_pkg = withSession $ \hsc_env ->
>> let
>>       dflags = hsc_dflags hsc_env
>>       hpt    = hsc_HPT hsc_env
>>       this_pkg = thisPackage dflags
>> in
>> case lookupUFM hpt mod_name of
>>   Just mod_info -> return (mi_module (hm_iface mod_info))
>>   _not_a_home_module -> do
>>         -- XXX: should we really throw IO exceptions here?
>> 	  res <- io $ findImportedModule hsc_env mod_name maybe_pkg
>> 	  case res of
>> 	    Found _ m | modulePackageId m /= this_pkg -> return m
>> 		      | otherwise -> throwDyn (CmdLineError (showSDoc $
>> 					text "module" <+> pprModule m <+>
>> 					text "is not loaded"))
>> 	    err -> let msg = cannotFindModule dflags mod_name err in
>> 		   throwDyn (CmdLineError (showSDoc msg))
>
>> This either returns a Module or throw an exception by looking it  
>> up in the home package table.  Otherwise, it throws a CmdLineError  
>> exception, which is intended for reporting failure inside GHCi.  I  
>> guess the proper way for this function would be to throw an  
>> 'ApiError', i.e., we expect the looked-up module to be existing in  
>> the home package table and to be loaded.
>
> Can you please explain, what you are doing here?
>
> If a module cannot be found on disk, this is certainly an (IO)  
> exception, since you cannot enforce the existence of a file. If a  
> module cannot be found in an internal table, although it should be  
> there, this is an 'error'. However one should try to minimize such  
> situations, may by organizing the lookup in another way.

Ah sorry.  This is an existing function, that tries to find a module  
in the set of installed packages (or only within a specified  
package).  I think it can reasonably be expected to fail.  For  
example, the package could be hidden or the module is not part of an  
existing package.  Well, actually, the original type was

   findModule :: Session -> ModuleName -> Maybe PackageId -> IO Module

and throwing an exception was used as error the reporting mechanism.   
I think Simon's Extensible Exception paper actually mentions that  
they are sometimes used as error reporting mechanism "for  
convenience".  Of course, we can and should normalise this behaviour  
at the API-level, so I will study your list of articles and the  
mailing list thread (I was hoping I could avoid reading the whole  
thread ;) )
>
>> (*) an alternative implementation that would probably be a bit  
>> more efficient in case of errors could use continuation-passing  
>> style:
>>
>> newtype Ghc a
>>    = Ghc { unGhc :: forall ans.
>>                     Session
>>                  -> (GhcError -> IO ans)  -- failure continuation
>>                  -> (a -> IO ans)         -- success continuation
>>                  -> IO ans
>
> Why do you think it would be more efficient?
>

It performs less matching on constructors.  Consider the monad  
instance of the Either variant:

   instance Monad Ghc where
     return x = Ghc $ \_ -> return (Right x)
     m >>= k  = Ghc $ \s -> do rslt <- runGhc m s
                               case rslt of
                                 Left err  -> return (Left err)
                                 Right a   -> runGhc (k a) s

Here, every >>= immediately deconstructs the value constructed by the  
monad.  If an error is thrown, all >>= calls will merely deconstruct  
the value of the previous call, and reconstruct it immediately.

The CPS variant looks like this:

   instance Monad Ghc where
     return x = Ghc $ \s fk k -> k x
     m >>= f  = Ghc $ \s fk k ->
                  runGhc' m s fk (\a -> runGhc' (f a) s fk k)

This simply adjusts the continuation and the failure continuation is  
just passed through and is called directly in case of an error

   throw err = Ghc $ \_ fk _ -> fk err

i.e., the CPS immediately jumps to the error handler and aborts the  
current continuation.

/ Thomas

--
Once upon a time is now.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part
Url : http://www.haskell.org/pipermail/libraries/attachments/20080714/4badad0e/PGP.bin


More information about the Libraries mailing list