[Haskell-cafe] OverloadedStrings, but with compile-time invariants

Max Ulidtko ulidtko at gmail.com
Fri Dec 8 15:50:58 UTC 2023


Greetings @cafe,

Transparent wrapping of string literals into custom types, driven 
automatically by type inference, is of course very useful. I'm getting 
huge mileage off GHC's OverloadedStrings when working with e.g. the 
Text type:

class IsString a where
  fromString :: String -> a

instance IsString Text {- ... -}

greeting :: Text
greeting = "Hello"

But I hope there's possibly room for improvement. I've got a bag of 
several use-cases; perhaps the simplest is the following.

Consider a type called NonEmptyText which is a (wrapper around) Text, 
but with an added invariant that the contained text is never the empty 
string "".

Depending on codebase, such a type can be very useful too, allowing to 
offload the burden of keeping track of the invariant onto machine; the 
typechecker will show me all the spots to handle, current or future 
ones. IOW, this type enables writing a non-boolean-blind null check, 
toNonEmptyText :: ToText str => str -> Maybe NonEmptyText — which 
makes not only the programmer, but also the compiler, aware of the 
null-check result.

I'm focusing the example on Text to somewhat maintain generality of the 
idea. NonEmpty Char can be an alternative formulation of "non-empty 
String"; length-indexed vectors could possibly handle this too. But 
there're all sorts of invariants besides non-emptiness one may wish to 
express.

I'll omit other examples; but to summarize, I've got plenty of 
practical motivation to get "smart literals".

And to the point: IsString as it stands, forces us to write much 
unsatisfactory instances:

instance IsString NonEmptyText where
  fromString (c:cs) = NonEmptyText.new c cs
  fromString [] = error "unavoidable runtime panic" -- argh

... despite fromString being invoked, presumably with 
OverloadedStrings, on a string literal — which is statically known at 
compile-time.

I'd imagine upgrading the interface with GHC's own TypeLits:

class KnownSymbol lit => IsString' str lit where
  fromString' :: proxy lit -> str

This 2-parameter formulation of IsString enables pretty straightforward 
user code, along the lines of:

type family MyInvariant (s :: Symbol) :: Constraint where
  MyInvariant "" = TypeError ('Text "Empty string literal")
  MyInvariant _s = ()

instance (MyInvariant lit, KnownSymbol lit) => IsString' NonEmptyText 
lit where
  fromString' = nothingImpossible . NeT.fromText . toText . symbolVal 
where
    nothingImpossible (Just x) = x
    nothingImpossible Nothing = error "typelevel invariant violated"

text1, text2 :: NonEmptyText
text1 = fromString' $ Proxy @"hello, types"
text2 = fromString' $ Proxy @""            -- compile error here, as 
expected!

With the primes, this is possible to write today. With 
AllowAmbiguousTypes, the unwieldy syntax can be abbreviated somewhat, 
e.g. smartLiteral @"overloaded string literal with compiler-checked 
invariant" — but is still rather awkward... in the same way that 
ubiquitous T.pack is annoying enough to want to enable 
OverloadedStrings and stop seeing it all over the place.

The question is, could this IsString' somehow become the IsString that 
OverloadedStrings is friends with?
I guess, not soon, with compatibility in mind... But in principle? Is 
it even worth trying to pursue the GHC Proposal path?
Perhaps a new extension OverloadedTypelevelStrings?

Relatedly... Can a GHC Plugin do this?

And overall, to zoom out of "XY Problem" pitfalls: am I missing a 
neater way to have "partial" IsString instances, those that can reject 
some literals at compile-time?

Having this code work is the goal:

text1, text2 :: NonEmptyText
text1 = "hello, types"
text2 = "" -- compile error

Best regards,
Max

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20231208/b4908c9c/attachment.html>


More information about the Haskell-Cafe mailing list