[Haskell-cafe] Towards a better time library (announcing tz)

Mihaly Barasz klao at nilcons.com
Mon Apr 7 15:49:50 UTC 2014

Hello Renzo and list,

Since my last email I have implemented the two things that I mentioned:
* a Template Haskell function with which you can include a few needed
`TZ`s in your binary at compile time
* a module which contains all of the time zones shipped in a pure way.
So, if you use this module, all of the zone files will be compiled
into your binary, so you can use them completely independently from
the platform/machine you are running on. (This is appropriate for a
webapp for example, where you need to display times to many users in
many different time zones.)

Renzo, please take a look at the Data.Time.Zones.All module
I tried to implement it with your comments in mind. If you think that
this is something like what you need, something like what you were
thinking about, I can release it as a separate package with your
versioning scheme.


On Tue, Apr 1, 2014 at 11:55 AM, Mihaly Barasz <klao at nilcons.com> wrote:
> On Tue, Apr 1, 2014 at 3:21 AM, Renzo Carbonara <gnuk0001 at gmail.com> wrote:
>> I'm currently a user of the `timezone-olson` and `timezone-series`
>> libraries, so I can understand the requirement for better time zone
>> representations than the one offered by `time`. I think it would be
>> best for the community if your work and the one in `timezone-olson`
>> and `timezone-series` could be integrated somehow, as there doesn't
>> seem to be a need to have two different implementations for parsing
>> the Olson file format. As far as I know, Yitzchak is willing to
>> improve his libraries, I CC him here.
> It seems that we have quite different views on things. :)  But of
> course, I agree that one good solution is much better than two
> half-baked ones, I'm just not exactly sure how to proceed. I think,
> it's best if we give it some time and slowly figure out a way to
> compromise and cooperate.
>> Given that you are shipping a timezone database with your package, as
>> a user I'd really prefer `loadTZFromDB` to be `String -> Maybe TZ`
>> instead of `String -> IO TZ` so that it can be used in pure code.
>> You'd probably need to hardcode the contents of the `*.zone` files you
>> are including within a Haskell module, but that could be done
>> automatically using some script.
> This is in the working, I just didn't get to it yet.
> One problem with this is that once you use the module providing this
> pure interface, _all_ of the time zone definitions will be compiled
> into your binary, increasing its size substantially (few hundred
> kilobytes). It's not a problem for many uses, but this is why I don't
> want it as the only mechanism.
>> Moreover, an idea I've been toying with is writing a program that
>> parses the tzdata information files and generates a type-safe pure API
>> for interacting with the tz data in memory. For example, given the
>> tzdata information files you would generate types, values and pure
>> functions such as the following:
>>     data TimeZone
>>        = Europe__Paris
>>        | America__Buenos_Aires
>>        | ...
>>     timeZoneName :: TimeZone -> Text
>>     timeZoneName Europe__Paris = "Europe/Paris"
>>     timeZoneName America__Buenos_Aires = "America/Buenos_Aires"
>>     timeZoneName ... = ...
>>     timeZoneFromName :: Text -> Maybe TimeZone
>>     timeZoneFromName "Europe/Paris" = Just Europe__Paris
>>     timeZoneFromName "America/Buenos_Aires" = Just America__Buenos_Aires
>>     timeZoneFromName ... = ...
>>     timeZoneFromName _ = Nothing
>>     timeZoneInfo :: TimeZone -> TZ
>>     timeZoneInfo Europe_Paris = ... some hardcoded value (the `TZ` in
>> your library) ...
>>     timeZoneInfo ... = ...
> Interesting idea, I have to think about it a bit more, but right now I
> don't see why not.
>> As a minor tidbit: If one is going to do this, I think a good idea is
>> to devise a package versioning system that's somewhat follows the tz
>> database versions. This is what I had in mind for the `tzdata` package
>> I was planning to create (unless someone else does it first):
>>   The version number of the package is always @YYYYMMDD.B.C@, where
>>   @YYYYMMDD@ are the year (@YYYY@), the month (@MM@) and the day (@DD@)
>>   of the @tzdata@ release this particular version was designed for. For
>>   example, @tzdata 2013i@ was officially released on @2013-12-17@, so a
>>   version of this package that was designed for @tzdata 2013i@ will
>>   carry the version number @20131217.B.C at . However, that doesn't mean
>>   that this library won't work with versions of the @tzdata@ library
>>   different than @2013i@, it's just that support for new or old data
>>   (say, the name of a new time zone that was introduced) can be missing.
>>   The @B@ and @C@ values in the version number of this library as
>>   treated as the /major/ and /minor/ versions respectively, as suggested
>>   in the /Haskell Package Version Policy/ page:
>>   http://www.haskell.org/haskellwiki/Package_versioning_policy
> Hmm, I see the reasoning behind this, but I think it would be annoying
> for most of the users. Or not?
> If we separate only the data in a separate package (on which `tz`
> depends), then very few people would want to explicitly specify its
> version... Yeah, this might work. :)
> Mihaly
>> Maybe the `timezone-olson`, `timezone-series`, this `tzdata` idea and
>> `tz` could all live together; with `tz` depending on the others? Just
>> a thought :)
>> Regards,
>> Renzo Carbonara.
>> On Mon, Mar 31, 2014 at 3:15 PM, Mihaly Barasz <klao at nilcons.com> wrote:
>>> I would like to propose reforming the 'time' [1] library.
>>> Initially, I was just planning to announce my new 'tz' [2] library, but
>>> realized that I have a more important agenda. Namely: Haskell needs a
>>> better time library!
>>> Let me summarize what are - in my view - the biggest deficiencies of
>>> 'time':
>>> 1. Inefficient data structures and implementations.
>>> 2. Ad-hoc API which is hard to remember and frustrating to work with.
>>> 3. Conceptually wrong representations and/or missing concepts.
>>> The wonderful thyme [3] package (by Liyang HU) improves a lot on #1 by
>>> choosing better data structures and careful implementations and on #2
>>> by lensifying the API.
>>> But, it was the #3 that caused me the most frustration lately; most
>>> importantly the time zone handling.
>>> There is a TimeZone data type in 'time', but it is a misnomer, it
>>> basically represents a fixed time difference (with a label and a DST
>>> flag). 'time' basically adapts the broken approach from libc: you can
>>> work with one time zone at a time, which is defined globally for your
>>> program (via the TZ environment variable). So, the transformation
>>> between UTCTime and LocalTime which should have been a pure
>>> function can only be done in IO now. Like this:
>>>     do tz <- getTimeZone ut
>>>        return $ utcToLocalTime tz ut
>>> Oh, and just to hammer down on the point #1 from the list above. This
>>> code runs in about 6100 ns on my machine. The drop-in replacement from
>>> tz: utcToLocalTimeTZ [4] (which is actually pure) runs in 2300 ns.
>>> While this is a significant improvement, it's easy to miss the point
>>> where the bulk of the inefficiency comes from the data structures. In
>>> my main project we represent times as Int64 (raw nanoseconds since
>>> UNIX epoch; and similar representation for zoned times). And to
>>> convert those to and from different time zones we need 40 ns. That's
>>> right, a 150 _times_ improvement!  (There are many other interesting
>>> benchmark results that I could mention. An exciting bottom line: we
>>> can actually beat the libc in many use-cases!)
>>> The 'tz' package is still very much in flux. I will try to solidify
>>> the API soon, but until then it should be considered more of a proof
>>> of concept. There is some missing functionality, for example. On the
>>> other hand, there are the 'timezone-series' [5] and 'timezone-olson'
>>> [6] packages that together provide about the same functionality as
>>> 'tz' (minus the efficiency), and I'd like to explore if we could
>>> remove some of the overlap. But, all kind of suggestions and requests
>>> are welcome!
>>> More importantly, I'd like to hear the opinions of the community about
>>> the general issue of a better time library! Do we need one?  How
>>> should we proceed about it?  I think, Haskell could potentially have
>>> one of the best time libraries, but the current de-facto standard is
>>> mediocre at best. Unfortunately, designing a good time library is very
>>> far from trivial, as many existing examples demonstrate. And I
>>> definitely don't know enough for it. (I understand time zone info
>>> files, now that I wrote tz, but that's just a tiny fraction of what's
>>> needed.)  So, if you think you can contribute to the design (have
>>> important use-cases in mind, know good examples of API, have some
>>> experience working with dates and time, etc. etc.) - speak up!
>>> Mihaly
>>> Footnotes:
>>> [1]  http://hackage.haskell.org/package/time
>>> [2]  http://hackage.haskell.org/package/tz
>>> [3]  http://hackage.haskell.org/package/thyme
>>> [4]  http://hackage.haskell.org/package/tz-
>>> [5]  http://hackage.haskell.org/package/timezone-series
>>> [6]  http://hackage.haskell.org/package/timezone-olson
>>> Version: GnuPG v1.4.6 (GNU/Linux)
>>> iQEVAwUBUzmwz/i5FOsZqz9DAQJGcAgAhPF4JLWnL4ApJ2qxqAwHqXcIPqRpVb5A
>>> TH2LERH2A/6b3xXCRYsPgyD43j2CzqZGffRvINSw9fGoJYWuRmis5dCf9hwPiKtg
>>> UwtoemQSVLUIIRxZyg5pdL+dxadnttm7bGC+UuQJHtSSBRweEh3unr8dcNm4idC3
>>> nxWOMclbo2hyMdwzDo1bqugugq2xCGPiGrL550aF1lCGD2pf2vQO1feW/5XyMaCR
>>> Oj6gI+eHo8SuhUx30Dokv1kx8Ssay0aVmmASCJKnR8Bwv1J9AKWo3A==
>>> =Bp64
>>> -----END PGP SIGNATURE-----
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe

More information about the Haskell-Cafe mailing list