[Haskell-cafe] Towards a better time library (announcing tz)
Mihaly Barasz
klao at nilcons.com
Tue Apr 1 09:55:57 UTC 2014
On Tue, Apr 1, 2014 at 3:21 AM, Renzo Carbonara <gnuk0001 at gmail.com> wrote:
> I'm currently a user of the `timezone-olson` and `timezone-series`
> libraries, so I can understand the requirement for better time zone
> representations than the one offered by `time`. I think it would be
> best for the community if your work and the one in `timezone-olson`
> and `timezone-series` could be integrated somehow, as there doesn't
> seem to be a need to have two different implementations for parsing
> the Olson file format. As far as I know, Yitzchak is willing to
> improve his libraries, I CC him here.
It seems that we have quite different views on things. :) But of
course, I agree that one good solution is much better than two
half-baked ones, I'm just not exactly sure how to proceed. I think,
it's best if we give it some time and slowly figure out a way to
compromise and cooperate.
> Given that you are shipping a timezone database with your package, as
> a user I'd really prefer `loadTZFromDB` to be `String -> Maybe TZ`
> instead of `String -> IO TZ` so that it can be used in pure code.
> You'd probably need to hardcode the contents of the `*.zone` files you
> are including within a Haskell module, but that could be done
> automatically using some script.
This is in the working, I just didn't get to it yet.
One problem with this is that once you use the module providing this
pure interface, _all_ of the time zone definitions will be compiled
into your binary, increasing its size substantially (few hundred
kilobytes). It's not a problem for many uses, but this is why I don't
want it as the only mechanism.
> Moreover, an idea I've been toying with is writing a program that
> parses the tzdata information files and generates a type-safe pure API
> for interacting with the tz data in memory. For example, given the
> tzdata information files you would generate types, values and pure
> functions such as the following:
>
> data TimeZone
> = Europe__Paris
> | America__Buenos_Aires
> | ...
>
> timeZoneName :: TimeZone -> Text
> timeZoneName Europe__Paris = "Europe/Paris"
> timeZoneName America__Buenos_Aires = "America/Buenos_Aires"
> timeZoneName ... = ...
>
> timeZoneFromName :: Text -> Maybe TimeZone
> timeZoneFromName "Europe/Paris" = Just Europe__Paris
> timeZoneFromName "America/Buenos_Aires" = Just America__Buenos_Aires
> timeZoneFromName ... = ...
> timeZoneFromName _ = Nothing
>
> timeZoneInfo :: TimeZone -> TZ
> timeZoneInfo Europe_Paris = ... some hardcoded value (the `TZ` in
> your library) ...
> timeZoneInfo ... = ...
Interesting idea, I have to think about it a bit more, but right now I
don't see why not.
> As a minor tidbit: If one is going to do this, I think a good idea is
> to devise a package versioning system that's somewhat follows the tz
> database versions. This is what I had in mind for the `tzdata` package
> I was planning to create (unless someone else does it first):
>
> The version number of the package is always @YYYYMMDD.B.C@, where
> @YYYYMMDD@ are the year (@YYYY@), the month (@MM@) and the day (@DD@)
> of the @tzdata@ release this particular version was designed for. For
> example, @tzdata 2013i@ was officially released on @2013-12-17@, so a
> version of this package that was designed for @tzdata 2013i@ will
> carry the version number @20131217.B.C at . However, that doesn't mean
> that this library won't work with versions of the @tzdata@ library
> different than @2013i@, it's just that support for new or old data
> (say, the name of a new time zone that was introduced) can be missing.
> The @B@ and @C@ values in the version number of this library as
> treated as the /major/ and /minor/ versions respectively, as suggested
> in the /Haskell Package Version Policy/ page:
> http://www.haskell.org/haskellwiki/Package_versioning_policy
Hmm, I see the reasoning behind this, but I think it would be annoying
for most of the users. Or not?
If we separate only the data in a separate package (on which `tz`
depends), then very few people would want to explicitly specify its
version... Yeah, this might work. :)
Mihaly
>
> Maybe the `timezone-olson`, `timezone-series`, this `tzdata` idea and
> `tz` could all live together; with `tz` depending on the others? Just
> a thought :)
>
>
> Regards,
>
> Renzo Carbonara.
>
>
> On Mon, Mar 31, 2014 at 3:15 PM, Mihaly Barasz <klao at nilcons.com> wrote:
>>
>> I would like to propose reforming the 'time' [1] library.
>>
>>
>> Initially, I was just planning to announce my new 'tz' [2] library, but
>> realized that I have a more important agenda. Namely: Haskell needs a
>> better time library!
>>
>> Let me summarize what are - in my view - the biggest deficiencies of
>> 'time':
>>
>> 1. Inefficient data structures and implementations.
>>
>> 2. Ad-hoc API which is hard to remember and frustrating to work with.
>>
>> 3. Conceptually wrong representations and/or missing concepts.
>>
>> The wonderful thyme [3] package (by Liyang HU) improves a lot on #1 by
>> choosing better data structures and careful implementations and on #2
>> by lensifying the API.
>>
>> But, it was the #3 that caused me the most frustration lately; most
>> importantly the time zone handling.
>>
>> There is a TimeZone data type in 'time', but it is a misnomer, it
>> basically represents a fixed time difference (with a label and a DST
>> flag). 'time' basically adapts the broken approach from libc: you can
>> work with one time zone at a time, which is defined globally for your
>> program (via the TZ environment variable). So, the transformation
>> between UTCTime and LocalTime which should have been a pure
>> function can only be done in IO now. Like this:
>>
>> do tz <- getTimeZone ut
>> return $ utcToLocalTime tz ut
>>
>>
>> Oh, and just to hammer down on the point #1 from the list above. This
>> code runs in about 6100 ns on my machine. The drop-in replacement from
>> tz: utcToLocalTimeTZ [4] (which is actually pure) runs in 2300 ns.
>> While this is a significant improvement, it's easy to miss the point
>> where the bulk of the inefficiency comes from the data structures. In
>> my main project we represent times as Int64 (raw nanoseconds since
>> UNIX epoch; and similar representation for zoned times). And to
>> convert those to and from different time zones we need 40 ns. That's
>> right, a 150 _times_ improvement! (There are many other interesting
>> benchmark results that I could mention. An exciting bottom line: we
>> can actually beat the libc in many use-cases!)
>>
>> The 'tz' package is still very much in flux. I will try to solidify
>> the API soon, but until then it should be considered more of a proof
>> of concept. There is some missing functionality, for example. On the
>> other hand, there are the 'timezone-series' [5] and 'timezone-olson'
>> [6] packages that together provide about the same functionality as
>> 'tz' (minus the efficiency), and I'd like to explore if we could
>> remove some of the overlap. But, all kind of suggestions and requests
>> are welcome!
>>
>> More importantly, I'd like to hear the opinions of the community about
>> the general issue of a better time library! Do we need one? How
>> should we proceed about it? I think, Haskell could potentially have
>> one of the best time libraries, but the current de-facto standard is
>> mediocre at best. Unfortunately, designing a good time library is very
>> far from trivial, as many existing examples demonstrate. And I
>> definitely don't know enough for it. (I understand time zone info
>> files, now that I wrote tz, but that's just a tiny fraction of what's
>> needed.) So, if you think you can contribute to the design (have
>> important use-cases in mind, know good examples of API, have some
>> experience working with dates and time, etc. etc.) - speak up!
>>
>>
>> Mihaly
>>
>>
>> Footnotes:
>>
>> [1] http://hackage.haskell.org/package/time
>> [2] http://hackage.haskell.org/package/tz
>> [3] http://hackage.haskell.org/package/thyme
>> [4] http://hackage.haskell.org/package/tz-0.0.0.1/docs/Data-Time-Zones.html#v:utcToLocalTimeTZ
>> [5] http://hackage.haskell.org/package/timezone-series
>> [6] http://hackage.haskell.org/package/timezone-olson
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.6 (GNU/Linux)
>>
>> iQEVAwUBUzmwz/i5FOsZqz9DAQJGcAgAhPF4JLWnL4ApJ2qxqAwHqXcIPqRpVb5A
>> TH2LERH2A/6b3xXCRYsPgyD43j2CzqZGffRvINSw9fGoJYWuRmis5dCf9hwPiKtg
>> hK1wUCz9AsKlKBZztR9eLxROqM/xXMH4HaFydr/YOVffDVY6fUIK9fPbRFJBVCBq
>> UwtoemQSVLUIIRxZyg5pdL+dxadnttm7bGC+UuQJHtSSBRweEh3unr8dcNm4idC3
>> nxWOMclbo2hyMdwzDo1bqugugq2xCGPiGrL550aF1lCGD2pf2vQO1feW/5XyMaCR
>> Oj6gI+eHo8SuhUx30Dokv1kx8Ssay0aVmmASCJKnR8Bwv1J9AKWo3A==
>> =Bp64
>> -----END PGP SIGNATURE-----
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
More information about the Haskell-Cafe
mailing list