Trac to Phabricator (Maniphest) migration prototype

Matthew Pickering matthewtpickering at gmail.com
Wed Dec 21 10:12:56 UTC 2016


Dear devs,

I have completed writing a migration which moves tickets from trac to
phabricator. The conversion is essentially lossless. The trac
transaction history is replayed which means all events are transferred
with their original authors and timestamps. I welcome comments on the
work I have done so far, especially bugs as I have definitely not
looked at all 12000 tickets.

http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com

All the user accounts are automatically generated. If you want to see
the tracker from your perspective then send me an email or ping me on
IRC and I can set the password of the relevant account.

NOTE: This is not a decision, the existence of this prototype is to
show that the migration is feasible in a satisfactory way and to
remove hypothetical arguments from the discussion.

I must also thank Dan Palmer and Herbert who helped me along the way.
Dan was responsible for the first implementation and setting up much
of the infrastructure at the Haskell Exchange hackathon in October. We
extensively used the API bindings which Herbert had been working on.

Further information below!

Matt

=====================================================================

Reasons
======

Why this change? The main argument is consolidation. Having many
different services is confusing for new and old contributors.
Phabricator has proved effective as a code review tool. It is modern
and actively developed with a powerful feature set which we currently
only use a small fraction of.

Trac is showing signs of its age. It is old and slow, users regularly
lose comments through accidently refreshing their browser. Further to
this, the integration with other services is quite poor. Commits do
not close tickets which mention them and the only link to commits is a
comment. Querying the tickets is also quite difficult, I usually
resort to using google search or my emails to find the relevant
ticket.


Why is Phabricator better?
====================

Through learning more about Phabricator, there are many small things
that I think it does better which will improve the usability of the
issue tracker. I will list a few but I urge you to try it out.

* Commits which mention ticket numbers are currently posted as trac
comments. There is better integration in phabricator as linking to
commits has first-class support.
* Links with differentials are also more direct than the current
custom field which means you must update two places when posting a
differential.
* Fields are verified so that mispelling user names is not possible
(see #12623 where Ben mispelled his name for example)
* This is also true for projects and other fields. Inspecting these
fields on trac you will find that the formatting on each ticket is
often quite different.
* Keywords are much more useful as the set of used keywords is discoverable.
* Related tickets are much more substantial as the status of related
tickets is reflected to parent ticket.
(http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T7724)

Implementation
============

Keywords are implemented as projects. A project is a combination of a
tag which can be used with any Phabricator object, a workboard to
organise tasks and a group of people who care about the topic. Not all
keywords are migrated. Only keywords with at least 5 tickets were
added to avoid lots of useless projects. The state of keywords is
still a bit unsatisfactory but I wanted to take this chance to clean
them up.

Custom fields such as architecture and OS are replaced by *projects*
just like keywords. This has the same advantage as other projects.
Users can be subscribed to projects and receive emails when new
tickets are tagged with a project. The large majority of tickets have
very little additional metadata set. I also implemented these as
custom fields but found the the result to be less satisfactory.

Some users who have trac accounts do not have phab accounts.
Fortunately it is easy to create new user accounts for these users
which have empty passwords which can be recovered by the appropriate
email address. This means tickets can be properly attributed in the
migration.

The ticket numbers are maintained. I still advocate moving the
infrastructure tickets in order to maintain this mapping. Especially
as there has been little activity in thr the last year.

Tickets are linked to the relevant commits, differentials and other
tickets. There are 3000 dummy differentials which are used to test
that the linking works correctly. Of course with real data, the proper
differential would be
linked.(http://ec2-52-213-249-242.eu-west-1.compute.amazonaws.com/T11044)

There are a couple of issues currently with the migration. There are a
few issues in the parser  which converts trac markup to remarkup. Most
comments have very simple with just paragraphs and code blocks but
complex items like lists are sometimes parsed incorrectly. Definition
lists are converted to tables as there are no equivalent in remarkup.
Trac ticket links are converted to phab ticket links.

The ideal time to migrate is before the end of January The busiest
time for the issue tracker is before and after a new major release.
With 8.2 planned for around April this gives the transition a few
months to settle. We can close the trac issue tracker and continue to
serve it or preferably redirect users to the new ticket. I don't plan
to migrate the wiki at this stage as I do not feel that the parser is
robust enough although there are now few other technical challenges
blocking this direction.


More information about the ghc-devs mailing list