[Haskell-cafe] Downloading Haskell repos from GitHub

Gwern Branwen gwern0 at gmail.com
Fri Apr 30 11:38:49 EDT 2010


Along the lines of
http://blog.patch-tag.com/2010/03/13/mirroring-patch-tag/ for
downloading all patch-tag.com repositories, I've begun to wonder how
to download all Github repositories since more and more people seem to
be using it.

Nothing in http://develop.github.com/ seems especially useful for
grabbing the git:// URLs of all repos by language - just by user.

The only real list of repos by language seems to be gotten at via
http://github.com/languages/Haskell/updated or
http://github.com/languages/Haskell/created . (You might think
http://github.com/languages/Haskell would be good, but no, it's just a
few random repos by interest and not a full listing.)

I looked at the HTML, and it looks possible to use tagsoup to get all
98 pages and then parse the entries to get the HTTP URLs of the repos,
and then turn *that* into git:// URLs suitable for shelling out to
'git clone', but I can't help but wonder if maybe there's a better
approach someone more familiar with Github would know.

-- 
gwern


More information about the Haskell-Cafe mailing list