Thursday, July 31, 2008

git clones of Apache codebases

In the past few months we've been discussing on the infrastructure-dev mailing list about various ways of extending or improving the version control functionality available to Apache projects. One of the main themes of the discussion has been making it easier to access Apache codebases using git or other distributed SCM tools.

The new svn.eu.apache.org mirror that was recently announced supports git-svn use when accessed as an authenticated user over https. Unfortunately that access is limited to Apache committers and git-svn can be notoriously slow when making initial clones of complex codebases in svn.

To work around these issues I set up a collection of git mirrors of selected Apache codebases on my server. You can find these unofficial mirrors at http://jukka.zitting.name/git/. The mirrors are automatically updated daily.

The mirrors work pretty much like normal git repositories in that you don't need git-svn or any other svn tools to work with them. The only significant difference to normal git repositories is that svn tags are mapped to branches named "tags/..." in the mirrors due to the different way git and svn handle tags. Also, settings like svn:ignore, svn:eol-style, etc. are not replicated in these git mirrors.

Let me know if you're interested in seeing other Apache codebases mirrored. I'm also interested in other feedback or ideas related to these git mirrors.

27 comments:

  1. Someone created a clone of your repo at repo.or.cz:

    http://repo.or.cz/w/stdcxx.git

    Which can be cloned by http (which works better through firewalls).

    ReplyDelete
  2. Good point about HTTP. I've enabled it also on my server.

    ReplyDelete
  3. Thank you so much. I kept getting bitten by trying to clone things from svn.apache.org. This is enormously useful to me so I can keep track of what's going on in various projects we use.

    ReplyDelete
  4. How cool is that?!? Could you add a git clone of the logging projects (log4j, log4cxx, etc.) too?

    ReplyDelete
  5. Done. You'll now find log4j, log4cxx, log4net, chainsaw, and commons-logging included.

    ReplyDelete
  6. Thanks a lot for this!

    How exactly do you have git-svn configured to export the normally remote branches? Are you using a post-update script to update the remote refs to be local?

    ReplyDelete
  7. I use the following script to copy the remote git-svn branches to local branches:

    git update-ref refs/heads/master refs/remotes/trunk
    git for-each-ref refs/remotes | cut -d / -f 3- | grep -v -x trunk |
    grep -v @ | while read ref
    do
    git update-ref "refs/heads/$ref" "refs/remotes/$ref"
    done
    git for-each-ref refs/heads | cut -d / -f 3- | grep -v -x master |
    while read ref
    do
    git rev-parse "refs/remotes/$ref" > /dev/null 2>&1 ||
    git update-ref -d "refs/heads/$ref" "refs/heads/$ref"
    done

    ReplyDelete
  8. Would you be so kind as to add Xalan-Java (svn.eu.apache.org/repos/asf/xalan/java) ?

    Regards and thanks in advance

    ReplyDelete
  9. One mirror of Xalan Java coming up... Have fun!

    ReplyDelete
  10. Could you add cxf?

    ReplyDelete
  11. Sure, the CXF mirror is now being created.

    ReplyDelete
  12. Strange...why does it show the last change was 5 months ago?

    ReplyDelete
  13. The initial sync is still running. Once it's done you'll see also the latest commits.

    ReplyDelete
  14. Pluto is now mirrored (and the CXF mirror is complete). Have fun!

    ReplyDelete
  15. Woah that was quick! Now I gotta go throw out my svn checkout and clone - cheers!

    Btw - got any links to discussion about debate regarding Apache adopting Git?

    ReplyDelete
  16. The archives of the infrastructure-dev list where most of the recent git discussion at Apache has taken place can be browsed at http://dir.gmane.org/gmane.comp.apache.infrastructure.devel

    ReplyDelete
  17. Can you setup an 'email notification' option to the comments in your posts?

    ReplyDelete
  18. > Can you setup an ‘email notification’ option to the comments in your posts?

    I don't know how to do that (or if the feature is even available) on wordpress.com.

    You can subscribe the feed at http://jukkaz.wordpress.com/2008/07/31/git-clones-of-apache-codebases/feed/ to keep up with comments on just this post.

    ReplyDelete
  19. Hi Jukka!

    It was nice meeting you at Apachecon last week. Any way you can add mahout to the git repo?

    Thanks!

    ReplyDelete
  20. Thanks for the reminder! Apache Mahout is now mirrored.

    ReplyDelete
  21. Sweet! Thanks again!

    ReplyDelete
  22. Hi Jukka,

    Can you add OpenJPA to the list please?

    Also, do you have a link to anything about best practices for committers using git + Apache's svn repo?

    Thanks!

    ReplyDelete
  23. The OpenJPA mirror is now coming up. The initial sync probably takes some while (hours or even days). I'll ping you when it's ready.

    I've been planning to write up a summary of the git@Apache talk I gave a month ago at the BarCamp Apache, but I haven't yet gotten it done. :-( I'll try to do that in the next few days.

    If you haven't already seen them, you should check out the archives of the infrastructure-dev@ list for some good discussion on the various things people have been doing with git.

    ReplyDelete
  24. Sorry. Subscribing to comments...

    ReplyDelete