This post originated from an RSS feed registered with Ruby Buzz
by Christian Neukirchen.
Original Post: My DVCS wishlist
Feed Title: chris blogs: Ruby stuff
Feed URL: http://chneukirchen.org/blog/category/ruby.atom
Feed Description: a weblog by christian neukirchen - Ruby stuff
After last week’s intermezzo with Git, my
curiosity for distributed version control systems (DVCS) reinflamed
again. I also imported the Ruby CVS history into
Monotone, which has a pretty fast
CVS importer, and Mercurial,
which CVS importer seamt to be even faster
(cvs20hg),
but unfortunately is not complete yet. However, Mercurial also can
import from Git, so I went that way.
My projects will continue to be kept in
Darcs for near future, but so far no DVCS really
could convince me. Wondering about which lacked what, I thought it
would be useful to write up what I want to have. So far, I tried:
Darcs,
Git/Cogito,
Mercurial and
Monotone. I also dabbled into
Bazaar (seems to be discontinued),
Bazaar-NG,
FastCST (seems to be
discontinued) and SVK (IMO just a hack).
So, here is my wishlist (roughly ordered in decreasing importance):
Prefer file storage over patch storage, it’s just easier to deal
with in practice. It took be a long time to figure this out, but I
actually think it’s the more pragmatic solution. I noticed this
when I saw how the Git repository just merged with the Gitk
repository, even if both didn’t share a single revision. Darcs, on
the other hand, even had problems doing merges which were factually
the same, but just couldn’t be arranged the right way. The theory
of patches sounds nice, but it doesn’t work out.
Note that this doesn’t exclude diff storage, this of course should
be done to save disk space and bandwidth.
Provided by: Bazaar-NG (I think), Git/Cogito, Mercurial, Monotone.
Revisions need to be identified by a globally unique identifier, e.g.
a SHA1-hash or a GUID.
Revision storage should be implemented as write-once files. Once a
file has been written, it should not be touched afterwards. This
eases incremental backup and generally improves safety.
Alternatively, if files are append-only, this is acceptable too.
Changing files leaves a bad taste. (It’s okay for index files and
other unessential information.)
File permissions must be saved, at least the executable bit. Also,
the VCS shouldn’t touch the contents of the files at all (no
newline conversion, no keywords by default).
Easy setup of repositories: Setting up a new repository needs to be
possible with a single command, usually that’s xxx init—it will
turn the current directory into a fresh repository (or even import
the files of the current directory, as Cogito does).
Support multiple heads of development in a single repository.
This encourages microbranching and eases incremental development without
keeping loads of working directories around.
Provided by: Git/Cogito, Monotone.
It has to be possible to export patches with full metadata (e.g. renames)
as ASCII files, e.g. to send via mail or share in other ways. It
needs to support binary files, too. (Think of contributing graphics
to a game.)
Provided by: Bazaar-NG, Darcs (very good), Git/Cogito (no binary,
renames partly), Mercurial (bundles, but they are not ASCII, renames
partly), Monotone (packets, good).
It needs to be possible to contribute patches via mail.
This is the way most non-regular commiters send patches.
Serving repositories over dumb HTTP: This is essential to allow
people easily setting up repositories on their cheap webspace.
Systems that require CGIs would be acceptable too, here (Mercurial
without old-http); opening new ports isn’t. It doesn’t need to be
the most efficient way of accessing, but must not be unreasonably
inefficient.
It needs to provide a GUI repository viewer that can show change history
as a tree and diffs for each revision. I’ve found such a tool
indispensably since I’ve discovered Gitk, especially if you
microbranch a lot.
It needs a good and fast tool to import CVS trees. I’ve found
this absolutely needed to convert legacy repositiories and capture
the history of older projects locally.
If you find any mistakes or misattribution, please post a comment and
I’ll correct it.
Writing a good DVCS is not that hard in theory, but very hard in
practice—not only for technical reasons. Implementing DVCS is a
community effort, I’d even state it’s pointless today to start yet
another VCS, unless you are a celebrity that already has a big
community behind (cf. Git).
NP: The Smiths—You Just Haven’t Earned It Yet Baby