What Is Subversion?

Subversion is a free/open source version control system. That is, Subversion manages files and directories, and the changes made to them, over time. This allows you to recover older versions of your data or examine the history of how your data changed. In this regard, many people think of a version control system as a sort of “time machine.

Subversion的版本库可以通过网络访问,从而使用户可以在不同的电脑上进行操作。从某种程度上来说,允许用户在各自的空间里修改和管理同一组数据可以促进团队协作。因为修改不再是单线进行,开发速度会更快。此外,由于所有的工作都已版本化,也就不必担心由于错误的更改而影响软件质量—如果出现不正确的更改,只要撤销那一次更改操作即可。

Some version control systems are also software configuration management (SCM) systems. These systems are specifically tailored to manage trees of source code and have many features that are specific to software development—such as natively understanding programming languages, or supplying tools for building software. Subversion, however, is not one of these systems. It is a general system that can be used to manage any collection of files. For you, those files might be source code—for others, anything from grocery shopping lists to digital video mixdowns and beyond.

Is Subversion the Right Tool?

If you're a user or system administrator pondering the use of Subversion, the first question you should ask yourself is: "Is this the right tool for the job?" Subversion is a fantastic hammer, but be careful not to view every problem as a nail.

If you need to archive old versions of files and directories, possibly resurrect them, or examine logs of how they've changed over time, then Subversion is exactly the right tool for you. If you need to collaborate with people on documents (usually over a network) and keep track of who made which changes, then Subversion is also appropriate. This is why Subversion is so often used in software development environments— programming is an inherently social activity, and Subversion makes it easy to collaborate with other programmers. Of course, there's a cost to using Subversion as well: administrative overhead. You'll need to manage a data repository to store the information and all its history, and be diligent about backing it up. When working with the data on a daily basis, you won't be able to copy, move, rename, or delete files the way you usually do. Instead, you'll have to do all of those things through Subversion.

Assuming you're fine with the extra workflow, you should still make sure you're not using Subversion to solve a problem that other tools solve better. For example, because Subversion replicates data to all the collaborators involved, a common misuse is to treat it as a generic distribution system. People will sometimes use Subversion to distribute huge collections of photos, digital music, or software packages. The problem is that this sort of data usually isn't changing at all. The collection itself grows over time, but the individual files within the collection aren't being changed. In this case, using Subversion is “overkill.[2] There are simpler tools that efficiently replicate data without the overhead of tracking changes, such as rsync or unison.

Subversion的历史

In early 2000, CollabNet, Inc. (http://www.collab.net) began seeking developers to write a replacement for CVS. CollabNet offers a collaboration software suite called CollabNet Enterprise Edition (CEE), of which one component is version control. Although CEE used CVS as its initial version control system, CVS's limitations were obvious from the beginning, and CollabNet knew it would eventually have to find something better. Unfortunately, CVS had become the de facto standard in the open source world largely because there wasn't anything better, at least not under a free license. So CollabNet determined to write a new version control system from scratch, retaining the basic ideas of CVS, but without the bugs and misfeatures.

In February 2000, they contacted Karl Fogel, the author of Open Source Development with CVS (Coriolis, 1999), and asked if he'd like to work on this new project. Coincidentally, at the time Karl was already discussing a design for a new version control system with his friend Jim Blandy. In 1995, the two had started Cyclic Software, a company providing CVS support contracts, and although they later sold the business, they still used CVS every day at their jobs. Their frustration with CVS had led Jim to think carefully about better ways to manage versioned data, and he'd already come up with not only the name “Subversion,” but also with the basic design of the Subversion data store. When CollabNet called, Karl immediately agreed to work on the project, and Jim got his employer, Red Hat Software, to essentially donate him to the project for an indefinite period of time. CollabNet hired Karl and Ben Collins-Sussman, and detailed design work began in May 2000. With the help of some well-placed prods from Brian Behlendorf and Jason Robbins of CollabNet, and from Greg Stein (at the time an independent developer active in the WebDAV/DeltaV specification process), Subversion quickly attracted a community of active developers. It turned out that many people had encountered the same frustrating experiences with CVS and welcomed the chance to finally do something about it.

The original design team settled on some simple goals. They didn't want to break new ground in version control methodology, they just wanted to fix CVS. They decided that Subversion would match CVS's features and preserve the same development model, but not duplicate CVS's most obvious flaws. And although it did not need to be a drop-in replacement for CVS, it should be similar enough that any CVS user could make the switch with little effort.

After 14 months of coding, Subversion became “self-hosting” on August 31, 2001. That is, Subversion developers stopped using CVS to manage Subversion's own source code and started using Subversion instead.

While CollabNet started the project, and still funds a large chunk of the work (it pays the salaries of a few full-time Subversion developers), Subversion is run like most open source projects, governed by a loose, transparent set of rules that encourage meritocracy. CollabNet's copyright license is fully compliant with the Debian Free Software Guidelines. In other words, anyone is free to download, modify, and redistribute Subversion as he pleases; no permission from CollabNet or anyone else is required.

Subversion的特性

在讲解Subversion为版本控制领域带来的特性时,我们会经常通过Subversion对CVS的改进进行说明。如果不熟悉CVS,了解所有Subversion的特性会有一定的困难。而如果根本就不熟悉版本控制,你就只有干瞪眼的份儿了。因此,最好首先阅读一下第 1 章 基本概念,这一章简单介绍了一些版本控制的基本思想和概念。

Subversion支持:

版本化的目录

CVS tracks only the history of individual files, but Subversion implements a “virtual” versioned filesystem that tracks changes to whole directory trees over time. Files and directories are versioned.

真实的版本历史

由于只能跟踪单个文件的变更,CVS无法支持如文件拷贝和改名这些常见的操作—这些操作改变了目录的内容。同样,在CVS中,一个目录下的文件只要名字相同即拥有相同的历史,即使这些同名文件在历史上毫无关系。而在Subversion中,可以对文件或目录进行增加、拷贝和改名操作,也解决了同名而无关的文件之间的历史联系问题。

原子提交

A collection of modifications either goes into the repository completely or not at all. This allows developers to construct and commit changes as logical chunks and prevents problems that can occur when only a portion of a set of changes is successfully sent to the repository.

版本化的元数据

每一个文件和目录都有自己的一组属性—键和它们的值。可以根据需要建立并存储任何键/值对。和文件本身的内容一样,属性也在版本控制之下。

可选的网络层

Subversion has an abstracted notion of repository access, making it easy for people to implement new network mechanisms. Subversion can plug into the Apache HTTP Server as an extension module. This gives Subversion a big advantage in stability and interoperability, and instant access to existing features provided by that server—authentication, authorization, wire compression, and so on. A more lightweight, standalone Subversion server process is also available. This server speaks a custom protocol that can be easily tunneled via SSH.

一致的数据操作

Subversion用一个二进制差异算法描述文件的变化,对于文本(可读)和二进制(不可读)文件其操作方式是一致的。这两种类型的文件压缩存储在版本库中,而差异信息则在网络上双向传递。

高效的分支和标签操作

The cost of branching and tagging need not be proportional to the project size. Subversion creates branches and tags by simply copying the project, using a mechanism similar to a hard link. Thus these operations take only a very small, constant amount of time.

可修改性

Subversion没有历史负担,它以一系列优质的共享C程序库的方式实现,具有定义良好的API。这使得Subversion非常容易维护,和其它语言的互操作性很强。

Subversion的架构

图 1 “Subversion's architecture”给出了Subversion设计总体上的“俯视图”。

图 1. Subversion's architecture


图中的一端是保存所有版本数据的Subversion版本库,另一端是Subvesion的客户程序,管理着所有版本数据的本地影射(称为“工作拷贝”),在这两极之间是各种各样的版本库访问(RA)层,某些使用电脑网络通过网络服务器访问版本库,某些则绕过网络服务器直接访问版本库。

Subversion的组件

安装好的Subversion由几个部分组成,下面将简单的介绍一下这些组件。下文的描述或许过于简略,不易理解,但不用担心—本书后面的章节中会用更多的内容来详细阐述这些组件。

svn

命令行客户端程序。

svnversion

此工具用来显示工作拷贝的状态(用术语来说,就是当前项目的修订版本)。

svnlook

直接查看Subversion版本库的工具。

svnadmin

A tool for creating, tweaking, or repairing a Subversion repository.

svndumpfilter

过滤Subversion版本库转储数据流的工具。

mod_dav_svn

Apache HTTP服务器的一个插件,使版本库可以通过网络访问。

svnserve

一个单独运行的服务器程序,可以作为守护进程或由SSH调用。这是另一种使版本库可以通过网络访问的方式。

svnsync

一个通过网络增量镜像版本库的程序。

What's New in Subversion

The first edition of this book was released in 2004, shortly after Subversion had reached 1.0. Over the following four years Subversion released five major new versions, fixing bugs and adding major new features. While we've managed to keep the online version of this book up to date, we're thrilled that the second edition from O'Reilly now covers Subversion up through release 1.5, a major milestone for the project. Here's a quick summary of major new changes since Subversion 1.0. Note that this is not a complete list; for full details, please visit Subversion's web site at http://subversion.tigris.org.

Subversion 1.1 (September 2004)

Release 1.1 introduced FSFS, a flat-file repository storage option for the repository. While the BerkeleyDB back-end is still widely used and supported, FSFS has since become the “default” choice for newly-created repositories due to its low barrier to entry and minimal maintenance requirements. Also in this release came the ability to put symbolic links under version control, auto-escaping of URLs, and a localized user interface.

Subversion 1.2 (May 2005)

Release 1.2 introduced the ability to create server-side “locks” on files, thus serializing commit access to certain resources. While Subversion is still a fundamentally concurrent version control system, certain types of binary files (art assets, for example) cannot be merged together. The locking feature fulfills the need to version and protect such resources. With locking also came a complete “auto-versioning” implementation, allowing Subversion repositories to be mounted as network folders. Finally, Subversion 1.2 began using a new, faster binary-differencing algorithm to compress and retrieve old versions of files.

Subversion 1.3 (December 2005)

Release 1.3 brought path-based authorization controls to the svnserve server, matching a feature formerly found only in the Apache server. The Apache server, however, gained some new logging features of its own, and Subversion's API bindings to other languages also made great leaps forward.

Subversion 1.4 (September 2006)

Release 1.4 introduced a whole new tool—svnsync— for doing one-way repository replication over a network. Major parts of the working copy metadata were revamped to no longer use XML (resulting in client side speed gains), while the BerkeleyDB repository back-end gained the ability to automatically “recover” itself after a server crash.

Subversion 1.5 (June 2008)

Release 1.5 took much longer to finish than prior releases, but the headliner feature was gigantic: semi-automated tracking of branching and merging. This was a huge boon for users, and pushed Subversion far beyond the abilities of CVS and into the ranks of commercial competitors such as Perforce and Clearcase. Subversion 1.5 also introduced a bevy of other user-focused features, such as interactive resolution of file conflicts, partial checkouts, client side management of changelists, powerful new syntax for the svn:externals feature, and SASL authentication support for the svnserve server.



[2] Or as a friend puts it, “swatting a fly with a Buick.