Anatomy of a repositoryΒΆ

Author:Edward Z. Yang <ezyang@mit.edu>

Wizard is all about using Git’s excellent directed acyclic graph model of history to perform file-system merges as well as keep track of user changes on top of ours. If you are not familiar with the way Git internally represents commits, I highly recommend reading Git for Computer Scientists first.

Wizard takes a simplified view of upstream: from the point of view of the pristine branch pointer, history should be a straight-forward progression of versions. Internal development history is discarded, and there is a one-to-one mapping of releases and commits.

digraph pristine_dag {
node [shape=square]
subgraph cluster_pristine {
    c -> b -> a
    a [label="1.0"]
    b [label="1.1"]
    c [label="2.0"]
    label = "pristine"
    color = white
}
}

From here, we build “scriptsified” versions of the application, which correspond to the master branch. Every time upstream releases an update, we import it into our pristine branch, and then merge the changes into master.

digraph master_dag {
node [shape=square]
subgraph cluster_master {
    cs -> bs -> as
    as [label="1.0-scripts"]
    bs [label="1.1-scripts"]
    cs [label="2.0-scripts"]
    label = "master"
    color = white
}
subgraph cluster_pristine {
    c -> b -> a
    a [label="1.0"]
    b [label="1.1"]
    c [label="2.0"]
    label = "pristine"
    color = white
}
as -> a
bs -> b
cs -> c
}

If there was an error in a deployed scripts version, you might see a structure like this:

digraph scripts2_dag {
node [shape=square]
subgraph cluster_master {
    cs -> bs2 -> bs -> as
    as [label="1.0-scripts"]
    bs [label="1.1-scripts",style=dashed]
    bs2 [label="1.1-scripts2"]
    cs [label="2.0-scripts"]
    label = "master"
    color = white
}
subgraph cluster_pristine {
    c -> b -> a
    a [label="1.0"]
    b [label="1.1"]
    c [label="2.0"]
    label = "pristine"
    color = white
}
as -> a
bs -> b
cs -> c
}

But such occasions should be rare. In this particular graph, 1.1-scripts was defective, and 1.1-scripts2 was the fixed version.

There is another layer to this graph, which is not visible from the repository: it contains the user’s commits and is unique for each user.

digraph master_dag {
node [shape=square]
subgraph cluster_user {
    node [shape=ellipse]
    u -> x -> y -> z
    u [style=filled,fillcolor=red,fontcolor=white,color=red]
    label = "master"
    color = white
}
subgraph cluster_master {
    bs -> as
    as [label="1.0-scripts"]
    bs [label="1.1-scripts"]
    color = white
}
subgraph cluster_pristine {
    b -> a
    a [label="1.0"]
    b [label="1.1"]
    label = "pristine"
    color = white
}
as -> a
bs -> b
x -> bs
z -> as
}

The red node u represents uncommitted changes that may exist in a user’s checkout at any given time. The untagged commits x, y and z each have a particular story: z was the commit generated when the install took place and the user’s specific configuration was versioned. y was the pre-upgrade commit generated so that we could then perform a merge; x is the resulting merge commit.

All user repositories are initialized with --shared, which means they take no space footprint at the very beginning. However, this also makes it vitally important that the canonical repository in the scripts locker not lose revisions.

Previous topic

Setting up Wizard

Next topic

Creating a repository

This Page