TODO

   1 The Git Autoinstaller
   2
   3 TODO NOW:
   4
   5 - Make it faster
   6     - Certain classes of error will continually fail, so they should
   7       put in a different "seen" file which also skips them, unless
   8       we have some sort of gentle force
   9
  10 - Keep my sanity when upgrading 1000 installs
  11     - Distinguish between errors(?)
  12     - Custom merge algo: absolute php.ini symlinks to relative symlinks (this
  13       does not seem to have been a problem in practice)
  14     - Custom merge algo: check if it's got extra \r's in the file,
  15       and dos2unix it if it does, before performing the merge
  16     - `vos exa` in order to check what a person's quota is.  We can
  17       figure out roughly how big the upgrade is going to be by
  18       doing a size comparison of the tars: `git pull` MUST NOT
  19       fail, otherwise things are left conflicted, and not easy to fix.
  20     - Prune -7 call errors and automatically reprocess them (with a
  21       strike out counter of 3)--this requires better error parsing
  22     - Snap-in conflict resolution teaching:
  23         1. View the merge conflicts after doing a short run
  24         2. Identify common merge conflicts
  25         3. Copypaste the conflict markers to the application.  Scrub
  26            user-specific data; this may mean removing the entire
  27            upper bit which is the user-version.
  28         4. Specify which section to keep.  /Usually/ this means
  29            punting the new change, but if the top was specified
  30            it means we get a little more flexibility.  Try to
  31            minimize wildcarding: those things need to be put into
  32            subpatterns and then reconstituted into the output.
  33
  34 - Distinguish from logging and reporting (so we can easily send mail
  35   to users)
  36     - Logs aren't actually useful, /because/ most operations are idempotent.
  37       Thus, scratch logfile and make our report files more useful: error.log
  38       needs error information; we don't care too much about machinability.
  39       All report files should be overwritten on the next run, since we like
  40       using --limit to incrementally increase the number of things we run. Note
  41       that if we add soft ignores, you /do/ lose information, so there needs
  42       to be some way to also have the soft ignore report a "cached error"
  43     - Report the identifier number at the beginning of all of the stdout logs
  44     - Don't really care about having the name in the logfile name, but
  45       have a lookup txt file
  46     - Figure out a way of collecting blacklist data from .scripts/blacklisted
  47       and aggregate it together
  48     - Failed migrations should be wired to have wizard commands in them
  49       automatically log to the relevant file.  In addition, the seen file
  50       should get updated when one of them gets fixed.
  51     - Failed migration should report how many unmerged files there are
  52       (so we can auto-punt if it's over a threshold)
  53
  54 - Let users use Wizard when ssh'ed into Scripts
  55     - Make single user mass-migrate work when not logged in as root
  56
  57 - Make the rest of the world use Wizard
  58     - Make parallel-find.pl use `sudo -u username git describe --tags`
  59       to determine version.  Make parallel-find.pl have this have greater
  60       precedence.  This also means, however, that we get
  61       full mediawiki-1.2.3-2-abcdef names (Have patch, pending testing and commit)
  62     - Make deployed installer use 'wizard install' /or/ do a migration
  63       after doing a normal install (the latter makes it easier
  64       for mass-rollbacks).
  65
  66 - Pre-emptively check if daemon/scripts-security-upd
  67   is not on scripts-security-upd list (/mit/moira/bin/blanche)
  68
  69 - Redo Wordpress conversion, with an eye for automating everything
  70   possible (such as downloading the tarball and unpacking)
  71
  72 - Pay back code debt
  73     - Genericize callAsUser and drop_priviledges in shell
  74     - Summary script should be more machine friendly, and should not
  75       output summary charts when I increase specificity
  76     - Summary script should do something intelligent when distinguishing
  77       between old-style and new-style installs
  78
  79 - Other stuff
  80     - Don't use the scripts heuristics unless we're on scripts with the
  81       AFS patch.  Check with `fs sysname`
  82     - Make 'wizard summary' generate nice pretty graphs of installs by date
  83       (more histograms, will need to check actual .scripts-version files.)
  84     - It should be able to handle installs like Django where there's a component
  85       that gets installed in web_scripts and another directory that gets installed
  86       in Scripts.
  87     - ACLs is a starting point for sending mail to users, but it has
  88       several failure modes:
  89         - Old maintainers who don't care who are still on the ACL
  90         - Private AFS groups that aren't mailing lists and that we
  91           can't get to
  92       A question is whether or not sending mail actually helps us:
  93       many users will probably have to come back to us for help; many
  94       other users won't care.
  95
  96 PULLING OUT CONFIGURATION FILES IN AN AUTOMATED MANNER
  97
  98 advancedpoll: Template file to fill out
  99 django: Noodles of template files
 100 gallery2: Multistage install process
 101 joomla: Template file
 102 mediawiki: One-step install process
 103 phpbb: Multistage install process
 104 phpical: Template file
 105 trac: NFC
 106 turbogears: NFC
 107 wordpress: Multistage install process
 108
 109 PHILOSOPHY ABOUT LOGGING
 110
 111 Logging is most useful when performing a mass run.  This
 112 includes things such as mass-migration as well as when running
 113 summary reports.  An interesting property about mass-migration
 114 or mass-upgrade, however, is that if they fail, they are
 115 idempotent, so an individual case can be debugged simply running
 116 the single-install equivalent with --debug on.  (This, indeed,
 117 may be easier to do than sifting through a logfile).
 118
 119 It is a different story when you are running a summary report:
 120 you are primarily bound by your AFS cache and how quickly you can
 121 iterate through all of the autoinstalls.  Checking if a file
 122 exists on a cold AFS cache may
 123 take several minutes to perform; on a hot cache the same report
 124 may take a mere 3 seconds.  When you get to more computationally
 125 expensive calculations, however, even having a hot AFS cache
 126 is not enough to cut down your runtime.
 127
 128 There are certain calculations that someone may want to be
 129 able to perform on manipulated data.  As such, this data should
 130 be cached on disk, if the process for extracting this data takes
 131 a long time.  Also, for usability sake, Wizard should generate
 132 the common case reports.
 133
 134 Ensuring that machine parseable reports are made, and then making
 135 the machinery to reframe this data, increases complexity.  Therefore,
 136 the recommendation is to assume that if you need to run iteratively,
 137 you'll have a hot AFS cache at your fingerprints, and if that's not
 138 fast enough, then cache the data.
 139
 140 COMMIT MESSAGE FIELDS:
 141
 142 Installed-by: username@hostname
 143 Pre-commit-by: Real Name <username@mit.edu>
 144 Upgraded-by: Real Name <username@mit.edu>
 145 Migrated-by: Real Name <username@mit.edu>
 146 Wizard-revision: abcdef1234567890
 147 Wizard-args: /wizard/bin/wizard foo bar baz
 148
 149 GIT COMMIT FIELDS:
 150
 151 Committer: Real Name <username@mit.edu>
 152 Author: lockername locker <lockername@scripts.mit.edu>
 153
 154 NOTES:
 155
 156 - It is not expected or required for update scripts to exist for all
 157   intervening versions that were present pre-migration; only for it
 158   to work on the most recent migration.
 159
 160 - Currently all repositories are initialized with --shared, which
 161   means they have basically ~no space footprint.  However, it
 162   also means that /mit/scripts/wizard/srv MUST NOT lose revs after
 163   deployment.
 164
 165 - Full fledged logging options. Namely:
 166   x all loggers (delay implementing this until we actually have debug stmts)
 167     - default is WARNING
 168     - debug     => loglevel = DEBUG
 169   x stdout logger
 170     - default is WARNING (see below for exception)
 171     - verbose   => loglevel = INFO
 172   x file logger (creates a dir and lots of little logfiles)
 173     - default is OFF
 174     - log-file   => loglevel = INFO
 175
 176 OVERALL PLAN:
 177
 178 * Some parts of the infrastructure will not be touched, although I plan
 179   on documenting them.  Specifically, we will be keeping:
 180
 181     - parallel-find.pl, and the resulting
 182       /mit/scripts/.htaccess/scripts/sec-tools/store/scriptslist
 183
 184 * The new procedure for generating an update is as follows:
 185   (check out the mass-migration instructions for something in this spirit,
 186   although uglier in some ways; A indicates the step /should/ be automated)
 187
 188     0. ssh into not-backward, temporarily give the daemon.scripts-security-upd
 189        bits by blanching it on system:scripts-security-upd, and run parallel-find.pl
 190
 191     1. Have the Git repository and working copy for the project on hand.
 192
 193 /- wizard prepare-pristine --
 194
 195 A   2. Checkout the pristine branch
 196
 197 A   3. Remove all files from the working copy.  Use `wipe-working-dir`
 198
 199 A   4. Download the new tarball
 200
 201 A   5. Extract the tarball over the working copy (`cp -R a/. b` works well,
 202        remember that the working copy is empty; this needs some intelligent
 203        input)
 204
 205 A   6. Check for empty directories and add stub files as necessary.
 206        Use `preserve-empty-dir`
 207
 208 \---
 209
 210     7. Git add it all, and then commit as a new pristine version (v1.2.3)
 211
 212     8. Checkout the master branch
 213
 214     9. [FOR EXISTING REPOSITORIES]
 215        Merge the pristine branch in. Resolve any conflicts that our
 216        patches have with new changes. Do NOT let Git auto-commit it
 217        with --no-commit (otherwise, you want to git commit --amend
 218        to keep our history clean
 219
 220        [FOR NEW REPOSITORIES]
 221        Check if any patches are needed to make the application work
 222        on Scripts (ideally, it shouldn't.
 223
 224 /- wizard prepare-new --
 225
 226     Currently not used for anything besides parallel-find.pl, but
 227     we reserve the right to place files in here in the future.
 228
 229 A       mkdir .scripts
 230 A       echo "Deny from all" > .scripts/.htaccess
 231
 232 \---
 233
 234    10. Check if there are any special update procedures, and update
 235        the wizard.app.APPNAME module accordingly (or create it, if
 236        need be).
 237
 238    11. Run 'wizard prepare-config' on a scripts server while in a checkout
 239        of this newest version.  This will prepare a new version of the
 240        configuration file based on the application's latest installer.
 241        Manually merge back in any custom changes we may have made.
 242        Check if any of the regular expressions need tweaking by inspecting
 243        the configuration files for user-specific gunk, and modify
 244        wizard.app.APPNAME accordingly.
 245
 246    12. Commit your changes, and tag as v1.2.3-scripts (or scripts2, if
 247        you are amending an install without an upstream changes)
 248
 249       NOTE: These steps should be run on a scripts server
 250
 251    13. Test the new update procedure using our test scripts.  See integration
 252        tests for more information on how to do this.
 253
 254         http://scripts.mit.edu/wizard/testing.html#acceptance-tests
 255
 256       GET APPROVAL BEFORE PROCEEDING ANY FURTHER
 257
 258       NOTE: The following commands are to be run on not-backward.mit.edu.
 259       You'll need to add daemon.scripts-security-upd to
 260       scripts-security-upd to get bits to do this.  Make sure you remove
 261       these bits when you're done.
 262
 263 A  14. Run `wizard research appname`
 264        which uses Git commands to check how many
 265        working copies apply the change cleanly, and writes out a logfile
 266        with the working copies that don't apply cleanly.  It also tells
 267        us about "corrupt" working copies, i.e. working copies that
 268        have over a certain threshold of changes.
 269
 270 A  15. Run `wizard mass-upgrade appname`, which applies the update to all working
 271        copies possible, and sends mail to users to whom the working copy
 272        did not apply cleanly.
 273
 274    16. Run parallel-find.pl to update our inventory
 275
 276 * For mass importing into the repository, there are a few extra things:
 277
 278     * Many applications had patches associated with them.  Be sure to
 279       apply them, so later merges work better.
 280
 281         # the following operation might require -p1
 282         patch -p0 < ../app-1.2.3/app-1.2.3.patch  # [FIDDLY BIT]
 283
 284     * When running updates, if the patch has changed you will have to
 285       do a special procedure for your merge:
 286
 287         git checkout pristine
 288         # NOTE: Now, the tricky part (this is different from a real update)
 289         git symbolic-ref HEAD refs/heads/master
 290         # NOTE: Now, we think we're on the master branch, but we have
 291         # pristine copy checked out
 292         # NOTE: -p0 might need to be twiddled
 293         patch -p0 < ../app-1.2.3/app-1.2.3.patch
 294         git add .
 295         # reconstitute .scripts directory
 296         git checkout v1.2.2-scripts -- .scripts
 297         git add .scripts
 298         # NOTE: Fake the merge
 299         git rev-parse pristine > .git/MERGE_HEAD
 300
 301       You could also just try your luck with a manual merge using the patch
 302       as your guide.
 303
 304 * The repository for a given application will contain the following files:
 305
 306     - The actual application's files, as from the official tarball
 307
 308     - A .scripts directory, with the intent of holding Scripts specific files
 309       if they become necessary.
 310
 311         * .scripts/lock (generated) which locks an autoinstall during upgrade
 312