The Git Autoinstaller TODO NOW: - Symlinked rerere to get awesomeness. Problems: - Permissions - Might not make a huge difference; how does it handle empty file and removed file cases? - Need to manually run git rerere subsequently to reap benefits - Majority of resolutions have to happen pre-merge (see below) - Consider workflow: run wizard mass-upgrade, and then begin resolving working copies one by one. Each time we resolve a copy, it should cause other copies to start magically resolving. So, ordering should be: 1. Perform merge 2. If it fails, merge the rr-cache with central rr-cache (this operation needs to be atomic) and replace it with a symlink. File permissions preferably should be made correct, but don't have to be since only root will be touching subsequently. If the hash already exists, don't do anything (maybe record this for the benefit of Mister Kite aka so we don't have to do a full traversal, this optimization might be essential) 3. When a human is resolving the merges, they are "low concurrency", i.e. only one commit recording rerere will happen at a time. This means that rr-cache does not need to be concurrent safe. Some number of hashes in the rr-cache will start having postimages; we'll use a full-scan to figure that out. Then cross-reference those with the recorded pending resolutions, and figure out which checkouts we can run rerere on (this gets permissions kind of tricky). We'll try an alternative plan: manually require the user run some sort of retry command that does this as root; presumably they'd run this every ten installs or something. A user can run git rerere to get a resolution early. This requires some new data-structures: - Besides the merge.txt file (which should never ever change), we should have an outstanding.txt file which gets modified as our scripts do resolutions behind our back. Those modifications might a little annoying for a human to keep up with, so we should recommend something like watch -n2 "head file" or something - We need to keep track of the hashes and the cross-referencing. A very small sqlite database might be a good idea here, although the type of information we're interested in a somewhat unnatural query. Alternatively, we just have a very simple text file. - Make it possible to say certain classes of missing files are ok - Wizard needs a correct arch/ setup - The wizard command, when not on scripts, should automatically SSH to scripts and start executing there? - Write the code to make Wordpress figure out its URL from the database - Remerges aren't reflected in the parent files, so `git diff` output is spurious. Not sure how to fix this w/o tree hackery. - Sometimes users remove files. Well, if those files change, they automatically get marked as conflicted. Maybe we should say for certain files "if they're gone, they're gone forever"? What is the proper resolution? - Parse output HTML for class="error" and give those errors back to the user (done), then boot them back into configure so they can enter in something different - Replace gaierror with a more descriptive name (this is a DNS error) - Pre-emptively check if daemon/scripts-security-upd is not on scripts-security-upd list (/mit/moira/bin/blanche) - If you try to do an install on scripts w/o sql, it will sign you up but fail to write the sql.cnf file. This sucks. - Web application for installing autoinstalls has a hard problem with credentials (as well as installations that are not conducted on an Athena machine.) We have some crazy ideas involving a signed Java applet that uses jsch to SSH into athena.dialup and perform operations. - Pay back code debt - Tidy up common code in callAsUser and drop_priviledges in shell; namely cooking up the sudo and environment variable lines - Summary script should be more machine friendly, and should not output summary charts when I increase specificity - Report code in wizard/command/__init__.py is ugly as sin. Also, the Report object should operate at a higher level of abstraction so we don't have to manually increment fails. (in fact, that should probably be called something different). The by-percent errors should also be automated. - Move resolutions in mediawiki.py to a text file? (the parsing overhead may not be worth it) - PHP end of file allows omitted semicolon, can result in parse error if merge resolutions aren't careful. `php -l` can be a quick stopgap - Other stuff - Figure out why Sphinx sometimes fails to crossref :func: but wil crossref :meth:, even though the dest is very clearly a function. Example: :func:`wizard.app.php.re_var` - The TODO extension for Sphinx doesn't properly force a full-rebuild - Code annotation! - Make single user mass-migrate work when not logged in as root. The primary difficulty is making the parallel-find information easily accessible to individual users: perhaps we can do a single-user parallel-find on the fly. - Don't use the scripts heuristics unless we're on scripts with the AFS patch. Check with `fs sysname` - Make 'wizard summary' generate nice pretty graphs of installs by date (more histograms, will need to check actual .scripts-version files.) - It should be able to handle installs like Django where there's a component that gets installed in web_scripts and another directory that gets installed in Scripts. - ACLs is a starting point for sending mail to users, but it has several failure modes: - Old maintainers who don't care who are still on the ACL - Private AFS groups that aren't mailing lists and that we can't get to A question is whether or not sending mail actually helps us: many users will probably have to come back to us for help; many other users won't care. PULLING OUT CONFIGURATION FILES IN AN AUTOMATED MANNER advancedpoll: Template file to fill out django: Noodles of template files gallery2: Multistage install process joomla: Template file mediawiki: One-step install process phpbb: Multistage install process phpical: Template file trac: NFC turbogears: NFC wordpress: Multistage install process COMMIT MESSAGE FIELDS: Installed-by: username@hostname Pre-commit-by: Real Name Upgraded-by: Real Name Migrated-by: Real Name Wizard-revision: abcdef1234567890 Wizard-args: /wizard/bin/wizard foo bar baz GIT COMMIT FIELDS: Committer: Real Name Author: lockername locker NOTES: - It is not required nor expected for update scripts to exist for all intervening versions that were present pre-migration; only for it to work on the most recent migration. - Currently all repositories are initialized with --shared, which means they have basically ~no space footprint. However, it also means that /mit/scripts/wizard/srv MUST NOT lose revs after deployment. OVERALL PLAN: * Some parts of the infrastructure will not be touched, although I plan on documenting them. Specifically, we will be keeping: - parallel-find.pl, and the resulting /mit/scripts/.htaccess/scripts/sec-tools/store/scriptslist * The new procedure for generating an update is as follows: (check out the mass-migration instructions for something in this spirit, although uglier in some ways; A indicates the step /should/ be automated) 0. ssh into not-backward, temporarily give the daemon.scripts-security-upd bits by blanching it on system:scripts-security-upd, and run parallel-find.pl 1. [ see doc/upgrade.rst ] [ENTER HERE FROM CREATING A NEW REPO] 9. Push all of your changes in a public place, and encourage others to test, using --srv-path and a full path. [ XXX: doc/deploy.rst ] GET APPROVAL BEFORE PROCEEDING ANY FURTHER; THIS IS PUSHING THE CHANGES TO THE PUBLIC NOTE: The following commands are to be run on not-backward.mit.edu. You'll need to add daemon.scripts-security-upd to scripts-security-upd to get bits to do this. Make sure you remove these bits when you're done. 10. Run `wizard research appname` which uses Git commands to check how many working copies apply the change cleanly, and writes out a logfile with the working copies that don't apply cleanly. It also tells us about "corrupt" working copies, i.e. working copies that have over a certain threshold of changes. 11. Run `wizard mass-upgrade appname`, which applies the update to all working copies possible. 12. Run parallel-find.pl to update our inventory [ XXX: doc/upgrade.rst ] * For mass importing into the repository, there are a few extra things: * When mass producing updates, if the patch has changed you will have to do a special procedure for your merge: git checkout pristine # NOTE: Now, the tricky part (this is different from a real update) git symbolic-ref HEAD refs/heads/master # NOTE: Now, we think we're on the master branch, but we have # pristine copy checked out # NOTE: -p0 might need to be twiddled patch -p0 < ../app-1.2.3/app-1.2.3.patch git add . # reconstitute .scripts directory git checkout v1.2.2-scripts -- .scripts git add .scripts # NOTE: Fake the merge git rev-parse pristine > .git/MERGE_HEAD You could also just try your luck with a manual merge using the patch as your guide. [ XXX: doc/layout.rst ] * The repository for a given application will contain the following files: - The actual application's files, as from the official tarball - A .scripts directory, with the intent of holding Scripts specific files if they become necessary. - .scripts/dsn, overriding database source name