Greg Hudson’s MIT blog


Minerva: mod_authz_mitgroup

Posted in minerva by ghudson on the August 22nd, 2007

I’ve written an authz module for authenticating against MIT groups on a standalone Apache 2.x web server.  It uses ldap.mit.edu as a back end currently.

In order to use it, you first need an Apache auth mechanism which produces a username like ghudson@mit.edu or just ghudson.  The simplest way I know of to do that is to use mod_auth_sslcert from the scripts.mit.edu project.  A future option will be to use Shibboleth, which is expected to be piloted soon; I haven’t tried that yet (but I plan to).

So, the details:

1. Setting up mod_auth_sslcert (until Shibboleth becomes an option)
I’ve stashed a copy of the source at:

http://web.mit.edu/minerva-dev/src/mod_auth_sslcert/mod_auth_sslcert.c

or you can grab it from the scripts.mit.edu repository.  Make sure you have the appropriate httpd devel package installed for your OS (or have your path set properly if you built httpd from source) and run:

apxs -c -i -a mod_auth_sslcert.c

which will compile the source, install it in the httpd modules directive, and add a LoadModule directive to your httpd.conf.  You then configure it in some suitably global section of httpd.conf:

AuthSSLCertVar SSL_CLIENT_S_DN_Email

which will produce usernames like ghudson@MIT.EDU.  If for other reasons you’d rather the username look like just ghudson, you can do that with:
AuthSSLCertStripSuffix “@MIT.EDU”

You also need the web server configured to be able to verify MIT client certificates (see http://web.mit.edu/apache-ssl/www/README.certificate for instructions on getting a server certificate; those are written for Apache 1.3, so you’ll probably need to store the certificate elsewhere for your Apache 2.x server), and to have an area of your server configured with:

SSLVerifyClient require

It’s traditional to use a separate port for the portion of the web space which requires client certificates, but with Apache 2.x you can actually just put it in a directive and the server will do an SSL renegotiation once it detects that the requested URL is part of the affected area.

2. Setting up mod_authnz_mitgroup itself

Get the source from:

http://web.mit.edu/minerva-dev/src/mod_authz_mitgroup/mod_authz_mitgroup.c

and install it with:

apxs -c -i -a mod_authz_mitgroup.c

In a .htaccess file or Location directive for a resource you want to control, you would restrict to a specific group with:

AuthType SSLCert
require mitgroup minerva-dev

The first line is specific to using mod_auth_sslcert for authentication; with Shibboleth you’d do something different.
LDAP queries performed by this module will be cached for ten minutes by default.  You can change that with the LDAPCacheTTL directive, e.g. “LDAPCacheTTL 300″ for five-minute caching.

3. Doing the same thing with mod_authnz_ldap
If you’re willing to accept a hackier syntax and a closer tie-in to LDAP as a back end, you can do the same thing with mod_authnz_ldap, which is distributed with httpd 2.2.  You still need mod_auth_sslcert or equivalent to get the username set up.  Your per-resource access restriction directives would look like:

AuthType SSLCert
AuthLDAPUrl ldap://ldap.mit.edu/dc=mit,dc=edu?mail
AuthLDAPGroupAttribute uniquemember
require ldap-group cn=minerva-dev,ou=groups,dc=mit,dc=edu

(If you configured mod_auth_sslcert to strip the “@MIT.EDU” suffix, remove the “?mail” directive at the end of the URL so that mod_authnz_ldap uses the default field instead, which is “uid”.)  You can put AuthLDAPGroupAttribute in a global place, but don’t put AuthLDAPUrl there or every resource will become inaccessible if mod_authnz_ldap can’t determine the user’s DN.

cobwebs: Close to a first draft

Posted in minerva by ghudson on the May 25th, 2007

Frustratingly, I’m having to pull myself away from cobwebs work (in favor of Openfire work) just as I feel like I’m close to having a configuration I can show around and start getting feedback on.  Mostly, I need to set up SSL, write scripts to provision new user accounts, and write a little placeholder front-page content.  If anyone is really interested, I could manually set up some accounts and let them poke around.  Some assorted ramblings below.
LVS was a little confusing to set up, partly due to out-of-date documentation.  Like scripts, cobwebs will use the “direct” (or “gateway”) packet-forwarding mode, which means the web server nodes have an arpless interface alias for the virtual IP address which they share with the arpful interface alias on the LVS node.  Setting up an arpless interface alias in Linux is sort of a black art; the mechanism has changed across kernel releases and it still isn’t all that pretty.  Today you can “ifconfig interfacename -arp” which might or might not scope properly for an interface alias; regardless, the RHEL 5 init infrastructure doesn’t do that if you say ARP=no in an ifcfg file, but instead does “ip link set dev devname arp no” on the parent interface, which is the wrong scope.  The workaround is to stick your arpless interface aliases on the lo device instead of the eth0 device, since it doesn’t matter if the true lo device arps or not.  Also like scripts, cobwebs will use the “source hash” scheduling mechanism, which will tend to route the same requestor IP address to the same server.  That’s probably good for performance over GFS, but it might actually be better to route requests for the same domain (username.cobwebs.mit.edu) to the same server.  That’s much harder to do, though.
Out of curiosity, I looked into how scripts.mit.edu was dealing with sharded /tmp; apparently it isn’t ideal if a node goes down, and having an unshared /tmp might be better even if it breaks the illusion of a single machine.  On cobwebs, I can just bind-mount /home/cobwebs/tmp over /tmp and get proper behavior if a node goes down.  If I’m betting the farm on GFS, I may as well get something out of it.

I asked jis for a relatively exotic X.509 certificate for cobwebs (listing *.cobwebs.mit.edu as well as cobwebs.mit.edu, cobwebs-apache1.mit.edu, and cobwebs-apache2.mit.edu as common names).  Haven’t heard back.  I may fall back to asking for a more commonplace certificate in order to get things going, but it would be cool if https://username.cobwebs.mit.edu/ could work.

For provisioning user accounts, I decided to make web server node #1 the master and require that provisioning take place there (so it won’t be a high-availability service).  This is because the web server nodes have to all agree on the same uid for an account, and unlike scripts, I’m not determining the uid from a higher authority like hesiod or AFS.   So someone has to be the decider.

While requesting keytabs, I ran across an odd corner case: MIT is still handing out srvtabs (yum, krb4 inertia) and if you convert a srvtab to a keytab with stock RHEL krb5 configuration, you get keytabs with the wrong hostname (cobwebs.athena.mit.edu).  Easily fixed, but perplexing until I figured out what was going on.

cobwebs: PHP session directories and related stuff

Posted in minerva by ghudson on the May 15th, 2007

Since I last posted, I’ve been working on the cobwebs db guest image and learning about MySQL.  There’s not much to say about that since it pretty much just works.  I need to write a little bit of machinery to create users and to create databases for users, but it doesn’t seem hard.  For the moment I’m hand-creating databases as I need them.
I’ve also tried deploying a few PHP web apps onto my test apache instance.  The results are pretty encouraging.  WordPress and Drupal worked with no problems.  I do notice that all these web apps assume that the vast majority of web host sites use “localhost” as the MySQL host.  That seems a little odd; I would expect most web hosts to want to separate out the database server pretty quickly.  I wonder if they use redirectors.
When I tried out MediaWiki, I ran into a small hurdle: it wants the PHP session directory to be writable, which it’s not since /etc/php.ini sets save_path to /var/lib/php/session which is writable by group apache but not by random user IDs.  I can configure PHP’s session.save_path in /etc/php.ini to point to a world-writable sticky directory like /tmp.  The planned default umask (072; all users share the same group) would work to protect session data from other users, I think.  But it feels unsafe; I’d rather have a separate session directory for each user.  I don’t think I can do that in a single global php.ini; the documentation and source code don’t reveal any kind of substitution going on in session.save_path which I could use to put in the current user’s home directory or some such.

scripts.mit.edu creates a php.ini alongside its web app auto-deployments which sets session.save_path.  I could do the same, but that only works for web apps I have auto-installers for.  I’d like stuff to work out of the box as much as possible.

On a similar note, how does scripts.mit.edu configure PHP to automatically notice php.ini files dropped in alongside PHP scripts?  I couldn’t immediately figure that out, and it didn’t seem to happen on its own.

Cobwebs: falling back to mod_suphp

Posted in minerva by ghudson on the May 10th, 2007

I decided to fall back to mod_suphp for now and steal the scripts.mit.edu PHP FastCGI infrastructure later.

mod_suphp has a weird limitation: in order to get it to handle PHP scripts, you have to issue (among other things) the directive “suPHP_AddHandler x-httpd-php”, and the suPHP_AddHandler directive is only allowed within .htaccess files, not within httpd.conf.  People on the suphp mailing list have asked about this lots of times and have never gotten a clear answer.  Apparently, everyone uses a patch which allows the directive in httpd.conf, so I did that too.  Weird, but I hope to wash my hands of the whole thing in the long term.

Also, it’s much easier to get suphp working when you don’t unwittingly have mod_php installed.  Oops.  (It was being loaded and configured via an /etc/httpd/conf.d script; I’ve decided to disable the loading of those scripts for now.)

Anyway, PHP scripts, automatically executed with no special configuration as if mod_php were installed, but running as the content owner’s uid.  Ready to move on; one more task added to the long list of eventual cleanups.

cobwebs: The suexec security model

Posted in minerva by ghudson on the May 9th, 2007

I did some reading and some thinking about the suexec security model.  You can find an exact description of the checks suexec performs here but there isn’t really an explanation of the rationale behind those checks.  Here is my attempt to reverse-engineer the rationale:

suexec allows the apache user to take actions on behalf of other users.  Without enough restrictions on what suexec will run, apache would become a superuser, or at least a sort of almost-superuser who can’t modify system files or bind to reserved ports but can execute arbitrary code with any non-system uid it chooses.  The suexec restrictions implement the following contract between regular users and the apache account: the apache account can (in effect) cause you to run any executable program you own which you have placed in your public_html directory, with any arguments, as long as it doesn’t look like you flubbed the permissions enough to let someone else modify that executable.

Naturally, it’s very easy for a naive user to accidentally extend this contract to allowing the apache account to take any action.  “cp /bin/sh $HOME/public_html” would be enough.  But: savvy users are protected; users who have no dynamic content are protected; and users who simply install a ready-made web app with dynamic content are probably safe since those files are intended to be executed via the web anyway.
Under the cobwebs model, no one should have an easy time getting the apache user to do anything nefarious.  But the httpd server runs as apache (rather than root) for a reason: the httpd code is really complex, and a security compromise in that code should not allow an attacker to run rampant over every user’s files.  So I don’t want to relax the suexec contract too much and turn the apache account into a dangerously mutant uid.

How, then, can I make it so that the apache account can execute PHP files without requiring the PHP scripts to be executable and prefixed with #!/usr/bin/php?  I see two options:

1. mod_suphp.  It’s a little like suexec, but since it knows it’s running the php script interpreter, it can do its checking on the filename argument to php rather than on the path to the interpreter itself.  I’ve audited mod_suphp a little, and I don’t like it for three reasons: it’s horribly over-engineered (25 small C++ files to implement what should be a straightforward single-function C program), it appears to have a giant hole in the form of the PHP configuration file (apache gets to choose the ini file location and there are no restrictions on what it is), and it carries none of the efficiencies of FastCGI.

2. Modify the suexec program to extend the contract specifically for PHP: the Apache user can cause you to run /usr/bin/php on any php script you own in your public_html directory that meets all the usual criteria.   If suexec is run with /usr/bin/php as the executable argument, it can do all its checks on the next argument on the command line (except for the check that the file is executable), instead of on the command itself.

Option #2 seems very attractive.  There is another issue which andersk has raised, though: using a different fastcgi process for each user and for each distinct php.ini file.  To solve that problem, I need to learn more about how fcgid decides whether to reuse processes–and while I’m at it, how user content can determine the php.ini file location.  That will be tomorrow’s task, most likely.
On a personal note, I feel like I–and the scripts.mit.edu people–are solving a problem which should have been solved several years ago.  The Apache httpd people clearly care about the mass virtual hosting problem to some degree, and they have some developers with good understandings of Unix security, but at some point there was a failure of vision.  They just haven’t assembled the right pieces to make secure mass virtual hosting possible.  I feel sorry for the people who aren’t security experts or C programmers who are trying to do this and dealing with the inevitable compromises that result.

cobwebs: FastCGI, suexec, and PHP

Posted in minerva by ghudson on the May 7th, 2007

Here’s another bit of web server configuration lossage, as yet unsolved.
We can’t run mod_php on cobwebs for security reasons I’ve mentioned before.  Instead, we need to configure *.php files to be handled by the php interpreter through CGI or FastCGI, probably the latter.

However, Apache suexec balks at running the php interpreter from /usr/bin/php.  It wants all of the binaries it executes to live under the user’s public_html directory.  To live within the suexec security restrictions, every user’s public_html will need a copy of the php binary (which will need to be updated when php is updated, presumably) and the .php file handler will need to be pointed at the per-user copy.  The latter can be accomplished with mod_rewrite magic, which I’d rather ignore; the former sounds like an even bigger hassle.

(scripts.mit.edu doesn’t have to worry about this problem because in their model, php scripts are directly executable, and binfmt_misc figures out that it should run them through /usr/bin/php after suexec has washed its hands of the issue.)

So, do I install a modified suexec?  I need to understand why the hierarchy restriction is in place.  Possibilities include:

1. To support a model where users can only execute “safe” programs of the system administrator’s choosing.  I guess in this model, public_html/cgi-bin would be a root-owned subdir and would be the only one which can be configured for script execution.  Except I don’t think that works since suexec will refuse to run root-owned programs.  So I don’t think that’s the issue

2. To somehow prevent user A from taking over user B’s account using suexec.  suexec will refuse to run if the executing user’s uid isn’t apache, so we have to posit that user A has taken over the apache uid somehow.  That should be difficult in our model (since we don’t run mod_php or similar), but supposing it were possible, the hierarchy restrictions limit what programs user A can execute as user B’s uid.  However, they do not limit what arguments the web server can pass to those programs, so it seems like a lot of wiggle room.

3. Something else.

Cobwebs: Apache and the hosting of untrusted users

Posted in minerva by ghudson on the May 7th, 2007

Before I started prototyping this web hosting project, I did a lot of research into the details of securing (or trying to secure) a hosting system for untrusted users. After a great deal of digging, I came to the conclusion that commercial services aren’t typically doing a very good job of insulating users from each other. PHP code typically runs as the same uid for all users, and users typically have access to each others’ database passwords (either from the shell or from dynamic web content) in the default configurations of web apps. It’s a big mess; I’m a little surprised it doesn’t generate a lot of high-profile security incidents, but I guess security compromises are so common across so many fronts these days that they don’t generate much news or pressure to fix things. Also, you’re typically sharing a web host with only a few thousand other users who don’t know you and don’t have much motivation to steal or muck with your database information; a limited user population means less risk of attack.

Part of the reason for the bad state of web hosting security is that the available tools don’t encourage a good security architecture–specifically, Apache httpd and PHP. PHP is most commonly run as a module, but httpd doesn’t have a multiprocessing model which allows module code to be executed with per-user uids; barring some very gross and unsupported hacks, all module code has to be executed as the web server uid. So, right off the bat you have to choose between the common configuration and a secure one. PHP has a maze of configuration options to try to form a security cordon between users sharing the same uid, but they are full of holes and most of them are desupported in the forthcoming PHP 6. Moreover, if you turn enough of them on, you’ll wind up breaking common web apps.

So I’ll need to abandon the common configuration and run PHP scripts through FastCGI or regular CGI, which support per-user execution uids for user content. Fair enough, but httpd has issues in this department as well. The only built-in mechanism for picking a per-user execution uid is mod_userdir, which translates http://servername/~username/blah into /homedir-of-username/public_html/blah (where the middle path component is configurable). That turns out to be horribly insecure from a browser perspective; all http://servername/~username URLs live in the same browser security context, allowing them to keylog each other’s input, steal each other’s Basic auth credentials, and do other nasty things.

To give each user a separate browser security context, I’ll want to avoid mod_userdir and use per-user subdomains instead: http://username.cobwebs.mit.edu/blah, where we have a wildcard DNS record mapping *.cobwebs.mit.edu to the cobwebs IP address. This is called “mass named-based virtual hosting”, because all the virtual hosts share the same IP address and because there are too many of them to maintain configurations for by hand in httpd.conf. There are a few ways to automate the URL-to-path translation for mass virtual hosting, such as mod_rewrite and mod_vhost_alias, but neither one of them supports picking an execution uid for CGI scripts based on the domain name.

After consulting with the scripts.mit.edu maintainers, I determined it was time to write some code. I made a copy of mod_userdir.c, stripped out most of the configuration logic and the existing path translation, borrowed a few small bits of code from mod_vhost_alias, and produced mod_userdomain.c which does the necessary translation and uid selection. I also sent mail to the httpd maintainers asking if they are interested in mod_userdir extensions to support this kind of thing, but have received no response.

Cobwebs: trials and travails (part 2)

Posted in minerva by ghudson on the May 7th, 2007

So, I have a bunch of non-updated RHEL 5 guests (when I would have preferred up-to-date FC6 guests) with the Red Hat Cluster Suite packages available on them.  There’s documentation on setting up RHEL 5 clusters; I’m not doing anything too out of the ordinary from Red Hat’s point of view.  Should be straightforward, right?  Well, not so much.

First, I had a choice between the old GUI for administering clusters (system-config-cluster) and the new web-based one, Conga.  Conga contains two parts, a back-end named ricci which you install on the cluster nodes, and a piece of HTTP middleware which can be installed on a cluster node or another machine; it doesn’t really matter.  My first warning sign with Conga was that luci RPM contains the entire Zope framework.  I’m sure there’s some important practical reason for that, but it suggests that something went awry in the release process and the software might not be fully baked.  My second warning sign was that luci wanted me to input a list of all the cluster nodes and their root passwords.  I’m not really comfortable with that kind of machine administration.  system-config-cluster it is.

So I set up a cluster config file on a few nodes (I started with the block device exporter and two web server nodes) and attempted to bring up the cluster software.  It wouldn’t come up on any nodes–this turns out to be because I hadn’t enabled the requisite ports on the firewall, but the Red Hat documentation didn’t bother to mention any of that, and the log messages weren’t very conducive to understanding.  As of today I’m still not sure what ports I need to enable (there’s a list in the linux-cluster FAQ, but it doesn’t say whether it’s for the RHEL 4 code or the RHEL 5/FC6 code, so I can’t trust it); I simply disabled the firewall for now.  The cluster software also ran afoul of SELinux; this time, the error messages thrown were much more helpful, although they didn’t indicate the exact filename aisexec was failing to mainpulate.  I decided to disable SELinux enforcement and come back and clean that up later.

After making those allowances, everything pretty much worked.  gnbd (the block device exporter) doesn’t appear to come with any startup scripts, so I’ll have to write my own.  But when started by hand, it works fine, as does gfs so far.  I’m adding to a long list of stuff to come back and clean up later, but I’m ready to configure Apache on the web server nodes.

Cobwebs: trials and travails (part 1)

Posted in minerva by ghudson on the April 27th, 2007

So, that’s what I’ve been working on: getting the Red Hat clustering software working under virtual Xen images on a single RHEL5 machine.

First, I was stalled on network addresses.  Our machine was deployed into a colo facility on a full network and we couldn’t get more than one IP address for a machine.  Due to miscommunications, it took weeks to break that roadblock.  Xen images can operate via NAT, but the libvirt software that RHEL5 built on top of Xen doesn’t work with it, so I wound up beating my head against that problem futilely until I got more IP addresses and stopped having to worry about it.

The next big problem was software packaging.  The aforementioned clustering software is supposedly packaged in RHEL5 (with the right kind of license) and in FC6.  I want to use FC6 for my guest images, but but key pieces of clustering software (like gnbd and gfs) seem to be missing in FC6.  My queries about this on the mailing list got one confusing answer and no followup, so I gave up.  If necessary, I’ll compile from source, but I don’t want to go into that level of bit-fiddling while I’m still trying to get a test environment up and running.  So, I went to RHEL5.  For various aggravating political and technical reasons, we don’t have convenient access to the clustering software packages in RHEL 5 (we can’t get updates to those packages right now and we don’t have access to the packages via RHN, but we at least have copies of them in AFS from the RHEL5 media), but at least we have them at all.

The Cobwebs web-hosting plan

Posted in minerva by ghudson on the April 27th, 2007

Cobwebs is the temporary name for a shared web-hosting service which is expected to be part of the Minerva service offerings.  (Much like scripts.mit.edu is.)
Shared web-hosting is a security nightmare.  We expect people to be running poorly-written PHP web apps which are subject to frequent compromise.  A user account compromise is no big deal, but if that escalates into a root compromise, that’s a large problem.

In order to prevent root escalations, we need to close known security holes quickly.  The scripts.mit.edu people discovered that a one-day response time is too slow.  Most of the time this is pretty easy: update a vulnerable package while the machine is running and keep going.  A few times a year, however, someone finds a local security hole in the Linux kernel.  Having a mid-day outage to reboot the machine after a kernel upgrade is unpleasant.

So, right from the start we want cobwebs to be a highly-available service, at least in the sense of being able to deliberately shut down one web server instance for a kernel upgrade with no user-visible interruption.  As long as we’re going to have multiple web server instances, we’d like them to do load-sharing for better scalability.

Fortunately, Red Hat has done a lot of work in this area, although it isn’t all completely polished yet.  Their stuff is called the linux-cluster project (which is a slightly confusing name, since there are also Beowulf clusters and the like).  The most complex piece in there is a filesystem called GFS which can be mounted from multiple server nodes which share a block storage device.  But there’s also a lot of generic cluster infrastructure which detects node failures, “fences” failed nodes off from shared storage to avoid corruption, and monitors services.

(scripts.mit.edu doesn’t have to worry about as much of this stuff because it uses AFS.  We don’t want to use AFS, for reasons I can go into separately.)

Right now we can’t justify the hardware expenditure of a 5-6 machine farm for cobwebs, so I’m using virtual Xen images on a RHEL 5 host.  There are a couple of guest images for the web server and user logins (these are the images which have to be kept safe from account-to-root escalations), one guest image to export a block device to the web server images, and a couple of guest images which will run LVS to load-balance connections between the two web server images.  And another one to run MySQL databases.  (I haven’t been thinking too hard about the MySQL end of this service yet, but I’ll have to at some point.)