Opened 8 years ago

Last modified 5 years ago

#321 new enhancement

Switch static-cat to a blacklist instead of whitelist

Reported by: adehnert Owned by:
Priority: normal Milestone:
Component: web Keywords: opinionated, static-cat, haskell
Cc:

Description

Currently, static-cat (and related infrastructure) has a list of extensions that it will read (see http://scripts.mit.edu/faq/50/). Any other extension it will refuse to handle. We should switch from whitelisting certain "safe" extensions (and incrementally increasing that list) to blacklisting certain "dangerous" extensions (and possibly incrementally increasing that list).

(N.B.: If you want a fun ticket to spend five hours discussing before you fail to reach a concensus on the desirability of doing it, here's a great one.)

As Mitch writes in #92:

Regardless of what you'd expect with a normal web server, remember that many of our users have no clue how a web server works. And if they checked to make sure that their file wasn't visible on the web, that's a completely reasonable thing to expect not to change.

At the moment, if somebody decides to put, say, secret-info.rtf, secrets.txt.gz, secrets.txt.bz2, secrets.json, secrets.yaml, secrets.sql, or presumably some huge number of other extensions in their locker, they might verify that loading them didn't work, and conclude that they're safely protected. As is, however, at some point in the indefinite future, a request from another scripts user may result in us unblocking any of those extensions (with variable amounts of angst about the security implications of doing so).

Our original user would have no particular reason to expect their files to have suddenly become accessible. Using a blacklist would help protect against this --- as long as we never remove things from the blacklist, a user who checks once can be reasonably assured that their files will never become visible.

Naturally, the reverse of this is that a user who checks once is not reasonably assured that a page will remain visible. Fortunately, site breakage tends to be fairly obvious, and a scripts.mit.edu user is liable to notice (or get an email from a user) if part of their (actively used) site spontaneously breaks, at which point they can fix it. The reverse does not hold --- if part of their site spontaneously becomes viewable, they may not find out for a lengthy period of time, even for an actively-used site.

There is a risk that some software uses some obscure extension for its configuration files, and would not expect that extension to be served up. I believe that most web hosts default to serving up anything that lacks a configured handler, however, so I would expect any off-the-shelf software to address this risk itself --- by using the same extension as the source files (so that they will also be executed instead of displayed), shipping with .htaccess rules that disable access, or storing the files outside the docroot, for example. (With the exception of the last one (assuming static-cat checks that files are in the docroot), note that these techniques depend upon Apache being uncompromised. This may be unacceptable.)

This is related to:

  • #92 (make robots.txt viewable)
  • #319 (make .txt files viewable)
  • #320 (make static-cat per-user customizable)

There are lengthy discussions of the desirability of doing this at:

  • -c scripts -i sipb, on 2011-03-21
  • -c scripts -i 1685762, on 2011-08-03

Change History (1)

comment:1 Changed 5 years ago by andersk

  • Keywords opinionated added; opionated removed
Note: See TracTickets for help on using tickets.