Custom Query (196 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (64 - 66 of 196)

Ticket Resolution Summary Owner Reporter
#142 fixed Add monitoring for failure of the backend network mitchb
Description

We don't presently have a Nagios test that will alert us if there's a failure of the backend network switch, or the backend interface on an individual server. All the probes for sql.mit.edu will still pass because they run over the public network.

We should use some plugin to run a 'select 1;' or something similarly trivial on each scripts server.

#143 fixed Monitor postfix queue size adehnert
Description

Today we had over a million emails in our postfix queues due to a misbehaving script or something. We should have nagios alert when the size of the queues gets over a hundred or something on a server, so that we notice these problems *before* running ls in the queue directory (much less actually doing something with the messages) becomes annoyingly slow*.

  • It looks like about four minutes, unless that was how long until my ctrl-c registered after I decided I didn't feel like waiting.
#145 fixed update scripts.mit.edu/web text for pony kaduk
Description

The text on scripts.mit.edu/web still directs users to email us for a mit.edu hostname. We should update this to mention pony, and possibly also point at FAQ 14, which has text about checking for availability.

Note: See TracQuery for help on using queries.