Custom Query (196 matches)
Results (115 - 117 of 196)
Ticket | Resolution | Summary | Owner | Reporter |
---|---|---|---|---|
#142 | fixed | Add monitoring for failure of the backend network | mitchb | |
Description |
We don't presently have a Nagios test that will alert us if there's a failure of the backend network switch, or the backend interface on an individual server. All the probes for sql.mit.edu will still pass because they run over the public network. We should use some plugin to run a 'select 1;' or something similarly trivial on each scripts server. |
|||
#143 | fixed | Monitor postfix queue size | adehnert | |
Description |
Today we had over a million emails in our postfix queues due to a misbehaving script or something. We should have nagios alert when the size of the queues gets over a hundred or something on a server, so that we notice these problems *before* running ls in the queue directory (much less actually doing something with the messages) becomes annoyingly slow*.
|
|||
#145 | fixed | update scripts.mit.edu/web text for pony | kaduk | |
Description |
The text on scripts.mit.edu/web still directs users to email us for a mit.edu hostname. We should update this to mention pony, and possibly also point at FAQ 14, which has text about checking for availability. |