Opened 17 years ago
Closed 16 years ago
#20 closed defect (fixed)
scripts LVS design issues
Reported by: | andersk | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | web | Keywords: | |
Cc: |
Description (last modified by andersk)
(Imported from help.mit.edu #431727.)
Now that Nagios doesn't suck, we can actually see the scripts outage caused by the AFS server restart every Sunday morning. This made me realize a few things:
- Our fallback to hodge-podge isn't just an exceptional condition; it happens every week. Thus it's an even worse idea than I thought it was. Viewers will get confused, and search engines may remove pages from their indexes, if they happen to get a 404 error from hodge-podge at the wrong moment.
- Since the heartbeat script is in the scripts locker, the AFS server that serves it (aegisthus) is a single point of failure. Ideally LVS would check multiple heartbeat scripts in lockers on several different AFS servers, and continue routing connections if any of them respond.
Change History (3)
comment:1 Changed 17 years ago by andersk
- Description modified (diff)
comment:2 Changed 17 years ago by andersk
- Description modified (diff)
comment:3 Changed 16 years ago by quentin
- Resolution set to fixed
- Status changed from new to closed
Note: See
TracTickets for help on using
tickets.
The LVS directors now run a local sorry-server that responds to all requests with a 500 error.