[Previous] [Contents] [Next]
This near-standing-room-only session outlined events leading to a 1996 system crash and how orders, serials, claims, and invoices were handled during the prolonged downtime plus clean-up efforts afterwards.
In addition to the session material included in the IUG conference notebook, Ms. Cerqua also provided a handout providing background information on the State Library of Ohio along with several examples pertaining to procedures implemented during and after the system crash.
An equipment malfunction (a disk failure) occurred October 2, 1996. Because of problems with the backup tapes and reindexing, the system was offline until October 30 for staff functions and November 4 for the OPAC. When the crash occurred, $50,000 of federal grant money had just been encumbered, and orders had to be received and paid for very quickly. The highest priority was to keep materials moving.
Because of problems with a corrupted backup, reindexing was necessary. Approximately 2% of the database was corrupted, and all acquisitions data from September 21, 1996 to October 2, 1996 was missing. Orders, check-in records, payment records, routing slips were all gone and could not be recovered. Innovative was contacted, as were acquisitions vendors to alert them of the situation. Baker & Taylors B&T Link program was implemented as a backup ordering system. Because the State Library is an OhioLINK member, the collection was mirrored and staff could see the collection to confirm titles and locations.
In assessing the situation, the following points were key to survival. Staff skill levels had to be identified, with untapped skills utililzed in the short term. Internal and external customers had to be contacted and apprised of the problem. Work-arounds for key processes were set up (serials, firm orders, receiving, invoices/financial, fiscal). Priorities for each point were established. Spreadsheets and vendor order confirmation reports were used to track orders.
Innovative reindexed the database and the record reload took one week. All stray orders, check-ins, and items were moved to dummy records. When requested, Innovative created more dummy records and moved order records around. About 350 stray order records were attached to four bibliographic records. All serials data had to be rekeyed.
After the disaster, backup procedures were changed significantly. It was learned that the disk crash was caused by a tape load failure spurred by an error message being ignored during the backup procedure. As a result, a tape broke, overheated the unit, and shut down the system. The responsibility for backups was changed. Full backups are now done twice a week, and tapes are refreshed more often. One month of backup tapes is kept in storage, with one weeks tapes kept on site. Daily backups are also done. E-mail confirmations of orders now go to a non-INNOPAC address. A log is kept of all system glitches, burps, etc., no matter how insignificant they may seem at the time.
Costs associated with the crash included overtime (about 60 hours) and lost records (order, bib, serial, etc.). In assessing the intangibles of the disaster, it was noted that there was a loss of confidence in the database, and more worry given to little glitches that might occur with the system. The Library staff is now more aggressive when dealing with Automation staff in resolving problems.
[Previous] [Contents] [Next]