IUG 2001 Conference Proceedings

Table of Contents

Session: F5/P2

URL Maintenance in Innopac: A User's Toolbox

Tom Tyler, University of Denver Library

Accompanying slides can be found at: http://www.du.edu/~ttyler/iug2001/

Tom Tyler opened the presentation by explaining his interest in URLmaintenance. While his current position at the University of Denverincludes responsibility for the licensing of electronic resources, it has been his work with the identification of electronic versions of U.S. government documents and support of theMarcXGen software used by many Innopac libraries that has been the basis forhis current involvement in URL maintenance issues.

Mr. Tyler acknowledgedthat while the new Millennium Access Plus (MAP) product relies on dynamic and contextual linking instead ofconventional URLs embedded in Marc records, there will continue to bea need for static URL verification in the foreseeable future.Librarians and staff involved in this work can use the tools described in this presentation to make their work moresystematic, efficient, and effective.

From Mr. Tyler'sperspective URL maintenance has become part of a larger URLmanagement effort. Within libraries, a cottage industry dealing withURL issues has developed in recent years. Reference personnel may beresponsible for maintaining proxy server resources; serials staffoften add and maintain URLs for e-journal access; cataloging createsnew or edits exiting 856 field URLs; systems staff attemptto insure accurate IP addressing for the clientele served by thelibrary; and, others negotiate licenses which communicate IPparameters to vendors or their representatives. In the past year librarieshave seen URLs in their catalogs increase significantly as the result of increased linking to electronic publications by theGovernment Printing Office (GPO), locally initiated links toelectronic full-text journals made available to subscribers orlicensees of aggregator products, and the purchase of large sets of records with 856 URLs for collections such as netLibrary and Early EnglishBooks.

Innovative's URLchecker, as reviewed by Mary Strouse and Tom Tyler at IUG-8 inPhiladelphia, continues to be problematic for many libraries. First,for libraries witha significant number of URLs in their catalogs, the loading time andreloading time of the URLVerify report are painfully slow. Second,redirects are untested, so that if a URL changes, the Innopac productdoesn't check the new address for its validity. Third, there are many goodURLs that are reported as errors.

Mr. Tyler's threefreeware programs continue to be used by a number of Innopac libraries. MarcXGen is used with third-party link checkers such as LinkBot and Xenu's LinkSleuth. URLVExpt, andMkItBetr are used with the Innovative URLVerifyproduct.

Libraries using the Innovative URL Checker can make the product moreeffective by doing two things. First, they can take advantage of theURL Block process, an undocumented feature, which can be set upthrough the Innovative's help desk. URL Block consists of a file containing domains thatwill not be checked by URLVerify.

The second way the Innovative URLVerifyreport can be made more useful is to use either MarcXGen or URLVExptas the basis for secondary testing of URLs from the error report. This generally reduces thenumber of URLs requiring maintenance attention by fifty toseventy-five percent.

In the last half of thepresentation, Mr. Tyler showed several illustration of how files ofdelimited data from Innopac and similar files from electronic resource vendors could be combinedusing a relational database to create an environment for URLmaintenance and management. Mr. Tyler also showed a method for harvesting URLs enmasse from the Internet. He spoke highly of theNoteTab text editor as a tool to extract URLs from web pages and to reformattext to a delimited format that could be used in a database.

Following a briefillustration of how Innovative's text and graphical editorscould be used to create or edit URLs in Marc records, he concludedwith a summary of the University of Denver's"MarcOver" project which, using the tools he hadjust described and a locally created load table for overlayingrecords, resulted in the addition of more than twenty-five thousandnew URLs to the university's Innopac catalog.

Innopaclibraries that have sent staff for Innovative's load table training have the capability to create custom loaders.Mr. Tyler noted that while the creation and use of an overlay loaderis not for the faint of heart, it is a very powerful and effectivetoolfor large-scale projects such as the one successfully attempted byhis library to add URLs to many thousands of records.

Mr. Tyler illustrated thiswith a diagram that outlined the process used at the University ofDenver.

Before the MarcOverproject, the University of Denver catalog contained about 7,000records with about 8,500 URLs. After the project the catalogcontained more than 57,000 URLs in 35,000 records.

The library was able toincorporate more than 22,000 URLs linking to the GPO Access databasesfor congressional publications, about 9,000 URLs linking tolegislative histories from the Library of Congress' Thomas website, links to almost 1,000 National Academy of Sciencespublications, and links to about 500 CIA maps from the University ofTexas library.

