Free Web Hosting Provider - Web Hosting - E-commerce - High Speed Internet - Free Web Page
Search the Web

Sun Microsystem, 27 Oktober 2000

Home | Artikel Lainnya


Library Trends and Technology Leading Libraries into the Next Century


By Graeme Wilson, M.Sc Information Management, University of Strathclyde Bid Manager, EOS International

The intention of this paper is not necessarily to give answers to the challenges facing information managers and librarians at this time, but more to point in a few directions, notable trends, if you prefer.

Let's look very briefly at adoption of technology within the library and information world from a historical viewpoint. Considering what has happened previously often helps us put into perspective what we have to deal with now.

Enabling technologies, that is technologies that were both of the time and affordable to use, resulted in moving libraries and information centers into the computer age. The first part of this automation process was computerization of the circulation system, as this was perceived as a real need. A move towards authority retrieval and thereafter the catalog card followed a few years later. Meanwhile, work had been underway in information retrieval in the database and online world.

Previously, libraries had to depend largely on their own staff to prepare in-house catalog cards or used one of the commercial services for this activity. The need for standardization, even in the manual age, led to AACR2 [Anglo-American Cataloguing 2], and later to the ISBD [International Standard for Book Description] format.

Computerization of these bibliographic descriptions led to a further requirement for standards of bibliographic descriptions for MAchine-Readable Catalogue formats (MARC). This requirement was borne, in part, out of non-standardization of data entry facilities in differing library automation systems, but more importantly, out of the requirement for an `` exchange'' medium for bibliographic data.

Computerization initially took place in large scale libraries. Disk space for storage of data was still a rare commodity and it was usually only larger libraries or profit-based corporations that were able to take advantage of computerization. But why did they do it? Was it immediately seen as a means of competitive advantage over other libraries? Such competition would surely only be warranted in a corporate-type environment and a significant number of first generation library automation sites were in the academic sector. Observations indicate that automation as a competitive advantage resource truly came into maturity once there could be value-added information stored with the bibliographic references. More of that later.

There has always been a spirit of cooperation within the academic field. Sure, there is high competition for grants and financial endowments at almost every level of academic life, but the concept of resource sharing is an old one in the academic community. It was realized that a centrally located store of bibliographic records in machine readable format could be used as a resource by many libraries and that libraries participating in the original cataloguing could maximize investment in staff time by importing the bibliographic records held by their library in print format, but catalogued by a librarian from another site. The cost benefits of this could even be realized by those libraries newer to automation who had to pay for the MARC records.

Time was the real benefit. Libraries are primarily service-oriented institutions - the level of service offered by the librarian to its borrowers and patrons differs greatly depending on the type of library and the amount of resource available for offering the service. Some libraries are lucky enough to be able to charge fees to non-institutional patrons to augment their budget. Others are prohibited to do so by law. The immediate time benefits were two-fold. Library automation systems had a data entry module, enabling the librarian staff to enter records, holdings, etc OR to import records from other sources and they also supplied an information retrieval engine, so that information could be found in the system. The types and quality of search available in early library automation software varied quite widely. It is generally agreed that the retrieval capabilities were limited at best. The major stumbling block, although not initially perceived as such, was that the ability to search was limited to the librarian or to those patrons who had been trained to use computer science retrieval commands. Whatever the limitations, the speed at which librarians were able to retrieve relevant information for their patrons as compared to searching against a card catalogue did not stand to comparison.

Library automation systems became firmly established and recognized as a beneficial technology for the librarian. As computer power increased and as the prices correspondingly decreased, the automation providers increased the scope of the library automation functions. This led to the integrated library management system (ILS). These systems enable library staff to perform almost all of their functions ``on-line'', often meaning that data entered in one part of an integrated system can be used again elsewhere, thus saving time and money and further ensuring accuracy. Your typical ILS system will supply a cataloguing module, OPAC, circulation control module, purchasing module, serials management module, import module and reports module. Others may also include facilities for inter-library loans.

By now, there has been a general homogenization of functionality between library systems; that is, a feature that you find in one system will probably be present in another. Of course, some systems have notable strengths in certain modules and, as such, have attracted more of a certain type of library than others. Academic libraries will be attracted to systems with excellent circulation control functionality. Libraries who are working within a consortium and who have multi-site branches may lean more towards a system that can take care of their special purchasing and distribution needs.

As the technology advanced, so did the library automation systems. The main focus was on improving the ability for the borrower to retrieve information from the library automation system. Command driven retrieval was replaced by menu-driven retrieval; OPAC terminals were set up in libraries with options to perform simplified search strategies. The majority of systems were either on mainframe computers or on vendor specific bespoke hardware. Those libraries with systems on mainframe computers could generally claim to be consistent with other hardware being used in their organization. Access to the database was through an OPAC terminal in the library. This was a dumb terminal with monochrome or green or orange type display. This was effective for retrieval and handling of text-based information and was treated as a success.

The next advance was to enable desktop computer users access to the library over the organization-wide network. This meant that querying of the library database could be done remotely. Hitherto libraries had been running (and in some cases still are running) a suite of electronic online services for their patrons. There was access to the local catalog; in some cases there was electronic access to a citations index; Online service provision was really only accessible by trained information scientists; Subject specialist libraries would often have a CD-ROM terminal set up to enable users to perform more specific content searches. The point is, that all of these services were in almost all cases being offered from different points of access, sometimes locking a library into a bespoke system that, although providing mission critical services to its organization, was not within the scope and control of the in-house systems department. This could often cause problems in support and downtime of services.

In fact, it has proved to be the system departments that have indirectly led to changes in the way the technology has affected libraries; technology-led solutions gradually became replaced by market place demand-driven solutions. For example a systems department manager may offer the following confines to a librarian:

Your library system must work within our current network setup. We are a UNIX-based operation. We demand that your system runs in client-server mode to optimize our investment in PCs and all your client software must be running in a Windows 95 environment. Your system must have sophisticated networking ability and be able to take advantage of a BootP or RARP server set-up. The system must work with or at least interface with our current e-mail system and it must be scalable so that the systems department budget can be spread, if necessary. Most importantly, this organization only runs with the Oracle RDBMS so we will automatically favor any automation vendor whose system uses Oracle.

That is just an example. I could have said Windows NT instead of UNIX or PC-based Novell LAN. I could have used Sybase or Informix instead of Oracle. The point is, libraries are being forced to take account of these parameters and conditions. In some cases these confines may even push them into a technical area that means the library system actually costs more than they might have budgeted. This is the trade-off when trying to conform to a now growing set of standards. These standards are not just the standards set by the library community - AACR2, MARC, Z39.50, ISO10162/3, etc. Librarians are having to pay attention to and to conform to standards across the board; that is all standards emerging in the library, publishing, information services, networking and communications and hardware sectors.

This is the situation now. While this need for conforming to standards has affected librarians and libraries to the extent that standards are dictating what they use in the way of automation there has also been an explosive growth in how librarians can afford access to information for their users. This is, of course, access to information through the Internet.

Access to information resources on the Internet has been around for a long time. The Internet as we know it today has evolved from the ARPANET and other such Inter-Networking services that have grown since. Access to cooperative database services over the Internet was not very user friendly. In fact, the user generally had to be well versed in running a telnet session in order to be able to access information. Once accessed, a user would have to learn how to use the local search engine of the database in order to retrieve relevant information.

The practical use of this access medium in terms of patrons in libraries was (and still is) to enable patrons to access other libraries to search for materials that may assist in collating as much relevant information as may be required for a research paper, for example. To assist the patrons, the librarians generally set up instructions at the terminal or menu access through the network to specific libraries, known by the librarian to be likely to carry titles of interest to their patrons. In most cases, these libraries would already be in some kind of interlibrary loan agreement with the originating library, so as to make these online searches practical. We must remember that these searches were generally carried out as telnet (text-based) sessions.

The information community decided that there would be an advantage for differing types of retrieval system to be able to query other retrieval systems and not have to learn the search language of the remote retrieval system. Telnet sessions actually place you inside the target library database and the user must use the remote database's search language. This is not always an easy thing to master. Instead, why not have a common search interface, or at least a protocol that would align all search engines to work similarly. Thus the Z39.50 standard was born. Z39.50 is an ANSI (US) standard, and there is in fact a sub-set of international standards covering the same common retrieval parameters. These are the ISO 10162/63 standards. In fact, Z39.50 covers a wider set of rules which apply to the publishing industry in general, not just to library automation.

Access to a library collection using the Z39.50 retrieval standard can be via a direct connection or by the Internet. It depends on the modes of access allowed to you by your network and by the host library.

While the Information Industry has been paying attention to retrieval standards, the online industry was busy pushing ahead with technology developments for Internet users. The Internet, as we all know, does not just contain library catalogues. It contains full text databases, it contains marketing lists, special interest discussion forums, online shopping. Online ``browsers'' were developed as a means for users to narrow their search criterion while keeping them in the familiar Windows environment. Users could subscribe to Online hosts and use their browse engines. The benefits of this were that the online hosts would set up several other services for the user to access. These would be tailored specifically for certain user types and be very easy to access. This is the direction of the agencies such as America Online, Prodigy, etc. Other Internet users were keener on the power within the browser itself, Netscape Navigator, for example, and what these browsers could do for them.

By now it is fairly standard practice to be able to place several search definitions in a template and to then choose your Search engine of choice - Magellan, Alta Vista, Excite for example. These will generally return you with anything between several hundred and several tens of thousands of potential results, usually displaying from those sites with highest relevancy down to the lowest relevancy. So again we are faced with a problem- that is actually to find information that really is relevant to the searcher. Because the browsers can handle Frames, Certificates, JAVA and JAVAscript it means that the content appears to be more exciting. This may very well be the case, but it does not mean, per se, that the information found over the Internet is any more relevant than a searcher could find in their own catalogue.

Librarians have not been slow to react to the enormous potential of the Internet as resource provider. After all librarians, like any other professionals, have to justify their service to their management in terms of quality and cost Managing a library is no more about handling printed materials only. Librarians must maintain their own print collections of course, and I am not predicting the disappearance of print as the main source of information for users. But users are becoming much more demanding and sophisticated in their requirements. They require information in machine readable format, they require to access video information, sound, all sorts of digitized media. Now, had librarians not been on top of the situation, management of non-print collections could have become the domain of a new sub-set of information professionals, whereby management and packaging of this information could have been organized as an activity separate to that of the library. I think you will agree that there would be no sense in that scenario. Who, after all, is better equipped, who already has the requisite training, the technical know-how, the ability to categorize and classify, to search than a trained librarian? It is now accepted that printed collections, audio-visual and sound collections must also share the arena with digitized information of all kinds. Previously, a librarian had control over their collection. They developed it through purchasing of new titles or through subscriptions and survived on inter-library loans as a means of sourcing information not existing in their collection. This plethora of digitized information to manage in addition to the hard copy is causing librarians and information managers to become pivotal in how organizations seek data. Technology now allows us to use a one stop access point for information retrieval, but unless the librarian can at least suggest some type of control for search parameters, organizations may be finding that their users are spending a lot of time on web browsers, having fun, but not actually getting much out of it.

That is precisely why IT departments and libraries are building towards developing organization intranets and organization extranets as primary sources of information retrieval for their users and ``loose '' searches over the Internet as a second option.

Making use of other enabling technologies is becoming critical to a library or information center's effectiveness. Those libraries and information centers currently subscribing to the centralized method of data storage and retrieval may want to think about what to do in the future. No longer are you going to be relying on your library holdings database as your sole source of information. You may want to think about how your data is physically stored. On one server or on an array of servers? What is the most effective way for your organization to handle data for retrieval purposes? Is it possible to divide resources in the organization to the extent that some servers are used as primary database machines, others as processors and others as communication servers in order to optimize performance wherever its needed whenever its needed? How can your organization benefit from fat or thin clients? Does your organization plan to make use of emerging JAVAstation/NetPC technologies and the enormous cost savings which they imply? What does this mean in terms of network traffic? Talking of networks, how does your organization expect to manage the organization intranets and issues of access to the Internet? These are the issues you must now consider.

Much has been written about intranets. They are of immense benefit to organizational ``off-site'' workers who have password access to their organization's intranet section over the web. It is a benefit to the organization's information providers who can be sure that they need only release the information once for all relevant parties to be able to access the information.

Extranets have actually been around for a long time. The word `` Extranet'' is, in itself, a newer term. Extranets allow extension of intranet ``privileges'' to a select crowd of users, giving them access to certain areas within your own intranet for mutual benefit; allowing disparate organizations with non-competitive agendas to share information for a common good. Like intranets this information is only accessible to those with permission to access. By definition these are usually (but not always) secure sites. The information set up on these extranet ``sites'' can be searched in the same way as general Internet information - with browsers and Internet search engines. If an Internet browser is allowed to access such a site it is because the information is likely to be relevant to their requirements.

An Extranet service can also be defined as one allowing users who subscribe to the service to access information. This definition pre-supposes that an organization has identified a certain web site as being one that should be a primary source of specific information for Internet users in their organization. Such a service is offered by EOS International and Information Quest with our Q and IQ browser products. The practical use of the Q web browser permits library patrons to browse not only their own collection over the Internet but those of other libraries and Internet sources. The networking set-up of the Q browser will determine whether the access is one point access or ``menu access'' to Internet databases. The aim of IQ is to provide a single point of access to the world of electronic information. In fact, the aim is a little more focused than that. IQ offers access primarily, but not exclusively, to scientific, technical, medical and business journals in electronic form through the Internet, providing extensive research capabilities sought by individuals and organizations. IQ has the ability to search on any word within the article content of a publication. IQ is able to search the original journal text and then is able to deliver the selected original journal article, including any graphics, right to the user's desktop - either through existing subscription to the electronic journal or by a one-off purchase of an electronic article through the web browser's totally secure purchasing system.

IQ's content includes a database of over 7,000,000 tables of content from over 12,000 journals dating back to 1990. In addition, through existing relationships with over 25,000 publishers worldwide, Dawson, the parent company of IQ and EOS International, is adding electronic journal content as fast as publishers can make it available.

Besides the immense database at its disposal what other powerful features can the EOS browsers boast? Our browsers use the Excalibur Search Engine, which uses two powerful indexing and retrieval technologies: Adaptive Pattern Recognition Processing and Natural Language Processing. They support query by example searches and allow building of successive queries based on results of previous searches. IQ supports individual and group user password administration, as well as IP security. It generates detailed user reports. It is designed to be able to link into existing library OPACs and EDI/EDIFACT on-line journal subscription system.

So, what are the benefits that Q offers apart from being a one-stop point of access to information? Let us take the Excalibur search engine as a starting point. Q Series and IQ offers a web browser so the absolutely standard author search, title search and subject search is available as standard. This type of search is available on most bibliographic web browsers and so there are no learning requirements for users. The browsers also offer a keyword search based on all content within the database. This will search through title, table of contents, abstracts and full-text articles for occurrences of the keyword. They will then display a list of hits for the user to browse. IQ is able to identify whether the user/user organization currently subscribes to the electronic version of the identified title. If they do , then the electronic content is able to be downloaded free-of-charge. If they do not, but wish the article, the IQ copyright management system allows the user to download the electronic content and charges according to pre-arranged agreement. This can be by secure online credit card processing, by deposit account or by billing (invoice) procedure. Organizations, in order to control indiscriminate downloading of journals by staff can set up accounts to be accessible by approved staff only. In any case usage statistics from all users are available and so the library staff are able to monitor which journals are being searched on a regular basis. This, in turn, is able to affect the library's purchasing decisions and thus maximizes the power of the budget for printed materials.

On displaying the list of hits from a web search, IQ will also indicate whether a particular title is currently part of the organization's current print holdings. This is possible because, as an added feature, IQ is able to store an organization's holdings list and match them against any search results.

These browsers also allow the user to perform a query by example search. To make the nature of the search understandable to the average user, the query by example search is activated by pressing the ``More like this '' button. This works on occurrences of keywords, authors, etc in the chosen article and searches the databases for other similar titles or titles by the same author. The hits are then displayed in a list of relevancy for the user to browse. The power of this type of intuitive searching should not be underestimated.

Excalibur will also permit searches and return hits on misspelled words. This is because Excalibur has Binary Adaptive Pattern Recognition and so, for example searching on human ceratonin will retrieve titles containing the proper spelling human seritonin. This works on author names too which can be of immense benefit to users having to search for authors of international origin.

But perhaps the greatest strength behind the Excalibur search engine is the ability for the user to enter queries in natural language. This natural language processing is also described as CONCEPT searching. Concept searching not just about allowing users to enter a search as a standard sentence, the search engine is looking for concepts and word relationships, not just words. This allows users to also enter phrases, so a search could be performed on a complete sentence like :

Why do ganglia degenerate during normal development? and also on a phrase like

effects of modifying endocrine levels in mammals

In process of executing natural language searches you can choose, if you wish, to turn on a feature called WORD EXPANSION. This enables you, the user, to make intellectual decisions, based on your chosen subject knowledge, to further define filters in the search. Thus the user is able to specify that in the search ``Why do ganglia degenerate during normal development? '' that degenerate is a verb and to discount any hits with degenerate as a noun. The principle benefit to defining words in WORD EXPANSION is to prevent false hits. To researchers, finding too many articles, most of which are not relevant, is a bigger problem than not finding enough articles.

Although I have concentrated on EOS International and its products in order to describe these latest trends, the criteria I discuss for retrieving and managing of information are general. I hope this paper has helped the reader to understand the uses of current and a bit of future technology to the information world.

Copyright© April 1997 Graeme Wilson: gwilson@eosintl.com


©2000 InfoPerpus. Any request? Please send to: librarian@infoperpus.8m.com