Digital Libraries: Chapter 5 (1999)

Chapter 5
People, organizations, and change

People and change

This is the first of four chapters that examine the relationship between people, organizations, and digital libraries. This chapter examines how individuals and organizations are responding to the changes brought by new technology. Chapter 6 looks more specifically at economic and legal issues. It is followed by a chapter on the management of access to digital libraries, and related issues of security. Finally, Chapter 8 examines user interfaces, which are the boundary between people and computers.

The story of digital libraries is one of change. Authors, readers, librarians, publishers and information services are adopting new technology with remarkable speed; this is altering relationships among people. Every organization has some members who wish to use the most advanced systems (even when not appropriate) and others who demand the traditional ways of doing things (even when the new ones are superior). This is sometimes called the "generation gap", but the term is a misnomer, since people of all ages can be receptive to new ideas. The pace of change, also, differs widely amongst organizations and disciplines. Some corporate libraries, such as those of drug companies, already spend more than half their acquisitions budgets on electronic materials and services, while, for the foreseeable future, humanities libraries will be centered around collections of printed material, manuscripts, and other tangible items.

Perhaps the most fundamental change is that computing is altering the behavior of the creators and users of information. Tools on personal computers allow individuals with a modicum of skill to carry out processes that previously required skilled craftsmen. Word processing and desktop publishing have made office correspondence almost as elegant as professionally designed books. Figures, graphs, and illustrations can be created in full color. Not every creator wishes to learn these techniques, and some privately produced efforts have miserably poor design, but many people create elegant and effective materials with no professional assistance. This has impact on every information professional - publisher, librarian, archivist, indexer and cataloguer, or webmaster.

A few people argue that the new technology removes the need for professionals to manage information. That is naive. Publishers and libraries perform functions that go far beyond the management of physical items. Services such as editing and refereeing, or abstracting and indexing are not tied to any particular technology. Although the web permits users to mount their own information, most people are pleased to have support from a professional webmaster. The overall need for information professionals will continue, perhaps grow, even as their specific practices change with the technology, but the new forms of organizations and the new types of professional that will emerge are open to speculation.

Digital libraries created by the users

Some of the most successful digital libraries were created by researchers or groups of professionals for themselves and their colleagues, with minimal support from publishers or librarians. Chapter 2 described two of these, the Physics E-Print Archives at the Los Alamos National Laboratory and the Internet RFC series. The panels in this section describe three more: the Netlib library of mathematical software, the data archives of the International Consortium for Political Science Research, and the Perseus collections of classical texts. These digital library collections are well-established and heavily used. They employ professionals, but the leadership and most of the staff come from the respective disciplines of physics, computing, applied mathematics, the social sciences, and classics.

Digital libraries that were created by the user communities are particularly interesting because the services have been constructed to meet the needs of the disciplines, without preconceived notions of how collections are conventionally managed. When creators and users develop the systems that they want for their own work, they encounter the normal questions about organizing information, retrieving it, quality control, standards, and services that are the lifeblood of publishing and libraries. Sometimes, they find new and creative answers to these old questions.

Panel 5.1.
Netlib

Netlib is a digital library that provides high-quality mathematical software for researchers. It was founded in 1985 by Jack Dongarra and Eric Grosse, who have continued as editors-in-chief. It is now maintained by a consortium which is based on Bell Laboratories, and the University of Tennessee and Oak Ridge National Laboratory, with mirror sites around the world.

The original focus of Netlib was the exchange of software produced from research in numerical analysis, especially software for supercomputers with vector or parallel architectures. The collections now include other software tools, technical reports and papers, benchmark performance data, and professional information about conferences and meetings. Most of the material in Netlib is freely available to all, but some programs have licensing restrictions, e.g., payment is required for commercial use.

The technical history of Netlib spans a period of rapid development of the Internet. Beginning with an electronic mail service, at various times Netlib has provided an X-Windows interface, anonymous FTP, CD-ROMs, and Gopher services. Currently, it uses web technology. The Netlib team continues to be among the leaders in developing advanced architectures for organizing and storing materials in digital libraries.

The organization of the Netlib collections is highly pragmatic. It assumes that the users are mathematicians and scientists who are familiar with the field and will incorporate the software into their own computer programs. The collections are arranged in a hierarchy, with software grouped by discipline, application, or source. Each collection has its own editor and the editors use their knowledge of the specific field to decide the method of organization. Netlib has developed a form of indexing record that is tailored to its specialized needs and the collections are also classified under the Guide to Available Mathematical Software (GAMS), which is a cross-index provided by the National Institute of Standards and Technology.

Netlib is a success story. Hundreds of thousands of software programs are downloaded from its sites every year, contributing to almost every branch of scientific research.

Panel 5.2
Inter-university Consortium for Political and Social Research

The Inter-university Consortium for Political and Social Research (ICPSR), based at the University of Michigan, has been in continuous operation since 1962. The archive provides social scientists with a digital library to store collections of data that they have gathered, for others to use. Data that was expensive to gather will lose its usefulness unless it is organized, documented, and made available to researchers. The ICPSR collects research data in a broad range of disciplines, including political science, sociology, demography, economics, history, education, gerontology, criminal justice, public health, foreign policy, and law.

About two hundred data sets are added to the archive every year, with several thousand files of data. Some of these data sets are very large. In addition to the data itself, the archive stores documentation about the data, and codebooks, which explain the design of the study, decisions made by the original researchers, how they gathered the data, any adjustments made, and the technical information needed to used it in further research. The collection is organized hierarchically by discipline for easy browsing. Each data set is described by a short record which contains basic cataloguing information and an abstract.

The archive has been in existence through many generations of computer system. Currently it has a web-based user interface, which can be used for browsing or searching the catalog records. Data sets are delivered over the Internet by FTP, with selected data available on CD-ROM.

ICPSR is a not-for-profit consortium. Hundreds of colleges and universities in the United States and around the world are members. Their annual subscriptions provide access to the collections for their faculty and students. People whose organizations do not belong to ICPSR can pay for the use of individual data sets.

Libraries and museums play a special role for users in the humanities, because they provide the raw material on which the humanities are based. Digital libraries can provide much wider access to these materials than could ever be provided by physical collections. A university with a fine library will always have an advantage in humanities teaching and research, but it need not have exclusive use of unique items. The British Library is beginning to digitize and mount on the web treasures, such as the Magna Carta and the manuscript of Beowulf. In the past, access to such documents was restricted to scholars who visited the library or to occasional, expensive facsimile editions. In the future, everybody will be able to see excellent reproductions. Panel 5.3 describes Perseus, a digital library of classical materials organized to be accessible to people who are not specialists in the field.

Panel 5.3.
Perseus

Some of the leading projects in electronic information have been led by established faculty in the humanities, but many have been maverick projects with little institutional support. Sometimes junior faculty members have pursued new ideas against the opposition of senior members in their departments. In the mid-1980s, while a junior faculty member in classics at Harvard University, Gregory Crane began work on the project known as Perseus, which uses hyperlinks to relate sources such as texts and maps with tools such as dictionaries. In particular, Crane aimed to give the general student an appreciation of the poems of the Greek poet Pindar. From this early work has emerged one of the most important digital libraries in the humanities.

The collections now have comprehensive coverage of the classical Greek period and are extending steadily into other periods of Greek history, the Roman period, and beyond. Source materials include texts, in both the original language and in translation, and images of objects, such as vases, and architectural sites. However, perhaps the greatest resource is the effort that has been made in structuring the materials and the database of links between items.

In Perseus, an emphasis on the content has enabled the collections to migrate through several generations of computer system. Classical texts are fairly stable, though new editions may have small changes, and supporting works such as lexicons and atlases have a long life. Therefore, the effort put into acquiring accurate versions of text, marking them up with SGML, and linking them to related works is a long term investment that will outlive any computer system. Perseus has never had more than one programmer on its staff, but relies on the most appropriate computer technology available. It was an early adopter of Apple's Hypercard system, published a high-quality CD-ROM, and quickly moved to the web when it emerged. The only elaborate software that the project has developed are rule-based systems to analyze the morphology of inflected Greek and Latin words.

The long-term impact of Perseus is difficult to predict, but its goals are ambitious. In recent years, academic studies in the humanities have become increasingly esoteric and detached. It is typical of the field that Crane was unable to continue his work at Harvard, because it was not considered serious scholarship; he moved to Tufts University. Perseus may not be Harvard's idea of scholarship but it is certainly not lightweight. The four million words of Greek source texts include most of the commonly cited texts; when there are no suitable images of a vase, Perseus has been know to take a hundred new photographs; the user interface helps the reader with easy access to translations and dictionaries, but has a strong focus on the original materials. Perseus is a treasure trove for the layman and is increasingly being used by researchers as an excellent resource for traditional studies. Hopefully, Perseus's greatest achievement will be to show the general public the fascination of the humanities and to show the humanities scholar that popularity and scholarship can go hand in hand.

The motives of creators and users

Creators

An understanding of how libraries and publishing are changing requires an appreciation of the varied motives that lead people to create materials and others to use them. A common misconception is that people create materials primarily for the fees and royalties that they generate. While many people make their livelihood from the works that they create, other people have different objectives.

Look at the collections in any library; large categories of material were created for reasons where success is measured by the impact on the readers, not by revenue. Charles Darwin's The Origin of Species was written to promulgate ideas; so were Tom Paine's The Rights of Man, Karl Marx's Das Kapital, and St. Paul's Epistle to the Romans. Since classical times, books, manuscripts, pictures, musical works and poems were commissioned for personal aggrandizement; many of the world's great buildings, from the pyramids to the Biblioth�que de France, were created because of an individual's wish to be remembered. Photographs, diaries, poems, and letters may be created simply for the private pleasure of the creator, yet subsequently be important library items. Few activities generate so many fine materials as religion; conversely, few activities create as much of dubious merit. The Christian tradition of fine writing, art, music, and architecture is repeated by religions around the world. The web has large amounts of material that are advertising or promotional, some of which will eventually become part of digital libraries.

The act of creation can be incidental to another activity. A judge who delivers a legal opinion is creating material that will become part of digital libraries. So is a museum curator preparing an exhibition catalog, or a drug researcher filing a patent claim. Materials are created by government agencies for public use. They range from navigational charts and weather forecasts, to official statistics, treaties, and trade agreements. Many of the materials in libraries were created to provide a record of some events or decisions. They include law reports, parish records, government records, official histories, and wartime photographs. Preserving an official record is an important functions of archives.

People who convert material to digital formats from other media can also be considered creators; conversion activities range from an individual who transcribes a favorite story and mounts it on the web, to projects that convert millions of items. The actual act of creation can even be carried out by a machine, such as images captured by a satellite circling the earth.

A second misconception is that creators and the organizations they work for have the same motives. Some works are created by teams, others by individuals; a feature film must be a team effort, but nobody would want a poem written by a committee. Hence, while some creators are individuals, such as free-lance writers, photographers, or composers, others belong to an organization and the materials that they create are part of the organization's activities. When somebody is employed by an organization, the employer often directs the act of creation and owns the results. This is called a "work for hire". In this case, the motivations of the individual and the organization may be different. A corporation that makes a feature film could be motivated by profit, but the director might see an opportunity to advocate a political opinion, while the leading actress has artistic goals.

Creators whose immediate motive is not financial usually benefit from the widest possible exposure of their work. This creates a tension with their publishers, whose business model is usually to allow access only after payment. Academic journal are an important category of materials where the author's interests can be in direct conflict with those of the publisher. Journal articles combine a record of the author's research with an opportunity to enhance the writer's reputation. Both objectives benefit from broad dissemination. The tension between creators who want wide distribution and the publishers' need for revenue is one of the themes of Chapter 6.

Users

Library users are as varied as creators in their interests and levels of expertise. Urban public libraries serve a particularly diverse group of users. For some people, a library is a source of recreational reading. For others, it acts as an employment center, providing information about job openings, and bus timetables for commuters. The library might provide Internet connections that people use as a source of medical or legal information. It might have audio-tapes of children's stories, and reference materials for local historians, which are used by casual visitors and by experts.

Individuals are different and a specific individual may have different needs at different times. Even when two users have similar needs, their use of the library might be different. One person uses catalogs and indexes extensively, while another relies more on links and citations. Designers of digital libraries must resist the temptation to assume a uniform, specific pattern of use and create a system specifically for that pattern.

Among the diversity of users, some broad categories can be distinguished, most of which apply to both digital libraries and to conventional libraries. One category is that people use libraries for recreation; in digital libraries this sometimes takes the form of unstructured browsing, colloquially known as "surfing". Another common use of a library is to find an introductory description of some subject: an engineer begins the study of a technical area by reading a survey article; before traveling, a tourist looks for information about the countries to be visited. Sometimes a user wants to know a simple fact. What is the wording of the First Amendment? What is the melting point of lead? Who won yesterday's football game? Some of these facts are provided by reference materials, such as maps, encyclopedias and dictionaries, others lie buried deep within the collections. Occasionally, a user wants comprehensive knowledge of a topic: a medical researcher wishes to know every published paper that has information about the effects of a certain drug; a lawyer wishes to know every precedent that might apply to a current case.

In many of these situations, the user does not need specific sources of information. There will be several library objects that would be satisfactory. For example, in answering a geographic question, there will be many maps and atlases that have the relevant information. Only for comprehensive study of a topic is there less flexibility. These distinctions are important in considering the economics of information (Chapter 6), since alternative sources of information lead to price competition, and in studying information retrieval (Chapter 10), where comprehensive searching has long been given special importance.

The information professions and change

Librarians and change

As digital information augments and sometimes replaces conventional methods, the information professions are changing. Librarians and publishers, in particular, have different traditions, and it is not surprising that their reactions to change differ. To examine how change affects librarians, it is useful to examine four aspects separately: library directors, mid-career librarians, the education of young librarians, and the increasing importance in libraries of specialists from other fields, notably computing.

Library directors are under pressure. To be director of a major library used to be a job for life. It provided pleasant work, prestige, and a good salary. The prestige and salary remain, but the work has changed dramatically. Digital libraries offer long-term potential but short-term headaches. Libraries are being squeezed by rising prices across the board. Conservative users demand that none of their conventional services be diminished, while other users want every digital service immediately. Many directors do not receive the support that they deserve from the people to whom they report, whose understanding of the changes in libraries is often weak. Every year a number of prominent directors decide not to spend their working life being buffeted by administrative confusion and resign to find areas where they have more control over their own destiny.

Mid-career librarians find that digital libraries are both an opportunity and a challenge. There is a serious shortage of senior librarians who are comfortable with modern technology. This means that energetic and creative individuals have opportunities. Conversely, people who are not at ease with technology can find that they get left behind. Adapting to technical change is more than an issue of retraining. Training is important, but it fails if it merely replaces one set of static skills with another. Modern libraries need people who are aware of the changes that are happening around them, inquisitive and open to discover new ideas. Panel 5.4 describes one attempt to educate mid-career librarians, the Ticer summer school run by Tilburg University in the Netherlands.

Panel 5.4
The Ticer Summer School

In 1996, Tilburg University in the Netherlands introduced a summer school to educate senior librarians about digital libraries. The program was an immediate success and has been fully subscribed every year. Students for the two week course come from around the world. The largest numbers come from northern Europe, but, in 1998, there were students from Malaysia, Japan, India, South Korea and many other countries. Most are senior staff in academic or special libraries.

Tilburg University has been a leader in digital library implementation for many years; the course reflects the combination of strategic planning and practical implementation that has marked the university's own efforts. Many of the lecturers at the summer school are members of the libraries or computing services at Tilburg. In addition, a range of visiting experts provide breadth and visibility for the program. The fat book of lecture notes that every student receives is a major resource.

Some other features have led to the success of the Ticer school. The costs have been kept reasonably low, yet a high standard of facilities is provided. A pleasant social program enhances the value of fifty people from around the world living and working together for two weeks. Ticer has close relationship with Elsevier Science, the large Dutch publisher. Elsevier staff from around the world attend as students and senior Elsevier personnel give lectures. Finally, in a country where the weather is unpredictable, Ticer always seems to provide warm summer weather where people can sit outside, relax and work together.

Ticer demonstrates how international the field of digital libraries has become and the privileged position of the English language. The Ticer program is so thoroughly English-language that the publicity materials do not even mention that English is the language of the summer school, yet few students come from the English speaking countries.

Ticer is considering other programs. In 1998, several students expressed frustration that, while they were learning a great deal from the summer school, the directors of their libraries needed to hear the same story. Perhaps Ticer could offer a shortened program for executives.

The education of young librarians revolves around library schools. Librarianship is a profession. In the United States, a masters degree from a library school is a requirement for many library jobs. For years, the curriculum of library schools was rather pedestrian, centered around the basic skills needed by mid-level librarians. In many universities, the library schools was one of the weaker schools academically. Over the past few years, universities have realized that digital libraries provide opportunities for a new type of library school, with a new curriculum and a vigorous program of research. Some library schools are being rebuilt to focus on the modern world. Panel 5.5 describes one of them, at the University of California at Berkeley.

Table 5.5
The School of Information Management and Systems at the University of California at Berkeley

Early in the 1990s, several leading universities questioned the quality of their library schools. Other professional schools, such as law and business, were attracting the nation's brightest students and faculty, contributing outstanding research, and making a hefty profit. Library schools were drifting. The curriculum was more suitable for a trade school than a university. The faculty were underpaid and unproductive. Worst of all, the educational programs were not creating the leaders that the new digital libraries and electronic publishing would require.

Faced with this situation, Columbia University simply closed down its library school. The programs were considered to be less valuable than the buildings that the school occupied. The University of Chicago also closed its library school. The University of California at Berkeley and the University of Michigan went the other way and completely refurbished their schools. Perhaps the most dramatic change was at Berkeley.

The decision to create a new school

Many universities claim that academic decisions are made by the faculty. In most universities, this is pretense, but at Berkeley the academic community has great power. Berkeley's first instinct was to follow Columbia and to close down its library school, but enough people were opposed that the university changed its mind and set up a high-level planning group. The report of this group, Proposal for a School of Information Management and Systems, was released in December 1993. In classic bureaucratic doublespeak, the fundamental recommendation was to "disestablish" the existing school and "reconstitute" a new school from its ashes.

In creating a new academic program, good universities look at two factors. The first is the academic content. Is this an area with deep intellectual content that will attract first-rate faculty, whose scholarship will be a credit to the university? The second is education. Will the program attract able students who will go out and become leaders? The report answered these questions emphatically, but only if certain criteria were met.

As a starting point the report explicitly rejected the traditional library school curriculum and the master of library science degrees. With remarkable frankness, the report stated, "The degree to be awarded by this program ... is not designed to meet American Library Association requirements; rather, it will serve as a model for the development of accreditation criteria for the emerging discipline upon which the School is focused."

An inter-disciplinary program

The report was accepted and the school set out to recruit faculty and students. From the start the emphasis was on inter-disciplinary studies. The planning report is full of references to joint programs with other schools and departments. The program announcement for the new masters program mentioned, "aspects of computer science, cognitive science, psychology and sociology, economics, business, law, library/information studies, and communications." Students were required to have significant computer skills, but no other constraints were placed on their previous academic background.

The appointment of faculty was equally broad. The first dean, Hal Varian, is an economist. Other faculty members include a prominent scholar in copyright law and a former chair of the Berkeley computer scientist department. Many of the appointments were joint appointments, so that the faculty teach and carry out research across traditional departmental lines.

Success

It is to early to claim success for this new school or the similar effort at the University of Michigan, but the first signs are good. The school has met the basic requirements of high-quality faculty and students. The research programs have grown fast, with large external grants for research on interesting topics. Hopefully, a few years from now the first graduates will be emerging as the leaders of the next generation of information professionals.

Digital libraries require experts who do not consider themselves professional librarians, such as computer specialists and lawyers. Fitting such people into the highly structured world of libraries is a problem. Libraries need to bring in new skills, yet libraries are hesitant to recruit talent from outside their ranks, and needlessly restrict their choices by requiring a library degree from candidates for professional positions. Few of the young people who are creating digital libraries see library school on their career path. Compared with other professions, librarianship is notable for how badly the mid-level professionals are paid. Top-class work requires top-class people, and in digital libraries the best people have a high market value. It is troubling to pay a programmer more than a department head, but it is even more troubling to see a good library deteriorate because of a poor computing staff. Part of the success of the Mercury project at Carnegie Mellon was that the technical staff were administratively members of the university's computing center. Their offices were in the library and their allegiance was to the digital library, but they were supervised by technical managers, worked their usual irregular hours, had excellent equipment, and were paid the same salary as other computing professionals. Few libraries are so flexible.

Publishers and change

The changes that are happening in publishing are as dramatic as those in libraries. Since the fifteenth century, when printing was introduced in Europe, publishing has been a slow moving field. The publisher's task has been to create and distribute printed documents. Today, the publishing industry still draws most of its revenue from selling physical products - books, journals, magazines, and more recently videos and compact disks - but many publishers see their future growth in electronic publications.

Publishing is a mixture of cottage industry and big business. Large publishers, such as Time Warner, Reed Elsevier, and the Thomson group, rank as some of the biggest and most profitable corporations in the world, yet most of the 50,000 publishers in the United States publish fewer than ten items per year. Academic publishing has the strange feature that some publishers are commercial organizations, in business to make profits for their shareholders, while others are not-for-profit societies and university presses whose primary function is to support scholarship.

Success in publishing depends upon the editors who select the materials to be published, work with the creators, and oversee each work as it goes through production. Publishing may be a business, but many of the people who come to work in publishing are not looking for a business career. They enter the field because they are interested in the content of the materials that they publish.

Publishers use sub-contractors extensively. A few, such as West Publishing, the big legal publisher, run their own printing presses, but most printing is by specialist printers. Services that support the editor, such as copy-editing, are often carried out by free-lance workers. Books are sold through booksellers, and journals through subscription agents. This use of contractors gives publishers flexibility that they would not have if everything were carried out in-house. When Elsevier moved to producing journals in SGML mark-up, they could withdraw contracts from those printers who were not prepared to change to SGML. Recently, the publishing industry has had a wave of corporate takeovers. This has created a few huge companies with the wealth to support large-scale computing projects. The future will tell whether size is necessarily a benefit in electronic publishing. Web technology means that small companies can move rapidly into new fields, and small companies sometimes have an advantage in developing close relationships between editors and authors. Some observers considered that the decision to be sold by West Publishing, which had been privately held, was driven by fear that electronic publishing might weaken its dominance of the legal market.

Almost every university has a university press, but university publishing is less central to the university than its library. A few, such as Oxford University Press and Chicago University Press, have much in common with commercial publishers of academic materials. They publish large numbers of general interest, reference, and scholarly books. These presses operate on a sound financial footing and give no priority to authors from their own universities. Other university presses have a different role. Most scholarly monographs in the humanities have such narrow appeal that only a few hundred copies are sold. Such books would never be considered by commercial publishers. They are published by university presses, who operate on shoe-string budgets, usually with subsidies from their host universities. As universities have been tight for money during the past few years, these subsidies have been disappearing.

Computer professionals, webmasters, and change

Computing professional are as much part of digital libraries as are librarians and publishers. Success requires cooperation between these professions, but the cultural differences are great. The members of the Internet community are a wonderful resource, but they have an unorthodox style of working. Many have grown up immersed in computing. As in every discipline, some people are so knowledgeable about the details that they see nothing else and have difficulty in explaining them to others. A few people are deliberately obscure. Some technical people appear to define merit as knowledge of the arcane. Fortunately, deliberate obscurity is rare. Most people would like others to know what they do, but technical people often have genuine difficulty in describing technology to non-specialists.

The Internet community has its foundation in the generation of computer scientists who grew up on the Unix operating system. Much of the success of Unix came from a tradition of openness. The Unix, Internet, and web communities share their efforts with each other. They have discovered that this makes everybody more productive. An appreciation of this spirit of openness is fundamental to understanding how computer scientists view the development of digital libraries. This attitude can become a utopian dream, and a few idealists advocate a completely uncontrolled information society, but this is an extreme viewpoint. Most members of the Internet community have a healthy liking for money; entrepreneurship is part of the tradition, with new companies being formed continuously.

Even in the rapidly changing world of computing, the emergence of webmaster as a new profession has had few parallels. At the beginning of 1994, the web was hardly known, yet, in summer 1995, over one thousand people attended the first meeting of the Federal Webmasters in Washington, DC. This is an informal group of people whose primary job is to maintain web sites for the U.S. government. The emergence of the name "webmaster" was equally rapid. It appeared overnight and immediately became so firmly entrenched that attempts to find a word that applies equally to men and women have failed. Webmaster must be understood as a word that applies to both sexes, like librarian or publisher.

A webmaster is a publisher, a librarian, a computer administrator, and a designer. The job requires a range of expertise that include the traditional publishing skills of selection of material and editing, with the addition of user interface design and the operation of a high-performance computer system. A web site is the face that an organization presents to the world. The quality of its graphics and the way that it presents the material on the site are as important as any public relations material that the organization issues. At CNRI, the webmaster refers to herself, half-jokingly, as "the Art Department". She is a professional librarian who is highly skilled technically, but she spends much of her time carrying out work that in other contexts would be called graphic design. Her designs are the first impression that a user sees on visiting CNRI's web site.

In some organizations, the webmaster selects and even creates, the materials that appear on a web site. More commonly, the materials are generated by individuals or groups within the organization. The webmaster's task is to edit and format individual items, and to organize them within the web site. Thus the overall shape of the site is created by the webmaster, but not the individual items. For example, CNRI manages the web site of the Internet Engineering Task Force (IETF). The content is firmly controlled by the IETF secretariat, while the webmaster contributed the design of the home page the links with other items on the web site, so that a user who arrives at any part of the site has a natural way to find any of the material.

Webmasters vary in their range of computing skills. The material on a web site can range from simple HTML pages to complex. Some user interface methods, such as the Java programming language, require skilled programmers. Some web sites are very large. They replicate information on many powerful computers, which are distributed across the world. A really popular site, such as Cable Network News, has over one hundred million hits every day. Designing and monitoring these systems and their network connections is a skilled computing job. If the web site handles commercial transactions, the webmaster needs expertise in network security.

Many of the organizations that contribute to digital libraries have computing departments. The webmasters can rely on them for the occasional technical task, such as setting up a web server or a searchable index to the site. In other organizations, the webmaster must be a jack of all trades. Many web sites serve organizations that have no computing staff. A plethora of companies provide web services for such organizations. They design web pages, run server computers, and perform administrative tasks, such as registration of domain names.

New forms of organization

Consortia

Managing large online collections is expensive and labor-intensive. Libraries can save effort by working together in consortia to acquire and mount shared collections. This saves effort for the libraries and also for publishers since they have fewer customers to negotiate with and support. In the United States, consortia have been organized by states, such as Ohio, or by academic groupings. In Europe, where most universities are state run, there are well-established national consortia that provide digital library services for the entire academic community.

Panel 5.6 describes MELVYL which is a good example of a collaborative efforts. MELVYL was established by the University of California, before the days of the web, to provide services that concentrated on sharing catalog and indexing records. When the web emerged and publishers began to supply the full text of journals, the organization and technical structures were in place to acquire these materials and deliver them to a large community of users. This panel also describes the California Digital Library, a new venture built on the foundations of MELVYL.

Panel 5.6
MELVYL

The nine campuses of the University of California often act as though they were nine independent universities. Campuses, such as the University of California, Berkeley and the University of California, Los Angeles (UCLA), rank as major universities in their own right, but organizationally they are parts of a single huge university. They have shared a digital library system, MELVYL, for many years. For much of its life, MELVYL was under the vigorous leadership of Clifford Lynch.

At the center of MELVYL is a computer-based catalog of holdings of all libraries in the nine campuses, the California State Library in Sacramento, and the California Academy of Sciences. This catalog has records of more than 8.5 million monographic titles, representing 13 million items. In addition to book holdings, it includes materials such as maps, films, musical scores, data files, and sound recordings. The periodicals database has about 800,000 unique titles of newspapers, journals, proceedings, etc., including the holdings of other major California libraries. MELVYL also provides access to numerous article abstracting and indexing files, including the entire Medline and Inspec databases. In 1995, MELVYL added bit-mapped images of the publications of the Institute of Electrical and Electronics Engineers (IEEE). The images were linked through the Inspec database. Users who accessed the Inspec database could see the message "Image available" for records with linked IEEE bit-mapped images. The user could then request the server to open a window on the user's workstation to display the bit-mapped images. Use of the books and periodicals files is open to everybody. Use of the article databases is limited to members of the University of California.

MELVYL has consistently been an early adopter of new digital library technology. Much of the development of Z39.50 has been associated with the service. The MELVYL team was also responsible for creating the communications network between the University of California campuses.

The California Digital Library

The success of MELVYL helped the creation in 1998 of an even bolder project, the California Digital Library. This is the University of California's approach to the organizational challenges posed by digital libraries.

Each of the nine campuses has its own library and each recognizes the need to provide digital library services. After a two year planning process, the university decided in 1997 to create a tenth library, the California Digital Library, which will provide digital library services to the entire university. Organizationally this digital library is intended to be equal to each of the others. It director, Richard Lucier, ranks equally with the nine other directors; its initial budget was about $10 million and is expected to rise sharply.

The university could easily have justified central digital library services through arguments of economies of scale, a focus for licensing negotiations, and leveraged purchasing power. For these reasons, the campuses have historically shared some library services, notably MELVYL, which is incorporated in the new digital library. The California Digital Library is much more ambitious, however. The university anticipates that digital libraries will transform scholarly communication. The digital library is explicitly charged with being an active part of this process. It is expected to have a vigorous research program and to work with organizations everywhere to transform the role of libraries in supporting teaching and research. These are bold objectives, but the organization of the library almost guarantees success. At the very least, the University of California will receive excellent digital library services; at the best the California Digital Library will be a catalyst that changes academic life for everybody.

Secondary information providers and aggregators

The term secondary information covers a wide range of services that help users find information, including catalogs, indexes, and abstracting services. While many publishers are household names, the suppliers of secondary information are less well known. Some, such as Chemical Abstracts, grew out of professional societies. Others have always been commercial operations, such as ISI, which publishes Current Contents and Science Citation Index, and Bowker, which publishes Books in Print. OCLC has a special niche as a membership organization that grew up around shared cataloguing.

These organizations are simultaneously vulnerable to change and well-placed to expand their services into digital libraries. Their advantages are years of computing experience, good marketing, and large collections of valuable data. Many have strong financial reserves or are subsidiaries of conglomerates with the money to support new ventures. Almost every organization sees its future as integration between secondary information and the primary information. Therefore there are numerous joint projects between publishers and secondary information services.

Aggregators are services that assemble publications from many publishers and provide them as a single package to users, usually through sales to libraries. Some had their roots in the early online information systems, such as Dialog and BRS. These services licensed indexes and other databases, mounted them on central computers with a specialized search interface and sold access. Nowadays, the technical advantages that aggregators bring is comparatively small, but they provide another advantage. A large aggregator might negotiate licenses with five thousand or more publishers. The customer has a single license with the aggregator.

Universities and their libraries

Changes in university libraries

Like most organizations, universities have difficulty in handling change. Universities are large, conservative organizations. The senior members, the tenured faculty, are appointed for life to work in narrowly defined subject areas. Universities are plagued by caste distinctions that inhibit teamwork. The cultural divide between the humanities and the sciences is well-known, but an equally deep divide lies between scholars, librarians, and computing professionals. Faculty treat non-faculty colleagues with disdain, librarians have a jargon of their own, and computing professionals consider technical knowledge the only measure of worth.

The academic department dominated by tenured faculty is a powerful force toward retaining the status quo. To close a department, however moribund, is seen as an act of academic vandalism. When a corporation closes a factory, its stock price usually goes up. When Columbia University closed its library school, nobody's stock went up. There is little obvious incentive and much vocal disincentive to change. Until recently, Oxford University still had more professors of divinity than of mathematics.

Yet, universities are a continual source of innovation. Chapters 2 and 3 included numerous examples where universities developed and deployed new technology, long before the commercial sector. The flow of high-technology companies that fuels the American economy is driven by a small number of research universities, such as Stanford and Berkeley near San Francisco, Harvard and M.I.T. in Boston, and Carnegie Mellon in Pittsburgh.

Innovation in a large organization requires strategic reallocation of resources. New initiatives require new funding. Resources can be found only by making hard choices. The process by which resources are allocated at a research university appears arcane. Moving resources from one area to build up another is fraught with difficulties. The director of the Ashmolean Museum in Oxford once mused that the best strategy for the museum might be to sell part of its collection to provide funds to look after the rest; he doubted whether he would retain his position if he carried out such a strategy. Few deans would advocate cutting faculty numbers to provide resources that will make the remaining faculty more productive. Yet funds are reallocated. Year-by-year the portion of the budget that goes into computers, networks and support staff increases, one percent at a time.

In the long term, it is not clear whether such reallocations will bring more resources to existing library organizations, or whether universities will develop new information services outside their libraries. The signals are mixed. By a strange paradox, good information has never been more important than it is today, yet the university library is declining in importance relative to other information sources. The university library, with its collections of journals and monographs, is only one component in the exchange of academic information. Fifty years ago, faculty and students had few sources of information. Today they have dozens of methods of communication. New technology, from desk-top computing and the Internet, to air travel and video conferences, allows individual scholars to exchange large amounts of information. The technology has become so simple that scholars are able to create and distribute information with less help from professionals and outside the formal structure of libraries.

If libraries are to be the center for new information services they have to reallocate resources internally. Discussions of library budgets usually focus on the rising cost of materials, overcrowding in the buildings, and the cost of computing, but the biggest costs are the staff. Few universities make an honest effort to estimate the costs of their libraries, but a true accounting of a typical research library would show that about twenty five percent of the cost is in acquisitions and fifty percent in staff costs. The other costs are for building and equipment. If libraries are to respond to the opportunities brought by electronic information, while raising salaries to attract excellent staff, there is only one option. They will have to reorganize their staff. This is not simply a question of urging people to work harder or streamlining internal processes; it implies fundamental restructuring.

Buildings for digital libraries

An area of change that is difficult for all libraries, but particularly universities, is how to plan for library buildings. While digital libraries are the focus of research and development around the world, for many libraries the biggest problem is seen as the perennial lack of space. For example, in December 1993, the funding councils for higher education in the United Kingdom released a report on university libraries, known as the Follett Report. In terms of money, the biggest recommendation from the Follett Committee was the need for a major building program. This need was especially acute in Britain, because the numbers of students at universities had grown sharply and much of the space was required to provide study space on campus.

The expense of a new building to house rarely-used paper is hard to justify, but old habits are slow to change. Imposing library buildings are being built around the world, including new national libraries in London and Paris. Many libraries retain a card catalog in elegant oak cabinets to satisfy the demands of a few senior users, even though the catalog is online and no new cards have been filed for years.

To use a traditional library, the user almost invariably goes to the library. Some libraries provide services to deliver books or photocopies to offices of privileged users, but even these users must be near the library and known to the library staff. Users of a digital library have no incentive to visit any particular location. The librarians, webmasters, and other professionals who manage the collections have offices where they work, but there is no reason why they should ever see a user. The New York Public Library must be in New York, but the New York digital library could store its collections in Bermuda.

The dilemma is to know what will make a good library building in future years. The trials and tribulations of the new British Library building in London show the problems that happen without good planning. Library buildings typically last at least fifty years, but nobody can anticipate what an academic library will look like even a few years from now. Therefore the emphasis in new library buildings must be on flexibility. Since modern library buildings must anticipate communications needs that are only glimpsed at today, general purpose network wiring and generous electrical supplies must be led to all spaces. Yet the same structures must be suitable for traditional stacks.

Panel 5.7
The renovation of Harvard Law Library

Langdell Hall is the main library of Harvard Law School. As such, it is the working library of a large law school and one of the great collections of the history of law. During 1996/97, the library was fully renovated. The project illustrates the challenges of building for the long term during a period of rapid change.

At the time when the renovations were being planned, Harvard made two bold proposals. The first was that the library should provide no public computers. The second was that every working space should support laptop computers. It was assumed that people who visit the library will bring their own computers. In the new library, every place where users might work has a power outlet and a network connection. Users can use any of 540 Ethernet ports at tables, carrels, lounges, or study rooms throughout the building. In total, 1,158 network connections are installed within the building; almost all were activated immediately, the rest kept in reserve. In practice, the idea of having no public computers was relaxed slightly. There are about one hundred public computers, including one computer classroom and two small training labs.

Even for Harvard Law School, the cost of this project, $35 million, is a great deal of money. The school has effectively gambled on its assumptions that all users will have laptop computers and that the network installation has enough flexibility to adapt to changes with time. However, the biggest gamble was probably never stated explicitly. The school actually increased the amount of space devoted to library users. The assumption is that people will continue to come to the library to study. Law school faculty are known to prefer working in their offices rather than walk to the library. Many years ago, the librarian, Harry S. Martin III, stated, "Our aim is to keep the faculty out of the library." This witticism describes a commitment to provide faculty with service in their offices, including both online information and an excellent courier service. However, the attention given to reading spaces in Langdell implies a belief that, for many years, legal scholars and law school students will come to the library to do serious work.

Not all of the money went to technology. Langdell is an elegant old building, with historic books and valuable art. Much of the budget went into elevators, heating, air-conditioning and plumbing. The dignity of the library was not forgotten. Old tables were restored, chandeliers replaced an elderly and ineffective dropped ceiling, Latin inscriptions were re-located, and bas relief symbols of the law highlighted. This vision of a great law library combines the traditional view of a library with access to modern technology. Hopefully, Harvard has judged correctly and will be proud of its new law library for many years.

Buildings are one of many aspects of libraries that are long-term investments. Collections, catalogs, and indexes also represent sustained efforts over many years. Digital libraries are forcing rapid changes which provide both opportunities and challenges. This is not easy for individuals or for organizations, but the manner in which they react to digital libraries will determine which of them flourish in future years.

Last revision of content: January 1999
Formatted for the Web: December 2002
(c) Copyright The MIT Press 2000