Intermediation and the Digital Library

Joseph Jones


Paper presented at 1998 joint international conference,
Association for Literary and Linguistic Computing
Association for Computers and the Humanities
5-10 July 1998, Lajos Kossuth University, Debrecen, Hungary


Keywords: librarianship as a profession, online searching, end-user
searching, subject access



    Little Jack Horner
    Sat in the corner,
    Eating a Christmas pie;
    He put in his thumb,
    And pulled out a plum,
    And said, What a good boy am I!



The Question of Intermediation

    Until I thought about it, I never realized how weird Little Jack
Horner is. So, he eats with his hands — he's little, he's a kid,
that's OK. But it's Christmas and he's off by himself; in a corner —
which is usually the place of punishment; he has a whole pie to
himself, but no turkey or anything else (maybe it's dessert time and
he got tired of the adults at the table?); and he's latched onto a
plum and tells us he has done well — which is redundant, because we
all know what "plum" implies.

    What is really weird is that Little Jack Horner exemplifies an
ideal of information acquisition. The pie is a well-defined
territory that he is working over, his thumb is an instrument that
provides direct and effective access, the plum is the good stuff
that he is extracting, and he is happy in his activity — he feels
good about himself and his situation. Perhaps best of all, he can do
it alone and he knows what he is doing. The self-sufficiency
exhibited here is a stage familiar to anyone who has raised a child.

    The question to be explored is whether Little Jack Horner can
stay in his corner as print resources are supplemented — and
replaced — by digital counterparts. Put in other words, and to move
into a less analogical mode, does the digital library place its user
in a position of increasing dependence on human assistance? Are we
entering a stage where even a sophisticated library user will be
unable or unlikely to make adequate use of what is available?
Reflection on two decades of experience with transition in public
and academic libraries leads me to propose that the answer is "yes".


The Library Profession

    Before developing support for this thesis, it seems appropriate
to say something about the library profession. It emerged, in the
present common understanding, in the latter half of the nineteenth
century. Practitioners in this field, as in others like nursing and
primary education, have tended to be female, with corresponding
status and compensation. There has been a great divide between what
are called public services and technical services (epitomized by the
activities of reference and cataloguing). Libraries themselves have
been classified into four general types: school, public, academic,
and special. For a variety of social and technological reasons,
these distinctions are becoming less rigid and less definite.
Concurrently there is emerging another category which can variously
be described as the freelance librarian, contract researcher, or
information consultant.

    The credential of the librarian, a masters degree in library
science, is prevalent in the profession. It does not, however, enjoy
the degree of exclusionary power exercised by parallel
qualifications in fields like education, accounting, law, medicine,
and engineering. In part this difference relates to the considerable
but informal role played by apprenticeship in becoming qualified to
practice. Also significant is the public funding and service nature
of much of the work, which generates a perception that what is
provided is "free" and consequently of small value. Although
incorrect or incomplete information can have serious consequences,
accountability and liability tend to be far greater in other
professions. Credentials, finances, legal status, health, and safety
seem much more immediate and more important than information does.
The contract implicit in a client's direct purchase of a service
contributes to the importance assigned to results. A corollary is
that much of the consultation provided by a librarian is on-demand
and real-time, with less allowance for supporting research and a
reflective weighing of possibilities and alternatives.


From Print to Digital: One Example

    In the latter days of the print era, some twenty years ago, when
even the fruits of data processing usually reached the end user as a
printed product, getting at specific information (finding the plum
on the dinner table) still was largely intuitive and a much more
integrated exercise. That intuition was based on the transmission of
habits developed over centuries of dealing with print-based
information. The user's ability to navigate was in step with the
education system's methods and with publishers' conventions. An
illustrative comparison of subject indexes for periodical articles
provides some evidence for this assertion. The printed indexes
published by the H.W. Wilson Company will be set alongside the
databases made available by SilverPlatter.

    The indexing practices of Wilson provide coordination and easy
crossover among the broad range of subjects covered by their
separate indexes. As well, there is considerable correspondence to
the general thesaurus of the Library of Congress Subject Headings
which provide subject access to the monograph collections of most
academic and public libraries in North America. Beyond this
important general consistency lies much other work done for the user
before any search for information is undertaken:

    Careful selection of material to be indexed
    Uniform and good typographic design
    Control of subject terminology
    Precoordination of subject terms
    Repetition of entry under appropriate headings
    Embedding of cross-references

    Ironically, this paragon of print did not migrate as rapidly or
as smoothly as it might have into electronic delivery, in no small
measure because the printed product was so good. And because it was
print.

    Wilson developed its own software as an extension of its primary
business of providing information. SilverPlatter, on the other hand,
has emerged as a data retailer, providing a deceptively uniform and
relatively simple interface for a wide variety of producers.
Databases whose range and depth would not have been economical in
the print era now provide users with citations to grey literature
that is not available in most libraries and may be unobtainable.
This company may be seen as a cdrom/LAN successor to centralized
connect-charge database vendors like Dialog. The software itself is
"free", with revenue included in the purchase price of the databases
selected. In some respects, the uniformity they offer simplifies
training and user support, reducing the need for intermediation.

    There are significant limitations to this last observation,
however. One is that the apparent uniformity of the SilverPlatter
interface tends to mask significant differences in the underlying
data which are (or were) much more apparent in the producer's
printed products. Here are three examples:

• Journal of Economic Literature relies heavily on a classified
  approach to the information; online approach through a general
  subject heading can be difficult.

• Psychological Abstracts (PsycLit) has a strong and well-developed
  thesaurus which is still most easily consulted in its printed form.
  The subject terms themselves, and the thesaurus which controls
  them, can easily remain hidden to an online user.

• In more detail, consider the change in subject indexing introduced
  by MLA International Bibliography in 1981. An electronic searcher
  can naively combine the two files and search them in what appears
  to be the same way. However, the only very useful subject term (as
  opposed to title keyword) available for searching in the older file
  is name of literary author. This is readily apparent in the printed
  product, but almost invisible online.

    Another weakness of the uniformity offered by SilverPlatter is
that it is only a part of the information universe, one (albeit
major) interface among many others. All of these share the
instability inherent in continued development and migration. Further
discussion below will be devoted to the multiplicity of interfaces.


Aspects of the Transition

    It may be that we are progressing toward the look and feel of a
common interface, some combination of windows, menus, and hypertext
that will begin to allow intuitive navigation similar to that of the
printed page. But this vision appears in the screen very dimly. Much
more in evidence are users who experience the following:

    Lining up to get at a machine
    A less stable environment with more down time
    New interfaces that do not match old habits (even electronic
       ones)
    Having to seek training and choose appropriate sessions when
       they do not know what they do not know
    Selecting from a proliferating, overlapping, and
       ever-changing set of appropriate resources (print and
       electronic) and finding their way to them, whether networked
       or standalone
    Trying to navigate unfamiliar systems on the basis of screen
       helps, tip sheets, and manuals
    At the least, flying solo through a ready-made result set
    Variety and complexity associated with data capture/printing
       through download, session logging, email, etc. (in contrast
       with the directness and simplicity of photocopying)

    Beyond these difficulties lie three changes that call for
special comment.

    First is a multiplicity of interfaces. It is increasingly
possible to access the same data by different paths, in different
media, and through different softwares and hardwares, with
significant variation in the results obtainable. Underlying some of
this is a transition from telnet and proprietary clients to http,
with the robustness and functionality of the latter still falling
noticeably short of what the older technology has provided. Also
contributing to the multiplicity is vendor competition.

    Second is an extension from surrogate to full text. Traditional
library catalogues, bibliographies, and indexes are ultimately
spatial conventions — and in some instances, products of print
poverty. While there is often a useful spatial convention in the
physical arrangement of printed texts (e.g. the classification of
books on a shelf), the order is one-dimensional and normally
excludes alternatives (like shelving multiple copies wherever they
might be desired). A catalogue or index is unconstrained by physical
linearity and provides multiple pointers to the same item, whether
by description (author, title, etc.) or subject. It has also been
possible for a library to provide reference to many more printed
items than it could ever hope to own, house, or directly make
available. With recent decreases in the cost of electronic storage
and the technical ability to access all content, the bibliographic
record moves from necessary means (a record that makes it possible
to locate the complete text) to useful adjunct (header information
on an electronic text retrieved through direct searching). One is a
pointer, the other an identifier. The challenges of coping with full
text may be most dramatically illustrated by Lexis-Nexis, with its
one and one-third billion documents (growing at a rate approaching
ten million per week), representing over eighteen thousand data
sources.

    Third, and perhaps most significant, is a widespread and
unreflective move to raw keyword searching for data retrieval. (This
is related to, but not a necessary consequence of, the increasing
availability of full text.) What the computer can do with keywords
is very like what a shotgun can do to a barrel of goldfish in a dark
closet. Most users seem happy to hit a few surface swimmers almost
at random, while the big smart ones continue to circle at the
bottom. The sense of power delights, and there is not much rational
evaluation of what pulling the trigger has done. This raises doubts
about a good deal of what is called "research". Cumbersome and
limiting as they were, the card catalogue and the printed subject
index did force users to approach the data through structured
subject terminology. In general, keywords were not available. It
seems safe to say that most data users remain largely unconscious of
things like the principles of source selection in a database, the
structure of records and fields, the types of data these contain,
and the indexing practices adopted. Even setting this assumption
aside, it is rare to find a user possessing the self-conscious
linguistic sophistication required to construct a natural-language
equivalent for a thesaurus-controlled term.

    It is sobering to consider that there are only two ways of
organizing information for direct human access, with raw keyword
searching well toward the weaker end of the spectrum. The two are
classification (semantic order, generally with hierarchy) and
alphabetization (a conventional and purely arbitrary order), most
familiar perhaps in the book as analytic table of contents and
index. Of course, the alphabet itself can allow for semantic random
access (assuming the mind can generate the point of entry). And
semantic order is likely to embed itself within alphabetic
organization through controlled indexing terminology, provision of
cross references, and distinctions grouped under a concept as
subheadings. Ultimately these constitute a thesaurus.


Conclusion

    Implicit in the foregoing examples and analyses is an academic
perspective, one that assumes an orientation toward research rather
than casual and uncritical browsing of information. Given the
reality that communities of discourse employ and rely — probably far
too heavily — on personal acquaintance and informal means, the
information superstructure may become significant only at the
boundaries of those communities. Nevertheless, a scientific approach
to knowledge assumes that truth does not require the imprimatur of
the familiar, that subject areas cannot be closed and cosy, and that
research must be more than conversation and/or extrapolation from
randomly acquired information.

    This leads to the question of who should gather information, how
it should be gathered, and whether the gathering itself is integral
to the process of dealing with it appropriately. As regards the
latter, scholarly habits, particularly in the humanities, have
reflected a belief that the gathering cannot be delegated, although
some aspects of the work may be performed by a research assistant.
At a simpler level, the question becomes one of the role of
information gathering/library research in a student's education
(understood as gaining basic familiarity with the methods, tools,
and data of a particular discipline).

    Practicalities aside, it is here that we come to a tradeoff that
is being exacerbated by the shift from print to electronic delivery
of information. Will the quest for data continue to be primarily a
do-it-yourself undertaking, with increasing time and energy required
to cope with (if not master) the variety, changes, magnitudes,
overlaps, and proliferations of the digital library? Or will a need
for efficiency and a desire for good results compel most researchers
to rely on the intermediation of a specialist in information
management?

    As the costs of human support increase, and public funding
declines, a library user can look forward to less direct assistance
in an environment that grows more complex and unfamiliar. The
effectiveness of group training divorced from particular questions
and needs is limited, but it is the only practical way of teaching
methods, strategies, and procedures. Opaque technologies have
increased the importance of understanding what is being done.
However, most users operate on immediate need, and are much more
concerned with end than means. Consequently, information seekers
should develop a critical self-consciousness regarding what they do
not know. When help is provided one-on-one and on demand, it is
likely to take the form of a request whose derivation is not
explained. This may require greater trust in the information
provider, and could lead the user to a willingness to invest
sufficient time to develop information-finding skills.

    In retrospect, the current situation may seem peculiarly
transitional, with the durability, stability, accessibility, and
universality of the printed page manifested somehow once again in
the conventions of an electronic screen.



Suggestions for Further Reading:

    Basch, Reva. Secrets of the super searchers. Wilton, CT : Eight
Bit Books, 1993

    Bledstein, Burton J. The culture of professionalism : the middle
class and the development of higher education in America. New York :
Norton, c1976

    Clausen, Helge. "The future information professional: old wine
in new bottles? Part one" Libri 40:4 (Dec 1990) 265-277 ; " ... Part
two" Libri 41:1 (Mar 1991) 22-36

    Dalrymple, Prudence W., and Jennifer A. Younger. "From authority
control to informed retrieval: framing the expanded domain of
subject access" College & research libraries 52:2 (Mar 1991) 139-149

    Connell, Tschera Harkness. "Subject searching in online
catalogs: metaknowledge used by experienced searchers" Journal of
the American Society for Information Science 46:7 (Aug 1995) 506-518

    Foskett, A.C. The subject approach to information. 5th ed.
London : Library Association, 1996

    Hagler, Ronald. The bibliographic record and information
technology. 3rd ed. Chicago : American Library Association, 1997

    Harter, Stephen P., and Anne Rogers Peters. "Heuristics for
online information retrieval: a typology and preliminary listing"
Online review 9:5 (Oct 1985) 407-424

    Park, Bruce. "Libraries without walls; or, librarians without a
profession" American libraries 23:9 (Oct 1992) 746-747

    Quint, Barbara. "Disintermediation" Searcher 4 (Jan 1996) 4,6

    Studwell, William E. "Will intermediaries be an essential
component of online subject access in the future?" Technicalities
13:8 (Aug 1993) 8-9

    Tenopir, Carol. "Generations of online searching" Library
journal 121:14 (1 Sep 1996) 128, 130

    White, Herbert S. "Information intermediation: a fancy name for
reference work" Library journal 120:5 (15 Mar 1995) 44-45

 


 

For contact information see Joseph Jones.

 

Presented July 1998
 
Reformatted for migration July 2013
 
Hosted by Vancouver Community Network

 

Valid XHTML 1.0!         Valid CSS!