The first article, by Arlene Taylor and Tina Gross,
entitled, “What Have We Got to Lose? The Effect of Controlled Vocabulary on Keyword
Searching Results,” points to the important role that subject headings and
subject searches play in facilitating retrieval of relevant resources for
catalog users. Subject headings not only provide a controlled method for
searching, but also function indirectly through showing up via keyword
searches. According to the study done by Taylor and Gross, if the subject entry
were to be eliminated from the catalog, or withdrawn from keyword searches,
approximately one-third of related resources would fail to provide hits. This
is a significant number of items that would elude the user’s search.
Additionally, since the terms in question are located in the subject field,
they are more likely to provide quality resources as relate to the user’s
search of terms found in, say, the summary field, or table of contents (which
are sometimes entered into a record). This isn’t to completely discard keyword
searching, however. Users appear to have difficulty understanding how to search
for controlled-vocabulary subjects, at times, or may not know exactly which
terms to utilize for a given search. As keyword searches are easier, in some
ways, to understand, they are sometimes preferred by users. So it is not so
much a matter of choosing keyword searches or controlled vocabulary searches,
rather it is ensuring that we infuse the precision of the latter into the
usability of the former.
In
“On the Subject of Subjects,” Arlene Taylor further advocates for the
importance of utilizing subject headings and controlled vocabularies, as well
as extending their use into the electronic domain and the Internet. Taylor
points out that despite the pre-eminence of keyword searches for use by
Internet users, keywords continue to be a poor option where precision of search
results are desired. Not only does controlled vocabulary counteract this
through the notion of specific entry, but it also serves to increase the
possible search parameters for a user through the relating of broader and
narrower terms for a subject, as well as synonyms, near synonyms, and other valid
variations. Controlled, subject-based description does is not without its
faults, however. Taylor expresses that there is certainly a place for
keyword-based techniques, owing to their simplicity, lower cost to create, ease
of maintenance (automated), and ability to stay current. What is advocated for
the digital environment is a system in which less important, ephemeral
resources can be indexed automatically using keyword-technologies, while those
intended for long-term use can fall under controlled vocabulary systems.
It is
interesting to note that since the latter article’s publication almost 20
years ago, subject-based vocabulary control does not seem to have taken a
strong foothold in the online environment, and likely for very obvious reasons - cost, speed, currency, and the ability to automate. While resources such as WorldCat and the Digital Public Library of America, among others,
show the possibilities that online, linked resources can accomplish with
controlled vocabularies, the world of digital information in general exhibits a pretty low degree of accurate description, much to the chagrin of searchers and their millions upon millions of hits. Is there a better way? Surely now that Pandora's box has been ripped open, it will be all but impossible to stuff it back in.
-Gross, T, and AG Taylor. n.d. "What have we got to lose? The effect of controlled vocabulary on keyword searching results." College & Research Libraries 66, no. 3: 212-230. Social Sciences Citation Index, EBSCOhost (accessed February 18, 2014).
-Taylor, Arlene G., 1941-. 1995. "On the subject of subjects." Journal Of Academic Librarianship 21, 484-491. Education Full Text (H.W. Wilson), EBSCOhost (accessed February 18, 2014).
Subject description is always a pain ... it's so costly and it's difficult to program a computer to do it!
ReplyDeleteHowever, Google's indexing algorithm does a pretty good job computing what a document is about, but those are documents from the open Web where there's lots of redundancy. I'm interested in the long term viability of human subject indexing in the context of projects like DPLA...
--Dr. MacCall