Monday, January 13, 2014

Metadata: Metauseful or Metacrap?

The notion of widespread use of descriptive metadata for everything from webpages and search engines to digital media and online stores is undoubtedly an information organizer’s dream. All manner of internet navigation being guided by the use of accurate and consistent metadata would streamline search processes, eliminate undesirable results, and bring some semblance of order to the world of chaos that can often characterize the web.

As all of us have doubtlessly experienced, however, such ideas belong to the realm of fantasy. Web searches are messy, retrieving millions of results, with most users lacking the time or endurance to parse for relevant entries after the fire couple of pages. Sometimes we are deliberately led astray, receiving hits for sites that clearly have nothing to do with the item being searched for. Why does this happen? Why has the potential of metadata failed to bring order to the digital environment when its possibilities are so promising?


I find that journalist Cory Doctorow’s 2001 essay, "Metacrap: Putting the torch to seven straw-men of the meta-utopia,” provides a fairly succinct, accurate, and humorous summary of the reasons why the meta-dreamland is one not likely to ever reach the shores of reality. I won’t summarize the whole of the essay here, but will comment on a couple of points he makes.


For me, one of the more problematic areas that Doctorow discusses is the misuse of metadata for the purpose of attracting our attention. Whether malicious and outright false, or simply exaggerating or “enhancing” the properties of an object or site, the fact is that metadata on the internet, at times, takes on all the properties of intrusive and misleading advertising we’ve witnessed across other forms of media. From mail spam with maliciously-intended titles to the deliberate misuse of high priority/relevance metadata terms, the digital environment is anxiously competing for our attention. That the purposeful misuse of terms destroys the integrity of the metadata in question is of no concern to the creator, so long as it gets us to their site/product. A sad result of this trend is that website metadata fields are virtually ignored by most major search engines of today, depriving internet users of a tool that could have been invaluable in enabling expedient and highly accurate search results.


Another area of the essay I found particularly relatable dealt with the laziness of people contributing to inferior quality of metadata. Sometimes it is simple negligence or omission of information, sometimes it is not taking a few moments to ensure that the entered data is correct. Either way, the “casual” approach to metadata can play a significant role in compromising its quality. In my work for a digital archive in the local area, I was struck by the difference between the quality of metadata used by the archive itself, and the lack thereof in the document repository utilized elsewhere by the company to house all its current files. The lack of: standard naming protocols, visible dates, file naming methods, attribution of the content creator, and even of a standard file format spells N-I-G-H-T-M-A-R-E for the person eventually responsible for collecting and archiving the documents. Establishing such protocols and standards beforehand, and ensuring that content creators understand and follow them, would have averted much future work.


“Metacrap” is a good read, and won’t take but a few minutes of your time, so be sure to check out the link provided above. I know this article has been popular with my fellow LS 566 travelers. Would love to hear some of your thoughts and insights on whether the meta-utopia is truly as unachievable as Doctorow would have us believe. If you’re not of the SLIS crowd, your opinions and insights are equally welcome, as it’s always great to hear how people outside the field view the topic.

2 comments:

  1. Unfortunately, some types of purposeful metacrap is profitable :(

    http://websearch.about.com/od/seononos/a/spamseo.htm

    -Dr. MacCall

    ReplyDelete
  2. I liked your analysis of this piece and your point that "the digital environment is anxiously competing for our attention." While I do agree with Doctorow that a "meta-utopia" is not really realistic, I do still feel overall that metadata is getting better all the time and that maybe in the future we can find work-arounds to purposely falsify metadata so that maybe we can get to the point of a semi-utopia.

    And I feel your pain about having to deal with lazy/stupid mistakes in your own institution. I work in cataloging and I find large errors weekly on records that had been created about a decade ago by certain people in my department and also in the archives department and it is apparently because a volunteer used to enter some of the records and did not bother to proofread and in archives, there was a person who was not very good at looking up the correct LC subject headings. It's ridiculous how much time I waste correcting records some weeks so that they can be found. But I feel like the increase in people who are trained to be metadata professionals (at least in my institution, hopefully in others) should alleviate some of this for the future.

    ReplyDelete