Lost in the Stacks: March 2014

Sunday, March 30, 2014

(Mis) Understanding the Semantic Web, Part II

In my last post, I took a quick look at the idea of the Semantic Web, as well as some impressions I've gained from readings as to what the concept truly represents. In this second post on Semantic Web, I will be gathering some definitions and descriptions of the topic--some helpful, some perhaps not--that will hopefully start bringing some clarity to the topic for me and any of you that are also trying to unwind the infospeak it's often shrouded in.

Wikipedia describes Semantic Web as a "collaborative movement" which "aims at converting the current web, dominated by unstructured and semi-structured documents into a 'web of data.'" This process "extends the network of hyperlinked human-readable web pages by inserting machine-readable metadata about pages and how they are related to each other, enabling automated agents to access the Web more intelligently and perform tasks on behalf of users."

The World Wide Web Consortium (W3C) identifies the concept as a "Web of linked data...technologies [that] enable people to create data stores on the Web, build vocabularies, and write rules for handling data." The goal of Semantic web is "to enable computers to do more useful work and to develop systems that can support trusted interactions over the network."

Steven Miller, in Metadata for Digital Collections, uses an earlier Wikipedia entry for Semantic Web as his definition, citing it as "a group of methods and technologies to allow machines to understand the meaning--or "semantics" of information on the World Wide Web" (2014, 304).

Arlene G. Taylor and Daniel N. Joudrey, in The Organization of Information, characterize the Semantic Web as an entity where "data on the Web will be defined semantically and linked to relevant data for the purpose of more effective discovery of information..." (2009, 17). They additionally state that Semantic Web will allow information to be "defined in such a way that its meaning or semantics can be discernible, shared, and processed by automated tools as well as by people" (2009, 112).

While these descriptions trend toward the overly general, there are at least some commonalities that can help us start piecing together enough information to gain a sense as to what Semantic Web is. Linked data, resource discovery, and data readability are all concepts that are fairly straight-forward to understand, even if the overriding idea of semantics is a bit ambiguous. Based on the context of the other concepts, we can perhaps jump to the assumption that semantics is referring to the underlying meaning of data, a notion that humans are fairly capable of deciphering, but computers are pretty terrible at (i.e. did my search REALLY result in 9 million RELEVANT hits?). If we accept that idea, then maybe we can cobble together a fairly general definition for Semantic Web:
Linked data and associated technologies that are employed to aid machines in discerning the meaning of information, or data, on the web, in order to more effectively facilitate the discovery and retrieval of information resources.

There's undoubtedly quite a bit more to Semantic Web than that, but hopefully we're on the right track. In the final part of this 3-part blog, we'll see if we can hone in on this concept in a little bit more detail.

Joudrey, Daniel N., and Arlene G. Taylor. 2008. The organization of information 3rd ed. Westport, Connecticut: Libraries Unlimited.

Miller, Steven J. 2011. Metadata for digital collections : a how-to-do-it manual. New York : Neal-Schuman Publishers.

Wednesday, March 26, 2014

(Mis) Understanding the Semantic Web, Part I

The information world throws around a lot of concepts and terminology that can be confusing for those outside the field (let's face it, they confuse a lot of us within the field as well). The Semantic Web is one of those concepts that I seem to run into quite often during class reading, and while the term does seem to be talked about, and referred to, quite often, I can't say that it's one I've seen defined in terms that your average internet/computer user would understand, or at least be able to simply explain. Even my attempts to use Google searches for strings such as "What the heck is Semantic Web?," or "I don't understand Semantic Web," or even "Semantic Web for Dummies" hasn't turned up much help (unless I want to buy the book Semantic Web for Dummies). So just what is the Semantic Web, and why the heck should we care?

Piecing together the title at least creates an expectation of what the Semantic Web may be about. Web is immediately identifiable with the internet as a whole, perhaps more specifically in its role as facilitating linking between information resources. Semantics...well, that usually deals with the concept of meaning. Ok, good building blocks, I suppose, but not a great deal else to go on.

Some of my (possibly wrong) impressions of Semantic Web based on readings I've done include:

-improvement of web interoperability

-emphasis on linked data and metadata

-a certain degree of standardization (format? programming languages?)

-there are allegedly some "great" things that can be done with it

-most people will work with the external manifestations of it rather than the "under the hood" aspects

In the hopes of improving clarity and understanding of this topic, and verifying whether I'm anywhere close to the mark, next time we'll look at exploring some definitions and descriptions of Semantic Web to see if we can start honing in on what it really is.

Monday, March 24, 2014

Putting together my indexing guidelines

Now that decisions have been made regarding the usefulness of our individual metadata elements to the class indexing project, the next task is to create guidelines for the use of our element. While the implementation and use of the Date element would seem to be rather straight-forward, there are still some questions to sort out, such as what date formats should be utilized and the scope of the instructions I will need to include. On the latter topic, this mostly relates to my concern in whether I simply provide instructions for information that I know will be represented, or do I try to set-up guidelines for other contingencies. For example, though the Date element is likely to be restricted to describing dates concerning image creation and digitization for this project, with specific dates associated for each, do I include formatting guidelines for information such as date ranges, even though it is unlikely to be used in conjunction with the Date element in this project?

Dr. MacCall gave my first draft of these guidelines some pretty good feedback, and influenced me to not try to overstep the bounds of my element in the context of this project, so I have tried to pass on information that will only pertain to the specific needs of the Date element. Further, while I always feel it necessary to justify and support my writing, where possible, I was encouraged to keep the indexing instructions as simple and minimal as possible, in order to allow indexers to find the relevant information they need quickly. As such, additional information has been moved to the "Notes" section of my guidelines. I know that there is still some tweaking and editing to go, but think that I'm at least on the right track with this second version of instructions for using the Date element. Feedback is definitely welcome.

DC.Date

Label: Date

Description: The Date element is utilized to identify temporal information associated with the life cycle of a resource, including (but not limited to) its creation and digitization.

Required: No

Repeatable: Yes

Guidelines:

Dates should be entered as specifically as possible.

If portions of the date are not known, the indexer should use the level of specificity that matches the amount of temporal information that is known.

Full Date: YYYY-MM-DD

Month and Year: YYYY-MM

Year: YYYY

Examples:

Full Date: YYYY-MM-DD

Ex. Date=”2014-03-20”

Month and Year: YYYY-MM

Ex. Date=”2014-03”

Year: YYYY

Ex. Date=”2014”

Notes:

Date entry follows the format prescribed in the W3C Date and Time Formats http://www.w3.org/TR/NOTE-datetime and the sections for “Years” and “Calendar Dates” under ISO 8601.

The use of other formats may not be prohibited by the software, but can prove problematic to both human and artificial users alike. Please conform to the above guidelines to ensure consistency in data entry and presentation, and allow for potential aggregation of data into other discovery services and catalogs.

Tuesday, March 18, 2014

Making sense of the metadata world

Having come into my LIS program from the perspective of an elementary school librarian, I had expected to have my view of the field open up drastically, and learn about concepts I had never been exposed to before. Metadata was one of those concepts, which even for the first year or so of my studies boiled down inside my head to "it's like cataloging, but, not for books, right?" Needless to say, my Metadata class has opened my eyes a great deal as to the versatility of the metadata concept, and the numerous tools, ideas, and structures that are associated with it. But still, for someone who is focusing in on mastering the understanding of Dublin Core elements and their uses, there is a minefield of confusion and frustration as concepts and aspects of metadata pile upon each other in my mind. Schema, descriptive standards, content standards, digitization standards, left-side, right-side, refinements, qualifiers, MODS, MADS, LCSH, EAD, VRA, MeSH...the list goes on and on, with my sometimes overwhelmed brain trying to sort it all out and make sense of it all.

My LS 566 classmate, Michele, has posted an excellent recent blog entitled "Big Picture," which tackles some of the concepts listed above, and helps to break down some of the distinctions between the various types of standards and how they are used, as well as listing some examples of each. I found the post to be extremely helpful in developing and reinforcing my own understanding of many of these various ideas, and would highly recommend giving it a look. As for metadata's adherence to the EHAA (Everything Has An Acronym) standard that similarly bogs down the field of education I currently work in, well, that probably can't be helped.

Sunday, March 16, 2014

Dublin Core: Identifier Element

Though I have already been assigned my personal element for our class Digital Indexing Project, I've felt compelled to spend some extra time browsing through the other DC elements to get a better understanding of what they all do. Possibly one of the more underestimated elements, especially in the digital environment, is the Identifier tag, which contains unique identifying information for a particular resource. There are a wide variety of possible identifiers, including: ISBN numbers, file names, accession numbers, and URLs, and each can help play a role in helping users to locate the exact specific document or resource that they are searching for.

One of the more useful classes of identifiers that may show up in this metadata field are those known as DOIs, or Digital Object Identifiers. To help explain what a DOI does, consider the much experienced-scenario of a user having finally found the information research they've been searching for, and when they click on the link to reach it...Object Not Found HTTP 404. Don't you hate that? To help avoid such frustration, some information resources possess a DOI, which is a unique character string by which the resource can be searched for and found, wherever in the ether of the web it may have moved to. While not all informational objects possess a DOI, the practice is becoming increasingly common, especially with items such as journal articles.

For another take on the Identifier element, head over to my classmate Kasie's blog, where she provides some brief information as well as some helpful links on the topic.

Wednesday, March 12, 2014

Metadata Survivor

Had a real interesting experience in my Metadata class tonight, as we all made our pitches for why the elements assigned to us should be considered and utilized for the collection. Date, my element, which can cover a wide variety of temporal information related to an object, as discussed in an earlier post, made the cut...as did most of the elements. The notable castaways included Contributor, for which a strong case just could not be made for inclusion, as well as Language. These will likely be replaced by elements covering identification of the responsible indexing or entry person for the metadata, as well as one considering the football players likely to be found in the Bama football images we are describing.

It was intriguing to hear the wide variety of justifications and possible uses for elements provided by my classmates, and was an excellent opportunity to think critically about the importance of specific elements to the description of digital collections. I also found it challenging to consider the ways in considering the employment of elements when using digitized objects versus ones that are born digital, and how that distinction can alter the usefulness of those elements. This has been a very educational project so far, and we're just getting started on it. More to come as we get further along.

Tuesday, March 11, 2014

Controlled Vocabulary

One of the topics that I find myself returning to again and again in examining metadata and cataloging is controlled vocabulary. Broadly described by Steven J. Miller in our textbook, Metadata for Digital Collections, controlled vocabulary is "any standardized list of terms that have been selected for consistent use in describing or indexing information resources" (2011, 129). By ensuring the consistent use of terms to describe information objects, a metadata creator can more easily facilitate their retrieval by potential users. While the idea of a controlled vocabulary is nearly synonymous, for librarians, with tools such as LCSH (the Library of Congress Subject Headings), there are a variety of types and forms, including: lists, thesauri, taxonomies, and classification schemes, among others.

Trying to distinguish between the different types of controlled vocabularies can be daunting, even for information professionals, so any resource that can facilitate understanding is greatly appreciated. For those of you interested in learning more about the differences between these various descriptive tools, my LS 566 classmate, Cassandra, recently posted a blog that included a very helpful link, entitled Controlling your Language: A Directory of Metadata Vocabularies. Courtesy of Jisc Digital Media, this guide walks you through the basics of controlled vocabularies, and also describes the important features and uses of the major types that may be encountered. It's a quick read, and definitely worthwhile if you have interest in subject description, so be sure to check it out if you have some time to spare.

Miller, Steven J. 2011. Metadata for digital collections : a how-to-do-it manual / Steven J. Miller. n.p.: New York : Neal-Schuman Publishers, c2011., 2011. University of Alabama Libraries’ Classic Catalog, EBSCOhost (accessed March 11, 2014).

But wait...there's so much more

A little help from Organization Monkey for this entry.

Marie Kennedy, http://orgmonkey.net/?p=266, Jul 13, 2008

One of the real eye-openers for me as part of the SLIS Program at the University of Alabama, and through my LS 566 Metadata class, has been a recognition of the incredible opportunities that await library and information professionals in the Digital Age. Digital archives and records repositories, online libraries, digital image and music collections...the needs for information organization in the online environment are countless. Understanding the principles of information description, and being able to apply them, whether it be to a traditional OPAC catalog, or to any variety of metadata formats, will only become more useful as more and more information migrates into exclusively digital forms. Now, trying to make sure that people see the value in bringing human organization to all that data, and trying to dispel the archaic notion that librarian=person behind desk in building...well, that's another matter.

Monday, March 10, 2014

Metadata Games

There's no question that metadata is around us wherever we go in the digital world. Whether it is the personal information that notes our virtual journey to a site, the administrative metadata embedded onto a webpage, or the descriptive metadata that tells us about a selected informational object, we are adrift in a sea of metadata. Now, it is even there for us when we're ready to unwind and do something fun, in the form of Metadata Games. The product of cooperative efforts from organizations including Dartmouth College's Tiltfactor Game Laboratory, the National Endowment for the Humanities, and the American Council of Learned Societies, Metadata Games provides aspiring metadata creators and taggers the opportunity to help construct real metadata descriptions for real digitally archived material in the form of several pieces of interactive software. By drawing on the elements of crowd-sourcing and mobile gaming, Metadata Games seeks to use the population-at-large to solve the problem of description-less digital collections. It is definitely an interesting idea, though questions about ensuring traditional descriptive practices such as specific-entry and controlled vocabulary certainly abound.

Saturday, March 8, 2014

Guidelines for Metadata Creation

Being responsible for metadata creation can be a bit daunting at times, even if many of us are involved in the practice on an almost daily basis. I suppose much of that depends on the context, as when we're doing it during our daily work tasks, or in the comfort of our homes, it's part of an intuitive routine, while as part of a class assignment, it is something that you are definitely going to be assessed on, and you have to make sure it is done correctly. Since practices and requirements for creating metadata can vary wildly from situation to situation, it's good to have some solid guidelines to refer back to every now and again.

My LS 566 classmate, Tamara, recently blogged about some "Practical Principles for Metadata Creation and Maintenance," as passed on by the J. Paul Getty Trust. Though the emphasis of the guidelines seem geared more toward institutional and corporate metadata creation, there is still insight to be gained for all types of metadata creators, and they are definitely worth a look as we begin to approach our imaging project.

Wednesday, March 5, 2014

Choosing a Digital Repository

In addition to the Digital Imaging project talked about over the last few blogs, I will also be responsible for the presentation of a digital repository in my LS 566 Metadata class. Now there are a ton of really great digital repositories out there across the web, covering a wide variety of subject matter, and differing greatly in the manner in which images are presented and described, so the task of choosing just one can be a little bit overwhelming. Also a consideration is the requirements for our presentation, which include identifying information such as: the metadata schema and content standards utilized, any digitization standards noted, and features like whether the metadata records are included in any online catalogs or aggregators. This means that any "ideal" choice for this assignment will probably include quite a bit of detailed, "under-the-hood" type information on the site itself, or at least be prominent enough that those characteristics would be mentioned by any outside sources discussing the site. So how to go about choosing?

With the aid of a few links from our professor, I was off in search of a good digital image repository. As someone who has an avid interest in history, I was hoping to find something with a distinctly historical bent, maybe with pictures from the Civil War, or a similar era. After a few keyword searches, though, I came up with a collection that sounded fairly intriguing, housing a variety of governmental posters from World War II. The World War II Poster Collection from the Government and Geographic Information and Data Services Department at the Northwestern University Library (isn't that a mouthful?) is a digital collection of over 300 posters issued by various governmental agencies during the period of the second World War that were intended to help maintain the morale and resolve of the American people during that great conflict. I always find it interesting to see the values and priorities that were emphasized in documents from the past, so the process of browsing through this repository has been an interesting experience so far. Also interesting, from a library and information studies perspective, is the integration of the collection's records with the cataloging systems of both Northwestern University, as well as OCLC. Taking a closer look at the mechanics underlying these records, and the makeup of the repository, will be a big part of my assignment going forward, and I will be sure to post my progress in this direction.

If you've never taken a look at World War II posters before, they are definitely an interesting part of Americana, so take a few minutes to browse if you get a chance.

Monday, March 3, 2014

On Subject Description: "Ofness and Aboutness"

One of the major challenges that catalogers and metadata creators confront in subject description is in identifying the intended meaning behind the object being examined. While many books, images, and recordings are relatively straight-forward in the themes and topics they explore, there are also many instances in which the intended meaning of the object is distinctly different that its superficial characteristics. George Orwell's Animal Farm, for instance, is superficially the tale of animal interaction on a fictional farm, though the meaning behind this story is a satire and critique of the Russian Revolution and the totalitarianism of the Soviet Union. In the cataloging and metadata world, these distinctions represent the difference between the "ofness" of a work, and the "aboutness" of a work. Being able to identify both is a key component in ensuring that a bibliographic item is correctly described.

For a slightly different look at this topic, check out this recent post from one of my LS 566 classmates, which examines "ofness" and "aboutness" in the context of a Pablo Picasso painting, as well as some links providing a bit more insight into the subject.

Saturday, March 1, 2014

Looking at my assigned Dublin Core elements

Date, Creator, Contributor...these will be the Dublin Core elements that my two groupmates and I will be responsible for as part of our LS 566-Metadata digital imaging assignment. It's a pretty interesting grouping, being fairly straight-forward in the type of information being inputted-there's not a huge amount of room for interpretation as opposed to something like the Subject field. Nevertheless, these are some fields where consistency in data entry will be supremely important, as the lack of uniform dating, or author names, for instance, can really wreak havoc for users of the collection. As part of our first task for the assignment, my group was asked to look at the three elements and to discern whether they would be appropriate, or necessary, for the needs of the collection being described. I've included my brief thoughts below.

Date (Art Images)-This is the element I am personally responsible for, and one that I discussed in a little more depth during an earlier blog post. Date would seem to be an essential metadata element for nearly any collection, and is likely to have multiple uses in the description of digital art images, including: item creation date and the date of digitization, as well as the copyright date. The foremost considerations for using the Date element will be the use of a consistent format for date entries, and the question of how the dates can be best associated with the information they represent. The use of qualifiers or refinements, where possible, should significantly help with the latter.

Creator (Football Images)-The Creator element represents the individual or corporate entity responsible for the creation of the item being described, such as the: author, photographer, or artist. This would also seem to be an obvious element to be utilized in the description of digital images, so long as the specific photographer(s) are known. As with Date, the issue of controlling entry terms to ensure consistency will be one of the important considerations when working with the element. Determining a standard order by which Creator names will be inputted would go a long way toward easing our work with this element.

Contributor (Football Images)-My group struggled with trying to determine the relevance of this element to our project. Commonly used for persons or entities such as illustrators, translators, or anyone responsible for adding to the content being described, Contributor may not be needed when dealing with this collection of digital photographs. It is possible that a person responsible for graphically editing the content of the photos, or perhaps their digitization, could be listed under this element, if their identity is known.

So there is a brief rundown if what I will be working on for the digital imaging project. It should be an interesting experience, and I'm especially looking forward to looking at using some elements that will be requiring a bit of authority control in their entry. Looking forward to hearing about how the rest of the class feels about their assigned elements.