Describing Diasporan Digital Information Objects

The current global information environment has increased competition in places, such as libraries, where people have traditionally turned to for access to information. As a result of this competition, it is crucial for managers of large bodies of information—curators, librarians and archivists–to make the “search and find” process seamless.

Because of the array of search engines available to the everyday user (Google, Yahoo, Bing, etc), web users are more frequently bypassing library systems and using alternative data resources. Since 2005, libraries have found themselves in the position of having to compete for users of their services. Increasingly, they find that they must publish semistructured and structured data on the Web. “In essence, publishing interlinking data marks a shift in thinking from publishing data in human readable HTML documents to machine readable documents. That means that machines can do a little more of the thinking work for us.”(http://www.linkeddatatools.com) The goal of this added resource description is to enable next generation library users to benefit from embedded information made to assist with the easy interchange of multidisciplinary documents.

Libraries aiming to appeal to web-savvy users have found that a common way to specify meaning among webpages is by utilizing the Resource Description Framework (RDF), an essential part of publishing Linked Data in the Semantic Web. Linked Data builds directly on the architecture of the internet and applies a democratic, decentralized approach to the task of sharing data on a global scale.

“Linked Data is about using the Web to connect related data that wasn’t previously linked, or using the Web to lower the barriers to linking data currently linked using other methods”(http://www.linkeddatatools.com). Because the linkages are created through particular perspectives, it is important to note that majority of digital archives and repositories we access as users are located and maintained in racialized, neo-European diasporan locations. Historically, libraries have been the curators and preservers of middle and upper-class materials. Thus, how they structure linkages is through specific class and racialized frameworks.

I propose that the utilization of the Resource Description Framework can help enrich the current Web environment by spreading awareness through formal publication of relationships through more diverse networks. The aim of this paper is demonstrate what such a diverse relationship might look like by look at Afro Latin Music. I also seek to establish an understanding of the value and of the challenges that Afro Latin Music digital objects hold when represented as Linked Data versus traditional hyperlinked data. Linked Data, has the potential to play an important role in establishing a Web standard that could enhance Web users cultural awareness.

Linked Data goes beyond linking and allows for contributions of meaning. It can assist with information integration, help create new paths to information discovery (Patuelli, 2012) and influence the cultural competency levels of web surfers. Linked data can help to liberate information silos from their old chains, opening up the web to artificial intelligence processes. By adding meaning to each web address we can change the nature of ‘the link’. For example, whereas the link between web documents has no meaning other than “link”, a Linked Open Data link itself has specific meaning. Take the example of a library record located at the URL: http://vfrbr.info/work/12502. You don’t know anything else about that URL other than what you can determine from the text in the web address. By embedding data in the links associated with the URL, information managers can characterize each URL with a meaning such as “subject of the work” “stylistic origin” “preferred title of the work” and “composer”.

A good example of the possibilities that Linked Data offers is the FRBR model (Functional Bibliographic Requirements for Bibliographic Records); it is a metadata standard endorsed and developed by the International Federation of Library Associations. Key concepts in the FRBR model are Entities, Attributes and Relationships. Entities are the things users are searching for (bodies of works, people and concepts that users are interested in obtaining). Attributes (a characteristic about some concept, person or body of work) and Relationships are most important to users in formulating searches. For example, a composer’s work (http://vfrbr.info/work/121502) and person (http://vfrbr.info/person/3880) entities are associated through a “created by” (frbrer:P2009) authorship relationship in the library linked data record below.

Via search interface, a user using a library catalog can search for Afro-Cuban Composer Tania Leon’s orchestral work “Bata” using descriptive links : composer, title and recording (or score). Each link is dereferencable and is defined at a namespace (a virtual container for controlled vocabularies). The links attributes used in the above example come from the “frbrer” and frad” metadata standard published at (http://metadataregistry.org). The definitions for the “roles”, “Elements”, and vfrbr” vocabularies are located at: http://www.dlib.indiana.edu/projects/vfrbr/data/rdf/Ontology/owl_vfrbr.rdf.

What makes the above relationship successful is that they were designed for a specific community. This item was described with the information needs of the music students at Indiana University in mind. This data could however help other scholars looking to work with the same data. For example, an Afro-Latin digital humanist looking to make historical, sociological, and anthropological connections between the contents of these web resources and other datasets specific to Afro-Latin American research could use this particular linked data if the records were enhanced.

Let us look closer at Tania Leon’s piece. The very title of this orchestral work “Bata” can have various meanings depending on the researcher: Afro-Latin America, Cuba, drums, religious musical accompaniment, Santeria, etc. Yet this information is missing from the example above. Thus, the record serves a purpose for music scholars, but it obscures and masks the cultural identity and any Afro-Latin references. It illustrates the ways in which racialized identities are silenced. Specifically, the subject metadata in this record lack’s any formal indication that the subject of the work is Afro-Cuban. The vocabularies used to describe subject information (frad:P3050 and vfrbr:subjectOfTheWork) do not allow for common terms or language terms that might have meaning to those familiar with Afro Latin culture. These omissions may unintentionally prohibit a user from locating records based on cultural identity or race. Without specialized knowledge of Afro Cuban composers or knowledge about searching for music items digitally, this cultural data might remain hidden to other scholars unfamiliar with Afro Latin culture.

Keeping in mind the advantages of community specific relationships (EG. styles of Afro Cuban Rumba cycle: Columbia, Yambu Guaguanco) creating an Afro Latin Music specific metadata standard could help facilitate broader access to the previously mentioned Tania Leon record in the Variations Music Library. For those already looking to make connections with large datasets, the Linked Open Data community offers a growing body of data with which to work. For example, http://dbpedia.org has published a multi-domain vocabulary with enough instance data to enhance the relationships associated with the Tania Leon library record and provide for users to access external datasets using related terms like: “Santeria” “Religion_in_Cuba”, “ethnic_groups” and “Afro-Latin_American”. The Geonames vocabulary provides us with the tools to link our existing geographical data with information about Cuba.

Standardized formats (EG. DTD, XML, RDF) suggest that a specific community adapt a controlled vocabulary (as mentioned above), which in turn encourages more accurate representation in a variety of languages. Because there is no single kind of metadata for documents or other information objects and because there is no generic vocabulary to draw from, it becomes important for those working in the Afro-Latin American space to collaborate. They can join some of the other communities that have invested resources towards facilitating the integration and reuse of web content by utilizing and enhancing previously published structured data (see: linked open data cloud for examples).

As people whose cultural identities are central to how we think about data, we can help other scholars across disciplines find the hidden jewels in existing datasets by developing our own domain-specific vocabularies. In providing information we must address issues of power and representation when preparing to describe any aspect of Diasporan cultural heritage objects. Metadata generation could serve as a community driven platform for information producers, enablers and controllers alike, as a means to contend with the power-relations between the West and the African Diaspora research community (Konadu, 2011).

Digital humanists working in the Afro-Latin American space, can benefit from the adoption of metadata standards to make better informed decisions with the data they find and encourage for reuse. We can all do our part to establish an African Diaspora controlled vocabulary and dataset. Ever try to foaf yourself? Try it…metadata is fun. Then try linking it to a paper/recording/video or yours. It doesn’t have to stop there.

Acknowledgement: Data from the Variations/FRBR project of the Digital Library Program at Indiana University was used as primary data for this paper.

See the original post on http://afrolatinoproject.org/

_______________

Works Cited

Coyle, Karen (2005) Understanding metadata and its purpose. Journal of Academic Librarianship, 31 (2), 160-163

Konadu. Kwasi (2011). Accessing the archives: sources subjects and subjugation in the African world. In Benjamin Talton and Quincy T. Mills (Eds.), Black subjects in africa and its diasporas (179-201) New York: Palgrave Macmillan

Patuelli, Christina M. (2012) Personal name vocabularies as linked open data: a case study of jazz artist names. Journal of Information Science. DOI:10.1177/0165551512455989

Describing Diasporan Digital Information Objects

Submit a Comment Cancel reply

Pin It on Pinterest