Update 3/11/2010: This post is getting a lot of traffic so I thought I could at least mention a recent related blog post on open data in archaeology from the Open Knowledge Foundation Blog. I’ve collected more links to resources on this topic in my delicious bookmarks, and will hopefully be adding more soon (I have to integrate them from another delicious account I was using while working on the paper discussed in this blog post).
I spent my stay-cation this past week trying to plow through all the articles and books I’ve amassed for my term paper on metadata for digital collections of archaeological materials. I use the vague word “materials” because one of the things I need to decide is whether I’m going to discuss things like datasets and 3D models and other fun things that might differentiate archaeology collections from art collections. I thought I was going to focus on descriptive metadata for images. It could be interesting to consider the differences between images of artifacts and “art” images. Usually if an artifact is important enough to get its own metadata, it’s probably moved into the realm of “art”, right? But what about all the photographs and other imagery generated during excavations? I need to figure out how people are currently putting this stuff online, how it fits in with the hard data, and whether there are any standard practices for describing these things, either as a unit or individually.
Everything I’ve read so far indicates that there’s a lack of standards (both for digital and physical collections) partly because there’s no consistency in the types of data collected by various archaeological projects, and because of differences in recording protocols, terms, measurement units, and language (Styliadis et al., 2009; Snow et al., 2006). The March 2009 issue of the Society for American Archaeology Archaeological Record has numerous articles devoted to the topic of international curation standards for archaeological collections. In her article, “Creating Digital Access to Archaeological Collections,” Julia A. King writes:
…while most archaeologists now use digital technologies in their work (for report production and image capture, for example) minimal consideration has been given to the long-term preservation and accessibility of the materials generated through this work (and, by accessibility, I don’t mean just the ability to ‘find’ objects or records within a repository. I also mean the ability to get relatively quick access to the data represented by these materials for research and interpretive purposes). The archaeological collections management literature, which has enjoyed considerable growth covering a wide range of topics in the last 20 years, has yet to consider the challenges of managing digital collections in the kind of detail afforded physical collections.
Earlier in the same issue, in an article entitled “From the Dust to the Disk”, David Bibby writes that each excavator collecting data in their own way “has lead to a myriad of variations…The key to successful data preservation is structured data collection. There has to be some common denominator, even if at only a very basic level — safeguards to ensure data integrity and security as well as some guarantee that future users of the excavation data will have an approximate knowledge of what to expect.” (17). He goes on to describe a recommended data structure designed to work with any sort of excavation data.
The most interesting articles I’ve read address the problem by proposing the use of concept ontologies and mapping to avoid requiring archaeologists/curators/anyone to use a single data model. The goal is “cyberinfrastructure”. Snow et al. advocate developing database mediation services that would encompass the various perspectives in archaeology, but would also “facilitate future efforts within the archaeological community to establish common, minimal standards for metadata descriptions of artifacts, sites, maps, and other academic resources”. Kintigh (2006) and Sugimoto, Felicetti, Perlingieri, & Hermon (2007) discuss semantic data integration for archaeology using an ontological approach.
I am just scratching the surface of this, and I wonder how much not being an IT person is going to impede me. I have many many things to investigate:
- what sort of metadata is required to facilitate semantic data integration?
- which thesauri and classification systems best support data interoperability, and are those systems being used on archaeological data?
- I need a better understanding of XSLT, OAI-PMH, and RDF (and, let’s face it, XML too).
- I need a better understanding of CIDOC-CRM, MIDAS, SPECTRUM (UK Museum Documentation Standard) and other museum data standards.
- I need to look at the websites of FISH, EPOCH. I need to look more deeply at the ADS website.
- I need to play around with any online collections of archaeology data I can find. tDAR (prototype?), ADS catalog?, …
- What is the most recent work that has been done on this? What is the current status of the much hoped-for archaeology cyberinfrastructure?
- Check out some links from the page of this Archaeology and Cultural Heritage Application Working Group.
- Find some of the papers that were presented at VAST 2009.
Can I please move to Europe?
Bibby, D. (2009) From the Dust to the Disk: Collection and Preservation of Digital Excavation Data in Baden-Württemberg. The SAA Archaeological Record, 9 (2), 17-20.
King, J. (2009). Creating Digital Access to Archaeological Collections. The SAA Archaeological Record, 9 (2), 25-30
Kintigh, Keith. (2006). The Promise and Challenge of Archaeological Data Integration. American Antiquity, 71 (3), 567-578.
Snow, D., Gahegan, M., Giles, C. L., Hirth, K. G., Milner, G. R., Mitra, P., & Wang, J. Z. (2006). Cybertools and Archaeology. Science, 311 (5763), 958-959.
Styliadis, A.D., Akbaylar, I. I., Papadopoulou, D. A., Hasanagas, N. D., Roussa, S. A., & Sexidis, L. A. (2009). Metadata-based heritage sites modeling with e-learning functionality. Journal of Cultural Heritage, 10 (2), 296-312.
Sugimoto, G., Felicetti, A. Perlingieri, C. & Hermon, S. (2007). CIDOC-CRM Spider: Stonehenge as an Example of Semantic Data Integration. In D. Arnold, F. Niccolucci, A. Chalmers (Eds.), VAST 2007: 8th International Symposium on Virtual Reality, Archaeology, and Intelligent Cultural Heritage. (pp. 47-54). Aire-La-Ville, Switzerland: Eurographics Association, 2007.