While technology affords creation of digital collections, and promises access to all, the reality is that many cultural data collections exist in a precarious ecosystem, where erratic funding, fragmented support, and disconnected expertise threaten their continued existence. As a significant branch of the broader information ecosystem, cultural data collections range in size and scope, from national institutions to bespoke local collections supported by individuals. This exploratory, qualitative study engaged cultural data experts in Australia, Canada, and the United Kingdom to map the broad cultural data ecosystem and to identify opportunities for healthier growth. The development and maintenance of cultural data collections requires integration across the spheres of expertise of creators, curators, subject matter experts, information science, and computing and technology. The foundational structural elements of the ecosystem include funding, policies, access to existing data, community context, and technological infrastructure. The key elements of a healthy data ecosystem are clarity of purpose, user-focused design, sustainability, allied coproduction, and reciprocal interconnection. A healthier cultural data ecosystem means more collections and initiatives will have positive impacts for research, knowledge, and diverse communities, contributing positively to the broader information ecosystem and to society, at large.
Cultural data initiatives preserve and provide access to the historic artifacts of our communities. While national libraries, galleries, museums, and archives gather historically significant works that tell the stories of a country's broad, collective experience, individuals, and small groups (such as regional historical societies) collect historical items documenting rich, localized activities. These organizations, along with the many creators and curators of cultural data artifacts, comprise a complex ecosystem that has been transformed significantly through digitization and the use of online platforms. This is what Burkey (2022) refers to as a “new memory ecosystem, whereby heritage communities are invited to contribute, participate with, and share more of what they are interested in collectively remembering, rather than simply accepting the authoritative narratives of heritage institutions” (p. 185).
Globally, the cultural data ecosystem produces, manages, preserves, and facilitates interaction with collections that are extremely diverse in the materials they include, across myriad topics, for varied audiences. The development and maintenance of these collections relies on expertise from across various academic and practice disciplines, including information science, humanities, computing, social sciences, and cultural industries, among others. The diversity in data scope, creator type, material format, longevity of retention, and availability of access points (to name a few examples) make the cultural data ecosystem one of the most complex and rich segments of the broader information ecosystem incorporating human-generated knowledge artifacts. Terras et al. (2021) describe this ecosystem as “a patchwork of small to large scale content, held in different locations, formats and under different reuse licenses, with different institutional approaches to risk, public engagement and entrepreneurship” (p. 11).
Thus, while cultural data initiatives are growing in number, globally, they lack a cohesive, sustainable, and healthy ecosystem to enable collaboration and sharing across related contexts. Using an exploratory, qualitative design, the results of this study demonstrate that the lack of a healthy ecosystem for cultural data initiatives results in fragmented, ineffective approaches, where collections have significant design limitations, are not sustainable, and lack clarity of purpose. The potential for cultural data to be irrecoverably lost is very real due to limited funding, insufficient infrastructure, and a precarious workforce that lacks the integrated, sustained, interdisciplinary approach needed for collections' long-term viability.
This paper examines cultural data initiatives and practices as critical components of the broader information ecosystem; it identifies their interrelated parts and articulates the features that constitute a healthy cultural data ecosystem that, in turn, benefit society. The research draws on Nardi and O'Day's (1999) concept of a “complex system of parts and relationships”—that is, an ecosystem—characterized by the presence of multiple essential elements and qualities (p. 50) and “marked by strong interrelationships and dependencies among its different parts” (p. 51). These parts are diverse and include “keystone species” necessary for the survival of the ecosystem, which often coevolve together for the benefit of the whole (p. 52). An important example of a keystone species in information ecosystems is “people who build bridges across institutional boundaries and translate across disciplines” (p. 54). Cultural data initiatives are well-suited to ecological analysis because their character as ecosystems is highly evident, relying on diverse experts committed to building bridges and coevolving as technologies and other circumstances change.
The cultural data ecosystem is a distinctive branch of the larger global information ecosystem. It includes collections managed with standardized protocols within galleries, libraries, archives, and museums, and those curated by universities, cultural organizations, businesses, special interest groups, and individuals, such as performance companies, music collectors, artists, and local history associations. The goals of these collections vary across preservation, public access, research, education, and commercial purposes. Interactive technologies that enable public interaction and annotation of collections have also contributed to this “new ecosystem of commemorative practices and collective remembering” (Burkey, 2022, p. 186).
While digitization facilitates increased access and reuse of cultural data (Terras, 2015), it also creates challenges for reuse and interoperability, and “a complex, interleaving, network of issues regarding training and upskilling, licensing and copyright, access to computational resources, access to data and consideration of the place of technological development within the cultural and creative sectors, and how this sits alongside existing or inherited activities, resources and cultural policy” (Terras et al., 2021, p. 11). While there are moves to address the barriers to improved interoperability within libraries and archives, this requires significant investment and professional expertise that is not reflected across the other organizations creating cultural data collections (Hawkins, 2022; Zhang, 2022).
While assessing and addressing the health of the overarching information ecosystem is critical, it is also significantly challenging to analyze such an immense ecosystem, across data types, formats, and intention of design. Nardi and O'Day (1999) argue that examining “the biggest picture possible” can be difficult, and even pessimism-inducing, because macrolevel processes can “seem impenetrable” (p. 57). They argue that examining a specific, locally rooted, instance can provide a “viable point of intervention in a larger system” (p. 57). Examining the health of the cultural data ecosystem provides a critical window into the interrelated workings of the individuals, groups, and data sources that comprise this significant branch of the broader information ecosystem.
Cultural datasets include information that is among the richest, most diverse, and the most ephemeral, in existence. Examples discussed in this paper include recorded performances of poetry readings and circus events, national literatures, government debates, videogame code, and resources supporting linguistic and cultural resurgence. These collections preserve and provide public access to artifacts of the arts, heritage, and culture, which “help shape reflective individuals, produce engaged citizens, impact cities and urban life, improve health and well-being, and have distinctive economic benefits” (Terras et al., 2021, p. 2). Cultural data encompass artifacts that reflect and represent what it means to be human. This is what sets these collections apart from other types of data collections. As such, there is an urgent need, and great potential, for the cultural data ecosystem to reflect the diversity of human experience, expression, and creativity, beyond the Western canon and the formats included in many library collections.
A healthy cultural data ecosystem will strengthen the broader information ecosystem by democratizing access to cultural knowledge and ensuring appropriate standards are applied by content creators and curators. In this way, the information ecosystem benefits from heightened resilience against systems of misinformation and disinformation, as abundant, interoperable, and accessible cultural data strengthen our understandings and analyses of history and society. One well-known example of the power of such a cultural data approach is highlighted by Verwayen et al. (2011), who describe how the online proliferation of fake and low-quality images of Vermeer's painting The Milkmaid led the Rijksmuseum to release their cultural data, openly. The museum discovered that there were more than ten thousand fake images of the painting circulating online, causing a situation where “people simply didn't believe the postcards in our museum shop were showing the original painting [and this] was the trigger for us to put high-resolution images of the original work with open metadata on the web ourselves. Opening up our data is our best defence against [fakes that mislead the public]” (2011, p. 2).
A healthy cultural data ecosystem is also more capable of responding to engagements with systems of power, advantage, and disadvantage, as diverse cultural data support the correction of biased historical narratives. As Burkey (2022) notes, “heritage communities can utilize digital heritage initiatives as a nexus of information, where more voices can be brought together in providing a richer set of perspectives and a multitude of conversations instead of a particular narrative” (p. 196). This echoes Montenegro's (2019) critiques of the universalization of knowledge representation where, for example, “metadata and analyses [of traditional Indigenous knowledge are] generated by professionals and authorities… external to those communities, resulting in… incorrect information about Indigenous people's histories and realities” (p. 74). Indeed, Burkey (2022) explains that there is now a “consensus [that] a wider variety of channels equates to more voices in the conversation and at least the potential for increased involvement, broader interpretations, and more democratized versions of remembering through digital heritage initiatives” (p. 192). This makes visible the untold stories of marginalized peoples who would otherwise be absent from historical records. The research results presented here demonstrate that with sufficient, consistent care and appropriate resources, the social dynamics that challenge the sustainability of cultural data initiatives can be overcome. This, in turn, enables the cultural data ecosystem to contribute to the betterment of society.
2 GLOBAL CHALLENGES IN SUSTAINABILITY OF CULTURAL DATA PRACTICES
The gathering and management of cultural data is dispersed across government agencies, libraries, museums, galleries, universities, and independent collections held by cultural organizations or individuals. This work occurs within a broad spectrum of organizational contexts and available infrastructure and resources, ranging from large-scale, formalized collections with significant, long-term investment (e.g., mandated depository collections in national libraries), to small, bespoke collections developed by individuals with limited resources (e.g., an individual researcher gathering cultural materials over the span of their career). While countries have different approaches for providing funding and support opportunities for cultural data initiatives, there are many similarities, particularly regarding long-term sustainability and accessibility challenges.
In Australia, for example, many initiatives are undertaken by the Australian Research Data Commons (ARDC), which provides high-capacity data storage, hosting for many discipline-based collections, and project funding (https://ardc.edu.au/). Research Data Australia (https://researchdata.edu.au/), an ARDC data discovery service, provides access to data collections held by over 100 organizations. Yet, many initiatives focus on large consortia, not small-scale collections. At the individual level, researchers and small organizations rely on discrete, time-limited and highly competitive funding, influenced by government priorities. For example, the Australian Research Council's (ARC's) Linkage Infrastructure, Equipment and Facilities (LIEF) grants enable university researchers to form cooperative data partnerships (see https://www.arc.gov.au/funding-research/funding-schemes/linkage-program/linkage-infrastructure-equipment-and-facilities).
Canada also provides competitive funding to support cultural data initiatives, including the Social Sciences and Humanities Research Council (SSHRC) (https://www.sshrc-crsh.gc.ca/home-accueil-eng.aspx) and the Canada Council for the Arts (https://canadacouncil.ca/). SSHRC's Research Data Management Capacity Building Initiative, for example, supports development of skills and adoption of data management tools (see https://www.sshrc-crsh.gc.ca/funding-financement/programs-programmes/data_management-gestion_des_donnees-eng.aspx). The National Heritage Digitization Strategy (https://ccdh-cnpc.ca/) is a 10-year project (from 2016) supporting collaboration among Canada's memory institutions to preserve and provide access to heritage materials. Library and Archives Canada funds the Documentary Heritage Communities Program (https://library-archives.canada.ca/eng/services/funding-programs/dhcp/pages/dhcp.aspx), enabling community organizations (e.g., historical societies) to increase access to collections and foster preservation.
The United Kingdom (UK) provides similar, incremental funding schemes, with similar challenges for long-term sustainability (Wright & Gray, 2022). Interdisciplinary data initiatives are supported by the UK Research and Innovation's Arts and Humanities Research Council (AHRC) (https://www.ukri.org/councils/ahrc/), and international organizations such as the European Research Infrastructure Consortium's Digital Research Infrastructure for the Arts and Humanities (DARIAH) (https://www.dariah.eu/). However, the time-limited nature of these schemes, and their focus on projects instead of programmatic funding, do not provide sustainable infrastructure and skills development. Yet, several national initiatives do focus on longer-term strategies: the Creative Industries Clusters Programme (https://creativeindustriesclusters.com/), driving innovation, commercialization, and skills development; Towards a National Collection (https://www.nationalcollection.org.uk/), a project designed to remove collection boundaries and enable accessibility; and the Museum Data Service (https://artuk.org/about/museum-data-service), a partnership between Art UK, the Collections Trust, and University of Leicester to create a data repository for national museums and public collections.
3 THE NEED FOR A HEALTHY CULTURAL DATA ECOSYSTEM
Where initiatives rely on short-term funding, and small research teams, the fragility of the cultural data ecosystem is particularly evident. A healthy cultural data ecosystem relies on a mature funding and support model, where maintenance and expansion are sustained. The current approach, globally, means many artifacts are not discoverable by users, collections are developed in isolation and lack interoperability, and all phases of collecting work are managed with a precarious workforce. These aspects of the current cultural data ecosystem raise significant questions for long-term viability and sustainability.
- A reconsideration of methodologies and developing a shared understanding across the technical and humanities/arts disciplines.
- Technical infrastructure that is not project dependent and time limited to avoid fragmentation.
- Research funding to support and recognize the conditions needed for interdisciplinary research including cross-council schemes, recognition of hybrid roles and institutional resources to enable researchers to bridge disciplinary gaps.
- Training for humanities researchers and professionals through degree programs and short courses.
Several reviews by government agencies and disciplinary peak bodies identify the need for national policies and resourcing to support consistency and interoperability (e.g., Academy of the Social Sciences in Australia, 2022; Tindall & Duncan, 2020). These reports highlight the risks to cultural data preservation due to lack of consistent policy, inadequate resourcing, and lack of skills, globally. As Terras et al. (2021) note, the “legacy of 30 years of investment in cultural heritage digitization is a patchwork of small to large scale content, held in different locations, formats and under different reuse licenses, with different institutional approaches to risk, public engagement and entrepreneurship” (p. 11).
The National Library of Australia's Trove is an excellent example of the sustainability challenges faced even by large organizations mandated to collect and provide access to cultural data. Trove provides public access to more than 6 billion cultural data items from hundreds of partners, including libraries, museums, galleries, media organizations, government, and community organizations. Yet, Trove suffers from a severe lack of funding, which culminated in a financial crisis in 2022. As Jones and Verhoeven (2022) note, this is a symptom of a larger funding crisis across a sector that needs “sustainable, recurrent funding”; this speaks to the fragility of cultural data initiatives, broadly.
4 CHALLENGES WORKING ACROSS DISCIPLINES AND PRACTICE ENVIRONMENTS RELEVANT TO CULTURAL DATA
Indeed, cultural data initiatives extend well beyond established institutions (such as libraries), or disciplines (like digital humanities). These data arise from research and practice activities in various communities and disciplines. They are curated and maintained by professionals in varied settings and contexts, with design and implementation informed by expertise across the humanities, social sciences, and creative industries, as well as computing and information science. They rely on expertise in preservation, collection, access, commercialisation, decolonization, representation, knowledge organization, information behavior, user experience, human–computer interaction, and software engineering, among others.
Maintenance of digital resources falls between traditional roles and areas of expertise, and therefore responsibility for it remains ambiguous… to guarantee that both data and interface will be available in anything approximating perpetuity is an exceptionally difficult and expensive promise to make, and it is therefore no surprise that the long-term fate of so many digital humanities projects remains uncertain. (p. 61)
5 WHO ENGAGES IN THE CULTURAL DATA ECOSYSTEM?
Despite a shared interest in creating and maintaining cultural data, academic and practice-based entities are often siloed and singular in their focus. This means the various actors who need to be involved in the ecosystem may not be integrated with—or even aware of—other expertise required for short-term activities or long-term sustainability. Computing, humanities, and social sciences students and academics typically sit in different College or Faculty structures. Academic librarians may support similar disciplines, without enabling interdisciplinary investigations of cultural data needs. Funding agencies target initiatives to disciplines, reinforcing siloed approaches to handling cultural data.
Within this broader social ecosystem, cultural data initiatives are conceived of, built, and maintained by many different experts. A healthy cultural data ecosystem requires these various groups to be interconnected and willing to coevolve, to share similar goals, and to be supported by appropriate experts. There are five spheres of expertise relevant to this work, all of whom must work together for the ecosystem to thrive: creators, curators, subject matter experts, information science experts, and computing and technology experts.
Without cultural data creators there would be no collections in the broader information ecosystem. Creators include artists, performers, writers, and other professionals who create objects, performances, recordings, images, texts, or other cultural artifacts. This category also includes “broadcasters, publishers, …, innovators, creatives, exhibition creators and, developers” (Trove Strategy, n.d.). Creators may be contemporaneous, such as coders on independent video games, or historical, such as Renaissance composers. Creators generate cultural data by conducting creative work, which are integrated into various types of informing collections. For their data to endure, creators must accumulate, retain, or share them at point of creation. Creators generally lack formal training in collection management, curation, or preservation. Rather, they may preserve their work because of its informative value, for professional or legacy connections, or due to an instinctive impulse.
Where a creator amasses cultural data over a lifetime of creative practice, a curator then shapes that cultural data into an organized, described, and now often digitized or born-digital, collection. These collections, in turn, feed into the broader information ecosystem, to represent experiences across local and global societies. This category includes all types of information professionals, including librarians, museum curators, and archivists, who are formally trained in collection, description, presentation, storage, and preservation practices. Curators navigate an “interactive relationship” between users and collections (Walter, 1996). Digital collections extend beyond the walls of information institutions and “show the potential for greater interactivity, transforming how narratives around heritage spaces and objects are constructed and interpreted” (Evans, 2016, p. 51). In the cultural data ecosystem, curators create databases, catalogues, metadata schemes, and user interfaces. They contribute expertise around system interoperability, copyright, information ethics, and record management and retention. They preserve the historical record and ensure equitable information access through education and research, providing critical content and infrastructure that intersects with the broader information ecosystem.
5.3 Subject matter experts
Subject matter specialists connect with cultural data through disciplinary expertise or practice contexts, including research and teaching. While these experts may be creators, many are not; they are often historians, linguists, musicologists, dramaturgs, or literary scholars, who may also contribute to other branches across the broader information ecosystem. These experts may not start out thinking about data, but over time, they come to think about their texts as data. This category includes experts in digital humanities, which continues to evolve as a field of practice and research (Callaway et al., 2020). It has been defined as “interdisciplinary” (Wymer, 2021) or “a field, a methodological tool kit, a discipline, a subdiscipline, and a paradiscipline” (Risam, 2021), and “an array of convergent practices” (Schnapp & Presner, 2009). Increasing integration of technology into humanities research practices has involved development of new approaches to data analysis, modeling, visualization, and conceptualization (Camlot et al., 2020; Luhmann & Burghardt, 2022; Terras et al., 2021).
5.4 Information science experts
Two information science specializations contribute significantly to cultural data initiatives, as well as to the broader information ecosystem. First, information behavior scholars bring expertise in user needs and practices. They examine how people engage with data and systems, including “information experiences, in diverse circumstances and settings, and across various personal activities and outcomes” (Given et al., 2023). They explore how people navigate information, including purposeful, question-driven information seeking, and more serendipitous, often leisure-driven, information exploration. Second, knowledge organization experts study the organization and structure of information in support of discoverability and access, with expertise in classification, indexing, abstracting, taxonomies, and metadata. Knowledge organization's contribution to the cultural data ecosystem includes resolving issues where “information is being organized ad hoc, often resulting in systems that underperform and even effectively prevent access to data, information and knowledge” (Golub & Liu, 2022, p. 17).
5.5 Computing and technology experts
Cultural data initiatives require the expertise of system designers, software engineers, and often, experts in computational analysis. These experts contribute technical expertise, rather than expertise rooted in the subject domain of a data initiative; they contribute critical skills to all branches of the information ecosystem. Their research and practices are usually focused on the computing-related organization, structure, retrieval, and management of cultural data, rather than its substance, qualitative meanings, or underlying organizational principles. Some computing and technology experts bring formal expertise in other disciplines, as well; for example, many linguists work in computational methods, having developed technical expertise in addition to subject specialization. This category includes research software engineers, who support research in the humanities and other disciplines; they often contribute to research projects by developing specific technical solutions for data collection, analysis, management, digitization, system design, and maintenance (Hettrick, 2016).
The integration of all five spheres of expertise is essential for a healthy cultural data ecosystem to deliver on intended outcomes. Yet, knowledge siloing remains widespread, with project teams often including experts from only a few of these categories. For the ecosystem to thrive, and to support long-term viability of cultural data initiatives, we need to better understand the perspectives and experiences of these groups. The results of this exploratory, empirical study identify the key elements required for a healthy cultural data ecosystem. This includes elements that can enable experts who are often trained (and work) in isolation, to collaborate more effectively in a shared commitment to allied coproduction (Figure 1).
6 METHODS AND PARTICIPANTS
This research used qualitative, in-depth, exploratory key informant interviews to map the current state of the cultural data ecosystem, globally, and to formulate recommendations for a healthier ecosystem, in future. Using a maximum variation sampling approach (Palys, 2008; Stebbins, 2008), we interviewed nine experts in three countries (Australia, Canada, United Kingdom); participants were identified using a mix of purposive, convenience, and snowball sampling. Interviews were semistructured and ranged between ~40 and ~120 min, based on participant availability; six interviews were conducted online (via Zoom or MS Teams) and three were in-person. Participants reflected the range of roles present within the cultural data ecosystem; they reflected on their own cultural data experiences and work, and they provided views and insights on global trends (including challenges and opportunities) across the sector. As appropriate in a study of experts' experiences and views, participants were given the option to be anonymized, or to be referenced by name.
- David, a professor, and writer with long-term involvement in building and sustaining the Living Archive of Circus Oz, a contemporary Australian circus collective.
- Andy, a senior leader in a national archival institution in Canada.
- Jessie, a senior leader in a national archival institution in the United Kingdom.
- Sean, a digital curation librarian, in the Canadian research library sector.
- Sharon, a head librarian in a research-intensive institution in Canada, whose responsibilities focus on metadata.
- Taylor, a research fellow affiliated with a major Australian cultural data initiative.
- Jason, a professor of literature and director of the SpokenWeb Partnership Network, based in Canada.
- Melanie, a historian of computing in Australia, who is closely involved in leading large-scale software preservation initiatives.
- Daniel, a data scientist based at a large Australian university.
Interviews were transcribed, verbatim, and analyzed using an inductive approach where themes were identified as they emerged from the coding process (Fox, 2008). As this research is exploratory and inductive, its results cannot be decontextualized or isolated from interpretation, and so findings and discussion are presented together (Richardson, 2000). Due to the ecosystem perspective taken in the study design, we sensitized our analysis not only to the experiences of our participants as individuals, but also to what their accounts reveal about the health of cultural data initiatives as part of a broader, global information ecosystem.
7 FINDINGS AND DISCUSSION
Across the participant group, no two experts work with the same cultural data collections. Formats in participants' cultural data work include photographs, digitized print, born-digital texts, music recordings, nonmusical sound recordings, video recordings, software, and social media and web content. As contextualized documents, these formats represent family, community, and colonial histories; books, magazines, newspapers, letters, government records, and other text-based documents; professional and amateur music; poetry readings, spoken word, oral histories, interviews, and conversations; musical scores; films and documentaries; video games and complex media artifacts such as architectural files; and the contents of social media platforms such as Twitter. As participant Sharon, head of metadata at a large Canadian research library, points out, these documents embody “intangible cultural heritage,” because they make possible the discovery of “intangible things, as captured in tangible things.” By speaking with experts with different spheres of expertise, and who work with divergent cultural data collections, this study has identified common elements and challenges.
First, our analysis finds five predominant structural elements underpinning the cultural data ecosystem. In the following sections we outline these elements, including characteristics that lead them to function in unhealthy or healthy ways. Second, our analysis also reveals rich distinctions between the five spheres of expertise described previously, identifying considerations for assembling effective teams for the collection, preservation, and study of cultural data. Third, participants' accounts identify five core signs of health needed within the cultural data ecosystem and, by extension, individual projects. A healthy cultural data ecosystem is depicted in Figure 2, a visualization of our findings. As in any ecosystem, all elements are essential for the ongoing vitality of the system; each are discussed, in turn, in the sections that follow.
7.1 Structural elements
7.1.1 Funding environment
Funding, as an essential but uncertain resource, is the most significant structural element of the cultural data ecosystem. The cyclical nature of grant funding can bring projects to a premature end and cause lost progress when projects languish due to a lack of resources. For Taylor, a research fellow working on a large university-based Australian cultural data initiative, the funding status quo is marked by “bursts of investment […] There's kind of a finite end to these projects, and then you sort of have to start again.” David, a professor and writer with long-term involvement in building and sustaining the Circus Oz Living Archive, confirms Taylor's observation. The Circus Oz Living Archive is a portal showcasing and enabling research into the performance videos of Circus Oz, a prominent Australian circus company. David recounts initial investments into the planning and construction of the Living Archive, but then, as investment ceased, the Living Archive degraded (starting in 2014), until it ceased to function. Once the Archive became a cultural data emergency, urgent efforts by other researchers revived it; but it remains precarious in the absence of dedicated funding and institutional commitment. Today, David observes that “hosting and maintenance of the Living Archive itself is still very much…in doubt.”
Insufficient funding also causes instability for project staff, who often work on finite contracts. Experts in data science roles are difficult to retain as they can earn larger salaries in industry roles. Taylor observes “there can be some precarity around the people in those roles because they can get paid better [elsewhere] […] Who wants to be attached to, like, a complex, difficult, you know, challenging project on a kind of mediocre salary?” The funding cycle affects the ability of cultural data projects to build collections and systems over time. The grants funding this work are extremely competitive, and projects funded in the past are not guaranteed to be funded in the future.
Achievements that take years to build can be compromised quickly. Andy mentions the example of establishing relationships with communities historically affected by colonization; these relationships are key to knowledgeably and responsibly collecting certain cultural data, such as historical photos of Indigenous peoples whose names have not been documented. This important work is easily destabilized by government austerity measures.
We also have so much authority and credit in some spaces that if we were to say, ‘we do this [metadata modernization] now,’ a lot of folks would be, like, great. In fact, they're looking to us to do a lot of that. But […] you know, the next government could come in and be like, ‘you guys get twelve dollars a year.’
In other words, SSHRC's requirements were an imperative to develop crucial partnerships and material commitments.
You have to have a certain degree of matching funds with [partnering institutions] to put the money to digitize the collections on the table. […] Before year one even began, I felt like I had made a big win, you know, because it meant that all those institutions had committed.
7.1.2 Policy environment
Cultural data collections and research exist within policy contexts, with many forms of regulation, and guidelines affecting this work. Around metadata, for example, Sharon cites the current importance of the FAIR principles for digital data management (FAIR Principles, 2016), and the CARE Principles for Indigenous Data Governance (GIDA, 2018). In Canada, the Calls to Action from the Truth and Reconciliation Commission have motivated improvements in how Indigenous cultural data collections are managed and described (Truth and Reconciliation Commission of Canada, 2015).
This EaaSI project enables libraries and archives to make emulation available to their users, and to share collected, preserved software with each other's users. This is an example of beneficial alignment between a project's goals—software preservation and access—and the current copyright environment.
I was glancing through the act [and] thought Section 113J looked pretty promising. But I wanted to make sure, and so I sought expert IP legal advice. […] And I was right. […] It's legal for libraries and archives, as defined in the act, to make a preservation copy of content that's in their collection, and to make that available to research purposes, in the library or archive, or in another library or archive. And that's the really exciting bit.
7.1.3 Extant data
Cultural data initiatives always hinge on the question of what data exist, affected by historical collecting decisions made by individual private collectors or staff in collecting institutions. Forms of cultural data of interest today are often unaligned with past mandates of collecting institutions, so projects must sometimes draw varied strategies to build a collection. Melanie must continually address this challenge. In collecting 1980s and 1990s video games, Melanie's work relies on Ebay vendors. She reports “finding them really is tricky. […] We have bought a lot of the games that we've targeted for acquisition. […] That's just a matter of, you know, watching and waiting and finding a good copy and spending the money.” Melanie rightly argues that acquiring software not collected by libraries or archives is an ongoing challenge. Cases frequently arise of contemporary software becoming unusable. As Melanie observes, “This is not just about the past. This is about the now, and contemporary memory.” Librarians, archivists, and others involved in collecting decisions have significant power to shape future cultural data collections, which can only be built from data artifacts that are retained.
7.1.4 Institutional or community context
All cultural data collections emerge from, and exist within, an institutional or community context. Context is fundamental to the understanding of collections as data in the first place. Context is also a core influence on the intentions and imagined purpose of cultural data collections, including which research questions are imagined for them, and how a community will benefit from supporting them. Institutional or community context also frames thinking around who may be the intended, potential, or current users of cultural data. To satisfy funder requirements, cultural data initiatives are often positioned as beneficial for the public. In fact, initiatives are more likely to have specific users, such as researchers, or a specific Indigenous community focused on linguistic and cultural reclamation.
Sharon, who also contributes to cultural data collection management in Inuvik, within the Inuvialuit Settlement Region of northern Canada, adds that research emerging from institutional contexts need not be at odds with a collection's community purposes. She observes “the more access people have to things […] everything from a local genealogy researcher to somebody who does text mining […] Just making things available and accessible for people, I think creates opportunities there.”
Everything we were doing, even on the development side of this project was a research question, right? […] From a literary studies perspective, research only begins with the content, you know. And it'll be like, ohhh, it's with the text, but really everything in the SpokenWeb Network sort of became a research question about the management of data, like so in a lot of ways this whole project is essentially looking in different ways at how we work with, how we create, how we imagine, how we structure, and how we use data.
7.1.5 Technological infrastructure
Sean's work illustrates the considerations required before asking questions of infrastructure.
A lot of my role is helping [community groups] figure out rights, ethics, access to those materials, preservation, metadata, description, and then often we're trying to see if we can help them with infrastructure. So, you know, can you? Put your materials in our repositories? […] And sometimes that works, sometimes it doesn't, and it…may lead to, you know, we've consulted, given advice, connected.
The task for Jessie's team is to “move the infrastructures” into the future. Andy, another leader in a national archival institution, agrees, while pointing out that although technological infrastructure is always important, “it's not actually the hard part. It's figuring out all the thinking around it, getting a coordinated vision of what folks want. It's looking at, you know, what are the drivers?” Technology must follow the drivers, as Andy puts it, to avoid outcomes whose only benefit is being “shiny.”
You have to kind of do workarounds and bolt this bit on and bolt that bit on […] we have this sort of organic growth of systems such that we now have multiple systems and […] everything has to be squished together and remodelled and remodelled and remodelled […] We have some, you know, quite old relational databases, fundamentally […] But relational databases is not where it's at when it comes to your data-driven website.
7.2 Spheres of expertise
Cultural data collecting and research requires people with widely divergent forms of expertise. The interview data provide rich understandings of key distinctions among the spheres of expertise within the ecosystem, which can inform positive team formation and overcome barriers to collaboration. People with common expertise often relate to one another more than they may relate across spheres. As Jessie observes, when people collaborate across spheres it can feel “like being able to talk more than one language.”
David has an abiding, multifaceted interest in the original circumstances of the creation of the Circus Oz cultural data (i.e., performance videos), the evolution of the company over time, and the data's storytelling potential. He illustrates that a creator can make a rich contribution to a cultural data collection, without having created the dataset.
As a performing arts practitioner myself and as a video practitioner myself […] I was interested in memory studies, and I was interested in notions of storytelling and how this collection of video recordings could be seen to be telling the story of this company and this large ensemble of people who had made up the company across many years because it's a very particular form of cultural production, which began as a collective and then morphed over the years, but in a very organic way. […] So it was really as a creative artist, I suppose, that I was involved in this project, and interested in trying to mesh the desires of the company with thinking about the interaction design issues as well.
The curators in this study are true to form with their commitment to data access, organization, description, and preservation. Andy, working in national archival institution, voices a common sentiment for curators: “Access is the key to understanding.” Curators know their work is antecedent to the pursuit of research or other engagement with cultural data. Unlike other spheres of expertise, curators are usually working on multiple collections, with responsibilities that traverse any single collection. This creates challenges for curators who wish to develop closer involvement with collections of interest. Curators' roles are often conceived of as functional, with disciplinary knowledge deprioritized.
Curators are considering how their roles will evolve over time. Jessie, reflecting on her work as a metadata expert in a national archival organization, asks: “How can we maintain some of those traditional [archival] principles, but actually put those principles into practice in a different context, because it's a very different context now?” Similarly, Sean observes a shift toward a more proactive role: “It's always a question of who should take the first step, in a way, right, and maybe that's not a conventional orientation for librarians and archivists. You know, where there's a mindset of, you know, we'll be here when you need us.” Curators think deeply about the supports they provide and the systems they build, and they are committed to access as a principle. At the same time, because curators are situated within institutions, such as a libraries or archives, their roles, including with cultural data, are always intertwined with that institutional context.
7.2.3 Subject matter experts
Subject matter experts can sustain cultural data collections over long periods of time. At the same time, for some humanities and social sciences scholars, there can be a perceived legitimacy benefit to working with cultural data, simply because it carries the language and potential of data. Taylor, the Australian research fellow, recalls a period when:
There were all sorts of data awakenings […] I became increasingly interested in sort of thinking about telling literary history through this other medium, sound, which introduced me to all kinds of thinking about, from a practical perspective, but also a theoretical perspective about what a sound recording was as an entity, you know, as a research object.
Taylor's observation is that this period of collecting data for its own sake has largely passed, at least in Australia, but the sentiment is important to consider.
Everyone felt the need to, either literally or indirectly, say that they are doing something that involves a quantitative method. Because that could be tethered to a sense of, you know, scientific rigor or impact. And so, the idea that you're collecting all of this data seemed in and of itself, a useful thing to do. But what we know is that it's only as good as what people actually use it for and do with it.
7.2.4 Information science, computation, and technology experts
What is most telling in the interview data is the relative absence of discussion by cultural data experts (i.e., creators, curators, and subject matter experts) on the roles of information science experts and computation and technology experts. While the participants in this study do have some background and expertise in these areas, their focus in the discussions was primarily technical and functionalist with respect to these domain areas. None of the participants discussed the influence of research on their practice within these areas of expertise. This indicates a significant gap contributing to the lack of a healthy ecosystem for cultural data, as research in these domains is critical for long-term viability through enhanced knowledge organization practices and evidence-based understandings of people's information needs.
While the interviews demonstrated a level of understanding and sophistication around knowledge organization expertise (e.g., metadata required for collection descriptions), information behavior expertise was not discussed at all. While the participants regularly refer to the concept of “users,” the interviewees primarily demonstrate a surface understanding of the implications of collection work on end-users' experiences, needs, and understandings of their communities. This significant gap requires additional research, as well as integration of expertise from this subfield of information science, across the cultural data ecosystem.
7.3 Signs of health
We have demonstrated how cultural data, considered holistically, benefits from an ecosystem perspective. All cultural data initiatives are shaped by multiple structural elements that are much larger than individual projects, such as the availability of funding, which influences the scale and pace at which the work can proceed, as well as the past collecting practices of individuals and institutions, which determine what documents and artifacts have endured until now and are available to be contemplated as data. We have also described the five predominant spheres of expertise that are required for cultural data projects to proceed successfully.
In the preceding sections, we referenced challenges articulated by participants. These challenges often represent persistent quandaries faced by many people involved in cultural data work. They can be thought of as signs of “unhealthiness” within the current cultural data ecosystem. The purpose of this section is to articulate five predominant qualities that signify health within a cultural data project and, if consistently present across projects, signify health within the broader cultural data ecosystem. A healthy cultural data ecosystem (Figure 2) is one that contains these qualities, favorable structural elements, and participation representative of the predominant spheres of expertise.
7.3.1 Clarity of purpose
In a healthy cultural data ecosystem, a project proceeds with a shared sense of internal alignment, and its aims are clear. Clarity of purpose does not mean an absence of disagreement; contributors may bring divergent goals or research questions to a project based on their own interests and specializations. However, a healthy project radiates a sense of shared aspirations. Contributors can articulate the purpose of the project. This, over time, enables them to identify the insights and impacts that are being created. Contributors can document that intended research, community benefits, or other outcomes are in fact being enabled through the project.
For example, in discussing cultural data initiatives within Indigenous communities, Sharon observes a need for people working in this space to create “community-driven, sort of social contact-driven spaces in a lot of ways […] really thinking more critically about this space and trying to open to a bit more kind of a fluid, dynamic space.” Similarly, Taylor describes the thinking behind a large Australian cultural data initiative: “We thought that it would be more effective with the resourcing that we had, to deeply consider what kind of effort, skills, teams, and so on, intelligences, are required to take cultural data that's sitting in robust but actually fragile collections and see what needs to be done to it to make it immediately useful for some kind of analysis.” These examples of intentionality illustrate how a healthy cultural data project embodies the clarity brought by a reflective orientation to the work.
Clarity of purpose also creates space for unexpected benefits, or positive outcomes that cannot be predicted precisely. This is evident in how Jason, the director and chair of the SpokenWeb Partnership Network, describes this initiative's broad benefits. Having grown out of a smaller project based on the study of a single set of reel-to-reel tapes, scaling SpokenWeb upward involved circulating a call across Canada to locate other literacy recording collections. The clear need addressed by the initiative led to the identification of collections that had not been shared before, and that diversify our understanding of people's involvements with literature in Canada in the 20th century. Jason observes: “Some of the most exciting parts of the discovery process of new collections has been like, of queer communities that were recording their gatherings, you know; of Black communities in Montreal, for example, like where we found interesting collections; and just all kinds of communities that, like, were completely invisible, from even the many diverse collections that we have in our university archives. You know, they just weren't there at all.” SpokenWeb's credibility and clarity of purpose has made it appealing to new partners, including those who can contribute rare and little-known collections to the Partnership Network.
7.3.2 User-focused design
In a healthy cultural data ecosystem, consideration is given to users from the outset, whether they are future researchers, members of specific communities, or the public at large. A user focus is embedded into the design and implementation of cultural data projects from the outset. Users are not an afterthought. There is an ongoing commitment to the discoverability and accessibility of a healthy cultural data collection.
There is great potential in the cultural data sphere for user-focused design to be more prominent and normalized, and for information behavior research to inform cultural data practices. Taylor points out that user considerations often come up toward the end of a cultural data project, “when it's like ‘what do we do with this collection?’ […] And I think there's been a big shift in the last five years with, at least to get funding, a lot of people would say, well, ‘I'm building a database.’” However, Taylor also points to shifting attitudes, at least in Australia, in recognition of the need for purposeful approaches that integrate consideration of users; simply “building a database” is “not as in vogue anymore for a whole range of reasons, the main one being, like, who cares? Everyone's got data. So, you're gonna build a database that's not actually in and of itself an impactful output, unless you can demonstrate precisely who might be using it, what they're using it for, whether it's interoperable with other data sources, and so on.” Understanding the anticipated user, or the community, for a project is not an afterthought, but a foundational consideration.
A user-focused approach also includes awareness-raising and outreach, and the need to resource these important activities sufficiently. As Sharon puts it, “How do you let people know this thing [a collection] is here for them to find?” She points out how, around cultural data collections, people often imagine “the one stop shop. You know, one portal to rule them all. But people don't know the portal.” Healthy cultural data initiatives incorporate ongoing outreach as part of iterative, user-focused design and ensuring that user-focused design is supported by research evidence from information behavior studies.
Sustainability refers to the provision of appropriate investment over time, including but not limited to regularized funding. Adequate and secure long-term funding enables staffing to be less precarious, strengthening teams and ensuring project continuity and integrity. Programmatic investment enables collaborators to develop a vision for long-term maintenance and the less “shiny” work of ensuring that cultural data collections endure and remain usable over time. Sharon voices a common sentiment among participants that “right when the money happens, you get it, create a thing, and then it sits and then it just never goes anywhere or continues to thrive. […] There's so much support and infrastructure and resources for building the new thing.” By contrast, it can be difficult to secure resources to support the essential work “that's not the jazzy, interesting new stuff.” These less “jazzy” elements of cultural data work, such as infrastructure maintenance, are essential to a project's sustainability, but they must be provisioned adequately with resources and leadership.
It was not until several years later when the Living Archive's decline was noted by some of the original project team that efforts began to revitalize it.
There wasn't funding for the ongoing maintenance of the platform, and Circus Oz themselves had some changes in personnel and some changes in strategic direction, I suppose, and so they became both less interested and less able to maintain the Living Archive themselves, and it gradually, over the ensuing years from 2014, it fell from daily use and eventually got to the point where it was no longer accessible.
7.3.4 Allied coproduction
In a healthy cultural data ecosystem, there is a sense of open collaboration and mutual understanding among experts. Connections across spheres of expertise are particularly important, where ongoing respectful engagement is necessary as very different experts work together toward shared aims. Cultural data work is interdisciplinary and must be appreciated as an inherently shared enterprise.
Participants consistently emphasize the importance of not only working together well, but also maintaining awareness of the divergent expertise necessary for a cultural data project to succeed. Jessie observes that “patience and persistence” are needed, and that, at times, “it's like being able to talk more than one language, like I constantly find myself having to switch language.” Taylor echoes the importance of “having the right knowledge within a team.” Similarly, Daniel, a data specialist at a large Melbourne university, describes international collaborations with data-driven musicologists. He observes that over time, “people find a way to work together and find a common ground. […] The community develops a shared vocabulary which informs the technical terms used to describe things, but it also informs the way that people are able to talk to one another, sort of across disciplinary boundaries.”
Cultural data project teams can be highly diverse in terms of skills and expertise. To that end, participants observe that their role on a team is always partially pedagogical. Everyone must devote intentional effort to communicating the complexities of a project from their point of view. This includes complexities around technical requirements and possibilities, metadata schema, formats, and matters of ethics, ownership, and copyright; it also includes the grounding knowledge domain that informs a project's overarching structure, purpose, and research questions.
Jason, describing the leadership model for the SpokenWeb Partnership Network, points out that the Network's governance committee meets every 2 weeks, and has done so for 6 years running: “The people involved are just, you know, great. And we really like each other. […] And that's a huge, a huge important part of it, it sounds maybe glib or something, but it really has been such an important element of the success of the research network, that we care about each other. And we are interested in hearing each other's concerns and adapting to them.” A well-allied team, bringing together the right knowledge, is essential to the health of a cultural data project.
7.3.5 Reciprocal interconnectedness
Each cultural data project is unique and local, to some extent. At the same time, the health of the larger ecosystem, and the success of individual projects, benefits greatly when collaborators maintain some awareness of other projects, innovations, and communities of practice around the world. Reciprocal interconnectedness includes technical and metadata interoperability, which are key to the growth, flexibility, and sustainability of cultural data efforts. However, it also includes a commitment to, and practice of, staying acquainted with other relevant work. In a healthy ecosystem, there is a balance between minimizing redundancy (i.e., not “reinventing the wheel”) and identifying local needs and unique features.
Andy, drawing on experience in a leadership role in a national archival institution, emphasizes the potential benefits of reciprocal interconnectedness, as well as the future hazards of not attending to it. Andy argues that a key opportunity for those working in cultural data is to “really leverage technology for access in new and exciting ways. […] There's a lot coming that is being built at universities and other non-profit things that we'll be able to leverage. […] The next 10 years is going to be really big for that, because holy shit, the potential disaster of everybody going off and creating all that stuff and then later, how are we going to tie it all back together?” Andy also emphasizes that interconnectedness should be both international and intergovernmental: “Not only should Canada and Australia be talking about things, but, like, Higher Ed and Cultural Heritage [federal government departments], often because of where they live in funding models. [They] don't talk to each other, but there's so much overlap there. […] I worry about that a lot and currently in [my country] I have many worries about [multiple peak bodies] […] I'm just like, ah, everybody stop and do it together.” More fragmentary, competitive approaches, and their attendant instability, undermine the benefits of more robust, normalized interconnectedness.
Calls for interconnectedness on the largest scale—such as among those who argue for unified national repositories to ingest many data types—must be questioned so that loss of local specificity is clearly understood and can be factored into decision-making. Like any part of an ecosystem, it may be possible that having too much interconnectedness weakens the overall system and compromises other priorities.
A lot of the work that we're doing, we're asking fairly unique questions of the data that often require a very bespoke data model. One of the issues you can have with things like Dublin Core is that […] you can easily lose information as you sort of transform things. You can easily end up with a situation where anyone who had anything to do with, like, the creation of a song ends up being a creator, which is totally useless if you're interested in looking at like the poetry, the authorship, and other things.
Although many cultural data collections remain at risk, and initiatives are often precarious for reasons documented in this paper, there is great potential for these collections to have diverse positive impacts on society. Taylor sees a significant opportunity in getting “robust, clean, useful quantitative cultural data in the hands of decision-makers and government. […] It actually has very specific things to say that speak to a range of industries and contexts.” Andy cites the potential of cultural data to contribute to repairing historical injustices, in an inclusive, participatory way: “We have this big initiative where we have all these photos of [Indigenous] folks and we're reaching out to communities to be like, do you know any of these people? Because we want to name these people in our collection.” Jason highlights the SpokenWeb's success at increasing the visibility of voices not previously known to Canadian literature: “It's really diversifying our understanding of what literary activity and ‘the literary’ meant in Canada, really from the 50s to now, and bringing in a lot of different personal voices.” With this study, we have gathered the perspectives of experts from across five spheres of expertise, and through their accounts, we have mapped the elements of a healthy cultural data ecosystem for the first time. With more stable investment, and more diverse, expert teams, there is great potential for many more cultural data collections to have widespread positive impacts like those documented here.
It is also important to note that, as one branch of the broader information ecosystem, the cultural data ecosystem holds many potential lessons for other branches. This study was limited to one branch and the experiences of experts working within only three countries; as an exploratory study, it makes a significant contribution to mapping the current landscape for cultural data work in Australia, the UK, and Canada, and laying the groundwork for future research. The long-term sustainability of collections, for example, is not limited to cultural data; government records, genealogical materials, scientific research data, and other information sources may face similar challenges, threats, and opportunities to those identified in this study of cultural data initiatives. Similarly, ensuring that curatorial initiatives draw on expertise from all types of individuals working within the area, is critical to ensuring the long-term health of the specific ecosystem branch, as well as that of the overarching information ecosystem. Additional research is needed across various branches, to understand the specific needs and complexities that must be addressed.
As the information ecosystem is global in nature, it is also important to examine these issues across nations, systems, and cultures, to ensure that the ecosystem can work for the betterment of society. While this research examined cultural data experiences in three countries, extending these results with the addition of new perspectives—particularly from Indigenous peoples and non-Western countries—will further extend our understandings of the importance of cultural data initiatives, globally, and their impact on the broader information ecosystem. By ensuring that the information ecosystem, as a whole, includes perspectives and artifacts representing all people, we can ensure that society will benefit from its collective wisdom long into the future.
The authors acknowledge the support of Australian Research Council Linkage Infrastructure, Equipment and Facilities (LIEF) Program Grant LE210100021, entitled ACD-Engine: Enriching cultural data for research, industry and government. Open access publishing facilitated by RMIT University, as part of the Wiley - RMIT University agreement via the Council of Australian University Librarians.
- Academy of the Social Sciences in Australia. (2022). Australia's data-enabled research future: The social sciences. Author. Retrieved from https://socialsciences.org.au/data-enabled-reseach-future/
- 2022). From bricks to clicks: How digital heritage initiatives create a new ecosystem for cultural heritage and collective remembering. Journal of Communication Inquiry, 46(2), 185–205. https://doi.org/10.1177/01968599211041112
- 2020). The push and pull of digital humanities: Topic modeling the “what is digital humanities?” genre. Digital Humanities Quarterly, 14, 1.
- 2020). The afterlife of performance. English Studies in Canada, 46(2–4), 19–46. https://doi.org/10.1353/esc.2020.a903545
- 2015). Collaboration and infrastructure. In S. Schreibman, R. Siemens, & J. Unsworth (Eds.), A new companion to digital humanities (pp. 54–65). John Wiley & Sons. https://doi.org/10.1002/9781118680605.ch4
- 2016). Curating the language of letters: Historical linguistic methods in the museum. In M. Hayler & G. Griffin (Eds.), Research methods for creating and curating data in the digital humanities (pp. 44–61). Edinburgh University Press. https://doi.org/10.1515/9781474409667-004
- FAIR Principles. (2016). GO FAIR. Retrieved from https://go-fair.org/fair-principles
- 2008). Induction. In L. M. Given (Ed.), The SAGE encyclopedia of qualitative research methods (pp. 429–430). SAGE. https://doi.org/10.4135/9781412963909
- GIDA. (2018). CARE Principles for indigenous data governance. Retrieved from https://www.gida-global.org/care
- 2023). Looking for information: Examining research on how people engage with information. Emerald.
- K. Golub, & Y.-H. Liu (Eds.). (2022). Information and knowledge organisation in digital humanities: Global perspectives. Taylor & Francis. https://doi.org/10.4324/9781003131816
- 2022). Digital collections audit: Commissioned report. UKRI, Arts and Humanities Research Council.
- 2022). Archives, linked data and the digital humanities: Increasing access to digitised and born-digital archives via the semantic web. Archival Science, 22(3), 319–344. https://doi.org/10.1007/s10502-021-09381-0
- 2016). A not-so-brief history of research software engineers [Github]. Software Sustainability Institute. Retrieved from https://www.software.ac.uk/blog/2016-08-17-not-so-brief-history-research-software-engineers-0
- 2022, December 23). Trove's funding runs out in July 2023—and the National Library is threatening to pull the plug. It's time for a radical overhaul. The Conversation. Retrieved from https://theconversation.com/troves-funding-runs-out-in-july-2023-and-the-national-library-is-threatening-to-pull-the-plug-its-time-for-a-radical-overhaul-197025
- 2022). Digital humanities—A discipline in its own right? An analysis of the role and position of digital humanities in the academic landscape. Journal of the Association for Information Science and Technology, 73(2), 148–171. https://doi.org/10.1002/asi.24533
- 2020). The challenges and prospects of the intersection of humanities and data science: A White Paper from The Alan Turing Institute. Retrieved from https://figshare.com/articles/online_resource/The_challenges_and_prospects_of_the_intersection_of_humanities_and_data_science_A_White_Paper_from_The_Alan_Turing_Institute/12732164
- 2019). Subverting the universality of metadata standards: The TK labels as a tool to promote indigenous data sovereignty. Journal of Documentation, 75(4), 731–749. https://doi.org/10.1108/JD-08-2018-0124
- 1999). Information ecologies: Using technology with heart. MIT Press.
- 2008). Purposive sampling. In L. M. Given (Ed.), The SAGE encyclopedia of qualitative research methods (pp. 698–699). SAGE. https://doi.org/10.4135/9781412963909
- 2000). Writing: A method of inquiry. In N. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (pp. 923–948). SAGE.
- 2021). Digital humanities. In N. B. Thylstrup, D. Agostinho, A. Ring, C. D'Ignazio, & K. Veel (Eds.), Uncertain archives: Critical keywords for big data. MIT Press.
- 2009). Digital humanities manifesto 2.0. Retrieved from http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf
- 2008). Exploratory research. In L. M. Given (Ed.), The SAGE encyclopedia of qualitative research methods (pp. 327–330). SAGE. https://doi.org/10.4135/9781412963909
- 2015). Opening access to collections: The making and using of open digitised cultural content. Online Information Review, 39(5), 733–752. https://doi.org/10.1108/OIR-06-2015-0193
- 2021). The value of mass-digitised cultural heritage content in creative contexts. Big Data & Society, 8(1), 205395172110061. https://doi.org/10.1177/20539517211006165
- 2020). Humanities, arts and social sciences research data commons: Final report. Australian Research Data Commons. https://humanities.org.au/wp-content/uploads/2022/06/Australias-Data-Enabled-Research-Future-Humanities.pdf
- Trove Strategy. (n.d.). National Library of Australia. Retrieved from https://www.nla.gov.au/about-us/corporate-documents/corporate-strategies/trove-strategy
- Truth and Reconciliation Commission of Canada. (2015). Calls to action. Retrieved from https://ehprnh2mwo3.exactdn.com/wp-content/uploads/2021/01/Calls_to_Action_English2.pdf
- 2011). The problem of the yellow milkmaid. A Business model perspective on open metadata (Europeana White Paper No. 2). Retrieved from https://pro.europeana.eu/files/Europeana_Professional/Publications/Whitepaper_2-The_Yellow_Milkmaid.pdf
- 1996). From museum to morgue? Electronic guides in Roman Bath. Tourism Management, 17(4), 241–245. https://doi.org/10.1016/0261-5177(96)00015-5
- 2022). Culture is digital and the shifting terrain of UK cultural policy. International Journal of Cultural Policy, 28(7), 799–812. https://doi.org/10.1080/10286632.2022.2137149
- 2021). Introduction to digital humanities: Enhancing scholarship with the use of technology. Taylor & Francis.
- 2022). Empowering linked data in cultural heritage institutions: A knowledge management perspective. Data and Information Management, 6(3), 100013. https://doi.org/10.1016/j.dim.2022.100013