Changing Gears: Fast-Lane Design for Accelerated Innovation in Memory Organisations

Paper
Johan Oomen, Netherlands Institute for Sound and Vision, The Netherlands, Maarten Brinkerink, Netherlands Institute for Sound and Vision, Netherlands, Bouke Huurnink, Netherlands Institute for Sound and Vision, Netherlands, Josefien Schuurman, Netherlands Institute for Sound and Vision, Netherlands

Published paper: Changing gears: Fast-lane design for accelerated innovation in memory organisations

Audiovisual archives are embracing the opportunities offered by digitisation for managing their work processes and offering new services to a wide array of user groups. Organisation strategy, working processes, and software development need to be able to support a culture where innovation can flourish. Some institutions are beginning to adopt the concept of “two-speed IT.” The core strategy aims to accommodate two tracks simultaneously: foundational but “slow,” and innovative but flexible and “fast.” This paper outlines the rationale behind the two-speed IT strategy. It highlights a specific implementation at the Netherlands Institute for Sound and Vision, a large audiovisual archive and museum. Two-speed IT is enabling Sound and Vision to reach its business objectives.

Bibliography:
Changing Gears: Fast-Lane Design for Accelerated Innovation in Memory Organisations

Johan Oomen – Netherlands Institute for Sound and Vision
Maarten Brinkerink - Netherlands Institute for Sound and Vision
Lora Aroyo – VU University Amsterdam
Bouke Huurnink - Netherlands Institute for Sound and Vision
Roeland Ordelman - Netherlands Institute for Sound and Vision
Josefien Schuurman - Netherlands Institute for Sound and Vision
Summary

Audiovisual archives are embracing the opportunities offered by digitisation for managing their work processes and offering new services to a wide array of user groups. Organisation strategy, working processes and software development need to be able to support a culture where innovation can flourish. Some institutions are beginning to adopt the concept of "two-speed IT” contributes to. The core strategy aims to accommodate two tracks simultaneously - foundational but “slow”, and innovative but flexible and “fast". This paper highlights the rationale behind the two-speed IT strategy, its implementation and the main lessons learned that will help other museums and memory organisations to think about creating and maintaining the technical prerequisites to flourish in an online networked environment.

1. Introduction

Museums benefit from fostering a 'culture of innovation’ - as a way to effectively manage ever-changing expectations of user groups, and at the same time make the most of new opportunities offered by technology. The fundamental challenge is how to achieve the public missions, i.e. supporting a myriad of users to utilize heritage collections so that they can actively learn, experience and create. As Douglas Rushkoff notes “It’s not about how digital technology changes us, but how we change ourselves and one another now that we live so digitally” (Rushkoff, 2014). For this, it is essential for museums to have access to technical infrastructure that allows not only to manage digital assets but also to “pursue contemporary objectives” (Johnson 2015). For instance, using new channels for content distribution, (e.g. YouTube, Instagram) to engage with new user groups; or using technologies (e.g. linked open data, NLP) to enrich and optimize work processes; or allow for creative ways to access collections. In this paper, we propose how innovation is fostered by introducing the concept of ‘two-speed IT’ and the organisational structure to realise it for heritage organisations.

This paper is structured as follows: in Section 1. we zoom in on some of the most urgent challenges audiovisual archives face today. In Section 2, we highlight the necessity for audiovisual archives to invest in its innovative capital to ensure long-term impact. In Section 3. we introduce the concept of two-speed IT as a way to ensure uptake of innovative IT solutions in a production environment.

2. Audiovisual archives and the future

Audiovisual materials will be as much a part of the future fabric of information as text based materials are today. As creation will continue to expand, archives will be storing and managing increasingly large collections of assets. Archives operate within a dynamic and multifaceted context. They will grow to become nodes in a network of communities along with other content providers and a variety of stakeholders from various industries ranging from education and research, creative industries (publishing, broadcasting, game industry), tourism, journalism and so on. Recent studies indicate that by ~2025 analogue carriers will need to have been digitised. After that date it will be impossible to transfer the carriers, either due to technical obsolescence of the playback devices or due to the state of the physical carriers.

For many archives, managing born digital is already the norm, with analogue collections only growing through donations or acquisitions. So, the future of audiovisual archives is digital. Multiple formats will need to be supported, from the highest industry standards to emerging open video formats and wrappers. Content, in various formats, will continue to be managed through specialised asset management systems. Metadata will be fine-grained, allowing access at shot or scene level. Standards will be adopted to allow interchange between collections (RDF, SKOS, PID, schema.org, etc.) and to maintain a record of provenance or metadata records as content is distributed online. Navigation across the combination of semantic data and a diverse range of media types is essential. In terms of the value chain of media consumption and production, the position of archives and roles of archive staff will evolve. Already today, we see the transformation of the traditional role of archivists/cataloguers. The future archivist plays a role as media manager; managing assets from their inception all the way through to distribution and long-term storage.

In summary, we envision the future audiovisual archives to be smart, connected and open, using smart technologies to optimise workflows for annotation and content distribution. Collaborating with third parties to co-design and co-develop new technologies in order to manifest themselves as frontrunners rather than followers (Brynjolfsson 2014). Being connected to other sources of information (other collections, contextual sources), to a variety of often niche user communities, researchers and the creative industries. To embrace the use of standards defined by external instances rather than by the cultural heritage communities themselves. Fully embrace ‘open’ as the default to have maximum impact in society: applying open licences for content delivery, using open source software and open standards wherever possible. Promote open access to publications and so on.

Asset management systems will need to be able to manage various streams of metadata; (1) metadata exported from production systems, (2) expert annotations (3) machine-generated metadata (4) crowdsourced annotations and other sources (5) knowledge extracted from secondary sources related to content.

With respect to ensuring long-term storage, archives need to make fundamental choices between storing content on servers they own, using cloud storage or opting for mixed models. Other choices relate to the type of storage media (tape, optical, solid state, hard drives) and adaptation of standardised working processes to endure digital durability.

The dynamics between the creative industries (producers, broadcasters, distributors) and archives will change. Archive staff and ‘creatives’ will be working more closely together than ever before. These result in ample opportunities, for instance playing a more proactive role in the production process and suggest topics for new programmes based on gems from the archive. This relates to the future role of archives as curators of vast materials of content. Filters need to be applied to provide meaningful access to vast collections. These files can be created by machines (recommender systems), by experts or by a smart combination of both.

2. Fostering a ‘culture of innovation’

The lawful foundation of archives differ from organisation to organisation. Some organisations are established by law as separate entities (legal deposits), others are part of larger organisations like museums, libraries, universities or broadcasters. In many cases, audiovisual collections are maintained by public bodies and in effect serve public missions, but not exclusively. Commercial footage libraries and other commercial entities (e.g.. search engines and video platforms) are also looking after growing bodies of audiovisual heritage, albeit with other primary motivations than providing access to gain knowledge or to support creative processes. Growing forces of importance are private archives, notably created by the billions of people carrying smart phones that allow for high quality multimedia recording. Personal archiving is starting to be addressed, but is still a huge area of research. Established archives are investigating to what extent they can help ensure long-term access to these collections. Many commercial players are active in this domain, from social networks to cloud storage providers. Given this context, it is key for ‘traditional’ archives to educate their constituents about the value they bring to society through securing the sharing of knowledge, a prerequisite for democracies to function. But also, perhaps more down to earth, to educate and entertain communities and individuals and to facilitate the exchange of ideas between various stakeholders.

Audiovisual archives are in a challenging position; operating as custodians of in-copyright works whilst also managing the public’s expectations in providing online access. Copyright rules need to be modified in order to allow memory organisations to provide access to their collections. A balance needs to be found between giving creators a remuneration for using their works and allowing the guardians of their works to provide access for various user groups. As a fundamental rule, content added to the public domain should stay in the public domain. Memory organisations should adopt an ‘open by default’ access policy, as to lead by example. Also, regulations should be in place to make it possible to provide access to commercially unviable (i.e. out-of-commerce) content. Also, modernisation of copyright regulations should look at Collective Licensing and into other ways that decrease the burden for obtaining copyright permissions. In respect to newly created material, creators should be encouraged to use Creative Commons licenses to foster a culture of innovation and creativity. For works commissioned by public institutions, the use of open licenses could be made compulsory.

Impact needs to be measurable and measured wherever and whenever possible, not only because archives are asked to be accountable for how resources are spent, but also to build solid business cases that will enable future investments, be it in services or supporting infrastructures. Following the Balanced Value Impact Model, we can distinguish between Internal, Innovation, Economic and Social Impact. Impact metrics also need to take into account new types of use. Already, material from archives is shared using open licenses, e.g. on platforms such as Wikipedia. Use on these 3rd platforms need to be monitored if possible or, alternatively, qualitative evidence needs to be gathered. Audio- and video fingerprinting can be used to track content usage over various platforms.

3. Introducing two-speed-IT

As a result of digitisation, archives and their users are sharing the same information space. To fully fulfil their potential, archives will ensure that their collections are available where users reside. A practical implication of this truism is that the role of the institutionally maintained access point such as a searchable catalogue should not be the only access point to the collections. On the web, content likes to travel and archives will embrace this fact. For instance, by providing API access to the catalogue and through the adaptation of machine-readable copyright labels to facilitate access. These preconditions make it possible for 3rd parties to ‘build upon’ online accessible collections. For example publishers that integrate resources in learning environments. Following this ‘liberalisation’, a new ecosystem emerges, showing that archives can focus their efforts on “super serving” niche communities such as filmmakers, media scholars and amateur historians.

Archives benefit from fostering a ‘culture of innovation’ —  as a way to effectively manage ever-changing expectations of user groups, and at the same time make the most of new opportunities offered by technology (Mckeown 2012). For this, it is essential for archives to have access to technical infrastructure that allows not only to manage digital assets but also to pursue contemporary objectives in line with user expectations. For instance, using new channels for content distribution, (e.g. YouTube, Instagram) to engage with new user groups; or using technologies (e.g. linked open data, natural language processing) to enrich and optimize work processes; or allow for creative ways to access collections. A ‘culture of innovation’ will also open possibilities to increase level of cooperation with academia, in areas ranging from digital humanities to computer science.

Managing digital assets and embracing innovation are characterised by very different dimensions. In terms of standards used, on terms of partnerships, in terms of managing investments over time, in terms of accountability, in terms of staff expertise and so on. In order to bridge these two ecosystems, the two-speed IT concept can be adopted (Bossert (2015). Organizing for digital acceleration: Making a two-speed IT operating model work. This strategy accommodates two tracks simultaneously; a ‘slow’ foundational and a ‘fast’ innovative ecosystem. Below, we introduce the concept as it is used in the heritage domain.

figure 1
Figure 1. The Two-speed IT ecosystems

The ‘fast’ ecosystem features mostly tailor-made solutions that cater for very specific user requirements and are used to experiment with new technologies. The applications do not have very stringent uptimes can can be maintained by developers themselves. In the ‘slow’ ecosystem, standardised and off-the shelf solutions are used to secure 24/7 service. The solutions are updated regularly following service level agreements with suppliers. Given the impact, the frequency of updating is not high and measured in months rather than weeks.

Both ecosystems have their specific infrastructure, applications, development & staging environments and suppliers. As highlighted in Figure 1., they overlap partly, for instance when ecosystems make use of similar underlying streams of data. In practice, the ‘conversion’ from Slow to Fast is a process, driven by business requirements. Key is to optimise systems and processes.

4. Two-speed-IT in practice

Our illustrative use case is the Netherlands Institute for Sound and Vision - a leading audiovisual archive with a growing fully digitized collection of 1.9 million objects (ranging from film, television and radio broadcasts, music recordings and web videos) and a museum that attracts ~250.000 visitors annually. Born-digital assets are ingested in a state-of-the-art digital repository and accessible online and in the museum.

Sound and Vision has ensured the successful transition to the digital domain after completing a seven year, 90 million Euro programme to digitise its analogue assets. Today, it has one of the largest collections of digital heritage assets in the world, totalling over 15 Terabytes. Recently, an multi-annual innovation agenda was adopted, consisting of 5 research themes (1) automatic metadata extraction and big data analysis (2) exploring new access paradigms (3) understand users (4) ensure digital durability (5) study the impact of media.

As integral part of the transition to the digital domain, a new mission statement, a new strategic plan (covering 2016-2020) and a new organisational structure were defined and implemented. A guiding principle was the conviction that the success of memory organisations lies in their ability to make the above mentioned notions of ‘smart’, ‘connected’ and ‘open’ integral part of their strategies (Oomen 2011, Ridge 2014). Sound and Vision adopted ‘two-speed IT’ as one of the key design principles. This is mirrored in the structure of the organisation. Three department are (jointly) responsible to the successful execution of two-speed it. See Figure 2.
Research and Development - implementing the research agenda through participation in research projects. Software development by scientific programmers.
Development - translating business requirements into functional requirements. Development following SCRUM. Evaluating output of R&D projects.
Production & Maintenance. Ensuring the uptime of applications, installing new versions and patches from 3rd party suppliers, according to set service level agreements.

figure 3
Figure 2. Departments working on Two-speed IT

It needs to be noted that three departments are not responsible for the Business requirements and functional ownership of the services developed. The business units Archive (responsible for the collection management and access) and Museum (operating the Sound and Vision museum) are end-responsible.

4.1 First adoptions: speaker labeling and entity extraction

In the case of Sound and Vision, an off-the shelf asset management system (named DAAN - Digital Audiovisual Archive Netherlands) from supplier Vizrt forms the foundation of the ‘slow’ ecosystem, next to a more agile ‘fast’ ecosystem of tailor made solutions for distinct functionalities, notably open source search and automatic metadata extraction. This is the layer where output of research can be implemented in production workflows.

Following the Two-speed-IT Sound and Vision successfully deployed automatic speaker labelling (a result from a research project with Radboud University) in 2014, speeding up the annotation process and offering a new access point to the collections. In 2015, technology to extract names of people, places, events and organisations from subtitle files was introduced. This was originally developed in collaboration with the University of Amsterdam.
In both cases, spin off companies from Universities are playing an important role in the process, as they were involved in developing the initial demonstrators with academics and currently are working under a service level agreement, with the Production & Maintenance department.

4.2 Scaling up Two-speed IT for online access

Plans are underway for a major revision of the access strategy. Using the off-the-shelf asset management system as the backend, distinct access scenarios will be supported. This will be the ultimate test case for the ‘two-speed IT’ approach, as it impacts multiple departments and requires significant investments in the infrastructure.

Revising the access strategy started with identifying three forms of online availability that are in line with the strategic plan for 2016-2020:
1. Public Accessibility - providing access to the collections for everyone. This is one of the core missions of the institute. It ensures everyone can search through the entire catalogue online and - if possible - also directly consult the material. If the material isn’t digitized yet, or if rights are not cleared for online consultation, an alternative possibility to access the material will be presented to the user.
2. Online Presentation - highlighting core parts and specific items from the collection, in high quality. This allows the institute to enact its role as a museum in an online environment. It involves highlighting items(s) to offer a perspective on the collection through interpretation and contextualisation. The possibility to experiment with different online forms of presentation is key here. Active participation of the online audience will play an important role in ensuring production of meaning is not limited to the perspective of the museum curator.
3. Open Reusability - enabling reuse of the ‘open’ metadata and content for individuals and through APIs (for programmatic access). This supports the ‘open, unless’ access policy that the institute adheres to, allowing third parties to reuse collections that fall in the Public Domain, or where relevant IPR is held by Sound and Vision.


figure 2
Figure 3: Schematic overview of the three forms of online availability

These three forms (see Figure 3.) together clarify what online availability constitutes for the institute. Subsequently, they provide a starting point for the formulation of functional and technical requirements for the infrastructure that is required to support them.

The three forms of online availability are not tied to a particular service or organisational department, as they transcend the business units Museum, Archive and R&D. Furthermore, they provide insight into the different ways in which the institute achieves its tasks and goals in relation to online availability of its collections. The different forms of online availability are complementary to each other and certain aspects of one, are preconditions for the other. The formulation of these three forms of online availability is an important first step towards a more holistic online strategy for Sound and Vision. Therefore, it is crucial to prevent that these forms are realized separately from each other. This would lead to fragmentation of the online services provided by the institute, and would not reflect the complementary nature and the manner in which these three forms can be mutually reinforcing.

Next step in defining of the online strategy was making an inventory of the functional and technical requirements for the three forms of online availability:
For Public Accessibility is of prime importance that the entire collection is found easily. This requires proper integration into popular search engines, detailed metadata and search functionality that meets the needs of the user. Navigating the collection must be intuitive and insightful, for instance supported with automated recommendations and contextualised with links to internal and external collections. In the case it is possible to offer the option to consult the material online, it is important that the quality and user experience meets the expectations of everyday consumers of online media services. These are constantly evolving. For items where only metadata is available for online consultation, alternative options to consult that material (in situ consultation or ordering a physical copy) need to be offered. All of this is heavily reliant on sound administration of the relevant features and information in the catalogue.
Online Presentation is about highlighting specific items from the collection and placing them in a rich, meaningful context. This is managed through curation, research and editorial work and is also supported by automatic, data driven, curation mechanisms. It is the ambition of the Museum unit to experiment and innovate in the online presentation forms in which this presented to the user. The user's’ own voice in all of this is of great importance. An important requirement is therefore to support public participation, to enable a dialogue with the user and gives the audience active role in the production of meaning.
The Open Reusability of the collection is primarily supported by capturing specific (IPR) information about the collection in the catalogue, particularly the application of open licenses or the fact that an item is part of the Public Domain. This information will then be made explicit in the appropriate form for both humans and (search) machines. To support various types of use of the content, it should be available in high-quality and open media formats (both consultation copies and source files). For automated forms of reuse by third parties, programmatic access to the collection (metadata and content) through an open API - without login - is a requirement.

Note, that the three forms of online availability provide opportunities to strengthen each other, which remain underutilized currently. For example, the contexualisation (automated or curated) that results from the Online Presentation, could greatly enhance the intuitive navigation of the online catalogue that is provided under Public Accessibility, and vice versa.

To meet the functional and technical requirements for the three forms of online availability summarized above, several new components need to be introduced into the digital infrastructure of Sound and Vision. The three most important ones are listed below:
Transcoding and Online Consultation Copies
Online consultation copies are intended as end products that enable the user to consult the archival material online (and also to reuse the material in the case of Open Reusability). Playback quality and the file formats that are utilized must be in line with the expectations of everyday consumers of online media services (like YouTube and Netflix), which are constantly evolving. Important to note here is the fact that the online consultation copies must be generated for only a small part of the entire archive, namely the part where the IPR status permits online consultation. In the current situation this is an estimated 0.18% of the entire collection (and 0.6% of the digital collection). The proportion available for reuse is estimated to be 0.032% of the entire collection. The nature of the online consultation copies comes with higher demands on the playback quality (for an optimal user experience) and the variety of file formats (for different purposes and devices, in a rapidly changing technical environment). This differs from the so-called low-res preview proxies that are have already been generated for the entire digital archive, for the use in the media asset management (MAM) system.
Data processing and enrichment
Online availability of the Sound and Vision archive is reliant on four sources. Source files are stored in the Digital Archive, which can be accessed via the MAM. The MAM also serves as a catalog and contains the metadata of all objects. The Thesaurus ensures consistent use of descriptive terms, and serves as a bridge to link to the various collections in the archive and (external) context sources. To assist navigation and automated contextualisation, based on the abovementioned sources a media and data analysis component is required. Automatic annotation and linking is performed by various algorithms. The resulting annotations are - if possible - put back into the MAM catalogue (as metadata enrichments)..
Flexible backend
The flexible backend is a collection of functionality that combines data from multiple internal and external collections and context sources. It functions as an intermediate layer in the digital infrastructure, in order to extend the standard functionality provided by the MAM with specific functionality required for the three forms of online availability. The resulting combined data is exposed via various APIs to which online platforms and third parties can connect (depending on the goal).

The above-mentioned functional and technical requirements and proposed components also offer new possibilities beyond the three forms of online availability. First, the flexible backend can support experimentation, innovation and incubation based on the Sound and Vision archive, by combining direct access to the digital archive and collection metadata with new forms of media and data analysis and/or audience participation. This is relevant for internal R&D, but also for experimentation and innovation on the basis of the archive by (research) partners, startups. It allows the various parties to experiment (casually, explorative, or in a defined lab setting) with the enrichment of the archive. This corresponds with the desirable behavior in Ecosystem ‘fast’ of two-speed IT. Second, the infrastructure described above offers opportunities for new ways to provided access to and/or present the digital collection within the walls of the institution. Besides exploiting the digital archive and collection metadata, the (automated) enrichment, contextualisation and linking of the items be reused in the museum context. This provides opportunities for better integration of online and onsite museum experiences.

On the one hand continuity of the museum exhibits during the opening hours of the Museum are of high importance, and should therefore reside in ecosystem ‘slow’ of 2-speed it. However to provide a more contextualized, dynamic and personalized experience to the visitor, it is also important to be able to expose the museum visitor to the innovations happening in ecosystem ‘fast’ from the outset. This allows for incremental development of new museum experiences and also provide opportunities for audience engagement, beyond what Ecosystem ‘slow’ can offer.

5. Conclusions

After the successful implementation of two-speed IT in strategy and organisational structure, speaker identification and automatic entity extraction were successfully implemented in production systems in 2014 and 2015. This year, an ambitious programme on online access (see Section 5.)

Over the past two years, we have learned a lot regarding two-speed IT and find it is a very suitable strategy to ensure outcomes of research and innovation projects can find their way to production systems. With the experience gained over the past years, we look forward to implement the new access strategy.

References
Bossert, Oliver (2015). “Organizing for digital acceleration: Making a two-speed IT operating model work.” Consulted 12 January 2016. Available http://www.mckinsey.com/insights/high_tech_telecoms_internet/organizing_for_digital_acceleration_making_a_two_speed_it_operating_model_work
Brynjolfsson, E., & McAfee, A. (2014). “The second machine age: Work, progress, and prosperity in a time of brilliant technologies.”
Johnson, L., Adams Becker, S., Estrada, V., and Freeman, A. (2015). “NMC Horizon Report: 2015 Museum Edition.” Austin, Texas: The New Media Consortium.
Mckeown, M. (2012). “Adaptability: The art of winning in an age of uncertainty.” London: Kogan Page.
Oomen, J, & Aroyo, L. (2011). “Crowdsourcing in the Cultural Heritage Domain: Opportunities and Challenges.” In C&T '11, Proceedings of the 5th International Conference on Communities and Technologies. Brisbane, ACM
Ridge, M. (2014). “Crowdsourcing our cultural heritage.” London, Asgate, 2014.
Rushkoff, Douglas. (2013). “Present shock: when everything happens now.” : New York, New York, U.S.A. : Current.