<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article article-type="research-article" xml:lang="EN" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">LIBER</journal-id>
<journal-title-group>
<journal-title>LIBER QUARTERLY</journal-title>
</journal-title-group>
<issn pub-type="epub">2213-056X</issn>
<publisher>
<publisher-name>openjournals.nl</publisher-name>
<publisher-loc>The Hague, The Netherlands</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">lq.13579</article-id>
<article-id pub-id-type="doi">10.53377/lq.13579</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Automating Subject Indexing at ZBW: Making Research Results Stick in Practice</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-1019-3606</contrib-id>
<name>
<surname>Kasprzik</surname>
<given-names>Anna</given-names>
</name>
<email>a.kasprzik@zbw-online.eu</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<aff id="aff1">ZBW &#x2013; Leibniz Information Centre for Economics, Hamburg/Kiel, Germany</aff>
</contrib-group>
<pub-date pub-type="epub">
<month>10</month>
<year>2023</year>
</pub-date>
<volume>33</volume>
<fpage>1</fpage>
<lpage>17</lpage>
<permissions>
<copyright-statement>Copyright 2023, The copyright of this article remains with the author</copyright-statement>
<copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://www.liberquarterly.eu/article/10.53377/lq.13579"/>
<abstract>
<p>Subject indexing, i.e., the enrichment of metadata records for textual resources with descriptors from a controlled vocabulary, is one of the core activities of libraries. Due to the proliferation of digital documents, it is no longer possible to annotate every single document intellectually, which is why we need to explore the potentials of automation on every level.</p>
<p>At ZBW the efforts to partially or completely automate the subject indexing process started as early as 2000 with experiments involving external partners and commercial software. The conclusion of that first exploratory period was that commercial, supposedly shelf-ready solutions would not suffice to cover the requirements of the library. In 2014 the decision was made to start doing the necessary applied research in-house which was successfully implemented by establishing a PhD position. However, the prototypical machine learning solutions that they developed over the following years were yet to be integrated into productive operations at the library. Therefore in 2020 an additional position for a software engineer was established and a pilot phase was initiated (planned to last until 2024) with the goal to complete the transfer of our solutions into practice by building a suitable software architecture that allows for real-time subject indexing with our trained models and the integration thereof into the other metadata workflows at ZBW.</p>
<p>In this paper we address the question of how to transfer results from applied research into a productive service, and we report on the milestones we have reached so far and on those that are yet to be reached on an operational level. We also discuss the challenges we were facing on a strategic level, the measures and resources (computing power, software, personnel) that were needed in order to be able to affect the transfer, and those that will be necessary in order to subsequently ensure the continued availability of the architecture and to enable a continuous development during running operations.</p>
<p>We conclude that there are still no shelf-ready open source systems for the automation of subject indexing &#x2013; existing software has to be adapted and maintained continuously which requires various forms of expertise. However, the task of automation is here to stay, and librarians are witnessing the dawn of a new era where subject indexing is done at least in part by machines, and the respective roles of machines and human experts may shift even further and more rapidly in a not-so-distant future. We argue that in general, the format of &#x201C;project&#x201D; and the mindset that goes with it may not suffice to secure the commitment that an institution and its decision-makers and the library community as a whole will have to bring to the table in order to face the monumental task of the digital transformation and automation in the long run. We also highlight the importance of all parties &#x2013; applied researchers, software engineers, stakeholders &#x2013; staying involved and continuously communicating requirements and issues back and forth in order to successfully create and establish a productive service that is suitable and equipped for operation.</p>
</abstract>
<kwd-group>
<kwd>subject indexing</kwd>
<kwd>automation</kwd>
<kwd>machine learning</kwd>
<kwd>artificial intelligence</kwd>
<kwd>metadata</kwd>
<kwd>IT infrastructure</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1. Introduction &#x2013; Context</title>
<p>Subject indexing, i.e., the semantic enrichment of metadata records with descriptors, is one of the core activities of libraries. Trivially, due to the proliferation of digital documents it is no longer possible to annotate every single document intellectually &#x2013; which is why we need to explore the potentials of automation on every level. Automation can start with small measures such as using simple scripts and routines for metadata manipulation and go all the way to the use of methods from Artificial Intelligence, notably from the domain of Machine Learning.</p>
<p>At ZBW the efforts to automate the subject indexing process started as early as 2000. Two projects with external partners and/or commercial software yielded some insights into the state of the art at the time but mostly showed that the evaluated solutions would not suffice to cover the requirements of the library and that there still were several hurdles to overcome both with respect to the quality of the output and to the technical implementation. However, abandoning the endeavour was not an option since the need for automation became ever more obvious and pressing over time. A reorientation phase around 2014 led to the decision that from then on, the necessary applied research should be done in-house and only open source software should be used and created. For this purpose, a full-time position for a research engineer with the option to obtain a PhD in computer science was established within the library. The first phase of activities after this reorientation was called &#x201C;project AutoIndex&#x201D; and lasted until 2018. After a personnel change in 2018 the role of coordinating the automation of subject indexing was upgraded to a permanent full-time position and filled with a computer scientist with additional library training.<xref ref-type="fn" rid="fn1"><sup>1</sup></xref></p>
<p>However, the prototypical machine learning solutions that were developed in project AutoIndex were not yet ready to be integrated into productive operations at the library. In order to be able to take on this challenge properly, several additional adjustments were made on the strategic level: Most importantly, the automation of subject indexing at ZBW was declared no longer a project but a permanent task (dubbed &#x201C;AutoSE&#x201D;). This in turn prompted the initiation of a pilot phase (starting in 2020, planned to last until 2024) with the goal to transfer results from applied research in the AutoSE context into a productive service by building a suitable software architecture that allowed for real-time subject indexing with the trained AutoSE models and integration thereof into the other metadata workflows at ZBW. In order to meet these requirements, AutoSE was granted one more full-time position: since the beginning of the pilot phase, the team consists of a staff of three, covering the roles of lead/coordination, applied research, and software development/architecture.</p>
</sec>
<sec id="s2">
<title>2. Applied Research and Productive Operations</title>
<sec id="s2a">
<title>2.1. Applied Research &#x2013; Methods</title>
<p>From the machine learning point of view, subject indexing is a so-called multi-label classification task, i.e., to each publication several labels (&#x223C;subjects) can be assigned. Since the end of the last AI winter (around 2012) more and more &#x2013; actually usable! &#x2013; machine learning models for this task have emerged, and a large portion of them are available as open source software. In the precursor project, AutoIndex, a prototypical fusion approach towards automated subject indexing at ZBW had been developed that joined several methods and then filtered their combined output using additional rules (<xref ref-type="bibr" rid="r15">Toepfer &#x0026; Seifert, 2018a</xref>). At the same time, a team at the National Library of Finland (NLF) started creating the open source toolkit Annif (<xref ref-type="bibr" rid="r13">Suominen et al., 2023a</xref>) which offers various machine learning models for automated subject indexing and also allows the integration of one&#x2019;s own models. The two institutions were in contact and exchanged information about their respective developments.</p>
<p>At the beginning of the pilot phase the AutoSE team adopted Annif as a framework in order to combine several state-of-the-art models &#x2013; currently the following four are used: two variants (<italic>parabel</italic> and <italic>bonsai</italic>) of <italic>omikuji</italic> (<xref ref-type="bibr" rid="r3">Dong &#x0026; Suominen, 2022</xref>),<xref ref-type="fn" rid="fn2"><sup>2</sup></xref> which are tree-based machine learning algorithms, <italic>fastText</italic> (<xref ref-type="bibr" rid="r6">Facebook Inc., 2022</xref>; <xref ref-type="bibr" rid="r9">Joulin et al., 2016</xref>),<xref ref-type="fn" rid="fn3"><sup>3</sup></xref> which uses word embeddings, and <italic>stwfsa</italic> (<xref ref-type="bibr" rid="r18">ZBW, 2022a</xref>),<xref ref-type="fn" rid="fn4"><sup>4</sup></xref> a lexical algorithm based on finite-state automata, which was developed at ZBW and is optimised for the &#x201C;Standard-Thesaurus Wirtschaft&#x201D; (STW) (<xref ref-type="bibr" rid="r21">ZBW, 2023</xref>), the thesaurus for the economics domain hosted and used for subject indexing at ZBW, but can be used with other vocabularies as well. The output of all of these methods is then combined via another method &#x2013; <italic>nn-ensemble</italic> (<xref ref-type="bibr" rid="r14">Suominen et al., 2023b</xref>) &#x2013; which balances them out, yielding as final result a set of subjects that have all passed a given confidence threshold. For AutoSE the models are trained with short texts from the metadata records underlying the ZBW research portal EconBiz (<ext-link ext-link-type="uri" xlink:href="https://www.econbiz.de/">https://www.econbiz.de/</ext-link>), specifically titles and (if available) author keywords<xref ref-type="fn" rid="fn5"><sup>5</sup></xref> of publications in English. In parallel, applied research continues in order to explore other machine learning methods beyond the classical ones, including approaches from Deep Learning in the form of large language transformer models, notably pretrained ones (GPT (<xref ref-type="bibr" rid="r12">Radford et al., 2018</xref>) is a prominent example), which are particularly promising for multi-lingual subject indexing (<xref ref-type="bibr" rid="r17">&#x201C;Transformer (machine learning model)&#x201D;, 2023</xref>).</p>
<p>The AutoSE team is actively involved in the continuous advancement of Annif, checking with NLF at regular intervals if results from the AutoSE context can be integrated as new functionalities, assisting NLF with giving tutorials, and other institutions with advice on how to deploy Annif in practice.<xref ref-type="fn" rid="fn6"><sup>6</sup></xref> The team has complemented the ZBW instance of Annif with their own components for setting up experiments, hyperparameter optimisation, various additional quality control mechanisms (see <xref ref-type="sec" rid="s3">Section 3</xref>), and APIs in order to communicate with internal and external metadata workflows.</p>
</sec>
<sec id="s2b">
<title>2.2. Productive Operations &#x2013; Data Flows</title>
<p>A first version of a productive AutoSE service went into operation in 2021. The software runs on a Kubernetes cluster of five virtual machines and technologies such as <italic>helm</italic>, GitLab, <italic>prometheus</italic> and <italic>grafana</italic> are used for software deployment, continuous integration, and monitoring. As the applied research continues and the team is integrating more and more of the original requirements as well as supplementary enhancements, the architecture is constantly evolving and its modular design keeps it adaptable to future developments beyond the pilot phase.</p>
<p>The output of the service is used for two purposes at present: The first is fully automated subject indexing, for publications in English that would otherwise not be annotated with any subjects from the STW thesaurus at all. The EconBiz database is checked every hour for new eligible metadata records, these are then enriched by AutoSE with STW subjects, and written back into the database immediately. If a publication happens to belong to the core set of literature that is earmarked to be annotated by human specialists at ZBW then the AutoSE subjects are subsequently suppressed both in the search index and in the single display page for this publication once the intellectual subject indexing has taken place. The connection between AutoSE and the EconBiz database was activated in July 2021, and in the first six months of operations, over 100,000 machine-annotated metadata records were entered into the database via direct write access.<xref ref-type="fn" rid="fn7"><sup>7</sup></xref> The total number of records enriched by AutoSE methods in the database is higher as the team also processes large amounts of records retroactively which are then written back into the database via a batch process.<xref ref-type="fn" rid="fn8"><sup>8</sup></xref> As of December 2022, the EconBiz database contains around 1.3 Mio. records with AutoSE subject indexing, which corresponds to about a quarter of ZBW holdings.</p>
<p>The second purpose of the output of the service is machine-assisted subject indexing: the subjects generated by AutoSE are made available as suggestions to the platform used for intellectual subject indexing at ZBW (&#x201C;Digitaler Assistent&#x201D;; DA-3)<xref ref-type="fn" rid="fn9"><sup>9</sup></xref> via an API. This connection was the first one implemented, in 2020. Within DA-3, AutoSE suggestions are marked as machine-generated for reasons of transparency, and they can be adopted by a single click on an &#x201C;add&#x201D; button during the annotation of a publication. Freshly annotated records are stored in the union catalogue and mirrored back into the EconBiz database where the AutoSE team collects them and computes the F1 score (<xref ref-type="bibr" rid="r5">&#x201C;F-score&#x201D;, 2023</xref>) from the difference in order to monitor the performance of the current productive backend (also see <xref ref-type="sec" rid="s3">Section 3</xref>). <xref ref-type="fig" rid="fg001">Figure 1</xref> shows an overview of the corresponding data flows.</p>
<fig id="fg001">
<label>Fig. 1:</label>
<caption><p>Data flows of machine-generated subject indexing using the AutoSE service.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2023_33_Kasprzik_fig1.jpg"/></fig>
<p>Milestones yet to be completed within the pilot phase include:</p>
<list list-type="bullet"><list-item><p>Preparing for the use of abstracts and potentially also tables of content in addition to titles and author keywords &#x2013; besides gathering the necessary amount of training data for experiments in order to develop models that are optimised for these kinds of text materials, the team also had to clarify the situation with respect to text mining rights for abstracts since most licences do not mention the use of abstracts for non-commercial productive purposes such as the AutoSE service, even if the use for research purposes is explicitly allowed.</p></list-item>
<list-item><p>Preparing the integration of solutions for languages other than English &#x2013; approaches include an upstream machine translation before subject indexing and, as the most promising option, the use of BERT-like transformer models (<xref ref-type="bibr" rid="r17">&#x201C;Transformer (machine learning model)&#x201D;, 2023</xref>).</p></list-item>
<list-item><p>Finalising and publishing a web user interface which provides an interactive demo of the productive backend, statistics concerning the current and past performance of the AutoSE service and additional information about the methods comprised in the backend.</p></list-item>
<list-item><p>Automating various machine learning processes such as hyperparameter optimisation and training, in order to be able to retrain the models more easily when enough new metadata records have accumulated or a new version of the STW thesaurus is available.</p></list-item>
<list-item><p>Documenting the requirements of (future) productive operations (also see <xref ref-type="sec" rid="s4">Section 4</xref>).</p></list-item>
</list>
<p>Plans beyond the pilot phase include extending the architecture to integrate automated metadata extraction workflows in order to generate more input for AutoSE, and combining machine learning with symbolic approaches, i.e., incorporating more semantic information from STW and from external sources as a way to check the plausibility of the output of our trained models.</p>
</sec>
</sec>
<sec id="s3">
<title>3. Quality Management</title>
<sec id="s3a">
<title>3.1. Various Approaches to a Quality Assurance Concept</title>
<p>The automation of subject indexing is a change prompted by new technological possibilities, but it also affects subject indexing practices on a cultural level. In an automation endeavour such as this, quality control is key &#x2013; both because of the (positive or negative) effects of metadata quality on retrieval and because the approval of the output of the service among the stakeholders (i.e., in particular subject indexing experts) is vital in order to make sure that it will be accepted and used long-term.</p>
<p>The AutoSE team is working on a comprehensive quality assurance concept using different approaches in order to be able to guarantee an overall subject indexing quality that is as high as possible (also see <xref ref-type="bibr" rid="r10">Kasprzik (2022)</xref>). On the technical side this includes working with metrics commonly used in the machine learning domain (currently aiming to maximize the F1 score but we plan to evaluate differently weighted combinations of precision and recall or other metrics such as Normalized Discounted Cumulative Gain or metrics that take the hierarchy of the thesaurus into account as well) and identifying reasonable thresholds (e.g., the minimum level of confidence required). After the automated subject indexing process proper, those thresholds are applied to the output (along with other filters and if-then-rules such as blacklists and mappings, see 3.2). Since 2022, quality control for AutoSE also features the application of a machine-learning-based approach for the prediction of overall subject indexing quality for a given metadata record. More precisely, the method <italic>qualle</italic> predicts the recall for that record by drawing on confidence scores for individual subjects plus additional heuristics such as text length, special characters, and a comparison of the expected number of labels to the number of labels that were actually suggested. <italic>Qualle</italic> is based on a prototype described in <xref ref-type="bibr" rid="r16">Toepfer and Seifert (2018b)</xref> &#x2013; however, in order to be usable in productive operations, the code had to be re-implemented from scratch (<xref ref-type="bibr" rid="r19">ZBW, 2022b</xref>). Before launching <italic>qualle</italic> the team asked ZBW subject indexing experts for an intellectual review of its output in order to make sure that it would outperform the previous method &#x2013; up until that point a much coarser semantic filter had been applied for quality control on the metadata record level (rule &#x201C;min2VB&#x201D;): the output had to contain at least two subjects from one of the two economic core domains, modelled as two sub-thesauri in STW. Thus, this heuristic neglects other domains associated with economics, whereas <italic>qualle</italic> simply learns from the training data what an appropriate subject indexing should look like, without discriminating between subthesauri. This shows that, if trained on suitable data, a machine-learning-based method can be more flexible than an intellectually postulated rule.</p>
</sec>
<sec id="s3b">
<title>3.2. The &#x201C;Human in the Loop&#x201D;</title>
<p>However, one of the most essential components of quality assurance is and will remain the human element. The machine learning domain has coined the phrase &#x201C;human in the loop&#x201D; for this paradigm, which addresses &#x201C;the right ways for humans and machine learning algorithms to interact to solve problems&#x201D; (<xref ref-type="bibr" rid="r11">Monarch &#x0026; Manning, 2021</xref>). Possible interpretations and implementations may include:</p>
<list list-type="bullet"><list-item><p>the fact that training data is typically annotated by humans (which is also the case for AutoSE)</p></list-item>
<list-item><p>the fact that knowledge organisation systems and mappings between them are usually created and maintained by humans (which applies to STW as well)</p></list-item>
<list-item><p>machine-assisted subject indexing (such as machine-generated suggestions in DA-3, see above)</p></list-item>
<list-item><p>and various ways of collecting intellectual feedback to approaches such as Online Learning (where a machine directly retrains itself, for example on the basis of intellectual feedback data) and Active Learning (where a machine can interactively request annotations or assessments for individual data from a human at certain points).</p></list-item>
</list>
<p>With respect to ways of gathering intellectual feedback, several strategies have been used in the AutoSE context. One such strategy has been conducting an intellectual review about once a year where a group of ZBW subject indexing experts assess the quality of machine-generated subjects for a sample of around 1,000 publications by assigning one of four quality levels both to each individual subject and to the sum of subjects for the publication in question. If experts found a subject missing, then they could enter that into the form as well. For this kind of review the team used an interface that was developed in project AutoIndex which allows experts to view the relevant metadata, to access the full text via a link, and to navigate in the records assigned to them (<xref ref-type="bibr" rid="r20">ZBW, 2022c</xref>). After every review, the team conducted an extensive debriefing where the experts could also report individual observations and perceived biases in the output of AutoSE. Over the last several reviews, this has helped to identify and to remedy systematic divergences from the desired outcome &#x2013; for example, due to overrepresentation in the training data, the subjects for &#x201C;theory&#x201D; and &#x201C;USA&#x201D; wrongly appeared in the output more often than other subjects. As a temporary fix, the subject for &#x201C;USA&#x201D; was subsequently blocked if it was not contained in the title or the author keywords explicitly, whereas &#x201C;theory&#x201D; was blocked if a subject from a list with more specific subjects pertaining to economic theories compiled by the domain experts was also present in the (candidate) output. However, this approach is tedious and error-prone and such short- to medium-term solutions should be superseded by improvements in machine learning methods in the long run.</p>
<p>Another way of gathering feedback is comparing AutoSE output to intellectual subject indexing, where available &#x2013; every time a ZBW subject indexing expert adds STW subjects to a metadata record that had already been enriched by AutoSE, the AutoSE system is notified and the F1 score is computed from the difference between the two sets of subjects. This enables the team to gather evidence that a new backend performs better than the previous one before launching it into productive operations, for example &#x2013; see <xref ref-type="fig" rid="fg002">Figure 2</xref> for a visualisation of an A/B test where two backends were operated in parallel for a certain period of time in order to compare them with respect to this metric.</p>
<fig id="fg002">
<label>Fig. 2:</label>
<caption><p>Comparison of the F1 scores computed from subsequent intellectual subject indexing for two backends over time.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2023_33_Kasprzik_fig2.jpg"/></fig>
<p>However, while the F1 score is an accepted performance indicator in classification tasks, note that this is merely a &#x201C;binary&#x201D; sort of feedback in the sense that it only allows to determine whether a machine-generated subject is also present in the intellectually generated set and whether an intellectually generated subject is missing from the machine-generated set but not if a machine-generated subject not chosen by the human indexer is just too general or completely incorrect, for example. Therefore, and since annual reviews yield only little feedback data due to lack of personal resources and consequently small samples, in early 2022 the team collaborated with the provider of the DA-3 platform in order to integrate a solution into DA-3 so that subject librarians could give a graded feedback. As a consequence, subject indexing experts are now able and strongly encouraged to submit quality assessments via DA-3 continuously during their everyday work, without having to change clients. As in reviews before, they can rate subjects individually and their sum on the metadata record level. Missing subjects are computed from differences between AutoSE suggestions and the intellectual subject indexing that the experts enter into a record. The larger amount of assessment data collected this way affords the team a much better overview over AutoSE performance (as perceived by subject indexing experts) and enables them to improve their portfolio of methods in a more targeted way. Dynamically generated visualisations of this data (such as the one in <xref ref-type="fig" rid="fg002">Figure 2</xref>) will also be displayed via a web user interface in the future<xref ref-type="fn" rid="fn10"><sup>10</sup></xref> to increase transparency.</p>
<p>Future plans with respect to the implementation of a more advanced &#x201C;human in the loop&#x201D; relationship include exploring if this feedback data can be used for incremental learning (Online Learning). Another intriguing concept to pursue is that of Active Learning (see above). So far, automated and intellectual subject indexing represent quasi-separate lanes &#x2013; machine-generated subjects are discarded as soon as human-generated ones are available (even if the latter may be inspired by the former). We would like to explore the possibilities of a more interactive mode for machines and humans to solve the task of subject indexing together that also exploits their respective strengths (better) &#x2013; currently automated solutions are still designed to emulate intellectual ones as closely as possible although machines may be able to identify subtle patterns and differences where traditional rules for intellectual subject indexing are too coarse. Moreover, once the process is at least partly automated, this may also pave the way for the switch towards cataloguing and subject indexing practices that are based on entities and on formalised relationships between them, and not on (string-based) entries in a database, which in turn will facilitate the application of more advanced semantic technologies. Naturally, all these potential approaches have to be submitted to carefully designed studies as to their feasibility and actual usefulness in order to make sure that the suggested changes in information processing practices in the library are sustainable and tailored to the needs of the various users and stakeholders in a constructive way.</p>
</sec>
</sec>
<sec id="s4">
<title>4. Lessons Learned</title>
<sec id="s4a">
<title>4.1. Commitment and Communication</title>
<p>There are many hurdles to overcome in the transfer of prototypical software results into practice &#x2013; see <xref ref-type="bibr" rid="r1">Battistella et al. (2015)</xref> for a comprehensive literature review about the challenges for technology transfer in general, many of which apply to our case as well.</p>
<p>The author of this paper has time and again heard librarians (at this institution and others) express frustration about the impression that while there may be a number of promising automation solutions from applied research activities or other institutional projects, these solutions rarely make it into productive services that they can actually use in their everyday work. One reason for such an incomplete transfer into the infrastructure of an institution may be a lack of resources, especially of personnel with suitable expertise, both for applied research and for the implementation of research results into usable software. Decision-makers in libraries often seem reluctant to commit to a larger digital transformation attempt long-term and rather prefer to label tentative in-house automation activities as &#x201C;projects&#x201D; in order to avoid tying down substantial amounts of resources (including <italic>permanent</italic> positions for highly qualified staff) for many years. This is why for AutoSE the official switch from project status to a permanent task was not just a symbolic step but essential for breaking the barrier towards going live with a first version of the service because it did signal the necessary commitment from and to everybody involved. At ZBW, the acquisition of the necessary software and computing power and the creation of an additional position for a software architect were direct consequences of that initial commitment. Nevertheless, it goes without saying that the process did require and still requires a constant renegotiation of human and material resources and dealing with shifting strategic and financial circumstances within the institution and in the world around it. With regard to staffing, experiences from the AutoSE pilot phase have shown that, at an absolute minimum, the roles that have to be filled in order to ensure permanent productive operations and a continued development of the service are those of coordination,<xref ref-type="fn" rid="fn11"><sup>11</sup></xref> applied research, software development, and IT administration. The AutoSE team covers these roles with three people at the moment. Best practice also shows that at the very least the latter two types of expertise should be distributed over more than one person in order for productive operations to be fail-safe &#x2013; increasing the staff within the team would obviously be an attractive option (and a luxury at most institutions) but it also helps if the technology stack of the institution as a whole is as homogeneous as possible because that facilitates the exchange of expertise with other departments, which is why inter-departmental communication and coordination is so important for the activities of AutoSE.<xref ref-type="fn" rid="fn12"><sup>12</sup></xref></p>
<p>However, commitment is not just a requirement for decision-makers. If the transfer from research all the way into productive operations is to be successful, all parties involved have to stay involved until the results are satisfactory. This also pertains to researchers &#x2013; which is often problematic because the membrane between the academic world and the world of real-life use cases still seems to act as an obstacle to fruitful cooperation. Research processes in computer science can be categorised very roughly into two types: fundamental research yields theoretical findings which stay true no matter how they are applied, while applied research yields prototypical software which then has to be transformed into productive services. The execution of that latter step heavily depends on the application context and is not a one-way street but requires many cycles to adjust the outcome to the use case. These subsequent stages do not only involve additional scientific tests, but also the development and testing of software for productive operations, doing usability tests with the target users in order to find further issues and to ensure acceptance, and so on.<xref ref-type="fn" rid="fn13"><sup>13</sup></xref> Typically, these are the stages where the transfer process is prone to seizing up and the people involved get frustrated.</p>
<p>One cause for this could be that in the past, the latter stages have been underestimated by both decision-makers and researchers and as a consequence, success in this domain is not rewarded with prestige the same way as academic success. Sadly, there is currently no real incentive for researchers to stay involved beyond the prototype stage &#x2013; on the contrary, the pressure caused by academic key performance indicators (&#x201C;number of publications&#x201D;, &#x201C;amount of third-party funding acquired&#x201D;, etc.) is so prohibitively intense that it actively keeps them from doing so. AutoSE has had the fortune of having been assigned a full PhD position within the AutoSE team so that that staff member could focus both on their research and on the application context and thus create as much synergy as possible &#x2013; however, when collaborating with other researchers outside of the team these conflicting priorities are still a challenge that can sometimes get in the way.<xref ref-type="fn" rid="fn14"><sup>14</sup></xref> A (partial) solution would be for policy-makers to create key performance indicators that measure and reward successful research transfer activities (both for institutions and for individuals involved in such activities) and to attribute the same significance to them as to the other indicators.</p>
<p>Shifting more attention towards the relevance and complexities of the latter stages would hopefully a) cause decision-makers to provide sufficient human resources to fill all the essential roles (see above), and b) increase the extent to which the skills necessary for those latter stages are taught during the education of prospective applied researchers. This includes both the technical skills needed to participate in large-scale software development projects (cooperative programming, testing, deployment, and so on) and the soft skills needed for project management and for communicating with stakeholders, so that researchers are endowed with the necessary toolkit to find out where the practical challenges lie and to help solve them. In essence, all parties &#x2013; applied researchers, software engineers, stakeholders &#x2013; have to continuously communicate requirements and issues back and forth in order to effect a successful transfer of research results into a productive service that is suitable and equipped for permanent operation.</p>
</sec>
<sec id="s4b">
<title>4.2. Conclusion</title>
<p>Experiences from the pilot phase to date have shown the following: As of yet, there are no shelf-ready open source automation systems for subject indexing &#x2013; existing software has to be adapted and maintained continuously which requires various forms of expertise. The step of leaving the project format behind is worth the effort &#x2013; the search for automation solutions for subject indexing and other related processes is a permanent task that will stay with libraries for many years to come. Accordingly, productive operations in line with this task have to be based on a thoroughly established long-term concept and to be accompanied by adequate resources (personnel, software, computing power).</p>
<p>We have found the fact that applied research and software development for AutoSE is done <italic>within</italic> the library part of ZBW (and not in a separate research or IT development department) greatly beneficial because it allows a close collaboration and communication with subject librarians. It is essential to include subject indexing experts as stakeholders in the process &#x2013; both for their expertise in the areas of information and knowledge organisation and to increase acceptance since transparency helps to dissipate reservations and to establish a basic trust in the technology and especially in the ways the team is going to use it. The implementation of methods from Artificial Intelligence can assist libraries in their continued mission to prepare and provide information resources while remodelling their information processing practices in a novel way. The concept of <italic>human in the loop</italic> offers a possible approach for retaining intellectual subject indexing expertise while combining it with machine-learning-based methods and thus transferring it into a form that is more adapted to the potential of state-of the-art technology available today.</p>
</sec>
</sec>
</body>
<back>
<fn-group>
<title>Notes</title>
<fn id="fn1"><p>The author of this paper.</p></fn>
<fn id="fn2"><p>See <ext-link ext-link-type="uri" xlink:href="https://github.com/NatLibFi/Annif/wiki/Backend&#x0025;3A-Omikuji">https://github.com/NatLibFi/Annif/wiki/Backend&#x0025;3A-Omikuji</ext-link> for its integration into Annif.</p></fn>
<fn id="fn3"><p>See <ext-link ext-link-type="uri" xlink:href="https://github.com/NatLibFi/Annif/wiki/Backend&#x0025;3A-fastText">https://github.com/NatLibFi/Annif/wiki/Backend&#x0025;3A-fastText</ext-link> for its integration into Annif.</p></fn>
<fn id="fn4"><p>See <ext-link ext-link-type="uri" xlink:href="https://github.com/NatLibFi/Annif/wiki/Backend&#x0025;3A-STWFSA">https://github.com/NatLibFi/Annif/wiki/Backend&#x0025;3A-STWFSA</ext-link> for its integration into Annif.</p></fn>
<fn id="fn5"><p>Experiments in the AutoSE context have shown that author keywords improve the F1 score (<xref ref-type="bibr" rid="r5">&#x201C;F-score&#x201D;, 2023</xref>) to 0.55 on average as opposed to 0.47 when only using titles.</p></fn>
<fn id="fn6"><p>For example, there had been a regular exchange of ideas prior to the German National Library launching their own Annif-based &#x201C;cataloguing machine&#x201D; and associated AI project, see <xref ref-type="bibr" rid="r8">Grote (2022)</xref> and <xref ref-type="bibr" rid="r7"> German National Library (2022)</xref>.</p></fn>
<fn id="fn7"><p>The number for intellectual subject indexing at ZBW is around 30,000 records per year.</p></fn>
<fn id="fn8"><p>&#x223C;147,000 in 2021; &#x223C;500,000 in 2020.</p></fn>
<fn id="fn9"><p>For a short description see <xref ref-type="bibr" rid="r4">Eurospider Information Technology AG (2023)</xref>.</p></fn>
<fn id="fn10"><p>(January 2023) A prototypical version exists &#x2013; the next step is to launch the UI internally.</p></fn>
<fn id="fn11"><p>In <xref ref-type="bibr" rid="r1">Battistella et al. (2015)</xref> this role is called the &#x201C;intermediary&#x201D;: The intermediary &#x201C;acts as a third party agent assuming the role of facilitation/mediation between the parties in order to facilitate the relational context and with the aim of supporting the development of the process in its criticalities, addressing enabling or constraining factors.&#x201D;</p></fn>
<fn id="fn12"><p>The goal is to increase the so-called <xref ref-type="bibr" rid="r2">&#x201C;bus factor&#x201D; (2023)</xref> of the team which is &#x201C;the minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel&#x201D;.</p></fn>
<fn id="fn13"><p>&#x201C;Technology transfer is a bilateral process between sender and receiver: there is a &#x201A;process of feed-back&#x2018; from the sender to the receiver, which allows interested parties to obtain more information (knowledge) on the use of technology transferred&#x201D; (<xref ref-type="bibr" rid="r1">Battistella et al., 2015</xref>) &#x2013; the same is true for applied research results.</p></fn>
<fn id="fn14"><p>In terms of the challenges compiled by <xref ref-type="bibr" rid="r1">Battistella et al. (2015)</xref> this could also be interpreted as an issue of &#x201C;cultural distance&#x201D;, i.e., the transfer gets harder if there is a certain lack of a shared vision or a common goal.</p></fn></fn-group>
<ref-list>
<title>References</title>
<ref id="r1"><mixed-citation>Battistella, C., De Toni, A., &#x0026; Pillon, R. (2015). Inter-organisational technology/knowledge transfer: a framework from critical literature review. <italic>The Journal of Technology Transfer</italic>, <italic>41</italic>, 1195&#x2013;1234. <ext-link ext-link-type="doi" xlink:href="10.1007/s10961-015-9418-7">https://doi.org/10.1007/s10961-015-9418-7</ext-link></mixed-citation></ref>
<ref id="r2"><mixed-citation>Bus factor. (2023, July 5). In <italic>Wikipedia</italic>. <ext-link ext-link-type="uri" xlink:href="https://en.wikipedia.org/w/index.php?title=Bus_factor&#x0026;oldid=1154218987">https://en.wikipedia.org/w/index.php?title&#x003D;Bus_factor&#x0026;oldid&#x003D;1154218987</ext-link></mixed-citation></ref>
<ref id="r3"><mixed-citation>Dong, T., &#x0026; Suominen, O. (2022). <italic>Omikuji</italic> [Computer Software]. <ext-link ext-link-type="uri" xlink:href="https://github.com/tomtung/omikuji">https://github.com/tomtung/omikuji</ext-link></mixed-citation></ref>
<ref id="r4"><mixed-citation>Eurospider Information Technology AG. (2023, January 17). <italic>Subject indexing using the DA</italic>. <ext-link ext-link-type="uri" xlink:href="https://www.eurospider.com/en/relevancy-product/digital-assistant-da-3">https://www.eurospider.com/en/relevancy-product/digital-assistant-da-3</ext-link></mixed-citation></ref>
<ref id="r5"><mixed-citation>F-score. (2023, January 17). In <italic>Wikipedia</italic>. <ext-link ext-link-type="uri" xlink:href="https://en.wikipedia.org/wiki/F-score">https://en.wikipedia.org/wiki/F-score</ext-link></mixed-citation></ref>
<ref id="r6"><mixed-citation>Facebook Inc. (2022). <italic>fastText</italic> [Computer Software]. <ext-link ext-link-type="uri" xlink:href="https://fasttext.cc/">https://fasttext.cc/</ext-link></mixed-citation></ref>
<ref id="r7"><mixed-citation>German National Library. (2022, October). <italic>Automatic Cataloguing System</italic>. <ext-link ext-link-type="uri" xlink:href="https://www.dnb.de/EN/Professionell/ProjekteKooperationen/Projekte/KI/ki_node.html">https://www.dnb.de/EN/Professionell/ProjekteKooperationen/Projekte/KI/ki_node.html</ext-link></mixed-citation></ref>
<ref id="r8"><mixed-citation>Grote, C. (2022, May 11). <italic>German National Library launched its &#x201C;cataloguing machine&#x201D;</italic> [Online forum post]. Annif Users. <ext-link ext-link-type="uri" xlink:href="https://groups.google.com/g/annif-users/c/KVQB-hvLrbA/m/I9RwM9EPBgAJ">https://groups.google.com/g/annif-users/c/KVQB-hvLrbA/m/I9RwM9EPBgAJ</ext-link></mixed-citation></ref>
<ref id="r9"><mixed-citation>Joulin, A., Grave, E., Bojanowski, P., &#x0026; Mikolov, T. (2016). <italic>Bag of tricks for efficient text classification</italic>. arXiv preprint. <ext-link ext-link-type="doi" xlink:href="10.48550/arXiv.1607.01759">https://doi.org/10.48550/arXiv.1607.01759</ext-link></mixed-citation></ref>
<ref id="r10"><mixed-citation>Kasprzik, A. (2022, July 26&#x2013;29). <italic>Get everybody on board and get going &#x2013; the automation of subject indexing at ZBW</italic> [Conference presentation] 87<sup>th</sup> IFLA World Library and Information Congress (WLIC), Satellite Meeting: Information Technology &#x2013; New Horizons in Artificial Intelligence in Libraries, Dublin, Ireland. <ext-link ext-link-type="uri" xlink:href="https://repository.ifla.org/handle/123456789/2047">https://repository.ifla.org/handle/123456789/2047</ext-link></mixed-citation></ref>
<ref id="r11"><mixed-citation>Monarch, R.M., &#x0026; Manning, C.D. (2021). <italic>Human-in-the-loop machine learning &#x2013; active learning and annotation for human-centered AI</italic>. Manning Publications. <ext-link ext-link-type="uri" xlink:href="https://livebook.manning.com/book/human-in-the-loop-machine-learning/">https://livebook.manning.com/book/human-in-the-loop-machine-learning/</ext-link></mixed-citation></ref>
<ref id="r12"><mixed-citation>Radford, A., Narasimhan, K., Salimans, T., &#x0026; Sutskever, I. (2018). <italic>Improving language understanding by generative pre-training.</italic> OpenAI. <ext-link ext-link-type="uri" xlink:href="https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf">https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf</ext-link></mixed-citation></ref>
<ref id="r13"><mixed-citation>Suominen, O., Inkinen J., Virolainen, T., F&#x00FC;rneisen, M., Kinoshita, B.P., Veldhoen, S., Sj&#x00F6;berg, M., Umstein, P., Neatherway, R., &#x0026; Lehtinen, M. (2023a). <italic>Annif</italic> [Computer Software]. <ext-link ext-link-type="doi" xlink:href="10.5281/zenodo.5654173">https://doi.org/10.5281/zenodo.5654173</ext-link></mixed-citation></ref>
<ref id="r14"><mixed-citation>Suominen, O., Inkinen J., Virolainen, T., F&#x00FC;rneisen, M., Kinoshita, B.P., Veldhoen, S., Sj&#x00F6;berg, M., Umstein, P., Neatherway, R., &#x0026; Lehtinen, M. (2023b). <italic>nn-ensemble</italic> [Computer Software].</mixed-citation></ref>
<ref id="r15"><mixed-citation>Toepfer, M., &#x0026; Seifert, C. (2018a). Fusion architectures for automatic subject indexing under concept drift. <italic>International Journal on Digital Libraries, 21</italic>, 169&#x2013;189. <ext-link ext-link-type="doi" xlink:href="10.1007/s00799-018-0240-3">https://doi.org/10.1007/s00799-018-0240-3</ext-link></mixed-citation></ref>
<ref id="r16"><mixed-citation>Toepfer, M., &#x0026; Seifert, C. (2018b). Content-based quality estimation for automatic subject indexing of short texts under precision and recall constraints. In E. M&#x00E9;ndez, F. Crestani, C. Ribeiron, G. David, &#x0026; J. Correia Lopes (Eds.), <italic>Lecture Notes in Computer Science: Vol. 11057. Digital Libraries for Open Knowledge</italic> (pp. 3&#x2013;15). Springer. <ext-link ext-link-type="doi" xlink:href="10.1007/978-3-030-00066-0_1">https://doi.org/10.1007/978-3-030-00066-0_1</ext-link></mixed-citation></ref>
<ref id="r17"><mixed-citation>Transformer (machine learning model). (2023, January 17). In <italic>Wikipedia</italic>. <ext-link ext-link-type="uri" xlink:href="https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)">https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)</ext-link></mixed-citation></ref>
<ref id="r18"><mixed-citation>ZBW &#x2013; Leibniz Information Centre for Economics. (2022a). <italic>stwfsapy</italic> [Computer Software]. <ext-link ext-link-type="uri" xlink:href="https://github.com/zbw/stwfsapy">https://github.com/zbw/stwfsapy</ext-link></mixed-citation></ref>
<ref id="r19"><mixed-citation>ZBW &#x2013; Leibniz Information Centre for Economics. (2022b). <italic>qualle</italic> [Computer Software]. <ext-link ext-link-type="uri" xlink:href="https://github.com/zbw/qualle">https://github.com/zbw/qualle</ext-link></mixed-citation></ref>
<ref id="r20"><mixed-citation>ZBW &#x2013; Leibniz Information Centre for Economics. (2022c). <italic>releasetool</italic> [Computer Software]. <ext-link ext-link-type="uri" xlink:href="https://github.com/zbw/releasetool">https://github.com/zbw/releasetool</ext-link></mixed-citation></ref>
<ref id="r21"><mixed-citation>ZBW &#x2013; Leibniz Information Centre for Economics. (2023, January 17). <italic>STW Thesaurus for Economics</italic>. <ext-link ext-link-type="uri" xlink:href="https://zbw.eu/stw/version/latest/about.en.html">https://zbw.eu/stw/version/latest/about.en.html</ext-link></mixed-citation></ref>
</ref-list>
</back>
</article>