<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article article-type="research-article" xml:lang="EN" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">LIBER</journal-id>
<journal-title-group>
<journal-title>LIBER QUARTERLY</journal-title>
</journal-title-group>
<issn pub-type="epub">2213-056X</issn>
<publisher>
<publisher-name>openjournals.nl</publisher-name>
<publisher-loc>The Hague, The Netherlands</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">lq.10940</article-id>
<article-id pub-id-type="doi">10.53377/lq.10940</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Article</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Dawning of a New Age&#x003F; Economics Journals&#x2019; Data Policies on the Test Bench</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-7905-4209</contrib-id>
<name>
<surname>Vlaeminck</surname>
<given-names>Sven</given-names>
</name>
<email>s.vlaeminck@zbw.eu</email>
<xref ref-type="aff" rid="aff1"/>
</contrib>
<aff id="aff1">ZBW Leibniz-Information Centre for Economics, Hamburg, Germany</aff>
</contrib-group>
<pub-date pub-type="epub">
<month>8</month>
<year>2021</year>
</pub-date>
<volume>31</volume>
<fpage>1</fpage>
<lpage>29</lpage>
<permissions>
<copyright-statement>Copyright 2021, The copyright of this article remains with the author</copyright-statement>
<copyright-year>2021</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See <uri xlink:href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</uri>.</license-p>
</license>
</permissions>
<self-uri xlink:href="https://www.liberquarterly.eu/article/10.53377/lq.10940"/>
<abstract>
<p>In the field of social sciences and particularly in economics, studies have frequently reported a lack of reproducibility of published research. Most often, this is due to the unavailability of data reproducing the findings of a study. However, over the past years, debates on open science practices and reproducible research have become stronger and louder among research funders, learned societies, and research organisations. Many of these have started to implement data policies to overcome these shortcomings. Against this background, the article asks if there have been changes in the way economics journals handle data and other materials that are crucial to reproduce the findings of empirical articles. For this purpose, all journals listed in the Clarivate Analytics Journal Citation Reports edition for economics have been evaluated for policies on the disclosure of research data. The article describes the characteristics of these data policies and explicates their requirements. Moreover, it compares the current findings with the situation some years ago. The results show significant changes in the way journals handle data in the publication process. Research libraries can use the findings of this study for their advisory activities to best support researchers in submitting and providing data as required by journals.</p>
</abstract>
<kwd-group>
<kwd>data policies</kwd>
<kwd>journals</kwd>
<kwd>economics</kwd>
<kwd>open science</kwd>
<kwd>reproducibility</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1. Introduction</title>
<p>Journals are on the forefront of the scientific ecosystem. Because of their peer-review processes, they are an important instance to ensure scientific quality and integrity. Against the background of an ever growing amount of publications<xref ref-type="fn" rid="fn1">1</xref> (<xref ref-type="bibr" rid="r13">Johnson et al., 2018</xref>) and public debates on &#x2018;fake journals&#x2019; and &#x2018;predatory publishing&#x2019; (cf. <xref ref-type="bibr" rid="r11">Hern &#x0026; Duncan, 2018</xref>), quality assurance mechanisms like the peer-review process should ensure that only well founded research is published in the pages of scholarly journals. Research libraries have also developed a variety of services to help researchers find appropriate journals to publish their findings. Peer review and adherence to good scientific practice are key requirements for selecting a suitable journal.</p>
<p>Nevertheless, there are serious reservations about the role of peer-review procedures when it comes to the inclusion of data and calculations in the publication process. Specifically, discussions about useful editorial procedures in dealing with empirical or other data-driven contributions have been raised frequently in the last decades. Since the seminal paper of <xref ref-type="bibr" rid="r7">Dewald et al. (1986)</xref>, in which the authors systematically attempted to reproduce the results of published papers, the editorial procedures of journals in dealing with data have been under fire. The study of Dewald et al. suggested that errors in published empirical articles are &#x201C;a commonplace rather than a rare occurrence&#x201D; (<xref ref-type="bibr" rid="r7">Dewald et al., 1986</xref>, p. 587f). These findings have been widely regarded as a serious issue. However, twenty years later, the US-economist McCullough still noted: &#x201C;Results published in economic journals are accepted at face value and rarely subjected to the independent verification that is the cornerstone of the scientific method. Most results published in economics journals cannot be subjected to verification, even in principle, because authors typically are not required to make their data and code available for verification&#x201D; (<xref ref-type="bibr" rid="r18">McCullough et al., 2006</xref>, p. 1093f).</p>
<p>To assess the results of empirical publications, reviewers and would-be replicators need the data and some additional information on the methodology used. Also the different steps of the analysis or instructions given in economic experiments are crucial to assess the robustness of research findings. Reproducibility of research is a key pillar of the scientific method. Without the ability to check the findings of applied economic research for robustness or potential methodological errors, publications do not meet a basic requirement for scientific discoveries.</p>
<p>In the light of intensified debates on open science, this paper provides an updated analysis of the editorial policies of economics journals with respect to data. The paper asks how journals in economics today deal with the question of including data and calculations in the publication process. Specifically, the paper asks if there are editorial policies that request these files.</p>
<p>The paper addresses three dimensions: First, the article asks how many journals listed in the Clarivate Analytics&#x2019; Journal Citation Reports 2017 edition for economics (in the following abbreviated to JCR ECON 2017) have a data policy and classifies the different types of policies found in the sample. Second, the characteristics of these data policies are examined including an analysis of the materials requested from authors, the share of mandatory and voluntary data policies, how propriety and confidential data is handled, and a description of journals&#x2019; recommendations for publishing research data and replication files. In addition, the paper shows how the requirements of the data policies differ from publishing house to publishing house and characterises these differences. Third, the article compares the findings of this study with a similar survey published in 2015. The goal is to determine potential changes in the number of journals with a data policy and their demands since that point in time.</p>
<p>The paper starts with an outline of previous studies&#x2019; findings on journal data policies in economics and describes the current state of research. From there, it discusses reasons for which journals hesitate to implement data policies and why authors are reluctant to comply with these policies. Following this theoretical classification, the paper will clarify the methodology of the study and the data collection, which is based on a content analysis. It then presents the outcome of the survey and the specifications of journal data policies before comparing the outcome of this paper with the findings of a survey published in 2015. Finally, it presents summaries and discusses the outcome of the study with respect to the daily work of research libraries.</p>
</sec>
<sec id="s2">
<title>2. Literature Review</title>
<p>Studies have been dealing with questions of reproducible research and particularly with the data policies and data archives of economics journals since the late 1980s. The publication of <xref ref-type="bibr" rid="r7">Dewald et al. (1986)</xref>, set a starting point to the ongoing debate. Their paper presented the findings of a project in which the authors collected programs and data from authors of the Journal of Money, Credit and Banking (JMCB). For the project, the JMCB adopted a data policy in which they bound authors to making the programs and data available on request. In one of the author&#x2019;s previous papers, this type of policy is labelled as an &#x2018;author responsibility policy&#x2019; (abbreviated to ARP) as it leaves the responsibility to provide the replication files to others researchers with the original authors (<xref ref-type="bibr" rid="r23">Vlaeminck &#x0026; Herrmann, 2015a</xref>). In their study, Dewald et al. requested the replication files of published papers from the authors of JMCB. They reported a response rate of 67% (within an average response time of 217 days) for authors who already had their papers published. Of these respondents, 48% could not (or did not want to) provide their data and program codes (e.g. the computer programs to run an experiment and to analyse the data). <xref ref-type="bibr" rid="r17">McCullough and Vinod (2003)</xref> experienced similar problems, when trying to replicate the findings of a full issue of the American Economic Review (AER), which also had an ARP at that time: Half of the authors did not honour the journal&#x2019;s data policy. Also more recent studies report ongoing problems with data policies that rely on author&#x2019;s support: <xref ref-type="bibr" rid="r20">Savage and Vickers (2009)</xref> found that only one in ten researchers contacted provided their data. The studies of <xref ref-type="bibr" rid="r14">Krawczyk and Reuben (2012)</xref> and <xref ref-type="bibr" rid="r21">Stodden et al. (2018)</xref> reached results comparable with Dewald et al.&#x2019;s study from 1986.</p>
<p>These studies demonstrate a low response ratio by researchers for requests for data and program code. ARPs, like these, do not work due to a lack of incentive to share data with others. <xref ref-type="bibr" rid="r9">Feigenbaum and Levy (1993)</xref> have theoretically outlined why there are very limited incentives for authors to comply with such policies. <xref ref-type="bibr" rid="r8">Duvendack et al. (2015)</xref> argue that the costs to compile data and code into an easily reusable form are high, while authors receive small or no credit for this time-consuming work. This time could be spent more "productively" on other tasks that offer more benefits in terms of one&#x2019;s academic career. In addition, data sharing also involves the risk of damaging one&#x2019;s own scientific reputation should data or program codes contain errors.</p>
<p>These arguments illustrate why researchers are reluctant to share their data and why parts of the research community delay or even prevent a replication of their research (cf. <xref ref-type="bibr" rid="r14">Krawczyk &#x0026; Reuben, 2012</xref>; <xref ref-type="bibr" rid="r20">Savage &#x0026; Vickers, 2009</xref>).</p>
<p>However, also for journals, the incentives to implement data policies are ambivalent. Editors might fear that a data policy encourages authors to look for alternative journals, which do not have a data policy. In the event the data contains errors, editors might fear that sharing data and program code can cause similar reputational consequences to journals as it can to authors. Journals can fortify themselves against these by including data and program code review in the review process. However, this has high costs to editors and reviewers: additional communication and workflows need to be set up. For reviewers, going over the data might be more time consuming and challenging than reviewing the paper itself. Consequently, the inclusion of data policies can also have serious obstacles for journals.</p>
<p>Nevertheless, as a reaction and consequence of the malfunctions reported with ARPs, journals have slowly begun to implement data availability policies (DAP). Among the first journals to implement a mandatory DAP was the American Economic Review (AER) (<xref ref-type="bibr" rid="r3">Bernanke, 2004</xref>). The crucial difference compared to ARPs is that authors have to submit their data and program code prior to publication of an article to the editorial office or, nowadays, to a recognised data repository. Accountability for publishing or providing data to would-be replicators is thus shifted from authors to editorial offices or third parties like data repositories. This policy change is a big step forward.</p>
<p>Little by little, other (top-)journals followed the lead of the AER (<xref ref-type="bibr" rid="r18">McCullough et al., 2006</xref>). But the number of journals with a solid data policy remained marginal compared to the overall number of journals in economics: <xref ref-type="bibr" rid="r16">McCullough (2009)</xref> mentioned 17 journals which implemented mandatory data archives for data and program code since 1993. All of these journals also have data policies in which they inform authors about their requirements.</p>
<p>Since that time, several studies have analysed journal data policies and how their demands have changed over time. While there are many publications about journal data policies in different fields of research (most often in the sciences), there are just a few analyses for economic journals: <xref ref-type="bibr" rid="r22">Vlaeminck (2013)</xref> analysed a sample of 141 economics journals and found 29 journals (20.6%) with a DAP, while 11 (7.8%) journals had an ARP. Almost 83% of the DAP were mandatory. The study also includes a discussion of the specific demands of journal data policies. <xref ref-type="bibr" rid="r23">Vlaeminck and Herrmann (2015a)</xref> analysed a sample of 346 journals from economics and management for the availability of data policies. They found 49 journals with a DAP (14.2%), another 22 (6.4%) employed an ARP. 61.2% of those DAPs were mandatory. Again, the requirements of the data policies are described in the paper. <xref ref-type="bibr" rid="r12">H&#x00F6;ffler (2017)</xref> analysed all economics journals listed in the Journal Citation Reports (JCR) for the year 2015. Of the 343 journals in the study, 26 had a mandatory DAP (7.5%), while 110 had voluntary policies (32.1%). Another 15 (4.4%) employed an ARP and 34 (9.9%) offered both voluntary data deposit and making data available upon request. One of the most striking results of H&#x00F6;ffler&#x2019;s study is that a majority of journals in economics holds a policy on data &#x2013; for the first time, since studies have focused on economic journals. A study by <xref ref-type="bibr" rid="r5">Chin and Dong (2019)</xref> generally confirmed the findings of H&#x00F6;ffler. They analysed the data policies of 74 journals listed in the Tilburg University Top 100 Worldwide Economics Schools Research Ranking. They found 58 journals with a data availability policy (75.7%), while the remaining 18 journals had no data policy (24.3%). Of those 58 journals, 34 had a mandatory policy (58.6%).</p>
<p><xref ref-type="bibr" rid="r4">Chang and Li (2015)</xref> and <xref ref-type="bibr" rid="r25">Vlaeminck and Podkrajac (2017)</xref> have examined how voluntary and mandatory data policies perform. The authors concluded that mandatory data policies perform better, because the probability to find the data necessary to conduct reproductions is higher than for journals with voluntary data policies.</p>
</sec>
<sec id="s3">
<title>3. Methodology &#x0026; Data</title>
<p>In order to determine the share of economics journals with a data policy and to illustrate the demands of these policies, this paper will use the methodological approach of a content analysis. According to Neuendorf, &#x201C;content analysis is a research technique for making replicable and valid inferences from texts (or other meaningful matter) to the contexts of their use&#x201D; (<xref ref-type="bibr" rid="r19">Neuendorf, 2002</xref>, p. 18). Neuendorf splits the framework of a content analysis into nine conceptual steps. The first steps -theory and rationale, conceptualisations, operationalisations, creation of a coding scheme, and sampling- deal with the development of the research instrument. The subsequent steps - training and pilot reliability, coding, final reliability, tabulation, and reporting - deal with the collection and analysis of the data.</p>
<p>By using a content analysis, a structured, systematic coding scheme is applied to selected text passages. As a result, latent and manifest conclusions can be drawn about the frequency of certain topics, concepts or meanings. Central to the method is the selection of the text passages to be analysed, a precise definition of the variables for the analysis, their accurate operationalisation in a codebook, and the careful application of the coding in data acquisition.</p>
<p>While theory and research questions have already been described previously, the construction of the research instrument and the sampling need some explanation. To determine the share of journals with data policies in the most relevant journals in economics, all journals listed in the category &#x2018;economics&#x2019; of the 2017 edition of Clarivate Analytics&#x2019; Journal Citation Reports (<xref ref-type="bibr" rid="r6">Clarivate Analytics, 2018</xref>) &#x2013; in the following abbreviated to JCR ECON 2017 &#x2013; have been examined. In total, the JCR ECON 2017 itemises 353 journals in this category.<xref ref-type="fn" rid="fn2">2</xref> The publishing houses, the impact factor, and the ranking of the journals within the category &#x2018;economics&#x2019; of the JCR ECON 2017 have been added to the data. Subsequently, the journals have been categorised according to their methodological orientation.</p>
<p>Particularly, it has been of interest whether a journal generally accepts and publishes empirical, applied, or other data-driven contributions (like experiments, simulations or other forms of computational economics). To get this information, the sections &#x2018;aims and scope&#x2019;, &#x2018;about&#x2019; and the general introductory text passage of a journal have been checked. In cases of doubt, up to four published issues have been evaluated manually to identify potential empirical papers. Only those journals that publish contributions based on data remain in the sample for further analyses. Journals that do not accept or publish data-driven contributions were no longer regarded as these journals do not need a data policy.</p>
<p>Subsequently, the webpages of the remaining journals have been examined for instructions that regulate the handling of data and other materials essential to reproduce the findings of an article. Typically, this information is part of the &#x2018;information for authors&#x2019; section. In a few cases, this information is also available in the rubric &#x2018;duties of author&#x2019; (or similar named paragraphs). Partially, journals also have a separate section on their webpage which links to their data policy. For the content analysis, these instructions are regarded as sampling units.</p>
<p>If available, not only has the publisher webpage of a journal been searched for this kind of information, but also the website of the editorial office. In the past, the websites of the editorial offices have offered better and more exact information than the publisher&#x2019;s website (cf. <xref ref-type="bibr" rid="r22">Vlaeminck, 2013</xref>).<xref ref-type="fn" rid="fn3">3</xref> Specifically we looked for recommendations to submit specific files and research instruments that are crucial to reproduce the results of a paper.<xref ref-type="fn" rid="fn4">4</xref> These recommendations were treated as the coding unit for the content analysis.</p>
<p>All textual information with respect to handling data in the publication process found on journals&#x2019; websites has been collected in a single document. Policies with an identical wording (or only slight deviations) were grouped to facilitate and accelerate the evaluation. The document served as the base for the content analysis (cf. Appendix B).</p>
<p>The variables needed for the operationalisation in the context of the content analysis have been derived from the literature. <xref ref-type="bibr" rid="r15">McCullough (2007)</xref> and the <xref ref-type="bibr" rid="r1">American Economic Association (2005)</xref> outline these. <xref ref-type="bibr" rid="r10">Glandon (2011)</xref> has shown that the recommendations of the <xref ref-type="bibr" rid="r1">American Economic Association (2005)</xref> work in practice. To have a maximum of comparability, the study only incorporates files and variables, which are used by most methodological approaches (e.g. econometrics, simulations, and experiments) in economics.<xref ref-type="fn" rid="fn5">5</xref> Thus, they can be regarded as the &#x2018;lowest common denominator&#x2018; for the methods used in economics research.</p>
<p>The variables include:</p>
<list list-type="bullet">
<list-item><p>the <bold>dataset(s)</bold> used to reach the findings of a paper. Without the data, empirical findings cannot be checked. Therefore, availability of data is essential for reproductions.</p></list-item>
<list-item><p>the <bold>program code</bold> (e.g. of statistical analyses in econometric papers, experiments or simulations). Without having the program code of a statistical analysis, the results of a paper and their robustness cannot be assessed satisfactorily. Discussions about methodological choices or details of computations are not possible.</p></list-item>
<list-item><p><bold>descriptions of the data</bold> (e.g. data dictionary, codebooks, documentations) and/or of the entire <bold>research process</bold> (e.g. in readme-file and/or by including the instructions of experiments). Without descriptions of the data, it often is very difficult to make use of the data. In addition, without proper descriptions of the research process, it often remains unclear which steps have been taken to achieve the results of the paper.</p></list-item>
<list-item><p><bold>Intermediate datasets</bold> may help to understand the course and interstation of the research process.</p></list-item>
</list>
<p>The more of this information a data policy requires, the more robust it is in terms of reproducibility.</p>
<p>Beyond the policies&#x2019; recommendations for data and other materials, the study also explored some other characteristics of the data policies, which serve as additional variables in the content analysis:</p>
<list list-type="bullet">
<list-item><p>the <bold>degree of obligation</bold> of a policy (mandatory or voluntary). As discussed, the literature suggests that a voluntary data deposit does not work well in practice. For this reason, the share of mandatory and voluntary data policies was of interest to this study.</p></list-item>
<list-item><p>the way journals suggest to <bold>provide the data</bold> for replication purposes and the public (e.g. data repository, website).</p></list-item>
<list-item><p>In economics (and specifically in business administration), the use of data from commercial providers is widespread. To increase the reproducibility of research even in cases where restricted (e.g. proprietary or confidential) data was used, journals need a <bold>procedure</bold> that regulates these cases. The goal is to permit reproductions in principle, even if the original data cannot be shared for legal reasons.<xref ref-type="fn" rid="fn6">6</xref> Such a procedure might include providing an identifier of the data used, a contact address from where to obtain the data, information on the availability/access conditions of data, and the program code of the analysis.</p></list-item>
<list-item><p>With a <bold>data statement</bold>, authors can be transparent about the data they used in their article and indicate its availability. Furthermore they can provide a reason if data is not available to access for others. For this purpose, many of the major publishers offer templates to choose from. By adding such a data statement, the access conditions of the data become transparent and researchers interested in replications can quickly assess the availability of the data.</p></list-item>
</list>
<p>Each variable was dichotomously classified as &#x201C;mentioned/not mentioned&#x201D; in the data policy.<xref ref-type="fn" rid="fn7">7</xref> The subsequent coding was done manually by going through all the policies and marking all places in the text in which information on the variables was found. The coding schema experienced two rounds of adaptations after a short pre-test with a limited number of data policies. The final data collection took several weeks. To achieve inter-coder reliability to the greatest possible extent (despite there being only one coder) the coding process was performed twice: After a first run in March 2019, a second pass of the categorisation process took place in June 2019. The findings slightly differed and resulted in a third pass in which only these ambiguous findings were double checked. Afterwards, the evaluation started.</p>
<p>In order to compare the results of this study with the findings of a previously published survey (see section 5), a dataset compiled by <xref ref-type="bibr" rid="r24">Vlaeminck and Herrmann (2015b)</xref>, was used. This dataset contained information on journal data policies and their specifications for a sample of 346 journals. 262 of these journals also had an impact factor and almost all were listed in the Social Sciences Citation Index (SSCI).<xref ref-type="fn" rid="fn8">8</xref> Those journals serve as a useful comparison group for the findings achieved in this study.</p>
</sec>
<sec id="s4">
<title>4. Findings of the Study</title>
<p>Based on the approach described above this work found that 327 out of 353 journals (92.6%) of the JCR ECON 17 generally accept or at least sporadically publish empirical and other data-driven contributions. The percentage of these &#x2018;empirically-oriented&#x2019; journals is probably higher than the average of all journals in economics. But as the most prestigious journals often like to publish new or &#x2018;innovative&#x2019; results, this high percentage comes as no surprise.</p>
<p>Of these 327 &#x2018;empirically oriented&#x2019; journals, 223 have a data policy (68.2%). These shares and numbers refer to all types of policies (DAP, ARP and a combination of the two policy types).</p>
<sec id="s4a">
<title>4.1. Types of Journal Data Policies</title>
<p>The first outcome identified by applying the content analysis was the types of data policies used by the journals. The main differentiation was made between data availability policies (DAP) and author responsibility policies (ARP). As mentioned in section two, the main difference between these two policy types is the accountability for providing data to would-be replicators. While DAPs ask or require authors to submit their replication files (most often prior to publication of an article) to a third party (e.g. the publishing house, the editorial office or a recognised/trusted data repository), ARPs leave the responsibility for providing data to other researchers to the author. In the sample, 185 policies have been classified as a DAP, while 29 policies have been categorised as an ARP (cf. <xref ref-type="fig" rid="fg001">Figure 1</xref>).<xref ref-type="fn" rid="fn9">9</xref></p>
<fig id="fg001">
<label>Fig. 1:</label>
<caption><p>Data Policies found in the sample of 327 journals that accept or publish empirical or other data-driven contributions.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2021_31_Vlaeminck_fig1.jpg"/></fig>
<p>Nine journals offer both a DAP and an ARP (which means that authors may choose between depositing their files in a data repository or providing it to would-be replicators in cases of requests).</p>
</sec>
<sec id="s4b">
<title>4.2. Characteristics of Journal Data Policies</title>
<p>In order to ensure reproducibility as far as possible, the degree of obligation of a data policy plays an important role. As mentioned previously, many studies report issues with voluntary data deposit, while mandatory data policies perform much better.</p>
<p>The results of the content analysis show that 60 out of 223 journals (27%) do have a mandatory data policy (cf. <xref ref-type="fig" rid="fg002">Figure 2</xref>). Subdivided into the different types of data policies, 50 (27%) out of 185 journals with a DAP hold a mandatory policy, while the corresponding number for journals with an ARP is 10 out of 29 (34%). For journals which offer both an ARP and a DAP the respective quantities are zero out of nine (0%).</p>
<fig id="fg002">
<label>Fig. 2:</label>
<caption><p>Degree of obligation of the different types of data policies.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2021_31_Vlaeminck_fig2.jpg"/></fig>
<p>Another important regulation of journal data policies is whether they offer a procedure for research based on restricted data. Using restricted data is widespread in economics. Such data could be purchased/proprietary, protected due to privacy restrictions or might not be accessible due to reasons of confidentiality. The analysis shows 38 journals (17%) in total with specific regulations for research based on restricted data. Thirty-seven of these are journals with a DAP. One is a journal with an ARP (cf. <xref ref-type="fig" rid="fg003">Figure 3</xref>).</p>
<fig id="fg003">
<label>Fig. 3:</label>
<caption><p>Journals with procedures for research based on restricted data.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2021_31_Vlaeminck_fig3.jpg"/></fig>
<p>As a further result of the content analysis, the demands of the data policies in relation to specific files and data could be determined. The table below details how often the different files and information have been requested by the data policies in the sample (cf. <xref ref-type="table" rid="tb001">Table 1</xref>).</p>
<table-wrap id="tb001" position="float" orientation="portrait">
<label>Table 1:</label>
<caption><p>Specifications and demands of journal data policies</p></caption>
<table>
<thead>
<tr>
<th align="left" valign="top">Specification</th>
<th align="left" valign="top">DAP (n &#x003D; 185)</th>
<th align="left" valign="top">ARP (n &#x003D; 29)</th>
<th align="left" valign="top">DAP &#x0026; ARP (n &#x003D; 9)</th>
<th align="left" valign="top">Total (n &#x003D; 223)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Data/sets</td>
<td align="left" valign="top">185 (100%)</td>
<td align="left" valign="top">29 (100%)</td>
<td align="left" valign="top">9 (100%)</td>
<td align="left" valign="top">223 (100%)</td>
</tr>
<tr>
<td align="left" valign="top">Program Code</td>
<td align="left" valign="top">122 (65.9%)</td>
<td align="left" valign="top">26 (89.7%)</td>
<td align="left" valign="top">2 (22.2%)</td>
<td align="left" valign="top">150 (67.3%)</td>
</tr>
<tr>
<td align="left" valign="top">Descriptions/documentations</td>
<td align="left" valign="top">34 (18.4%)</td>
<td align="left" valign="top">22 (75.9%)</td>
<td align="left" valign="top">1 (11.1%)</td>
<td align="left" valign="top">57 (25.6%)</td>
</tr>
<tr>
<td align="left" valign="top">Intermediate datasets</td>
<td align="left" valign="top">17 (9.2%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">17 (7.6%)</td>
</tr>
<tr>
<td align="left" valign="top">Data statement</td>
<td align="left" valign="top">129 (69.7%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">9 (100%)</td>
<td align="left" valign="top">138 (61.9%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>All policies ask for data. This comes as no surprise, as the term &#x201C;data&#x201D; has been the selection criterion to determine a data policy. There are other findings, which astonish more: If we only look at the results for the DAPs, the first thing that stands out is the very small number of policies that demand descriptions or documentations. Without proper descriptions, it often is difficult to make use of that data. In addition, a comparatively low percentage, two thirds, of the policies ask for the program code. In view of the fact that the program code is essential to be able to reproduce the published results, this proportion is not satisfactory. Authors of 17 journals are also invited to submit intermediate datasets. All of these journals have adopted the robust data policy of the American Economic Association or use a modified version of it.</p>
<p>Data statements represent a new requirement of journal data policies and are widespread. Although they create transparency with regard to the availability of the data used, they do not help reproduce the results, strictly speaking. Almost 70% of all journals ask their authors to submit a data statement.</p>
<p>If we compare the specifications of the DAPs with those of the ARPs several differences can be observed: While all policies ask for data, the average demands for program code and descriptions are higher for ARPs than for DAPs. However not a single journal with an ARP asks for data statements or intermediate datasets. The nine journals that offer both policy types are the group with the weakest policies in the sample. While all of these ask for data and data statements all the other information, which is crucial to reproduce the results of a paper, is rarely requested.</p>
<p>In summary, many journals lack strong or detailed data policies. Often, the data policies do not mention fundamental requirements to ensure reproducibility (e.g., documentation and descriptions or - to a lesser degree - program code). Also the implementation of procedures, which should help in reproducing the findings of papers based on restricted data, is not widespread. To submit intermediate datasets is definitely a useful recommendation to ensure good scientific practice, but it also is a demanding requirement.</p>
<p>In order to make data available, most of the data policies suggest storing the data in a recognised repository (65%). All Elsevier journals offer to deposit the data in their in-house product Mendeley (28.7%). For 22% putting the files online on a personal or institutional website is an acceptable solution (cf. <xref ref-type="table" rid="tb002">Table 2</xref>).</p>
<table-wrap id="tb002" position="float" orientation="portrait">
<label>Table 2:</label>
<caption><p>Recommendations for depositing and disclosure of research data and replication files made by journals (multiple answers possible)</p></caption>
<table>
<thead>
<tr>
<th align="left" valign="top">Specification</th>
<th align="left" valign="top">DAP (n &#x003D; 185)</th>
<th align="left" valign="top">ARP (n &#x003D; 29)</th>
<th align="left" valign="top">DAP &#x0026; ARP (n &#x003D; 9)</th>
<th align="left" valign="top">Total (n &#x003D; 223)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Recognised data repository</td>
<td align="left" valign="top">136 (73.5%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">9 (100%)</td>
<td align="left" valign="top">145 (65%)</td>
</tr>
<tr>
<td align="left" valign="top">Dataverse</td>
<td align="left" valign="top">4 (2.2%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">4 (1.8%)</td>
</tr>
<tr>
<td align="left" valign="top">Website </td>
<td align="left" valign="top">47 (25.4%)</td>
<td align="left" valign="top">1 (3.5%)</td>
<td align="left" valign="top">1 (11.1%)</td>
<td align="left" valign="top">49 (22 %)</td>
</tr>
<tr>
<td align="left" valign="top">Mendeley data</td>
<td align="left" valign="top">63 (34.1%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">1 (11.1%)</td>
<td align="left" valign="top">64 (28.7%)</td>
</tr>
<tr>
<td align="left" valign="top">On request </td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">26 (89.7%)</td>
<td align="left" valign="top">9 (100%)</td>
<td align="left" valign="top">35 (15.7%)</td>
</tr>
<tr>
<td align="left" valign="top">Not stated</td>
<td align="left" valign="top">6 (3.2%)</td>
<td align="left" valign="top">1 (3.5%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">7 (3.1%)</td>
</tr>
<tr>
<td align="left" valign="top">Other</td>
<td align="left" valign="top">7 (3.8%)</td>
<td align="left" valign="top">1 (3.5%)</td>
<td align="left" valign="top">0 (0%)</td>
<td align="left" valign="top">8 (3.6%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Also with regard to data deposit, things have changed over the last years. In the past, journals typically attached the data to the article on the publisher&#x2019;s website as the default (cf. <xref ref-type="bibr" rid="r22">Vlaeminck, 2013</xref>). This was not a useful practice as data often went behind paywalls. A few journals offered their own data archives or recommended data repositories like Dataverse. Now some of the big publishing houses maintain lists of recommended data repositories on their webpages. Most often, these lists break down the repositories by the different scientific domains.</p>
</sec>
<sec id="s4c">
<title>4.3. Journal Data Policies and Publishers</title>
<p>A breakdown of the most important publishing houses of the 353 journals listed in the 2017 JCR ECON shows that Elsevier (70; 19.8%), Wiley (67; 19%), Springer (48; 13.6%), Taylor &#x0026; Francis (38, 10.8%) and Oxford University Press (24, 7.6%) are the publishers with the highest number of journals in the prestigious JCR ranking. Altogether, these five publishers account for more than 70% of all JCR ECON journals in the year 2017, what clearly shows their dominance and market power among the top journals in the field.</p>
<p><xref ref-type="fig" rid="fg004">Figure 4</xref> details the absolute number of the 327 journals that accept data-based contributions. The figure is grouped along the publishing houses and the data policies their journals employ.</p>
<fig id="fg004">
<label>Fig. 4:</label>
<caption><p>Journals with empirical contributions sorted by publishing houses and data policy.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2021_31_Vlaeminck_fig4.jpg"/></fig>
<p>At first glance, we observe very high rates of journals with data availability policies at the publishers Elsevier and Taylor &#x0026; Francis, while journals with ARP often belong to journals of the SpringerNature group. The SpringerNature group also has the highest number of journals in its holdings that offer both policy types.</p>
<p>As we worked through the data policies, we noticed that the guidelines of journals from the same publisher are often very alike. They are often structured very similarly and even their wording is also identical in many cases.</p>
<p>Some publishers (e.g. SpringerNature) use almost the same wording for their voluntary and their mandatory data policy. Sometimes only a few words have been changed (for instance, authors are not <italic>encouraged</italic> but <italic>required</italic> to follow the data policy).</p>
<p>Many publishers seem to have developed such a standard data policy. This seems to be the case for most major publishers. However, the policies differ quite substantially between different publishing houses.</p>
<p>In the past (cf. <xref ref-type="bibr" rid="r22">Vlaeminck, 2013</xref>; <xref ref-type="bibr" rid="r23">Vlaeminck &#x0026; Herrmann, 2015a</xref>) most data sharing policies were often individual. One exception was the American Economic Review&#x2019;s data policy, which had become a quasi-standard at that time (although only a handful of journals outside the American Economic Association used their policy).</p>
<p>Below, a more detailed summary for the five biggest publishers in the JCR ECON 2017 is given. Here, some major differences in how journals of these publishers structure their data policies are described (for details c.f. appendix B and the output file of the statistical analysis).</p>
<sec id="s4c1">
<title>4.3.1. Elsevier</title>
<p>All but one of Elsevier&#x2019;s journals that accept data-based contributions have a research data policy (98.6%). Almost all of them can be characterised as a DAP (98.5%). One journal has an ARP. At the point of evaluation, more than 60 of their journals in the JCR ECON used more or less the same data policy. It appears to be the &#x2018;standard policy&#x2019; for most of the publisher&#x2019;s journals. It consists of five paragraphs in the guide for authors that relate to the handling and disclosure of research data. All of these paragraphs appear to be generic and not specific to the situation in economics. In most cases, the data policy is voluntary, but there are nine journals (13.2%) with a mandatory data policy. Nevertheless, the policy mentions the most important information to ensure reproducibility of published research (software, code, models, algorithms, protocols, methods and &#x2018;other useful materials&#x2019;). The policies rarely mention data descriptions (2.9%). Data statements are requested by 67 data policies (97.1%). Elsevier provides templates for these. Here, authors can choose the template that best fits to their data and intentions. A defined procedure in cases where authors used restricted data for their scientific findings is available in only four cases (5.9%).</p>
<p>In terms of data deposit, all but two journals (97.1%) suggest storing the data in a data repository. All but four (94%) also recommend Mendeley Data, an in-house product. Simply attaching the data to the article on the journals&#x2019; websites is accepted by only seven journals (10.3%).</p>
</sec>
<sec id="s4c2">
<title>4.3.2. SpringerNature</title>
<p>Per default SpringerNature included a standardised data policy (ARP) for all of their journals. It is located in the &#x2018;ethical responsibilities of authors&#x2019; section, which might not be read by all of their authors. The policy states: &#x201C;<italic>Upon request authors should be prepared to send relevant documentation or data in order to verify the validity of the results.</italic>&#x201D; Beyond this sentence, 26 out of 43 (60.5%) journals have an additional DAP. Eight of these 26 also offer an ARP.</p>
<p>Less than half of SpringerNature&#x2019;s data policies mention data documentation (44.2%) or disclosure of program code (48.8%). The policies are voluntary most often: Only three journals (7%) are mandatory. Very few journals have an individual data policy. Among these journals, there is just one journal (2.3%) with a defined procedure in the event researchers use restricted data. 24 (55.8%) journals recommend depositing research data in a recognised repository.</p>
</sec>
<sec id="s4c3">
<title>4.3.3. Wiley</title>
<p>Journals published by Wiley are more reluctant with implementing research data policies. Thirty-six out of 57 journals (57.1%) possess such a policy. That is the lowest share among the big publishing houses in the sample. Among these, 34 (94.4%) have a DAP. In contrast to journals published by Elsevier or SpringerNature, many of Wiley&#x2019;s journals still have an individual data policy. Frequently, these policies are very detailed. Fourteen of these policies (38.9%) are mandatory. The same number of journals has a procedure in the event authors use restricted data. That is the highest percentage among major publishing houses. 58.3% of the policies recommend depositing the data in a data repository. Two-thirds (66.7%) of the policies ask for data statements.</p>
<p>Wiley also has a joint data policy for some of their journals. It consists of a short paragraph with two sentences in which the publishers asks to deposit research data in a public repository and to provide a data accessibility statement.</p>
</sec>
<sec id="s4c4">
<title>4.3.4. Taylor &#x0026; Francis</title>
<p>88.9% (32) of journals published by Taylor &#x0026; Francis have a data policy. Almost all of these policies are identical. It consists of three paragraphs. Core is the &#x201C;Basic Data Sharing Policy&#x201D; (Taylor &#x0026; Francis also offers stricter data sharing policies that are used primarily by journals in the sciences). All of these policies are DAPs, but not one single policy is binding. In addition, there is not a single policy with a defined procedure for research based on restricted data. Taylor &#x0026; Francis&#x2019; standard policy is also weak in other aspects: recommendations to submit program code (6.3%) or data documentation (3.1%) are below average (equivalent to two and one journals, respectively). In contrast, all data policies require data statements (100%) and almost all policies (96.9%) recommend depositing data in a repository.</p>
</sec>
<sec id="s4c5">
<title>4.3.5. Oxford University Press</title>
<p>The data policies of journals published by Oxford University Press remain highly individual. An overarching standard data policy of the publisher does not appear to exist to date.</p>
<p>Only 16 (66.7%) of their journals have a data policy. However, fourteen of these (87.5%) have a DAP. 12 of these data policies are mandatory (75%) - the highest share of all publishers in the sample. In addition, all data policies require the program code of calculations (100%) and six (37.5%) require data documentation. The same number of journals have a policy on the use of restricted data. Surprisingly, not a single journal recommends depositing data in a repository. Most guidelines (81.3%) refer to publishing the data on the publisher&#x2019;s website along with the article.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<title>5. A Comparison with the Situation in 2014</title>
<p>In order to classify the findings of this study, a comparison with previously published results was made. For this purpose, this work reused a dataset compiled by <xref ref-type="bibr" rid="r24">Vlaeminck and Herrmann (2015b)</xref>. The dataset contains information on data policies of economics journals and their specifications for a sample of 346 journals. Of these journals, 262 also had an Impact Factor and almost all of them appeared in the Social Sciences Citation Index (SSCI). The Journal Citation Reports (JCR) for the specific disciplines can be seen as (non-exclusive) sub collections of the SSCI (e.g. JCR Economics, Business, Management, etc.). The 262 journals therefore serve as a useful comparison group for an in-depth analysis of the types of data policies in use. In addition, it helps to determine potential differences in the specifications of these policies.<xref ref-type="fn" rid="fn10">10</xref></p>
<p>The comparison suggests fundamental changes in the way journals deal with data (cf. <xref ref-type="fig" rid="fg005">Figure 5</xref>): While in 2014 only 47 journals (17.9% out of 262) held a DAP, the corresponding number in 2019 was 185 (56.6% out of 327). This discrepancy suggests a paradigm shift in the academic publishing sector. For the other policy types, the changes have not been substantial as the growth of journals with ARPs and of a combination of DAP and ARP is much smaller.</p>
<fig id="fg005">
<label>Fig. 5:</label>
<caption><p>Data policies of economics journals in 2014 and 2019 compared.</p></caption>
<graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="figures/LIBER_2021_31_Vlaeminck_fig5.jpg"/></fig>
<p>However, the sheer number of data policies in economics journals does not necessarily say much about the policies&#x2019; quality. A useful data policy should request data, program code, and descriptions as these elements are crucial to reproduce the results of an empirical article.</p>
<p>When examining the specifications and demands of the data availability policies, significant changes can be observed: While in 2014 83% of the journals asked to provide the program code of an analysis, the share diminished to 66% in the recent study. The request to post descriptions of the data and the research process was mentioned by 74.5% of the data policies in 2014 compared to only 18.4% of the policies in 2019.<xref ref-type="fn" rid="fn11">11</xref></p>
<p>Also concerning the policies&#x2019; degree of obligation the numbers vary considerably: While the overall number of journals with mandatory data availability policies has grown, their share has dropped from 63.8% to only 27%. In 2014, 51.1% of the journals had a procedure that specifies which data and information authors have to provide in the event they used restricted data. This percentage plummeted to 20% in 2019 (cf. <xref ref-type="table" rid="tb003">Table 3</xref>).</p>
<table-wrap id="tb003" position="float" orientation="portrait">
<label>Table 3:</label>
<caption><p>Specifications and demands of data availability policies in 2014 and 2019 in comparison</p></caption>
<table>
<thead>
<tr>
<th align="left" valign="top">Specification</th>
<th align="left" valign="top">DAPs 2019 Sample: JCRECON2017 (n &#x003D; 185 of 223)</th>
<th align="left" valign="top">DAPs 2014 Sample: SSCI2013 (n &#x003D; 47 of 262) </th>
<th align="left" valign="top">Changes (in %) between 2014 and 2019</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Dataset(s)</td>
<td align="left" valign="top">185 (100%)</td>
<td align="left" valign="top">47 (100%)</td>
<td align="left" valign="top">0%</td>
</tr>
<tr>
<td align="left" valign="top">Program Code </td>
<td align="left" valign="top">122 (65.9%)</td>
<td align="left" valign="top">39 (83%)</td>
<td align="left" valign="top">&#x2212;17.1%</td>
</tr>
<tr>
<td align="left" valign="top">Description/documentation </td>
<td align="left" valign="top">34 (18.4%)</td>
<td align="left" valign="top">35 (74.5%)</td>
<td align="left" valign="top">&#x2212;56.1%</td>
</tr>
<tr>
<td align="left" valign="top">Intermediate datasets</td>
<td align="left" valign="top">17 (9.2%)</td>
<td align="left" valign="top">14 (29.8%)</td>
<td align="left" valign="top">&#x2212;20.6%</td>
</tr>
<tr>
<td align="left" valign="top">Mandatory data policies</td>
<td align="left" valign="top">50 (27%)</td>
<td align="left" valign="top">30 (63.8%)</td>
<td align="left" valign="top">&#x2212;36.8%</td>
</tr>
<tr>
<td align="left" valign="top">Procedure for restricted. data</td>
<td align="left" valign="top">37 (20%)</td>
<td align="left" valign="top">24 (51.1%1<sup>a</sup>)</td>
<td align="left" valign="top">&#x2212;31.1%</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><sup>a</sup>Three journals discouraged the submission of papers that rely on restricted data. For this reason, these journals have not been included in the count.</p>
</table-wrap-foot>
</table-wrap>
<p>Summarised, these numbers suggest two trends: On the one hand, we observe a massive increase of journal data policies among the most prestigious periodicals in economics. Specifically, the rate of increase of journals with DAPs is tremendous. In 2019, the absolute number of journals with such a data policy is almost four times as high as five years ago. Also the number of journals with an ARP or a combination of the two policy types has grown, but to a much lesser degree.</p>
<p>On the other hand, the average quality of these policies has not improved. Indeed, in absolute numbers more journals ask for the program code. More of these policies are obligatory and have a procedure for the use of restricted data. However, on average, the share of journals that are asking for program code, descriptions or intermediate datasets has diminished considerably over time. In addition, the share of mandatory data policies and of the policies that offer a specific procedure for research based on restricted data has fallen rapidly.</p>
</sec>
<sec id="s6">
<title>6. Summary and Discussion</title>
<p>The aim of this paper was to answer the question of how economics journals today deal with the inclusion of underlying data and analysis of an article in the peer review and publication process. To this end, we analysed how many of the journals listed in the 2017 edition of JCR ECON have specific rules (data policies) that address the use of data in the peer review or publication processes.</p>
<p>The study found that of the 353 journals listed in the 2017 edition of JCR ECON 327 journals (92.6%) publish empirical or data-driven research articles at least sporadically. Of these 327 journals, 223 have a data policy (68.2%).</p>
<p>The types of data policies found can be categorised into 185 journals (56.6%) with a data availability policy (DAP), 29 journals (8.9%) with an author responsibility policy (ARP) and nine journals that both offer a DAP and an ARP (2.8%). The remaining 104 journals (31.8%) do not have a data policy.</p>
<p>In light of the findings of previous studies on data policies of journals in economics, the results indicate a clear trend towards the adoption of data policies. In recent years, there seems to have been a veritable paradigm shift among journals and publishers when it comes to data policies. Moreover, it is satisfying that the overwhelming majority of journals has implemented DAPs rather than ARPs. As described in section 2, ARPs do not work well in practice.</p>
<p>The paper also illustrated some characteristics of the data policies found in the sample. As one important characteristic, the study determined the share of mandatory and voluntary data policies. As a result of the content analysis, it was found that only a minority of the journals have mandatory policies: 50 (27%) of the DAPs are mandatory, 10 (34.5%) of the ARPs but not a single journal that offers both an ARP and DAP. These numbers are comparatively low, especially when we take into account how much journals enforce other parts of their editorial policies (e.g. the use of style sheets), that are less important (from a scientific point of view) than the reproducibility of an article&#x2019;s results.</p>
<p>Regarding whether the guidelines include a process for research that relies on proprietary or confidential data, the study found 37 journals with a DAP (20%) that have such specific requirements (the corresponding numbers for journals with an ARP is one (3.4%) and zero for journals that support both types of policies). Journals without specific rules for such articles often grant exemptions from their data policies. However, to exempt these articles from basic scientific quality criteria does not seem to be a useful approach to me. Since economists often work with such proprietary data, the low percentage of policies with specific rules in this area is not satisfactory.</p>
<p>With respect to requested files and information mentioned in the data policies, all 185 DAP mentioned datasets (100%). Two-thirds (66%) also ask for program code/code of computation (e.g., of statistical/econometric analyses). Documentations of the data or research process were mentioned by less than one-fifth (18.4%). Intermediate datasets were mentioned in only 9.2% of the guidelines. Some journals also have specific requirements for experiments. In these cases, authors have to submit additional information (e.g. instructions or information about subject eligibility).</p>
<p>In comparison, journals with an ARP were more frequently asking for program code (89.7%) and descriptions of the data and/or research process (75.9%). All journals offering both types of guidelines asked for data sets (100%), but only two (22.2%) mentioned program code. Only one of these journals (11.1%) asked for descriptions or documentation of the data and/or research process.</p>
<p>A relatively new development is that journals have started to request data statements. These statements were demanded by 69.1% of journals with a DAP, by none of the journals with an ARP, and by all journals offering both policy types. Strictly speaking, a data statement does not really help in reproducing published results. However, it clarifies the accessibility of the data used by researchers.</p>
<p>With respect to data deposit, it is very positive that journals have begun to recommend trusted data repositories for storing replication data. Some years ago, journals often hosted the data next to the research article or on a separate webpage. This practice often resulted in issues with data accessibility (for instance after changes to the webpages and URLs) or data has been locked behind paywalls.</p>
<p>The study also contrasted the data policies of the journals of the five largest publishers within JCR ECON. In contrast to the situation a few years ago, most publishers now have a standard data policy that is used by most of their journals. The publishers&#x2019; standard data policies are not similar to each other. Frequently, these policies are not particularly detailed and they often lack the precision necessary to ensure reproducibility. The only commonality among all publishers&#x2019; data policies is that they are generic, whereas most journals with an individual data policy are subject-specific and thus often more robust in terms of reproducibility. Elsevier journals offer the most detailed standard data policy. Their guidelines name all files and materials that are crucial for reproducing the results of a paper.</p>
<p>The significance of this study becomes particularly evident when comparing the results with previous studies. With respect to the amount of data policies in economics journals, the last few years have brought a fundamental change: This study reports a 14% higher share of journals with data policies compared to the figures mentioned by <xref ref-type="bibr" rid="r12">H&#x00F6;ffler (2017)</xref> for more or less the same set of journals. Compared to the situation in 2014, the increase is massive: While <xref ref-type="bibr" rid="r23">Vlaeminck and Herrmann (2015a)</xref> found a total of 71 journals with a data policy in a sample of 346 economics journals, this study found a total of 223 journals in a sample of 327. Specifically the number of journals with DAPs has almost quadrupled.</p>
<p>While on the one hand the study shows a massive increase of newly established data policies, the numbers also indicate that not all policies can be considered robust. The code of computation for instance is crucial to understanding the research process, cleaning the data and the assumptions and decisions made during the analysis. This program code is requested by too few journals. The lack of documentation is also a serious concern regarding the reproducibility of published findings.</p>
<p>One reason why many of these data policies are still relatively weak could also be that publishers are initially adopting policies that are easy to comply with. At a later stage, after authors got more used to these policies, journals and publishers might state more precisely the requirements towards more strict and/or domain specific rules.</p>
<p>The massive increase in journal data policies may also be rooted in the science policy debates of recent years. While in general, good scientific practice and research integrity have always been crucial topics in academia, the debate has become more visible within the last few years. Discussions on open science but also reports on fraudulent research practices by some researchers have triggered an intensified debate in academia and the public. The publishing houses and journals seem to be responding to these debates by implementing data policies. It will be interesting to see how journal data policies evolve in the future. Therefore, a subsequent study on this set of journals might be useful in a few years to determine if publishers and journals have tightened their data policies. In addition, it might be of interest to investigate to what degree authors comply with these newly introduced data policies and if more data and other replication files become available.</p>
<p>Research libraries could benefit from the results of this study in several ways: If they have not yet begun to extend author advisory services to include data and data submission, the results of this study suggest that it is time to develop such services as soon as possible. Since the majority of prestigious journals in economics have data policies, research libraries should be prepared to respond to the potential increase in requests from researchers in the social sciences and beyond.</p>
<p>For those involved in advising researchers, the results of this study may also help to quickly assess and compare the requirements of different journals and publishers in this field. Going through the results and materials of this study, consultants quickly get an idea of the most important files and materials requested by economics journals and their publishers. This might be particularly helpful if the advisors are not from the field of social sciences or economics.</p>
<p>For those who provide guidance to researchers, we recommend using the <xref ref-type="bibr" rid="r2">American Economic Association (2021)</xref> requirements as a guide (<xref ref-type="bibr" rid="r2">American Economic Association, 2021</xref>). Their requirements listed in their data and code availability policy will almost certainly meet the expectations of any journal in the field and offer a good overview on what is needed to ensure reproducibility.</p>
<p>Furthermore, the results also indicate that research libraries should think about offering (hands-on) workshops for researchers on how to make their research reproducible. At least in scientific disciplines in which these skills are not taught in undergraduate education, this makes a lot of sense. Specifically young researchers are a useful target group for such seminars.</p>
<p>In addition, the results of the study suggest that already established services -like supporting researchers in finding a trusted repository to deposit their data- might become more important.</p>
</sec>
</body>
<back>
<app-group>
<app id="app1">
<title>7. Appendices</title>
<p>Appendices A and B can be viewed and downloaded in PDF format from the journal article website.</p>
</app>
</app-group>
<ref-list>
<title>References</title>
<ref id="r1"><mixed-citation>American Economic Association. (2005). <italic>Previous data availability policy (2005 - July 10, 2019)</italic>. <ext-link ext-link-type="uri" xlink:href="https://www.aeaweb.org/journals/policies/data-code/archive/2005">https://www.aeaweb.org/journals/policies/data-code/archive/2005</ext-link></mixed-citation></ref>
<ref id="r2"><mixed-citation>American Economic Association. (2021). <italic>Data and code availability policy</italic>. <ext-link ext-link-type="uri" xlink:href="https://www.aeaweb.org/journals/data/data-code-policy">https://www.aeaweb.org/journals/data/data-code-policy</ext-link></mixed-citation></ref>
<ref id="r3"><mixed-citation>Bernanke, B.S. (2004). Editorial statement. <italic>The American Economic Review</italic>, <italic>94</italic>(1), 404. <ext-link ext-link-type="uri" xlink:href="http://www.jstor.org/stable/3592790">http://www.jstor.org/stable/3592790</ext-link></mixed-citation></ref>
<ref id="r4"><mixed-citation>Chang, A. C., &#x0026; Li, P. (2015). Is economics research replicable&#x003F; Sixty published papers from thirteen journals say &#x201C;Usually not&#x201D;. <italic>Finance and Economics Discussion Series</italic>, <italic>2015</italic>(83), 1&#x2013;26. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.17016/FEDS.2015.083">https://doi.org/10.17016/FEDS.2015.083</ext-link></mixed-citation></ref>
<ref id="r5"><mixed-citation>Chin, M., &#x0026; Dong, D. (2019, May 6&#x2013;7). <italic>The quest for replicability: A review of research data policies in economics journals</italic> [Conference presentation]. INCONECSS, Berlin, Germany. <ext-link ext-link-type="uri" xlink:href="https://ink.library.smu.edu.sg/library_research/147">https://ink.library.smu.edu.sg/library_research/147</ext-link></mixed-citation></ref>
<ref id="r6"><mixed-citation>Clarivate Analytics. (2018). <italic>2017 Journal Citation Reports Economics</italic>. <ext-link ext-link-type="uri" xlink:href="https://jcr.clarivate.com/jcr/browse-journals">https://jcr.clarivate.com/jcr/browse-journals</ext-link></mixed-citation></ref>
<ref id="r7"><mixed-citation>Dewald, W. G., Thursby, J. G., &#x0026; Anderson, R. G. (1986). Replication in empirical economics: the journal of money, credit and banking project. <italic>The American Economic Review</italic>, <italic>76</italic>(4), 587&#x2013;603. <ext-link ext-link-type="uri" xlink:href="http://www.jstor.org/stable/1806061">http://www.jstor.org/stable/1806061</ext-link></mixed-citation></ref>
<ref id="r8"><mixed-citation>Duvendack, M., Palmer-Jones, R. W., &#x0026; Reed, W. R. (2015). Replications in economics: A progress report. <italic>Econ Journal Watch: Scholarly Comments on Academic Economics</italic>, <italic>12</italic>(2), 164&#x2013;191. <ext-link ext-link-type="uri" xlink:href="https://econjwatch.org/articles/replications-in-economics-a-progress-report">https://econjwatch.org/articles/replications-in-economics-a-progress-report</ext-link></mixed-citation></ref>
<ref id="r9"><mixed-citation>Feigenbaum, S., &#x0026; Levy, D.M. (1993). The market for (ir)reproducible econometrics. <italic>Accountability in Research</italic>, <italic>3</italic>(1), 25&#x2013;43. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1080/08989629308573828">https://doi.org/10.1080/08989629308573828</ext-link></mixed-citation></ref>
<ref id="r10"><mixed-citation>Glandon, P. (2011). <italic>Report on the American Economic Review data availability compliance project. Appendix to American Economic Review editors report 2011</italic>. Vanderbilt University. <ext-link ext-link-type="uri" xlink:href="https://digital.kenyon.edu/economics_publications/20/">https://digital.kenyon.edu/economics_publications/20/</ext-link></mixed-citation></ref>
<ref id="r11"><mixed-citation>Hern, A., &#x0026; Duncan, P. (2018, August 10). <italic>Predatory publishers: The journals that churn out fake science</italic>. The Guardian. <ext-link ext-link-type="uri" xlink:href="https://www.theguardian.com/technology/2018/aug/10/predatory-publishers-the-journals-who-churn-out-fake-science">https://www.theguardian.com/technology/2018/aug/10/predatory-publishers-the-journals-who-churn-out-fake-science</ext-link></mixed-citation></ref>
<ref id="r12"><mixed-citation>H&#x00F6;ffler, J. H. (2017). Replication and economics journal policies. <italic>American Economic Review: Papers &#x0026; Proceedings</italic>, <italic>107</italic>(5), 52&#x2013;55. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1257/aer.p20171032">https://doi.org/10.1257/aer.p20171032</ext-link></mixed-citation></ref>
<ref id="r13"><mixed-citation>Johnson, R., Watkinson, A., &#x0026; Mabe, M. (2018). <italic>The STM Report. An overview of scientific and scholarly publishing</italic>. International Association of Scientific, Technical and Medical Publishers. <ext-link ext-link-type="uri" xlink:href="https://www.stm-assoc.org/2018_10_04_STM_Report_2018.pdf">https://www.stm-assoc.org/2018_10_04_STM_Report_2018.pdf</ext-link></mixed-citation></ref>
<ref id="r14"><mixed-citation>Krawczyk, M., &#x0026; Reuben, E. (2012). (Un)Available upon request: Field experiment on researchers&#x2019; willingness to share supplementary materials, <italic>Accountability in Research: Policies and Quality Assurance</italic>, <italic>19</italic>(3), 175&#x2013;186. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1080/08989621.2012.678688">https://doi.org/10.1080/08989621.2012.678688</ext-link></mixed-citation></ref>
<ref id="r15"><mixed-citation>McCullough, B. D. (2007). Got replicability&#x003F; The journal of money, credit and banking archive. <italic>Econ Journal Watch: Scholarly Comments on Academic Economics</italic>, <italic>4</italic>(3), 326&#x2013;337. <ext-link ext-link-type="uri" xlink:href="https://econjwatch.org/articles/got-replicability-the-journal-of-money-credit-and-banking-archive">https://econjwatch.org/articles/got-replicability-the-journal-of-money-credit-and-banking-archive</ext-link></mixed-citation></ref>
<ref id="r16"><mixed-citation>McCullough, B. D. (2009). Open access economics journals and the market for reproducible economic research. <italic>Economic Analysis and Policy</italic>, <italic>39</italic>(1), 117&#x2013;126. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/S0313-5926(09)50047-1">https://doi.org/10.1016/S0313-5926(09)50047-1</ext-link></mixed-citation></ref>
<ref id="r17"><mixed-citation>McCullough, B. D., &#x0026; Vinod, H. D. (2003). Verifying the solution from a nonlinear solver: A case study. <italic>American Economic Review</italic>, <italic>93</italic>(3), 873&#x2013;892. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1257/000282803322157133">https://doi.org/10.1257/000282803322157133</ext-link></mixed-citation></ref>
<ref id="r18"><mixed-citation>McCullough, B. D., McGeary, K. A., &#x0026; Harrison, T. D. (2006). Lessons from the JMCB archive. <italic>Journal of Money, Credit and Banking</italic>, <italic>38</italic>(4), 1093&#x2013;1107. <ext-link ext-link-type="uri" xlink:href="http://www.jstor.org/stable/3838995">http://www.jstor.org/stable/3838995</ext-link></mixed-citation></ref>
<ref id="r19"><mixed-citation>Neuendorf, K. A. (2002). <italic>The content analysis guidebook</italic>. Sage Publications.</mixed-citation></ref>
<ref id="r20"><mixed-citation>Savage, C. J., &#x0026; Vickers, A. J. (2009). Empirical study of data sharing by authors publishing in PLoS Journals. <italic>PLoS One</italic>, <italic>4</italic>(9), Article e7078. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1371/journal.pone.0007078">https://doi.org/10.1371/journal.pone.0007078</ext-link></mixed-citation></ref>
<ref id="r21"><mixed-citation>Stodden, V., Seiler, J., &#x0026; Ma, Z. (2018). An empirical analysis of journal policy effectiveness for computational reproducibility. <italic>Proceedings of the National Academy of Sciences</italic>, <italic>115</italic>(11), 2584&#x2013;2589. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1073/pnas.1708290115">https://doi.org/10.1073/pnas.1708290115</ext-link>.</mixed-citation></ref>
<ref id="r22"><mixed-citation>Vlaeminck, S. (2013). Data management in scholarly journals and possible roles for libraries &#x2013; Some insights from EDaWaX. <italic>Liber Quarterly</italic>, <italic>23</italic>(1), 48&#x2013;79. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.18352/lq.8082">https://doi.org/10.18352/lq.8082</ext-link></mixed-citation></ref>
<ref id="r23"><mixed-citation>Vlaeminck, S., &#x0026; Herrmann, L.-K. (2015a). Data policies and data archives: A New paradigm for academic publishing in economic sciences&#x003F; In B. Schmid &#x0026; D. Dobreva (Eds.), <italic>New avenues for electronic publishing in the age of infinite collections and citizen science: Scale, openness and trust</italic> (pp. 145-155). IOS Press. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3233/978-1-61499-562-3-145">https://doi.org/10.3233/978-1-61499-562-3-145</ext-link></mixed-citation></ref>
<ref id="r24"><mixed-citation>Vlaeminck, S., &#x0026; Herrmann, L.-K (2015b): Data policies and data archives: A new paradigm for academic publishing in economic sciences&#x003F; (Replication data; Version 1), [Data set]. ZBW Journal Data Archive. <ext-link ext-link-type="uri" xlink:href="http://journaldata.zbw.eu/dataset/data-policies-and-data-archives-a-new-paradigm-for-academic-publishing-in-economics">http://journaldata.zbw.eu/dataset/data-policies-and-data-archives-a-new-paradigm-for-academic-publishing-in-economics</ext-link></mixed-citation></ref>
<ref id="r25"><mixed-citation>Vlaeminck, S., &#x0026; Podkrajac, F. (2017). Journals in economic sciences: paying lip service to reproducible research&#x003F; <italic>IASSIST Quarterly</italic>, <italic>41</italic>(1&#x2013;4), 16. <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.29173/iq6">https://doi.org/10.29173/iq6</ext-link></mixed-citation></ref>
<ref id="r26"><mixed-citation>Wharton Research Data Services. (2021, August 2). <italic>Terms of use</italic>. University of Pennsylvania. <ext-link ext-link-type="uri" xlink:href="https://wrds-www.wharton.upenn.edu/users/tou/">https://wrds-www.wharton.upenn.edu/users/tou/</ext-link></mixed-citation></ref>
</ref-list>
<fn-group>
<title>Notes</title>
<fn id="fn1"><p>Estimations range up to 33,100 active peer-reviewed English-language journals in 2018, which have collectively published approximately 3 million articles (<xref ref-type="bibr" rid="r13">Johnson et al., 2018</xref>, p. 25).</p></fn>
<fn id="fn2"><p>For the field of economics also other JCR subsets are of potential interest, e.g. Business (140 journals), Business and Finance (98 journals), Management (210 journals) and Operations Research and Management (84 Journals). All journals included in these subsets have an Impact Factor and are part of the SSCI (Social Sciences Citation Index) which is &#x2013;like the Impact Factor&#x2013; a proprietary product of Clarivate Analytics.</p>
<p>When looking at how many journals from these rankings are also represented in JCR ECON 2017, the following figures emerge: 13 journals (9.2%) from JCR Business, 40 (39.3%) from JCR Business and Finance, 12 (5.7%) from JCR Management, and another 13 (15.5%) from JCR Operations Research. For this reason, the results of this article relate primarily to journals found in the JCR ECON 2017 and, to a lesser extent, to business and finance journals.</p></fn>
<fn id="fn3"><p>We did not examine the printed issues of the journals, although we are aware of one case where the data policy is not available online, but only in the printed issue. Therefore, the results of the analysis describe the lower limit of the data policies found in our sample. The actual number of journals with a data policy is likely to be higher.</p></fn>
<fn id="fn4"><p>It was not always easy to distinguish between editorial policies that aim to achieve reproducibility of results and editorial notes that only explain what types of files can generally be processed in the submission process. In cases in which only the possibility to also submit data sets was mentioned in general, but no further instructions in terms of requirements for reproducibility were given, we did not consider such a note as a data policy.</p></fn>
<fn id="fn5"><p>For instance, some experimental designs require other replication files than simulations do. To have a comparability among the different methodologies in economics, we only looked for those files that are important for most methodological approaches.</p></fn>
<fn id="fn6"><p>Commercial data providers generally do not allow any sharing of research data, even for the purpose of reproducing published results. For example, Wharton Research Data Services (WRDS) - a major aggregator of access to numerous databases from commercial providers - states in its terms of use: &#x201C;Except as provided under the terms of the Subscription Agreement, you may not reproduce, distribute, modify, adapt, create derivative works of, display, transmit, broadcast, sell, license or in any way exploit the Proprietary Material, in whole or in part, without our advance written consent.&#x201D; (<xref ref-type="bibr" rid="r26">Wharton Research Data Services, 2021</xref>).</p></fn>
<fn id="fn7"><p>Details of the operationalisation are available in appendix A.</p></fn>
<fn id="fn8"><p>The SSCI is a set of journals of several Clarivate (formerly Thomson Reuters) Journal Citation Reports (JCR) for the social sciences. It includes more than 3400 journals and is subdivided into 58 (non-exclusive) disciplinary rankings. The JCR ECON is one of these disciplinary rankings. For the field of economics also other JCR subsets are of potential interested, e.g. Business (140 journals), Business and Finance (98 journals), Management (210 journals) and Operations Research and Management (84 Journals).</p></fn>
<fn id="fn9"><p>Please check appendix A for a detailed description of how a data policy has been assigned to one of the types. Some special cases are journals published by the SpringerNature group: As part of the &#x2018;ethical responsibilities of authors&#x2019;, these journals include a short paragraph on &#x2018;data disclosure upon request&#x2019;. In addition, a substantial part of these journals has a dedicated data policy. Journals with an additional data policy have been categorised as journals with a DAP, while those without a dedicated data policy have been categorised as journals with an ARP. Always the more strict policy type was decisive for the categorisation.</p></fn>
<fn id="fn10"><p>On average, the impact factor of the 262 journals was 1.98 (SD: 3.36). The 25% percentile was 0.86, the median 1.37 and the 75% percentile 2.4. For a detailed description of the sample, please cf. <xref ref-type="bibr" rid="r23">Vlaeminck and Herrmann (2015a)</xref>.</p></fn>
<fn id="fn11"><p>One hundred twenty six of the 223 journals from the 2014 study were also listed in the JCR ECON. The average impact factor of these 126 journals was 1.49 (SD: 1.2). Thirty-three journals had a DAP (26.2%), eight journals (6.4%) had an ARP. Also for this reduced subsample, the results remain comparable. The numbers for the 126 journals listed in the JCR ECON are even higher than reported for the sample of all the 262 IF-journals. For comparison: datasets (100%), program code (82.9%), descriptions (75.6%), intermediate datasets (34.2%), mandatory data policies (78.1%), procedure for restricted data (51.2%).</p></fn>
</fn-group>
</back>
</article>