So, do we have 2 or 74 percent Open Access availability?

by @jeroenbosman

N.B. This post contains updates between [ ]

Anyone following Open Access developments is confronted almost daily with new policies, mandates, goals, definitions, delimitations and the concomitant quantitative underpinnings. The passed two months saw a rush hour frenzy with the announcement of new OA policy from the main Dutch research funder NWO, procedures from HEFCE in the UK with details of their deposit mandate, the release of six reports from Canadian Science Metrix prepared for the European Commission earlier this year, major researcher surveys by Nature and by Taylor&Francis, a business report on the OA market, calculations on the cost of OA by Wouter Gerritsma at the Amsterdam (UvA/HvA) OA symposium and more.

This avalanche is of course no coincidence as the period around the Open Access week is deal to get OA stakeholders’ attention. Being a bit of a number cruncher, my attention was caught by some amazing facts & figures that surfaced from the depths of these these reports:

1) Open Access has a share of just 2.3% of the (STM) journals market, according to blogpost outlining a report by Simba (which can be had for $2500).

2) Worldwide over 50% of research papers published 2007-2012 are now Open Access available, according to the main ScienceMetrix report pepared for the EU (page 16).

3) An astounding 73.8% of Dutch papers from the year 2008-2013 is Open Access available, according to the Science Metrix report again (page 26).

4) OA papers are between 26% and 64% more cited than the average of all papers. OA citation advantage is strongest for papers available in green Open Access and even negative for Gold Open Access, again according to the Science Metrix repport (page 18).

5) Of all STEM researchers, 62% claims to have published at least 1 OA article over the last 3 years; for SSH that is 38%; both according to the recent global Author Insights Survey carried out by Nature (page 5).

6) 70% authors agree that there are fundamental benefits to OA, up from 60 in 2013. The  large scale 2014 survey sent out by Taylor&Francis also found that few authors see citation advantage as an important reason to publish OA.

7) Last year Dutch authors paid on average €1087 (excl. VAT) in APCs for their Gold OA papers, according to research by Wouter Gerritsma presented at an Amsterdam symposium during the Open Access Week.

8) Just to add one more: Web of Science shows that of all 1,25M 2014 papers & reviews indexed so far 8.5 percent is open access (on 20141211).

Pretty amazing and confusing figures huh?

Alas, as always, only the fine print reveals what is really going on. It makes the picture more blurry but also more faithful. So let’s dive in to see what this is all about.

1) The 2.3 percent market share pertains to the income derived from scholarly journals. As such it is not a very good indication of the success of OA because it is affected by the price of APC’s (higher APC’s will lead to a higher market share) and the price of regular subscriptions (lowering subscription prices will lead to a rising OA market share). Also the figure probably includes income derived form hybrid journals. So do not simply cheer if this market share figure rises. And  of course green OA is not in the equation at all. In theory we can have 100% OA though green deposits with a 0% OA income market share. Nevertheless the fairly low figure of 2.3% may be convincing investors in big publisher stock not to be afraid that OA will become a threat.

[Update 20141213: Walt Crawford has done comparable research arriving at roughly the same total OA market size as Simba (around $245M), without giving a percentage of the total journal market. Figures are in his blog, where he gives an amazing amount of other details, such as avarage APC levels in Gold OA journals at $388/981/170 for STE/M/SSH respectively. Unfortunately Crawford does not share his data like Wouter Gerritsma did recently.]

2-3) The high OA shares reported by Science Metrix are based on a very large Scopus sample, measuring OA access levels in April 2014 through Google Scholar (I suspect by checking for DOC/HTML/PDF full text links you see on the right of Scholar result pages) and a number of other sites and repositories. These figures are real, but omit the most recent period (half year/ year), which is arguably the most important for getting access and also the period with the least OA availability because of publisher embargoes on green sharing. Also it includes all versions of papers and may even include papers shared against copyright regulations through e.g. ResearchGate or private researcher sites. For these reasons you could argue that Science Metrix is stretching it a bit. On the other hand it excludes all papers published in full OA journals not indexed by Scopus, and that is over 70% of all OA journals in the Directory of Open Access Journals. That is reason to say Science Metrix may even be under-reporting. The extremely high figure for the Netherlands is across much of the board of disciplines, with only historical studies, chemistry, arts and ‘textual studies’ having a figures below 50% (table XIII).  The Netherlands shares this high overall level of OA availability with Croatia, Estonia and Portugal. One might hypothesize that the high standards of university repositories management contributes to the figure, but really more research is need here to be sure, because there are many more potential explanations.

4) The figures on citation advantage maybe counterintuitive. This is a difficult area of research. I guess there are many confounding 3rd variables at play. First of all: many of the full Gold OA journals in Scopus are new kids on the block, still building a reputation. Second, gold OA journals are still dominated by journals in SSH, which have fewer citations per paper than STEM and also have an overrepresentation of non-English papers, also having fewer citations. So it may not be their gold nature per se that gives them a citation disadvantage. But perhaps it is also just too early to call. The evidence is still not conclusive, despite interesting studies and data becoming available, such as data on the (then hybrid) journal Nature Communications and data on journals switching to OA by joining BioMedCentral.

5) The number of authors that already published a Gold (or hybrid) OA paper is surprisingly high, with two thirds in STEM and over a third in SSH. This means publishing OA is accepted and has become mainstream. Nice is also that the by far most mentioned reason to publish OA is that researcher think research results should be openly available and much less so because their funder requires OA or because they expect more citations. The only caveat here is that subdisciplines within SSH may show varying results and that perhaps the population is a bit skewed, because mainly based on authors known to Nature because they previously published with Nature/Macmillan (including Palgrave and Frontiers).

6) So, a large majority (70%) sees fundamental benefits to OA publishing according to the T&F survey. Add to this another 20 percent neutral and you wonder why not far more papers are published OA. What is it that holds the majority of researchers from reaping those benefits? The same study finds that OA publishing standards are on average considered a bit lower, so that may be part of the explanation. Also, especially for (some part of) SSH objections to CC-BY sharing may play a role, as that license is accepted by only 19% (compared to 71% for CC-BY-NC). A last factor may be that many fields are still lacking good, modern, high standard OA publishing venues, especially in SSH. So it is great news that the new Open Library for Humanities megajournal has opened for submissions a week ago, on 20141202.

7) The average APC paid by Dutch authors is much lower than the common view that APC’s are around 3000$. That latter figure probably got into people’s minds by researchers looking at OA APC’s in the traditional journals that they always used to publish in, so in a non-OA hybrid journal. When people really start looking for OA publishing venues (en route accepting that impact factors are bogus) they find that many have much more modest APC’s. Please note however that the average of €1087 excludes VAT and includes OA journals charging nothing. In Germany, more or less the same levels were found (€1210). BTW Green OA, or self deposit in an archive, is another very simple way of OA sharing that costs next to nothing.

8) The 8.5% Open Access availability of current year papers indexed in WoS seems to be casting a shadow over the Science Metrix figures. However, you have to note that this WoS figure excludes papers available in OA through hybrid journals, as WoS assigns the OA label to papers on the journal level, so only appears in full OA journal are counted as OA when applying their filter. It also disregards the many good young OA journals, as the inclusion process for WoS can take years. And finally WoS does not look at OA availability through self archiving. So this is a seriously underreporting figure.

In conclusion, at this point in time, we may say that on average we have reached and crossed tipping points for mass OA availability and acceptance. Improving the acceptance of licenses (either more open ones by authors or more stringent ones by OA publishers) and creating more and more diverse OA publishing venues can take OA further. All the reports and surveys mentioned are worth a closer look. And when someone asks me: well do we have 2 or 74 percent OA, I say, well…. about 38 😉

