FEATURE
Counting on COUNTER: The Current State of E-Resource Usage Data in Libraries
by Josh Welker
Various Common COUNTER Reports and What They Measure |
JR1 (Journal Report 1) |
Full-Text Article Requests by Month and Journal |
JR2 (Journal Report 2) |
Access Denied to Full-Text Articles by Month, Journal, and Category |
JR3 (Journal Report 3) |
Number of Successful Item Requests by Month, Journal, and Page Type |
JR4 (Journal Report 4) |
Total Searches Run by Month and Collection |
JR5 (Journal Report 5) |
Number of Successful Full-Text Article Requests by Year of Publication and Journal |
DB1 (Database Report 1) |
Total Searches, Result Clicks, and Record Views by Month and Database (updated in COUNTER Release 4—previously Searches and Sessions by Month and Database) |
DB2 (Database Report 2) |
Access Denied by Month, Database, and Category |
DB3/PR1 (Database Report 3/Platform Report 1) |
Total Searches, Result Clicks, and Record Views by Month and Platform |
BR1 (Book Report 1) |
Number of Successful Book Title Requests by Month and Title |
BR2 (Book Report 2) |
Number of Successful Book Section Requests by Month and Title |
|
|
|
|
|
|
|
|
Any librarian who has managed electronic resources has experienced the—for want of words—joy of gathering and analyzing usage statistics. Such statistics are important for evaluating the effectiveness of resources and for making important budgeting decisions. Unfortunately, the data are usually tedious to collect, inconsistently organized, of dubious accuracy, and anything but a joy to work with.
Once the internet became the ubiquitous way to access content, it did not take long for the library community to create standards to ease the process of collecting usage data. In 2002, librarians formed Project COUNTER (Counting Online Usage of Networked Electronic Resources). A year later, COUNTER issued Release 1 of its Code of Practice, which outlined standards for publishers and vendors to report usage statistics.
In 2007, the information science standards body NISO (National Information Standards Organization) created the Standardized Usage Statistics Harvesting Initiative protocol, known casually as SUSHI, which provides an automated way to download COUNTER reports via the web.
While COUNTER and SUSHI have helped libraries come a long way toward improving the adoption and availability of usage statistics for library market vendors, I soon came to learn that there is still a good amount of work libraries must do to get the data they need to make critical collection development and database budgeting decisions. I learned my lesson the hard way, by first turning to SUSHI with the hope it would fulfill my library’s need for data about database usage. Let me start at the start …
Building a SUSHI Client
In early 2012, Southwest Baptist University (SBU) Libraries began the daunting task of collecting electronic resource usage statistics. As the new electronic resources librarian, I got the job, which I immediately attempted to systematize and hopefully simplify. SBU did not have the funds to acquire a commercial electronic resource management (ERM) system, so after much difficulty, I built a functioning SUSHI harvester based on Python and MySQL. To make a long story short, details, instructions, and source code for this project can be found on my blog at http://josh-welker.blogspot.com.
The bottom line was that building a working SUSHI client may not have even been necessary, but it certainly was not sufficient. In order to fully evaluate usage, I needed to conduct a standard cost-per-use analysis of SBU’s electronic resources. Cost-per-use studies require two simple data ingredients: cost and usage. Cost was easy enough to find, but usage proved impossible to measure with SUSHI for most vendors. The only COUNTER report available via SUSHI that measures full-text uses is the JR1 report, which counts uses per journal title. But since the SUSHI standard does not allow librarians to specify a particular database, the JR1 report contains all the journals the library can access from a vendor’s platform. SBU might subscribe to 2 dozen databases from a vendor, each with its own price tag. There is no way to calculate cost per use for those dozens of databases when a vendor lumps all the usage figures together into one report. So at the end of the day and after all my work building a SUSHI client, I still ended up having to visit dozens of vendor websites to manually collect, collate, format, and analyze all the data myself, using the classic desktop applications Access and Excel.
A Study of Electronic Resource Statistics in Libraries
After becoming thoroughly disillusioned with SUSHI, I wanted to know if I was alone or if other libraries were bogged down in the same statistical quagmire. The rest of this article is about the results of a survey I conducted among my peers to not only satisfy my own curiosity but with the hope that such a study would reveal insights that the vendor community and COUNTER/NISO could use to improve the standards and protocols for collecting and reporting usage statistics.
My short survey was assembled using SurveyMonkey and distributed through the LITA, ERIL, and Code4Lib listservs. The survey contained four quantitative questions and two qualitative questions related to how libraries collect and use statistics. The survey was left open for 3 weeks.
The results showed that SBU is not unique in experiencing difficulty with usage statistics. A total of 131 librarians responded to the survey. Not every respondent answered every question. Anonymous raw data can be found on my blog. I will summarize the highlights here.
The first question asked what methods libraries use to collect usage statistics. Most librarians in the survey indicated that they manually collect usage data, just as I ended up doing in my project. SUSHI is seldom used, which is unsurprising since vendor implementation is poor and ERM software is expensive and unwieldy. Respondents did indicate that vendor websites provided COUNTER statistics more often than not, but a large number of vendors still provide no COUNTER statistics.
The second question asked how much time librarians spend collecting usage statistics for each year. Responses were across the board, indicating significant variance between institutions. This variance is most likely due to factors such as the amount of electronic resources, available staff, and automation tools. As an electronic resources librarian, I view 1 week as an acceptable amount of time to spend collecting usage statistics. Only 23.9% of respondents indicated that they spend 1 week or less collecting statistics, whereas 40.7% of respondents indicated that the process takes them 4 weeks or longer.
The third question asked what kinds of data librarians find useful. Answers are mapped to a Likert scale with the following values: Not useful (0), Somewhat useful (1), Very useful (2). The average response value was calculated for each category, ranging from 0 to 2. The highest scorers were Full-text Retrievals by Journal/Book (1.71), Full-text Retrievals by Database/Collection (1.69), and Searches by Database/Collection (1.50) by a margin of 0.33. This indicates that full-text retrievals are the main metric librarians care to know. Respondents ranked session and turnaway data unfavorably in general. It is also notable that the librarians in this survey did not care about platform-specific data, whether for full-text retrievals, searches, sessions, or turnaways.
The fourth question asked which COUNTER reports are useful to librarians. Answers were mapped to the same Likert scale as for the third question. The highest-ranking COUNTER reports (see table on page 8) were JR1 (1.77), DB1 (1.55), and BR1 (1.47) by a margin of 0.40. The popularity of JR1 and BR1 is not surprising since these librarians indicated that they primarily care about the number of full-text retrievals. DB1 reports are very popular, even though they do not include full-text statistics, most likely because they are the only COUNTER report that gives any indication of use at the database level. All other reports scored between 0.75 and 1.05, indicating that most libraries find them only somewhat useful at best.
The fifth question was qualitative and asked what major challenges librarians face when collecting usage statistics. Responses were grouped into several categories as common themes emerged. By far, the biggest complaints revolve around vendor implementation. Of 175 respondents, 66 complained about vendor compliance and data consistency. Many vendors offer few or no COUNTER reports. The data is in a proprietary format that cannot be easily compared to other resources. There is also no guarantee that vendors are counting usage the same way. Here is how one respondent described it: “for a search box that covers 5 databases and is searched 1,000 times in a month, [some vendors] will report 5,000 searches (1,000 for each database).”
Librarians reported distrust of the vendors’ statistics. One said, “It strikes me that publishers and vendors don’t want us to see this data because some of it could cause cancellations.” Another mentioned having a “feeling that the stats cannot altogether be trusted.”
But the largest complaint, voiced by 71 respondents, was that collecting statistics is too tedious. In most cases, this is due to poor vendor interfaces. A common grievance was that smaller vendors have no web-based interface for collecting statistics and require that a representative be emailed to ask for statistics. These contacts are often lost as libraries and vendors shuffle personnel. When vendors do provide an online interface, there is still the issue of tracking down potentially hundreds of URLs and login credentials and learning every interface’s ins and outs. “There is no consistency,” one respondent wrote. “Each vendor’s site is different, and it takes a while to figure out how to go about harvesting the use stats.”
The extremely low implementation of SUSHI services exacerbates the problem. Twelve respondents voiced frustration that vendors frequently change reporting platforms without notifying the right library personnel where and how to use the new platform. Ten complained that it is too difficult to use the available metrics to create a cost-per-use analysis; nine were frustrated that most vendors only report based on calendar year, while most libraries need data related to their fiscal years.
The sixth and final question asked what changes in the COUNTER standard would improve usage statistics. Fewer participants responded to this question (only 68 total), and responses were more varied. As in the previous question, responses were grouped by common themes.
The most common suggestion, mentioned by 29 respondents, was requiring more uniform vendor implementation. One of the biggest problems in this regard is the granularity of reporting. Some vendors will include all journals in a JR1 report, whether the library subscribes to the journal or not. Other vendors do not allow JR1 reports to target a specific database. Vendors also commonly conflate platforms with individual databases, which prevents libraries from getting usage data for the individual products they buy.
The second-most common suggestion is improvement of automation tools. SUSHI is a great standard, but it lacks wide vendor implementation, and even more so, it lacks software tools that allow libraries to take advantage of it.
Conclusions and Recommendations
Responses to the survey paint a somewhat bleak picture of the current landscape of electronic resource usage statistics. Librarians are spending inordinate amounts of time collecting statistics and are often not even able to get the information they need. As one respondent succinctly put it, there is “very little ROI in terms of driving collection development decision-making given the amount of time that goes into gathering the data.”
There seems to be a categorical misunderstanding among vendors regarding why libraries want usage statistics. For the most part, librarians don’t care about sessions, searches, IP addresses, or page views. It’s all about cost per use. One respondent said, “[Cost per use] is the only bit of information that library administrators care about anyway.”
Vendors and COUNTER should do everything in their power to make it easier for libraries to calculate cost per use. This includes providing full-text retrieval data for whatever the actual unit is that libraries are purchasing. If a library purchases a database, vendors should give usage data at the database level, not at the journal level or the platform level. This problem is especially rampant in SUSHI.
NISO needs to include a parameter in the SUSHI specification that allows libraries to filter data by a specific database. Otherwise, the data is basically useless. The new COUNTER Release 4 updated the DB1 report to include result clicks and record views instead of just searches and sessions, which is a great step in the right direction. However, at the time of writing, I could not find any vendors that had implemented the new DB1 report.
How can we make this better? I’ve summarized my recommendations for vendors, standards bodies, and librarians in the box on the left. Let’s work together to get better usage data easier.
|