Microsoft Word - December_ITAL_Biswas_final.docx Analyzing Digital Collections Entrances: What Gets Used and Why It Matters Paromita Biswas and Joel Marchesoni INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 19 ABSTRACT This paper analyzes usage data from Hunter Library’s digital collections using Google Analytics for a period of twenty-seven months from October 2013 through December 2015. The authors consider this data analysis to be important for identifying collections that receive the largest number of visits. We argue this data evaluation is important in terms of better informing decisions for building digital collections that will serve user needs. The authors also study the benefits of harvesting to sites such as the Digital Public Library of America, and they believe this paper will contribute to the literature on Google Analytics and its use by libraries. INTRODUCTION Hunter Library at Western Carolina University (WCU) has fourteen digital collections hosted in CONTENTdm—a digital collection management system from OCLC. Users can enter the collections in various ways—through the Library’s CONTENTdm landing pages,1 search engines, or sites such as the Digital Public Library of America (DPLA) where all the collections are harvested.2 Since October 2013, the Library has collected usage data from its collections’ websites and from DPLA referrals via Google Analytics. This paper analyzes this usage data covering a period of approximately twenty-seven months from October 2013 through December 2015. The authors consider this data analysis important for identifying collections receiving the largest number of visits, including visits through harvesting sites such as the DPLA. The authors argue that such data evaluation is important because it can better inform decisions taken to build collections that will attract users and serve their needs. Additionally, this analysis of usage data generated from harvesting sites such as the DPLA demonstrates the usefulness of harvesting in increasing digital collections’ usage. Lastly, this paper contributes to the broader literature on Google Analytics and its use by libraries in data analysis. LITERATURE REVIEW Using Google Analytics to study usage of electronic resources is common; a considerable amount of material exists describing the use of Google Analytics in marketing and business fields.3 Paromita Biswas (pbiswas@email.wcu.edu) is Metadata Librarian and Joel Marchesoni (jmarch@email.wcu.edu) is Technology Support Analyst, Hunter Library, Western Carolina University, Cullowhee, North Carolina. ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 20 However, the published literature offers little about the use of this software for studying usage of collections consisting of unique materials digitized and placed online by libraries and cultural heritage organizations. For example, Betty has written about using Google Analytics to track statistics for user interaction with librarian-created digital media such as quizzes and video tutorials.4 Fang discusses using Google Analytics to track the behavior of users who visited the Rutgers-Newark Law Library website.5 Fang looked at the number of visitors, what and how many pages they visited, how long they stayed on each page, where they were coming from, and which search engine or website had referred them to the library’s website. Findings were evaluated and used to make improvements to the library’s website. For example, Fang mentions using Google Analytics data for tracking the percentage of new and returning visitors before and after the website redesign. Among articles that discuss using web analytics to learn how users access digital collections, most have focused on a comparison between third-party platforms, online search engines, and the traditional library catalog to find preferred modes of access and whether results call for a shift in how libraries share their digital collections. For example, in their article on the impact of social media platforms such as HistoryPin and Pinterest on the discovery and access of digital collections, Baggett and Gibbs use Google Analytics for tracking usage of digital objects on the library’s website as well statistics collected from HistoryPin’s and Pinterest’s first-party analytics tools.6 The authors conclude that while neither HistoryPin nor Pinterest drive users back to the library’s website, they help in the discovery of digital collections and can enhance user access to library collections. Schlosser and Stamper compare the effects on usage of a collection housed in an institutional repository and reposted on Flickr.7 Whether housing a collection on a third-party site had an adverse effect on attracting traffic to the library’s website was not as important as ensuring users accessed the collection somewhere. Likewise, O’English demonstrates how data from web analytics were used to compare access to archival materials via online search engines as opposed to library catalogs using MARC records for descriptions.8 O’English argues library practices should change accordingly to promote patron access and use. Ladd’s article on the access and use of a digital postcard collection from Miami University uses statistics from Google Analytics, CONTENTdm, and Flickr over a period of one year.9 Ladd’s findings reveal that few users came to the main digital collections website to search and browse; instead, most arrived via external sources such as search engines and social media sites. The resulting increase in views makes it imperative, Ladd asserts, that regular updates both in CONTENTdm and Flickr are important for promoting access and use of the postcards. Articles on using Google Analytics for tracking digital collection usage have explored tracking the geographic base of users. For example, Herold uses Google Analytics to demonstrate usage of a digital archival collection by users at institutional, national, and international levels.10 Herold looks at server transaction logs maintained in Google Analytics, on- and off-campus searching counts, user locations, and repeat visitors to the archival images representing cultural heritage materials related to Orang Asli peoples and cultures of Malaysia. She uses these data to ascertain INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 21 the number of users by geographic region and determine that, while most visitors came from the United States, Malaysia ranked second. The data supported, according to Herold, that this particular digital collection was able to reach another target audience: users from Malaysia. Herold’s findings indicate that digitization of unique materials makes them available to a worldwide audience. Whether harvesting has increased usage of digital collections available via DPLA or its hubs has received limited exploration in the literature. Most writings on harvesting digital collections have focused more on the technical aspects of the process, like the DPLA’s ingestion method, the quality and scalability of metadata remediation and enhancement,11 and large metadata encoding.12 For example, Gregory and Williams write about the North Carolina Digital Heritage Center as one of the service hubs of the DPLA. The service hubs are centers that aggregate digital collection metadata provided by institutions for harvesting by the DPLA. The authors discuss metadata requirements, software review, and establishment of workflow for sending large metadata feeds to the DPLA.13 Boyd, Gilbert, and Vinson, in their article on the South Carolina Digital Library (SCDL), another service hub for DPLA, describe the planning behind setting up the SCDL, its management, and the technology involved in metadata harvesting.14 Freeland and Moulaison discuss the Missouri hub as a model for “institutions with similar collective goals for exposing and enriching their data through the DPLA.”15 According to them, by harvesting their metadata to the DPLA, institutions are able to share their digital collections with the broader public. Additionally, institutions that harvest metadata to the DPLA get value-added services like geocoding of location- based metadata and expression of contributed metadata as linked data. Data Collection Parameters Hunter Library digital collections usage data included information on item views16 and referrals17 for each of the collections including DPLA referrals. The authors also considered keyword search terms18 across all referrals, and within CONTENTdm specifically, that brought users to the Library’s collections. The authors considered the most frequently occurring keywords to be representing the subjects of collections that were most used. Repeat visitors to the Library’s digital collections’ website were also tracked. Finally, sessions19 were traced by the geographic area20 of the users. Hunter Library’s collections vary in size. The Library’s largest and one of the oldest collections, Craft Revival [Note: collections are set in roman and capitalized] showcases documents, photographs, and craft objects housed in Hunter Library and smaller regional institutions. The collection’s items represent the late nineteenth and early twentieth century (1890s–1940s) Craft Revival movement in Western North Carolina, which was characterized by a renewed interest in handmade objects, including Cherokee arts and crafts. The Craft Revival collection began in 2005 and includes 1,982 items. The second largest collection, Great Smoky Mountains, which highlights efforts that went into the establishment of the park and includes photographs on the landscape and flora and fauna in the park, began in 2012 and consists of 1,829 items. Not all digital ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 22 collections were harvested to the DPLA at the same time. While some older collections were harvested to the DPLA in 2013, smaller, institution-specific collections started later were also harvested later. For example WCU—Oral Histories, a collection of interviews collected by students of one of WCU’s history classes documenting the history and culture of Western North Carolina and the lives of WCU athletes or artists’ like Josephina Niggli who taught drama at WCU; Highlights from WCU, a collection of unique items from WCU’s Mountain Heritage Center and other departments on campus, including letters from the Library’s Special Collections transcribed by WCU’s English department students; and WCU—Fine Art Museum, showcasing art work from the university’s Fine Art Museum, were harvested to the DPLA in 2015. As these smaller collections were started later, their total item views and referral counts would likely be less than some of the Library’s older collections; however, these newer collections were included as they might provide valuable data regarding harvesting referrals and returning visitors. Table 1 shows the years the collections were started, the number of items included in each collection, and the year they were harvested to the DPLA. Collection Name Start Year Collection Size (Number of Items) Harvested Since Cherokee Traditions 2011 332 2013 Civil War 2011 68 2013 Craft Revival 2005 1,982 2013 Great Smoky Mountains 2013 1,829 2013 Highlights from WCU 2015 39 2015 Horace Kephart 2005 552 2013 Picturing Appalachia 2012 972 2013 Stories of Mountain Folk 2012 374 2013 Travel Western North Carolina 2011 160 2013 WCU—Fine Art Museum 2015 87 2015 WCU—Herbarium 2013 91 2013 WCU—Making Memories 2012 408 2013 WCU—Oral Histories 2015 67 2015 Western North Carolina Regional Maps 2015 37 2015 Table 1. Collections by year INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 23 Collecting Data Using Google Analytics The Library has had Google Analytics set up on online exhibits—websites outside of CONTENTdm that provide additional insight into the collection—since 2008 and began using Google Analytics to track its CONTENTdm materials with the 6.1.2 release in October 2013. CONTENTdm version 6.4 introduced a configuration field that allowed the authors to enter a Google Analytics ID and automatically generate the tracking code in pages to simplify the setup. Following that software update, OCLC made Google Analytics the default data logging mechanism. The Library set up Google Analytics such that online exhibits are tracked together with their CONTENTdm collections. This is accomplished by using custom tracking on all webpages and a custom script in CONTENTdm. This allows the Library to link its CONTENTdm and wcu.edu domains within Google Analytics so that sessions can be viewed across all online digital collections. Data were collected from Google Analytics using several tools. Google provides an online tool called Query Explorer (https://ga-dev-tools.appspot.com/query-explorer/) that can create and execute custom searches against Google Analytics. This application was used to craft the queries. Microsoft Excel was primarily used to download data, using the custom plugin Rest to Excel Library (http://ramblings.mcpher.com/Home/excelquirks/json/rest) to parse information from Google Analytics into worksheets. The Excel add-on works well, but requires knowledge of Microsoft Visual Basic for Applications (VBA) programming to use effectively. This limitation prompted the authors to look for a simpler way of retrieving data. The authors found OpenRefine (https://github.com/OpenRefine/OpenRefine) to collect, sort, and filter data, with Excel used for results analysis. Once in Excel, formulas were used to mine data for specific targets. RESULTS ANALYSIS The data collected using Google Analytics spanned a period of approximately twenty-seven months, from October 2013 through December 2015. Table 1 and graph 1 show each collection’s item views, item referrals, and size (number of items in the collection). These numbers were calculated for each collection as a percentage of total item views, total items referrals, and total number of items for all collections together. In table 2, the top five collections in terms of items views and referrals are highlighted. Graph 1, a graphical representation of table 2, displays more starkly the differences between collections in terms of views and referrals. ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 24 Collection Name Item Views as Percentage of Total Views Item Referrals as Percentage of Total Referrals Number of Items as Percentage of Total Items for all Collections Cherokee Traditions 6.38 6.12 4.74 Civil War 1.89 0.88 0.97 Craft Revival 41.35 52.39 28.32 Great Smoky Mountains 7.50 6.34 26.14 Highlights from WCU 0.23 0.08 0.56 Horace Kephart 11.67 7.62 7.89 Picturing Appalachia 10.03 9.99 13.89 Stories of Mountain Folk 3.51 2.45 5.344 Travel Western North Carolina 7.87 9.57 2.29 WCU—Fine Art Museum 0.19 0.08 1.24 WCU—Herbarium 0.71 0.45 1.30 WCU—Making Memories 7.13 2.64 5.83 WCU—Oral Histories 0.80 1.08 0.96 Western North Carolina Regional Maps 0.26 0.11 0.53 Total 100.00 100.00 100.00 Table 2. Collections by percentage Graph 1. Collections by percentage INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 25 As demonstrated in the preceding table and graph, Craft Revival, one of the Library’s oldest and largest collections, contributes more than 28 percent of all digital collections’ items and garners close to 42 percent of all item views and 53 percent of all item referrals. Great Smoky Mountains, the second largest collection, contributes a little more than 26 percent of items but receives only about 8 percent of all item views and 7 percent of all referrals. The Horace Kephart collection, focusing on the life and works of Horace Kephart—author, librarian, and outdoorsman who made the mountains of Western North Carolina his home later in life—is the Library’s fourth largest collection. It receives almost 12 percent of all item views and about 8 percent of all item referrals. Picturing Appalachia, the third largest collection—consisting of photographs showcasing the history, culture, and natural landscape of Southern Appalachia in the Western North Carolina region—makes up 14 percent of items and receives approximately 10 percent of all referrals and views. Travel Western North Carolina—visual journeys of Western North Carolina communities through three generations—contributes fewer than 3 percent of items but scores high on both items views and referrals. WCU—Making Memories, which highlights the people, buildings, and events from WCU’s history, and Stories of Mountain Folk (SOMF), which is a collection of radio programs from Western North Carolina non-profit Catch the Spirit of Appalachia and archived at Hunter Library, are collections that are similar in size—receiving fewer than 3 percent of all item referrals. However, WCU—Making Memories receives a more than 7 percent of all item views compared to SOMF’s almost 4 percent. These findings are not surprising as the Making Memories collection documents Western Carolina University’s history and may receive many views from within the institution. Overall, however, the Craft Revival collection can be considered the Library’s most popular collection. The Horace Kephart collection appears to be the second most popular collection. And, not surprisingly, Cherokee Traditions, a collection of art objects, photographs, and recordings similar in content to the Craft Revival in terms of its focus on Cherokee culture and history, is quite popular and receives more item referrals than both WCU—Making Memories and SOMF and more item views than SOMF (table 2). An analysis of keyword searches within CONTENTdm and keyword searches across all referral sources reiterates these findings. As part of the analysis, data collected for this twenty-seven- month period for the top keyword searches within CONTENTdm and the top keyword searches counting all referrals was recorded in an Excel spreadsheet and then uploaded to OpenRefine. OpenRefine allows text and numeric data to be sorted by name (alphabetical) and count (highest to lowest occurring). Once the Excel spreadsheet was uploaded to OpenRefine, keywords were sorted numerically and clustered. OpenRefine has a “cluster” function to bring together text that has the same meaning but differs by spelling or capitalization (for example, “CHEROKEE,” “cherokee,” “cheroke”) or by order (for example, “Jane Smith,” “Smith, Jane”). The clustering function provides a count of the number of times a keyword was used regardless of exact spelling. After identifying keywords belonging to a cluster (for example, a cluster of the word “Cherokee” spelled differently), the differently spelled or organized keywords in each cluster were merged in ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 26 OpenRefine with their most accurate counterparts. Finally, it should be noted that keywords including “!” and “+” symbols were most likely generated from either using multiple search terms within CONTENTdm’s advanced search or from curated search links maintained on some of our online exhibit websites. These links take users to commonly used result sets within the collection. Tables 3 and 4 provide a listing of the ten most frequently searched keywords within CONTENTdm across all referrals and names of collections that are most relevant to these searches. Keywords Occurrence Count Relevant Collection(s) Cherokee 187 Craft Revival; Cherokee Traditions Cherokee Language 107 Craft Revival; Cherokee Traditions Southern Highland Craft Guild 98 Craft Revival basket!object 96 Craft Revival; Cherokee Traditions Indian masks—Appalachian Region, Southern 83 Craft Revival; Cherokee Traditions Basket!photograph postcard 82 Craft Revival; Cherokee Traditions W.M. Cline Company 78 Picturing Appalachia; Craft Revival Cherokee +Indian! photograph 72 Craft Revival; Cherokee Traditions Wood-carving— Appalachian Region, Southern 70 Craft Revival Indian wood-carving— Appalachian Region, Southern 69 Craft Revival Table 3. Top keywords searches within CONTENTdm INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 27 Keywords Number of Sessions Relevant Collection(s) cherokee traditions 442 Craft Revival; Cherokee Traditions horace kephart 185 Horace Kephart; Great Smoky Mountains; Picturing Appalachia cherokee pottery 55 Craft Revival; Cherokee Traditions kephart knife 50 Horace Kephart amanda swimmer 37 Craft Revival; Cherokee Traditions appalachian people 36 Craft Revival; Cherokee Traditions; Great Smoky Mountains; WCU—Oral Histories cherokee indian pottery 36 Craft Revival; Cherokee Traditions cherokee baskets 34 Craft Revival; Cherokee Traditions weaving patterns 33 Craft Revival; Cherokee Traditions basket weaving 26 Craft Revival; Cherokee Traditions Table 4. Top keyword searches across all referrals Tables 3 and 4 show that top searches relate to arts and crafts from the Western North Carolina region (“baskets,” “Indian masks,” “Indian wood carving,” “Cherokee pottery”), artists (“amanda swimmer”), or topics relating to Cherokee culture (“cherokee,” “cherokee language”). Searches relating to the Horace Kephart collection (“horace kephart,” “kephart knife”) are also popular, explaining the fact that the Kephart collection, which accounts for fewer than 8 percent of the Library’s digital collections’ items scores highly in terms of item views (second) and referrals (fourth). The popularity of topics related to Western North Carolina is reiterated in the geographic base of the users. Graph 2 shows North Carolina accounts for most of the searches, with cities in Western North Carolina (Asheville, Franklin, Cherokee, Waynesville) accounting for more than 40 percent of sessions. ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 28 Graph 2. Cities by session count The majority of item referrals come from search engines such as Google, Bing, and Yahoo! Graph 3 shows the percentage of item referrals from these external searches.21 However, the DPLA also generates a fair amount of incoming traffic to the collections. For example, while all collections get referrals from the DPLA, harvesting to the DPLA is particularly useful for smaller collections such as Highlights from WCU, WCU—Fine Art Museum, and Civil War Collection. Each of these collections gets 17 percent of referrals from the DPLA, making DPLA the largest referral source following the search engines for the Highlights and Fine Art Museum collections. Graph 4 shows referrals each collection receives via the DPLA as a percentage of total referrals. This indicates the usefulness of harvesting to the DPLA. A trend seems also to show there is an increase in total referrals from DPLA per month the longer items are in DPLA (graph 5). Graph 3. Percentage of search engine item referrals (Google, Bing, and Yahoo!) 367 319 171 146 144 135 122 109 105 98 44% 29% 47% 44% 75% 43% 57% 11% 23% 75% 74% 38% 33% 6% 22% INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 29 Graph 4. Percentage of DPLA item referrals Graph 5. Increase in DPLA referrals over time Lastly, new and returning visitors to the collections were tracked as a marker of user interest in particular collections. Graph 6 shows data collected for new and returning visitors calculated as a proportion of the total number of visits for each collection. Some smaller collections like Highlights from WCU, WNC Regional Maps, WCU—Fine Art Museum, and WCU—Oral Histories score highly in terms of attracting return visitors (graph 6). 6% 17% 3% 12% 17% 4% 11% 6% 3% 17% 3% 4% 5% 0% ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 30 Graph 6. New and returning visitors DISCUSSION The aim behind gathering data was to study usage of Hunter Library’s digital collections and examine the usefulness of harvesting in promoting use. Although usage data logs were unable to shed much light on the actual usefulness of the collections to users, the logs provided information on volume of use, what materials were accessed, and where users were located. Analysis of the transaction logs indicates that while all collections likely benefitted from harvesting, Craft Revival, Cherokee Traditions, and Horace Kephart (collections focusing on the culture and history of western North Carolina) were the most heavily used and most visitors came from the state of North Carolina and from the region in particular. Search terms in the transaction logs also indicated a strong interest in items related to Cherokee culture and Horace Kephart. As Herold, who traced the second largest group of users of the Orang Asli digital image archive to Malaysia notes, the geographic base of a collection’s users can be indicative of the popularity of a subject area.22 Likewise, Matusiak asserts that users’ comments can be indicative of the relevance of collections to users’ needs and provide direction for the future development of digital collections.23 As neither the Craft Revival, Cherokee Traditions, nor Horace Kephart collection includes items that relate specifically to the university’s history—unlike other institution-specific collections mentioned earlier—it is possible collection users may be more representative of the larger public than the university. These findings point to the need for questioning identification of an academic INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 31 library’s user base as mainly students and faculty of the institution and whether librarians should give greater consideration to the needs of a wider audience.24 Data supporting the existence of this user base, whose true import or preferences might not be captured in surveys and questionnaires, can serve as a valuable source of information for individuals responsible for building digital collections. In an informal survey of Hunter Library faculty carried out by Hunter Library’s Digital Initiatives Unit in September of 2014, respondents considered collections such as Craft Revival to be more useful to users external to the university. While the survey could allude to the nature of the user base of a collection like Craft Revival, it understandably could not capture the scale of the item views and referrals garnered by this collection as well as a usage data analysis could. On the other hand, analysis of usage data, as demonstrated in this paper, indicated that certain collections— Highlights from WCU, WCU—Fine Art Museum, and WCU—Oral Histories—possibly served a niche audience. These smaller and more recently established collections consisting of university- created materials attracted more returning visitors (see graph 6). These returning visitors were likely internal users whose visits indicated, as Fang points out, a loyalty to these collections.25 In the paper “A Framework of Guidance for Building Good Digital Collections,” authored by the National Information Standards Organization Framework Advisory Group, the authors point out that while there are no absolute rules for creating quality digital collections, a good collection should include data pertaining to usage.26 The authors point to multiple assessment matrixes including using a combination of observations, surveys, experiments, and transaction log analyses. As the WCU digital collections findings demonstrate, a careful analysis of the popularity of collections can indicate the need for balancing quantitative data with more qualitative survey and interview data. These findings also indicate that usage data analysis can be very valuable in identifying the extent of collection usage by visitors who may not have significant survey representation. Results from the small (fewer than ten respondents) WCU survey indicate that some respondents question the institutional usefulness of collections such as Craft Revival. These results show the importance of taking multiple factors into account when assessing user needs and interests in digital collections. CONCLUSION The authors feel future projects might stem from this data analysis. For example, local subject fields based on the highest recurring keywords that were mined from the transaction logs can be added for all of Hunter Library’s digital collections. Usage statistics at a later period could be evaluated to study if addition of user generated keywords increased use of any collection. As Matusiak points out in her article on the usefulness of user-centered indexing in digital image collections, social tagging—despite its lack of synonym control or misuse of the singular and plural—is a powerful form of indexing because of “close connection with users and their language,” as opposed to traditional indexing.27 The terms users assign to describe images are also the ones they are most likely to type while searching for digital images. Likewise, according to Walsh, a ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 32 study conducted by the University of Alberta found more than forty percent of collections reviewed used a locally developed classification for indexing and searching their collections, and many of these schemes could work well for searches within the collection by users who are familiar with the culture of the collection.28 Usage-data analysis can constitute useful information that guides decisions for building digital collections that better serve user needs. It can identify a library’s digital collections’ users and what they want. These are important considerations to keep in mind if library services are to be all about engaging and building relationship with the users.29 Harvesting to a national portal such as the DPLA is beneficial for Hunter Library’s collections. At the same time, the Library’s institution-specific collections receive more return visits, likely because of sustained interest from the large user base of the university’s students and employees, an assessment supported by survey findings. Conversely, collections not so directly tied to the institution receive the most one- time item views and referrals. Items that get used are a good indication of what users want and, as this paper demonstrates, the focus of academic digital library collections should consider the needs of both the university audience and the general public. REFERENCES 1. A landing page refers to the homepage of a collection. 2. The DPLA provides a single portal for accessing digital collections held by cultural heritage institutions across the United States. “History,” Digital Public Library of America, accessed May 19, 2016, http://dp.la/info/about/history/. 3. Paul Betty, “Assessing Homegrown Library Collections: Using Google Analytics to Track Use of Screencasts and Flash-Based Learning Objects,” Journal of Electronic Resources Librarianship 21, no. 1 (2009): 75–92, https:// doi.org/10.1080/19411260902858631. 4. Ibid. 5. Wei Fang, “Using Google Analytics for Improving Library Website Content and Design: A Case Study,” Library Philosophy and Practice (e-journal), June 2007, 1-17, http://digitalcommons.unl.edu/libphilprac/121. 6. Mark Baggett and Rabia Gibbs, “Historypin and Pinterest for Digital Collections: Measuring the Impact of Image-Based Social Tools on Discovery and Access,” Journal of Library Administration 54, no. 1 (2014): 11–22, https:// doi.org/10.1080/01930826.2014.893111. 7. Melanie Schlosser and Brian Stamper, “Learning to Share: Measuring Use of a Digitized Collection on Flickr and in the IR,” Information Technology and Libraries 31, no. 3 (September 2012): 85–93, https:// doi.org/10.6017/ital.v31i3.1926. INFORMATION TECHNOLOGY AND LIBRARIES | DECEMBER 2016 33 8. Mark R. O’English, “Applying Web Analytics to Online Finding Aids: Page Views, Pathways, and Learning about Users,” Journal of Western Archives 2, no. 1 (2011): 1–12, http://digitalcommons.usu.edu/westernarchives/vol2/iss1/1. 9. Marcus Ladd, “Access and Use in the Digital Age: A Case Study of a Digital Postcard Collection,” New Review of Academic Librarianship 21, no. 2 (2015): 225–31, https://doi.org/10.1080/13614533.2015.1031258. 10. Irene M. H. Herold, “Digital Archival Image Collections: Who Are the Users?” Behavioral & Social Sciences Librarian 29, no. 4 (2010): 267–82, https://doi.org/10.1080/01639269.2010.521024. 11. Mark A. Matienzo and Amy Rudersdorf, “The Digital Public Library of America Ingestion Ecosystem: Lessons Learned After One Year of Large-Scale Collaborative Metadata Aggregation,” in 2014 Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI, 2014), 1–11, http://arxiv.org/abs/1408.1713. 12. Oskana L. Zavalina et al., “Extended Date/Time Format (EDTF) in the Digital Public Library of America’s Metadata: Exploratory Analysis,” Proceedings of the Association for Information Science and Technology 52, no. 1 (2015), 1–5, http://onlinelibrary.wiley.com/doi/10.1002/pra2.2015.145052010066/abstract. 13. Lisa Gregory and Stephanie Williams, “On Being a Hub: Some Details behind Providing Metadata for the Digital Public Library of America,” D-Lib Magazine 20, no. 7/8 (July/August 2014): 1–10, https://doi.org/10.1045/july2014-gregory. 14. Kate Boyd, Heather Gilbert, and Chris Vinson, “The South Carolina Digital Library (SCDL): What Is It and Where Is It Going?” South Carolina Libraries 2, no. 1 (2016), http://scholarcommons.sc.edu/scl_journal/vol2/iss1/3. 15. Chris Freeland and Heather Moulaison, “Development of the Missouri Hub: Preparing for Linked Open Data by Contributing to the Digital Public Library of America,” Proceedings of the Association for Information Science and Technology 52, no. 1 (2015): 1–4, http://onlinelibrary.wiley.com/doi/10.1002/pra2.2015.1450520100105/abstract. 16. A single view of an item in a digital collection. 17. Visits to the site that began from another site with an item page being the first page viewed. 18. Keywords are words visitors used to find the Library’s website when using a search engine. Google Analytics provides a list of these keywords. 19. A session is defined as a “group of interactions that take place on a website within a given time frame” and can include multiple kinds of interactions like page views, social interactions, and economic transactions. In Google Analytics, a session by default lasts thirty minutes, though ANALYZING DIGITAL COLLECTIONS ENTRANCES: WHAT GETS USED AND WHY IT MATTERS | BISWAS AND MARCHESONI | https://doi.org/10.6017/ital.v35i4.9446 34 one can adjust this length to last a few seconds or several hours. “How a Session Is Defined in Analytics,” Google, Analytics Help, accessed May 20, 2016, https://support.google.com/analytics/answer/2731565?hl=en. 20. Locations were studied in terms of mostly cities and states. 21. The percentage is based on the total referral count a collection gets—for example, a 44 percent referral count for Cherokee Traditions would mean that the search engines account for 44 percent of the total referrals this collection gets. 22. Herold, “Digital Archival Image Collections,” 278. 23. Krystyna K. Matusiak, “Towards User-centered Indexing in Digital Image Collections,” OCLC Systems & Services: International Digital Library Perspectives 22, no. 4 (2006): 283–98, https://doi.org/10.1108/10650750610706998. 24. Ladd, “Access and Use in the Digital Age,” 230. 25. Fang points out that the improvements made to the Rutgers-Newark Law Library website could attract more return visitors and thus achieve loyalty. Fang, “Using Google Analytics for Improving Library Website,” 11. 26. NISO Framework Advisory Group, A Framework of Guidance for Building Good Digital Collections, 2nd ed. (Bethesda, MD: National Information Standards Organization, 2004), https://chnm.gmu.edu/digitalhistory/links/cached/chapter3/link3.2a.NISO.html. 27. Matusiak, “Towards User-centered Indexing,” 289. 28. John Walsh, “The Use of Library of Congress Subject Headings in Digital Collections,” Library Review 60, no. 4 (2011), https://doi.org/10.1108/00242531111127875. 29. Lynn Silipigni Connaway, The Library in the Life of the User: Engaging with People Where They Live and Learn, (Dublin: OCLC Research, 2015), http://www.oclc.org/research/publications/2015/oclcresearch-library-in-life-of-user.html.