Analytics and Privacy: Using Matomo in EBSCO’s Discovery Service ARTICLES Analytics and Privacy Using Matomo in EBSCO’s Discovery Service Denise FitzGerald Quintel and Robert Wilson INFORMATION TECHNOLOGY AND LIBRARIES | SEPTEMBER 2020 https://doi.org/10.6017/ital.v39i3.12219 Denise FitzGerald Quintel (denise.quintel@mtsu.edu) is Discovery Services Librarian and Assistant Professor, Middle Tennessee State University. Robert Wilson (robert.wilson@mtsu.edu) is Systems Librarian and Assistant Professor, Middle Tennessee State University. © 2020. ABSTRACT When selecting a web analytics tool, academic libraries have traditionally turned to Google Analytics for data collection to gain insights into the usage of their web properties. As the valuable field of data analytics continues to grow, concerns about user privacy rise as well, especially when discussing a technology giant like Google. In this article, the authors explore the feasibility of using Matomo, a free and open-source software application, for web analytics in their library’s discovery layer. Matomo is a web analytics platform designed around user-privacy assurances. This article details the installation process, makes comparisons between Matomo and Google Analytics, and describes how an open-source analytics platform works within a library-specific application, EBSCO’s Discovery Service. INTRODUCTION In their 2016 article from The Serials Librarian, Adam Chandler and Melissa Wallace summarized concerns with Google Analytics (GA) by reinforcing how “reader privacy is one of the core tenets of librarianship.”1 For that reason alone, Chandler and Wallace worked to implement and test Piwik (now known as Matomo) on the library sites at Cornell University. Taking a cue from Chandler and Wallace, the authors of this paper sought out an analytics solution that was robust and private, that could easily work within their discovery interface, and provide the same data as their current analytics and discovery service implementation. This paper will expand on some of the concerns from the 2016 Wallace and Chandler article, make comparisons, and provide installation details for other libraries. Libraries typically use GA to support data-informed decisions or build discussions on how users interact with library websites. The goal of this pilot project was to determine the similarities between Google Analytics and Matomo, how viable Matomo might be as a Google Analytics replacement, and seek to bring awareness to privacy concerns in the library. Matomo could easily be installed on multiple websites. However, this project looked into a specific instance of monitoring, that of the library’s discovery layer, EBSCO Discovery Service (EDS). LITERATURE REVIEW Google Analytics The 2005 release of Google Analytics was a massive boon to libraries who long searched for an easy to implement and budget-friendly tool for analytics. Shortly after its release, academic libraries were quick to adopt the platform and install its JavaScript code into their library web pages.2 In a little over a decade, there have been nearly forty scholarly articles published that discuss the ways in which Google Analytics is used for libraries’ websites. Articles that not only mailto:denise.quintel@mtsu.edu mailto:robert.wilson@mtsu.edu INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 2 introduced the service, but also discuss the various ways libraries utilize the platform.3 In fact, in their survey of 279 libraries, O’Brien et al.’s 2018 research found that 88 percent of libraries surveyed had implemented Google Analytics or Google Tag Manager.4 In contrast, during that same period, authors found Matomo, or its earlier name, Piwik, discussed in a total of five scholarly articles, with only three libraries who wrote about using it as a web analytics tool.5 In addition to measuring website use, libraries found that Google Analytics allowed for several different assessments. In using Google Analytics, libraries could provide immediate feedback for projects, indicate website design change possibilities, create key performance indicators, and determine research paths and user behaviors.6 Convenience of implementation and use, minimal cost, and a user-friendly interface were all reasons cited for the widespread and fast adoption.7 Although the early literature covers a lot of ground about the reporting possibilities and the coverage of Google Analytics, there is rarely a mention of user privacy. Early articles that mention privacy provide a cursory discussion, reiterating that the data collected by Google is anonymous and therefore, protects the privacy of the user. Recently, there has been a shift in literature, with articles that now provide more in-depth discussions about user privacy and the concerns libraries have with third parties that collect and host user data. O’Brien et. al discussed the problematic ways that libraries adopted and implemented GA, by either overlooking front-facing policies or implementing it without the consent of their users.8 In their webometrics study, O’Brien et. al found that very few libraries (1 percent) had implemented HTTPS with the GA tracking code, only 14 percent had used IP anonymization, and not a single site utilized both features.9 The concern is not solely Google’s control of the data, but in Google’s involvement with third-party trackers. Third parties, as Pekala remarks, are rarely held accountable.10 With an advertisement revenue of $134 billion in 2019, representing 84 percent of its total revenue, it is important to remember that Google is an advertising company.11 Google's search engine monetization transformed it into one of the world's most recognizable brands. As the most visited site in the world, Google is firmly committed to security, especially when it comes to data theft. Google offers protection from unwanted access into user accounts, even providing ways for high-risk users, such as journalists or political campaigns, to purchase additional security keys for advanced protection.12 But while Google keeps data breaches and hackers at bay, the user data that Google collects and stores for advertising revenue tells a different story. Goo gle stores user data for months on end; only after nine months is advertisement data anonymized by removing parts of IP addresses. Then, after 18 months, Google will finally delete stored cookie information.13 Recent surveys are reporting an increase in users who want to know how companies are collecting information to provide data-driven services. In a 2019 Pew Research Survey, 62 percent of respondents believe it is impossible to go through their daily lives untracked by companies. Additionally, even with the ease that certain data-driven services bring, “81 [percent] of the public reported that the potential risks they face because of data collection by companies outweigh the benefits.”14 CISCO Technologies, in a 2019 personal data survey, found a segment of the population (32 percent) that not only cares about data privacy and wants control of their data, but has also taken steps to switch providers, or companies, based on their data policies. 15 Additionally, in Pew Research Survey results published as recently as April 2020, Andrew Perrin reports that an even larger number of U.S. adults (52 percent) are now choosing to not use products or services INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 3 out of concerns for their privacy and the personal information companies collect. 16 With a growing population of users who make inquiries about who, or what, is in control of their data, a web analytics tool that can easily answer those questions might serve libraries, and their users, well. COMPARISONS Google Analytics had been the library’s only web analytics tool until the start of the pilot project. During the pilot period, the authors simultaneously ran both analytics tools. Once Matomo was installed the authors found several similarities between the two products, and discovered that nearly identical analyses could occur, given the quality and quantity of the data collected. The pilot study focused only on one analytics project, which would be the library’s discovery layer— EBSCO’s Discovery Service. Authors worked with their dedicated EBSCO engineer to replicate the Google Analytics EDS widget, and have it configured to send output to Matomo instead. In making comparisons, one of the common statements about GA and Matomo, is that the numbers will never be exact matches. Oftentimes with much higher counts presented in GA than in Matomo. Several forums and blogs, even Matomo themselves, admit that there are several possible reasons why there is a noticeable difference between the two.17 Those involved in the discussion theorize that this is due to GA spam hits, bot hits, and Matomo’s ability for users to limit tracking. Beyond the counts, both products measure the same kinds of metrics for websites.18 For this project, the authors only wanted to look at specific metrics within EDS, those measurements that look more closely at the user, rather than the larger aggregate data. For the sake of the analysis, it is important to note that although both products have several great features; this is a specific situation where the researchers use certain features in terms of analytics. The analytics we collect for EDS strive to answer specific questions: • Are users searching for known item or exploratory searches? How often? • Are users utilizing the facets and limiters? How often? Although you can use both products to count page views or set events for your website, when looking at meaningful metrics for our discovery system, we focus more on the user level. In Google Analytics, the best way to capture these is by going through the User Explorer tool, which breaks up a user journey into search terms, events, and actions that occur during sessions. In the same way, Matomo provides anonymized user profiles that include search terms, events, and actions in its Visits Log report. In GA, you can export this User Explorer data in JSON format, but only at one user at a time, as seen in figure 1. This restriction also means you cannot see data from multiple users, with those details, on a single page. To contrast, in Matomo’s Visits Log, you can export the same data (search terms, events, actions) from multiple users in CSV, XML, PHP, TSV, JSON, or HTML formats. As seen in figure 2, Matomo offers a snapshot of this data in an easy-to-read single page, versus Google’s one user at a time option which requires clicking through to see a user report. INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 4 Figure 1. Screenshot of the Google Analytics User Explorer Tool Figure 2. Screenshot of the Matomo Visits Log Report INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 5 In summary, libraries using either of these analytics tools can measure usage and users with page views, visits, and unique visitors. Looking at how users navigate a site is possible with the available user paths, from the initial search, to events as seen in figures 3 and 4, and an exit page URL. Goals can be set and maintained with conversion metrics tied to referrers, visits, user location, devices, or user attributes. Like Google Analytics, Matomo can run reports on engagement and performance, and share customizable user-friendly graphs or graphs or other visual representation. Figure 3. Peer Reviewed Limiter as Event Action in Google Analytics Figure 4. Peer Reviewed Limiter Use as Event Name in Matomo Comparisons on Privacy Both Google Analytics and Matomo offer ways to protect the privacy of your users. Both offer IP anonymization, the option for data deletion after a certain time, and both provide Do Not Track feature for users. It is important to note the way Google offers these adjustments to the user. For Matomo, Do Not Track is a default behavior, meaning that the tracker automatically honors a browser’s settings for all sites, which is sometimes not the case, as respecting the Do Not Track browser setting is voluntary for websites, not mandatory .19 Google Analytics offers the same service, as long as it is implemented by the user through a browser extension.20 IP anonymization and data deletion are all features that Matomo users can adjust easily from the dashboard, whereas Google Analytics users will need to make those adjustments programmatically. 21 In Matomo, you can choose to automatically delete your old visitor logs from the database, although Matomo recommends keeping detailed Matomo logs from three to six months, and then INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 6 delete the older log data.22 Quite the contrast is Google Analytics where a user makes a data deletion request to Google, which then creates a report for your review, before submitting the request to Google. Even after submitting a request, Google still allows for seven days to reverse that decision. In terms of data retention, Google Analytics gives you the option to retain user data anywhere from 14 months to 50 months, with the option to never expire. Fourteen months is the shortest amount of time you can retain user data for, nothing less.23 IP anonymization is the default for Matomo analytics but is an opt-in feature for Google Analytics. Again, like data retention, any adjustments to IP anonymization in Matomo can occur in the dashboard with options to have two or three bytes removed from the address. Google Analytics will adjust the last octet to zero.24 Both products are similar in several ways, but the standout feature of Matomo is that the data belongs only to your institution. In his interview with Katherine Schwab for Fast Company, Mathieu Aubry, Matomo’s founder states it clearly: When [Google] released Google Analytics, [it] was obvious to me that a certain percent of the world would want the same technology, but decentralized, where it’s not provided by a centralized corporation and you’re not dependent on them… If you use it on your own server, it’s impossible for us to get any data from it.25 IMPLEMENTATION AND INSTALLATION Originally released as Piwik in 2007, Matomo was designed as a replacement to phpMyVisites.26 It is an open-source software application licensed under GNU GPL v3.27 It is designed as a PHP/MySQL application allowing the server operating system (OS) and web service to best match a user’s needs or institutional preferences and expertise.28 To match the organization’s preferences and expertise, this Matomo instance was set up as a Linux-Apache-MySQL/PHP (LAMP) stack server (CentOS 7 in our case) with Apache 2.4.6 and MySQL-MariaDB 5.5.60. The required configurations needed to run Matomo are well-documented on the Matomo documentation site as well as the download and documentation area. Depending on the version of Matomo, the mileage a user gets with the documentation may vary. For example, on the recent upgrade to 3.11.0, the instance displayed a warning notification that PHP v7.0 had reached end of life and recommended updating to PHP v7.1 or greater to accommodate future Matomo versions. However, at the time of this writing, the minimum PHP version required stated in Matomo’s documentation is 5.5.9 or greater.29 Like many PHP applications, once the prerequisite applications are installed (PHP, MySQL, and the selected web service, Apache in this case), the Matomo install is completed by browsing to the server’s URL or IP address on port 80. Browsing to the index.php path in a web browser will guide a user through the install process. The installer will also review file directories on the server and inform a user of any permissions problems that will need to be addressed for correct install and use. Compared to other PHP application install experiences, installing Matomo was straightforward and easier to follow than many. Within a few minutes, the admin user was created and the first website was added. The web-based administration area is also more robust and easier to use than many comparable applications. Many features that might typically require configuration file changes directly on the server, including Matomo upgrades, can be configured through the administration area. While the administration page has many options relating to paid-for premium features, there are several INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 7 particularly helpful free configuration cards in the interface. Most notably is the “System Summary” card that displays the current version of Matomo, PHP, and MySQL as well as total users, segments, goals, tracking failures, total websites configured, and a few other metrics. There is a “Tracking Failures” card that notifies of issues with websites, and a “Need Help?” card that links to the Matomo Community forums. Finally, the “System Check” card displays any warnings or errors as well as a link to the full system check report. This is extremely helpful when Matomo has been installed but the instance still needs additional configuration changes or follow-up tasks on upgrades. If there are warnings or errors, the full system report will often have recommendations of changes to make either in the administration page or on the server in the configuration files. These administration features make maintenance a straightforward process. Since setting up the server, two upgrades have been completed. In both cases, an email notification was received indicating a new stable release was available. On login to Matomo, this information also appeared as a banner. Simply clicking on the download update option automatically updated the service without any need to access the server directly or via SSH. In both cases the updates ran smoothly with one exception. In that case, several files were created or overwritten with the root user as the owner. As a result, Matomo indicated an issue with the files and/or path not being found. In actuality, the files did exist, but Matomo no longer had permission to read them. Resolution of the problem required browsing to the directory path indicated in a warning on the server and changing ownership from the root user to the apache user to match other files. Despite this issue, the update process is much more user-friendly than similarly structured applications. Standalone implementation and installation of Matomo is made simple by the installation documentation that is readily available on the Matomo.com website, especially if one is familiar with PHP/MySQL applications. Adding one or two websites whose architectures a new Matomo user is well-acquainted with is a good way for new users to pilot and get introduced to Matomo’s overall functions without being so overwhelmed that the more granular functions are never learned. A system admin may find maintenance and updates to this service less problematic with less interruption of the service than similarly structured applications while users may find the overall functionality of Matomo easier to use and finer points of reporting and analytics more transparent and easier to understand than Google Analytics. Once installed, the authors then tested Matomo on a low-traffic library site. After tracking proved successful, EDS was entered as a new website in the Matomo dashboard and the JavaScript tracking tag was placed in the bottom branding of EDS. The process of adding EDS as a new site to Matomo was as easy as expected, and the data collection was almost immediate. To mirror the EDS and Google Analytics integration, the authors worked with their EBSCO Library Service Engineer to create a Matomo widget. Luckily, another engineer had previously worked on an integration when it was known as Piwik. Instead of building from the ground up, the Piwik widget only needed clean and updated code to match the Google Analytics widget, which would allow for the tracking of events and site searches. Adding a user outside of the organization to Matomo was necessary for the EBSCO engineer to fine-tune the widget. Matomo admins can set up users with specific permissions within the system, with access to only a specific site. Each Matomo user has their own email address and password (not domain-specific), settings, and users can even customize their dashboard. After INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 8 testing proved successful, the new Matomo widget moved into the live profile of EDS, and data collection commenced. SECURITY Though the service is in a pilot stage with limited data collection, the authors wanted to ensure an SSL certificate was in place for login to Matomo. With EFF’s Certbot (https://certbot.eff.org/), the authors installed a Let’s Encrypt (https://letsencrypt.org/) SSL certificate. The SSL certificate is automatically renewed every three months via a cronjob on our server. Because of the power of the administration interface, caution should be used when assigning the “Super User” role to user accounts. It would also be wise to require two-factor authentication (2FA) on the service. Turning on 2FA is a very simple process and Matomo works with multiple third-party authentication utilities including Authy, LastPass, and 1Password. While each user can choose to activate 2FA, an admin can require it for all users if desired. CONCLUSION As the amount of research and rate of adoption testifies, since 2005 GA has set the benchmark for assessment of library web asset success and has made possible a completely new understanding of the library user experience and overall assessment of library services. Matomo’s earliest iteration appeared shortly after in 2007 and is a viable alternative to proprietary web analytics applications with a few notable advantages over GA. From a long-term perspective, the two biggest advantages of Matomo is that it is licensed under a copyleft GPL free and open source software (FOSS) license and is designed with user privacy at heart. For libraries, using FOSS applications whenever possible allows them to practice what they preach. FOSS does not mean cost-free. In fact, free in the FOSS sense is more akin to freedom (freedom to download, modify, distribute, and change the code) rather than free of charge. Budgeting for a hosted subscription, support, or the costs of a library running and maintaining the application itself or through an Infrastructure as a Service (IaaS) provider like Amazon Web Services (AWS) or Microsoft’s Azure is necessary, but the freedom Matomo provides by ensuring the library is in control of its patron data, that it is protected, and that data is not at risk of becoming a product in and of itself may well be worth the cost. Like other initiatives in the open-access movement or open-education resources, and as third- party data collection and privacy on the web becomes a more mainstream concern, opting to use Matomo to protect patron privacy principles allows libraries to be the leaders on issues relating to privacy and intellectual freedom. As noted earlier, there are other feature-based advantages Matomo provides that impact the day-to-day aspects of monitoring web asset use and assessment, like export options and viewing the full log of visits. Lastly, by focusing on EDS in this pilot, the authors were able to demonstrate and verify that Matomo rises to the challenge not just with traditional web asset analytics requirements, but to library-specific applications like proprietary discovery layer services. https://certbot.eff.org/ https://letsencrypt.org/ INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 9 ENDNOTES 1 Adam Chandler and Melissa Wallace, “Using Piwik Instead of Google Analytics at the Cornell University Library.” Serials Librarian 71, no. 3 (October 2016): 174, https://doi.org/10.1080/0361526X.2016.1245645. 2 Tabatha Farney and Nina McHale, “Introducing Google Analytics for Libraries,” Library Technology Reports 49, no. 4 (May 2013): 5, https://journals.ala.org/ltr/article/download/4269/4881. 3 Paul Betty, “Assessing Homegrown Library Collections: Using Google Analytics to Track Use of Screencasts and Flash-Based Learning Objects,” Journal of Electronic Resources Librarianship 21, no. 1 (2009): 75–92, https://doi.org/10.1080/19411260902858631; Jason D. Cooper and Alan May, “Library 2.0 at a Small Campus Library,” Technical Services Quarterly 26, no. 2 (2009): 89–95, https://doi.org/10.1080/07317130802260735; Stephan Spitzer, “Better Control of User Web Access of Electronic Resources,” Journal of Electronic Resources in Medical Libraries 6, no. 2 (2009): 91–100, https://doi.org/10.1080/15424060902931997; Julie Arendt and Cassie Wagner, “Beyond Description: Converting Web Site Usage Statistics into Concrete Site Improvement Ideas,” Journal of Web Librarianship 4, no. 1 (2010): 37–54, https://doi.org/10.1080/19322900903547414; Steven J. Turner, “Website Statistics 2.0: Using Google Analytics to Measure Library Website Effectiveness,” Technical Services Quarterly 27, no. 3 (2010): 261–78, https://doi.org/10.1080/07317131003765910; Gail Herrera, “Measuring Link-Resolver Success: Comparing 360 Link with a Local Implementation of WebBridge,” Journal of Electronic Resources Librarianship 23, no. 4 (2011): 379–88, https://doi.org/10.1080/1941126X.2011.627809; Wayne Loftus, “Demonstrating Success: Web Analytics and Continuous Improvement,” Journal of Web Librarianship 6, no. 1 (2012): 45–50, https://doi.org/10.1080/19322909.2012.651416; Tabatha A. Farney, “Click Analytics: Visualizing Website Use Data,” Information Technology & Libraries 30, no. 3 (2011): 141–8, https://doi.org/10.6017/ital.v30i3.1771. 4 Patrick O’Brien et al., “Protecting Privacy on the Web: A Study of HTTPS and Google Analytics Implementation in Academic Library Websites,” Online Information Review 42, no. 6 (2018): 734–51, https://doi.org/10.1108/OIR-02-2018-0056. 5 Junior Tidal, “Using Web Analytics for Mobile Interface Development,” Journal of Web Librarianship 7, no. 4 (2013): 451–64, http://doi.org/10.1080/19322909.2013.835218; Ramiro Federico Uviña, “Bibliotecas Y Analítica Web: Una Cuestión De Privacidad = Libraries and Web Analytics: A Privacy Matter,” Información, Cultura Y Sociedad no. 33 (2015): 105–12, http://revistascientificas.filo.uba.ar/index.php/ICS/article/view/1906; Sukumar Mandal, “Site Metrics Study of Koha OPAC through Open Web Analytics and Piwik Tools,” Library Philosophy and Practice (2019), https://digitalcommons.unl.edu/libphilprac/2835; Mohammad Azim and Nabi Hasan, “Web Analytics Tools Usage among Indian Library Professionals,” 2018 5th International Symposium on Emerging Trends and Technologies in Libraries and Information Services, (2018): 31-35, https://doi.org/10.1109/ETTLIS.2018.8485212. 6 Ian Barba et al., “Web Analytics Reveal User Behavior: TTU Libraries’ Experience with Google Analytics,” Journal of Web Librarianship 7, no. 4 (2013): 389–400, https://doi.org/10.1080/19322909.2013.828991. https://doi.org/10.1080/0361526X.2016.1245645 https://journals.ala.org/ltr/article/download/4269/4881 https://doi.org/10.1080/19411260902858631 https://doi.org/10.1080/07317130802260735 https://doi.org/10.1080/15424060902931997 https://doi.org/10.1080/19322900903547414 https://doi.org/10.1080/07317131003765910 https://doi.org/10.1080/1941126X.2011.627809 https://doi.org/10.1080/19322909.2012.651416 https://doi.org/10.1080/19322909.2012.651416 https://doi.org/10.1108/OIR-02-2018-0056 http://doi.org/10.1080/19322909.2013.835218 http://revistascientificas.filo.uba.ar/index.php/ICS/article/view/1906 https://digitalcommons.unl.edu/libphilprac/2835 https://doi.org/10.1109/ETTLIS.2018.8485212 https://doi.org/10.1080/19322909.2013.828991 INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 10 7 Betty, “Assessing Homegrown Library Collections.” 8 O’Brien et al., “Protecting Privacy on the Web,” 734. 9 O’Brien et al., “Protecting Privacy on the Web,” 741. 10 Shayna Pekala, “Privacy and User Experience in 21st Century Library Discovery,” Information Technology & Libraries 36, no. 2 (2017): 50, https://doi.org/10.6017/ital.v36i2.9817. 11 J. Clement, “Advertising Revenue of Google from 2001 to 2019,” Statista, February 5, 2020, https://www.statista.com/statistics/266249/advertising-revenue-of-google; Lily Hay Newman, “The Privacy Battle to Save Google From Itself,” Wired, November 1, 2018, https://www.wired.com/story/google-privacy-data/; Ben Popken, “Google Sells the Future, Powered by Your Personal Data,” NBC News, May 10, 2018, https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal- data-n870501; Richard Graham, “Google and Advertising: Digital Capitalism in the Context of Post-Fordism, the Reification of Language, and the Rise of Fake News,” Palgrave Communications 3, no. 45 (2017): 2-4, https://doi.org/10.1057/s41599-017-0021-4. 12 “Google Advanced Protection Program,” Google, https://landing.google.com/advancedprotection/. 13 “Google Privacy and Terms, Advertising,” Google, https://policies.google.com/technologies/ads?hl=en-US. 14 Brooke Auxier et al., “American and Privacy: Concerned, Confused and Feeling Lack of Control Over Their Personal Information,” November 15, 2019, Pew Research, https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/Pew- Research-Center_PI_2019.11.15_Privacy_FINAL.pdf. 15 “Consumer Privacy Survey,” November 2019, CISCO, https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series- 2019-cps.pdf. 16 Andrew Perrin, “Half of Americans Have Decided Not to Use a Product or Service Because of Privacy Concerns,” Pew Research, April 14, 2020, https://www.pewresearch.org/fact- tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because- of-privacy-concerns/. 17 “Matomo vs. Google Analytics 360,” Matomo.org, https://matomo.org/matomo-vs-google- analytics comparison; Lemon, “A Comparison of Data: Piwik vs. Google Analytics,” The FPlus (blog), November 30, 2016, https://thefpl.us/wrote/about-piwik; Himanshu Sharman, “Best Google Analytics Alternatives in 2020—Matomo & Piwik Pro,” OptimizeSmart (blog), March 30, 2020, https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics- alternative. 18 “Matomo vs. Google Analytics 360,” Matomo.org. https://doi.org/10.6017/ital.v36i2.9817 https://www.statista.com/statistics/266249/advertising-revenue-of-google https://www.wired.com/story/google-privacy-data/ https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal-data-n870501 https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal-data-n870501 https://doi.org/10.1057/s41599-017-0021-4 https://landing.google.com/advancedprotection/ https://policies.google.com/technologies/ads?hl=en-US https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/Pew-Research-Center_PI_2019.11.15_Privacy_FINAL.pdf https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/Pew-Research-Center_PI_2019.11.15_Privacy_FINAL.pdf https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series-2019-cps.pdf https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series-2019-cps.pdf https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://matomo.org/matomo-vs-google-analytics%20comparison/ https://matomo.org/matomo-vs-google-analytics%20comparison/ https://thefpl.us/wrote/about-piwik https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics-alternative https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics-alternative INFORMATION TECHNOLOGY AND LIBRARIES SEPTEMBER 2020 ANALYTICS AND PRIVACY | QUINTEL AND WILSON 11 19 Ryan Singel, “Google Holds Out Against ‘Do Not Track’ Flag,” Wired, April 15, 2011, https://www.wired.com/2011/04/chrome-do-not-track; Kieren McCarthy, “Do Not Track Is Back in the US Senate,” The Register, May 20, 2019, https://www.theregister.co.uk/2019/05/20/do_not_track; “How Do I Turn on the Do Not Track Features?,” Mozilla, https://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track- feature. 20 “Google Analytics Opt-Out Browser Add-On,” Google, https://support.google.com/analytics/answer/181881. 21 “IP Anonymization,” Google, https://developers.google.com/analytics/devguides/collection/analyticsjs/ip-anonymization. 22 “Managing Your Database’s Size,” Matomo.org, https://matomo.org/docs/managing-your- databases-size/ - deleting-old-unprocessed-data. 23 “Data Retention,” Google, https://support.google.com/analytics/answer/7667196?hl=en&ref_topic=2919631. 24 “IP Anonymization,” Google. 25 Katherine Schwab, “It’s Time to Ditch Google Analytics,” Fast Company, February 1, 2019, https://www.fastcompany.com/90300072/its-time-to-ditch-google-analytics. 26 “Matomo and phpMyVisites,” Matomo.org, https://matomo.org/faq/general/faq_437. 27 “Licenses,” Matomo.org, https://matomo.org/licences. 28 “Matomo (software),” Wikipedia, https://en.wikipedia.org/wiki/Matomo_(software). 29 “Matomo Requirements,” Matomo.org, https://matomo.org/docs/requirements. https://www.wired.com/2011/04/chrome-do-not-track https://www.theregister.co.uk/2019/05/20/do_not_track https://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track-feature https://support.mozilla.org/en-US/kb/how-do-i-turn-do-not-track-feature https://support.google.com/analytics/answer/181881 https://developers.google.com/analytics/devguides/collection/analyticsjs/ip-anonymization https://matomo.org/docs/managing-your-databases-size/%20-%20deleting-old-unprocessed-data https://matomo.org/docs/managing-your-databases-size/%20-%20deleting-old-unprocessed-data https://support.google.com/analytics/answer/7667196?hl=en&ref_topic=2919631 https://www.fastcompany.com/90300072/its-time-to-ditch-google-analytics https://matomo.org/faq/general/faq_437 https://matomo.org/licences https://en.wikipedia.org/wiki/Matomo_(software) https://matomo.org/docs/requirements ABSTRACT Introduction Literature Review Google Analytics Comparisons Comparisons on Privacy Implementation and Installation Security Conclusion Endnotes