Finding Science and Engineering Specific Data Set Usage or Funding Acknowledgements Previous Contents Next Issues in Science and Technology Librarianship Summer 2013 DOI:10.5062/F4CV4FP0 Finding Science and Engineering Specific Data Set Usage or Funding Acknowledgements Ann Coppin Information Science Specialist Jet Propulsion Laboratory, California Institute of Technology Pasadena, California ann.s.coppin@jpl.nasa.gov Copyright 2013, California Institute of Technology. Used with permission. Many scholarly articles have a section called Acknowledgement(s) just before the list of references. The Acknowledgements section in scientific publications has become common and grown in importance since the 1960s. This section may be used to acknowledge the contributions of people who are not considered authors, the source(s) of funding for the research, or something else the author(s) want to acknowledge about the research for the publication. As interest has grown in various publication metrics, information in the Acknowledgement section may be useful for creating a particular metric or identification of publication patterns. But how can specific information in the acknowledgements section be found and analyzed, particularly if funding sources or data sources are sought? The purpose of this paper is to briefly review the pertinent publications about studying acknowledgements and then discuss how to find funding or source data information. It concludes with three examples of actual search for acknowledgement metrics to support the value of a particular project. Articles which discuss using Acknowledgement information are not numerous. Most of the articles that exist focus upon how the acknowledgement section of a paper shows the social interactions or collaborations of people. Research writing is considered a social enterprise (Hyland and Salager-Meyer 2008). Blaise Cronin and colleagues are pioneers in looking for patterns in acknowledgements. Their early discussion of classification schemes for acknowledgements included the funding and access to data but focused upon the help given by individuals not considered authors, or as it is called, "peer interactive communication" (Cronin et. al. 1992). Acknowledgements are seen by Cronin and others as expressions of peer interactive communication (Cronin and Weaver-Wozniak 1993; Davis and Cronin 1993). But the understanding of the social significance of acknowledgement is incomplete. An online Acknowledgements Index was proposed (Cronin and Weaver-Wozniak 1993). An exploration of university scholars' acknowledgement behavior and perceptions of the acknowledgment practice focused upon acknowledgements as a social communication (Cronin and Overfelt 1994). The focus of Cronin's work is on the social exchange in the conduct of scholarship, the peer interactive communication (PIC). Points brought out by him include that format for acknowledgements varies according to the field, use of an acknowledgements section in scholarly publications has grown, and an acknowledgements section is now a well-established practice. Cronin lists six acknowledgement categories including Paymaster (i.e., funding) and Moral Support which includes access to data. The other four categories are Dogsbody (peer interactive communication), Technical (help of individuals), Prime Mover (source of inspiration, mentor, etc.) and Trusted Assessor (peer feedback, critical analysis). These last four are different aspects of interactions between people. Cronin made a summary reference to a 1991 study of astronomy acknowledgements which mentions only about a quarter of the acknowledgements were for providing access "to data, theoretical models, and computer codes" (Cronin 1995). A 2004 study mentions funding agency support as a category of Acknowledgements, but the focus of the study's results is upon the people/social aspects. In its data table the Financial (i.e. funding) was mentioned in 46% of the acknowledgments (Cronin et al. 2004). Another study, which focused upon automatic indexing of acknowledgements, noted that many sources of research funding expect researchers to acknowledge support (Giles and Councill 2004). The study looked for identifiable acknowledgement passages and noted that most acknowledgements in research papers are found in acknowledgement sections. However, the authors pointed out that "acknowledgment passages may also be found in unmarked sections, within a document header, or within footnotes." They found approximately 56% of the papers sampled contained acknowledgements. They included funding agencies as one of their categories (Giles and Councill 2004). A recent paper focuses upon the "Reward Triangle" of authorship, citedness, and acknowledgements. It considers the financial support and peer interactive communication the most important categories from the bibliometric point of view (Costas and van Leeuwen 2012). They observe that publications with funding acknowledgement present a higher impact as compared to publications without such acknowledgement. This article is looking at broad disciplines and found funding acknowledgements in 43% of the publications analyzed. None of these publications touched upon finding funding or data acknowledgements for specific projects. They all looked at either defined sets of publications or broad categories of publications to see how acknowledgements were being used. While studying acknowledgements for bibliometrics in broad categories is useful for general purposes, what is the value for someone evaluating the use of specific funds or a principle investigator wanting to understand the use of data made publicly available? A data set provider or a funder of a data set will find it useful to know who is using their data or funds, and what publications resulted from that use. The following are a few reasons that this form of "citation search" for data origin may be helpful: It validates that the creator of the data has received proper credit for the data. It promotes the value of the data set by showing who and how it is being used. It is something data curators/originators can point to as a justification for the time and money spent in creating and maintaining the data. It provides a way of providing a network of research interests: who else has done research using the same data - maybe they could be a future partner. Citation analysis on these publications (who has cited papers that cite your data) may also be useful in determining the larger impact of the data. It provides the creator with a broader understanding of the data set, by providing an ability to analyze the use, and potential problems that have been documented with the data in the public literature. Identifying who has used a particular dataset or received funding is not easy to do, but it can be done. Comprehensive bibliographies are not possible due to limitations in the papers themselves. Article authors may not mention the dataset or the funder. If authors do make acknowledgements, there is no set format for acknowledgements. Acknowledgement Sections of articles mostly mention people. Funding sources may also appear in these sections, but the ways the sources are referenced vary widely and often do not match the "required" wording. Funding sources may be listed as the funding program name or acronym; sometimes only a grant number is given; at times the name or acronym of the parent organization instead of the specific funding group is given. Also, the tools available for searching are limited. The Web of Science started adding funding information to records in 2008. However, because authors have no standard way of citing funding it is very 'hit or miss' for retrieving articles using the provided information. There is also no standard for citing the data source. With the growing requirements by research funders to have data management plans and make data accessible to more researchers, there is growing interest in how to cite data (Parsons et al. 2010; Raymond et al. 2012; McCallum et al. 2012; Mayernik et al. 2012; Ruf 2011). There has been no good equivalent to Science Citation Index for data. Thomson Reuters announced in October 2012 a new Data Citation Index. But currently full-text searching is still the best way to locate data sets or funding information. The drawback is that it cannot be used without access to the full text. So search strategies must be thought out in detail. It is important to let the customer know the limits of this type of search. Some general observations about acknowledging scientific data are: There is no standardized format for acknowledgements or for the location of the acknowledgement. Publications most frequently have an Acknowledgements Section just before References. However, what appears here are usually the names of people who are not considered authors nor have a publication cited in the paper. Funding information also may appear in this section if it is given. Funding information may be simply a generic mention such as NASA or NFS, or just the contract number. Scientists do not always acknowledge the source of data. Scientists when acknowledging the source of data or funding frequently do not use the prescribed wording in the 'Acknowledgements' section. It is necessary to get the requestor to give the prescribed wording, any acronyms that might be used, and any contract numbers if that level of comprehensiveness is wanted. Source of data, when mentioned, is often found in the Data/Methodology section. Web of Knowledge has added funding information. For recent papers they seem to summarize the information in the Acknowledgements section. For retrospective addition of this information to older records, they seem to have searched for contract numbers preceded by "Work performed by...," "Research supported by..." or similar phrases. In a 1965 science paper this information was found in a footnote tied to an article title. A 1981 Nature paper had a simple statement at the end of the discussion of results that was picked up by their process. Searching Google Scholar or Google is frequently the most productive way to find publications when searching with the most frequently used form of the desired data/funding source. If a source is not harvested by Google, then the best way to locate the desired information is to do a full text search of the source's digital collection. The following are synopsis of three requests received in the past five years to identify papers that cite specific science data sets or funding. Brief descriptions are given of how the requested information was found. An analysis of the results of the most recent effort is given which gives more detail on where data usage acknowledgement is found. Request 1. Use of NASA Time on the Keck Telescope during a specified set of years. Purpose: The NASA Keck Users' Bibliography of peer reviewed articles needed to be updated. This bibliography was both posted on the Michelson Science Center web site and was to be used to show the use of NASA allotted time on the telescope. The list of scientists and the period when time was assigned was provided by requestor. Also provided was the specified wording that the scientist was supposed to use to acknowledge NASA Time when publishing papers based upon the observations. Observations: Scientists frequently do not comply with Acknowledgement requirements. Either there is no mention in the Acknowledgements section or the wording may vary significantly from prescribed wording. Astronomers have a fairly rigid format for publications with a required section to describe the data used. Method to produce the requested bibliography: The ADS and Web of Science databases were searched for publications by the scientists since the last compilation of the bibliography. Astronomers usually have a labeled section to describe the observational data and its source. Reading this section to match observation time listed against assigned period was key to retrieving pertinent articles. There could be several years from time of observation to time of publication so a comprehensive list of assigned observation times had to be reviewed. Also, sometimes it was not clear if the time used was NASA time or one of the other Keck partner's allotted times. Request 2. Papers by authors who are supported by the Cassini Data Analysis Program (CDAP) Purpose: To show the use of this funding program. Supposedly authors are to state that their work is funded by the Cassini Data Analysis Program. This kind of funding support is rarely mentioned outside of the Acknowledgements section of publications. The requestor has not supplied Grant numbers. Observation: Authors often do not use prescribed wording. Grant numbers may be mentioned more frequently than program names or acronyms, or the authors simply mention NASA as the source of funding. Method to produce list of these publications: Web of Science provides for searching the names of Funding Agencies. Their Funding Table gives just the name and grant number as given in the papers Acknowledgements section. After the first search an alert was established for "Cassini Data Analysis." This alert was expanded to include the acronym CDAP. When the name is fully given, synonyms were discovered for the word Program. This alert is not very productive for getting results. Only 28 of the 159 publications identified as mentioning CDAP have been found via the Web of Science. The others have been found by alerts from Google Scholar, Google, and Science Direct, and by full-text searching of the American Geophysical Union (AGU) digital library. Alerts can be created in Google and Google Scholar once you are logged in to your Google account. Using Google Scholar for alerts is the most productive way to capture information from new publications. Google alerts while picking up web sites with 'articles' mentioning CADP also pick up mentions of CDAP in other types of publications such as curriculum vitae. Science Direct also allows setting up of alerts using full-text searches. However, Elsevier/Science Direct is currently well covered by Google for this purpose of full-text searching. The AGU Digital Library is currently not harvested by Google, so it is periodically searched for the program name. It should be noted that harvesting of publishers' web sites changes. So it is useful to periodically compare the Google Scholar alerts against what may be found by searching the publisher's web site. Request 3. Use of the Gravity Recovery and Climate Experiment (GRACE) Tellus Web Site by Scientists Purpose: To show the use of the data from this specific web site. GRACE data also is available from other web sites. The request originally was for a given string of words or a URL that the requestor thought would be in the Acknowledgements section of publications. The words given by the requestor were the name of the web site providing the data. A specific grant or funding agency name was not given. The given URL was for the web site. Observations: While the scientist using data from this web site is supposed to state the source of data in the Acknowledgements section of publications, this is frequently not done. Scientists publishing papers using GRACE data come from a variety of backgrounds and publish in a variety of formats. The purpose of publications also varies from news articles to reviews to discussions of the spacecraft to scientific analysis of GRACE data. The source of the data used, when given, was frequently in the publications section on Data/Methods/Design. It was also found in the Analysis/Discussion sections of papers. The wording in these sections frequently varied from prescribed wording. When the source of data was found in the Acknowledgements section, the wording was more standardized. Method to produce bibliography: The most useful search in any database was for the root of the URL, grace.jpl.nasa.gov. But it was not useful in the Web of Science since that database only captures funder names and contract numbers. Searching in Web of Science for GRACE 'And' Tellus in the funder field is also not useful because there are other GRACE and Tellus funders. Google Scholar is the best source for finding results from a wide variety of sources. However, these sources do not necessarily include publications in the AGU Digital library, IEEE Explore, or SPIE digital library. Another drawback to Google Scholar until 2012 was the inability to download good citation information. Now it is possible to download the basic author, title, source information. But it may be preferable to use the link to the publisher and download publication information from the publisher. Publisher's frequently provide the downloading of abstracts, keywords, and the DOI as well as the basic citation. Science Direct was also searched but only added one or two publications to those found through Google Scholar. Springerlink was fully covered by Google Scholar. The AGU Digital library was searchable for the root URL but has provided not very useful for a term search. AGU gives the instruction to use quotes to eliminate stemming. But using quotes around Tellus did not eliminate the stemming and retrieved numerous 'satellite/s.' Analysis of the GRACE Web Site Effort The GRACE web site request results were analyzed to get a better feel for how scientists are acknowledging use of data. Ninety-five publications have been found that mentioned the GRACE web site. Annotations are made to existing records in the GRACE Project Bibliography, or annotated records are added. The GRACE Project Bibliography is a comprehensive bibliography of articles about GRACE since 2002. The tracking of the use of the GRACE data web site was an additional request starting in 2011. AGU publications are a frequent source of articles using GRACE data. To give a feel for the lack of acknowledgment for the source of data, while there are 143 Journal of Geophysical Research articles in the GRACE Bibliography for 2007+, only 13 mentioned grace.jpl.nasa.gov. There are 132 Geophysical Research Letters in the GRACE Bibliography for 2007+, only 12 mention grace.jpl.nasa.gov. During this search, notes were kept of where the information was found and the specific wording of the mention. These notes were analyzed. In summary, 40% of the mention of grace.jpl.nasa.gov occurred in the data/design/methods section, and 27% were found in the Acknowledgements section. This leaves 33% scattered in other sections. Section Title Obvious Use of GRACE Tellus Web Site Not Obvious Use of GRACE Tellus Web Site Total Acknowledging Use Introduction   3 3 Data/Design/Methods 32 6 38 Acknowledgements 26   26 Analysis/Discussion/Comparison 9 2 11 Other 4 4 8 Google Scholar Quote 2 7 9 The "Other" section type includes sections labeled GRACE..., information in a figure caption, and information given in a listed Reference. Google Scholar search results always show a brief quote of the section containing the search terms. There were six publications found this way for which I did not have immediate access to the full text of the publication. Therefore, I do not know what section the information was in. The author acknowledgement of the use of the GRACE Tellus web site is shown by the following table. Publication Year Total Publications Acknowledging Use Acknowledgement in the Article's Acknowledgement Section 2007 3   2008 38 1 2009 26 3 2010 11 7 2011 8 8 2012 9 7 The use of the acknowledgement section of articles rather than just mentioning the data source in the data/methodology section is growing. Acknowledgement The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. Copyright 2013 California Institute of Technology. Government sponsorship acknowledged. References Costas, R. and van Leeuwen, T.N. 2012. Approaching the "reward triangle": general analysis of the presence of funding acknowledgments and "Peer interactive communication" in scientific publications. Journal of the American Society for Information Science & Technology 63(8):1647-61. Cronin, B. 1995. The Scholar's Courtesy: The Role of Acknowledgement in the Primary Communication Process. London, Los Angeles: Taylor Graham. Cronin, B. and Overfelt, K. 1994. The scholar's courtesy: a survey of acknowledgement behavior. Journal of Documentation 50(3):165-96. Cronin, B. and Weaver-Wozniak, S. 1993. Online access to acknowledgements. In: Williams, Martha E., editor. Proceedings of the National Online Meeting, New York, May 1993. Medford, N.J.: Learned Information. p. 93-98. Cronin, B, McKenzie G., and Stiffler M. 1992. Patterns of acknowledgement. Journal of Documentation 48(2):107-22. Cronin, B., Shaw, D., and La Barre, K. 2004. Visible, less visible, and invisible work: patterns of collaboration in 20th century chemistry. Journal of the American Society for Information Science and Technology 55(2):160-8. DOI: 10.1002/asi.10353 Davis, C.F. and Cronin, B. 1993. Acknowledgements and intellectual indebtedness: a bibliometric conjecture. Journal of the American Society for Information Science 44(10):590-2. Giles, C. and Councill, I. 2004. Who gets acknowledged: measuring scientific contributions through automatic acknowledgment indexing. Proceedings of the National Academy of Sciences of the United States of America 101(51):17599-604. DOI: 10.1073/pnas.0407743101 Hyland, K. and Salager-Meyer, F. 2008. Scientific writing. Annual Review of Information Science and Technology 42:297-338. DOI: 10.1002/aris.2008.1440420114 Mayernik, M., Kelly, K., Marlino, M., Wright, M., Abbasi, A., and Giesen, N. 2012. Bridging data lifecycles: tracking data use via data citations. Geophysical Research Abstracts 2012/4/1;14:EGU2012-11999. McCallum, I., Plag, H.P., and Fritz, S. 2012. Data citation standard: a means to support data sharing, attribution, and traceability. Geophysical Research Abstracts 2012/4/1;14:EGU2012-13029. Parsons, M.A., Duerr, R., and Minster, J. Data citation and peer review. Eos, Transactions of the American Geophysical Union 2010;91(34):297-8. DOI: 10.1029/2010EO340001 Raymond, L.M., Chandler, C.L., Lowry, R.K., Urban, E.R., Moncoiffe, G., Pissierssens, P., and Norton, C. 2010. Citations for data in refereed journals. Abstract 2011/12/1:IN11A-1068 presented at 2010 Fall Meeting, American Geophysical Union, San Francisco, Calif., 13-17 December. Ruf, C.S. Citations for data in refereed journals. 2011 Abstract 2011/12/1:IN52A-02 presented at 2011 Fall Meeting, American Geophysical Union, San Francisco, Calif., 5-9 December. Previous Contents Next