Bibliography

This is an automatically generated bibliography describing the content of this study carrel.

ardanuy-dataset-2022
- author: ardanuy
- title: ardanuy-dataset-2022
- date: 2022
- words: 3334
- flesch: 38
- summary: Our dataset differs from others in its emphasis on the geographical aspect of newspaper data. Establishing benchmark datasets like this provides a foundation for others to assess the performance of methods related to the identification and location of places in historical newspapers.
- keywords: ardanuy; dataset; doi; london; newspapers; toponym
- versions: original; plain text
aronson-oregon-2022
- author: aronson
- title: aronson-oregon-2022
- date: 2022
- words: 3189
- flesch: 48
- summary: However, we include several data columns that reference these files to create more contextual information. Students conduct original research in primary sources to compile data and to compose short narratives about Oregon movie theaters during the period of study (1894–1929).
- keywords: cinema; data; humanities; oregon; project; theater
- versions: original; plain text
bagga-hathi-2022
- author: bagga
- title: bagga-hathi-2022
- date: 2022
- words: 5327
- flesch: 51
- summary: The distribution of four features from our Enriched Feature set – average sentence length, Tuldava score, NRC positive score, and VADER positive score – across our dataset of fiction pages (red) and non-fiction pages (blue) sampled from 1800 to 1999. Studying long time scales necessarily requires large data collections as each time unit (year/decade) becomes sparser the less data one has.
- keywords: data; doi; fiction; historical; non; page; work
- versions: original; plain text
chen-china-2022
- author: chen
- title: chen-china-2022
- date: 2022
- words: 3200
- flesch: 37
- summary: A growing number of articles are published every year that use CBDB data to explore topics ranging from career trajectory, regional composition, and family connections of civil officials to intellectual and social networks of Neo-Confucian moral philosophers, antiquities collectors, and members of political factions. For a full list of publications that use CBDB data, see https:// projects.iq.harvard.edu/cbdb/publications-use-cbdb-data.
- keywords: biographical; cbdb; china; data; database; university
- versions: original; plain text
erlin-transcomp-2022
- author: erlin
- title: erlin-transcomp-2022
- date: 2022
- words: 2763
- flesch: 46
- summary: Given that the set of original language works was larger than the set of translations, we also randomly downsampled each year of our original publications to match the number of translations. Following the precedent established by Toury’s (1980) and Baker’s (1993) pioneering work on translation universals, our aim has been to create two independent corpora that enable researchers to evaluate translated texts as they relate to target language texts in general, rather than to compile a corpus of translations and their corresponding source texts.
- keywords: data; doi; literary; translations
- versions: original; plain text
faghihi-teaching-2022
- author: faghihi
- title: faghihi-teaching-2022
- date: 2022
- words: 9801
- flesch: 45
- summary: Oversight is provided by a board whose remit includes advice and training on the creation of TEI descriptions. Training was delivered in a series of structured workshops where the creation of TEI descriptions, with a particular focus on use of the authority files (lists of standard forms for certain entities in the data such as names and works), was embedded in a complete workflow involving collaborative working with GitHub.
- keywords: context; data; encoding; humanities; learning; manuscript; teaching; tei; text; text encoding
- versions: original; plain text
fekete-accessing-2022
- author: fekete
- title: fekete-accessing-2022
- date: 2022
- words: 1895
- flesch: 41
- summary: Second, adult environmental education can profit from further analysis by examining the level of environmental awareness about wood and trees in adults. Specifically, new aspects of environmental pedagogy, environmental education, sustainable development, climate protection, sylviculture, environmental awareness of families, adult environmental education, and education policies can also be investigated from the perspective of environmental awareness.
- keywords: data; variables; wood
- versions: original; plain text
felbur-crosslingusitic-2022
- author: felbur
- title: felbur-crosslingusitic-2022
- date: 2022
- words: 8223
- flesch: 53
- summary: While much effort is currently being invested in attempts to develop tools that will segment Chinese texts into words (some of them specifically designed to segment Buddhist materials, e.g. Wang, 2020), these tools remain unusable to us, since the underlying models themselves are often not openly released, and the training data used to create them is often not available. We then define Tibetan texts parallel to the Chinese sūtras as the ‘target.’
- keywords: alignment; buddhist; chinese; doi; embeddings; results; similarity; text; tibetan; word
- versions: original; plain text
gaber-forming-2022
- author: gaber
- title: gaber-forming-2022
- date: 2022
- words: 2319
- flesch: 32
- summary: Goran Gaber École des Hautes Études en Sciences Sociales (LIER-FYT), Paris, France; Maison Française d’Oxford, Oxford, UK goran.gaber@ehess.fr KEYWORDS: critique; title pages; union catalogues; dataset; book history; history of concepts TO CITE A complementary and interconnected “data package” was deposited on Zenodo, comprising: (1) a classical text-based bibliography, supplemented by (2) a CSV dataset of information contained therein, (3) the images of title pages not readily available online, and (4) a comprehensive BibTeX dataset.
- keywords: critique; dataset; pages; title
- versions: original; plain text
gerardi-kahd-2022
- author: gerardi
- title: gerardi-kahd-2022
- date: 2022
- words: 5444
- flesch: 50
- summary: Such databases, beside elucidating the internal classification of language families, play a role in the understanding of displacement and linguistic contact, for example, through borrowing. Apart from its value for (computational) historical linguistics mentioned in the previous section, the KAHD database also serves as language documentation and preservation effort for Amazonian language families since, as shown in Section 1.1, the number of speakers for some of the languages is diminishing at a fast rate (see e.g. D’Ávila 2019).
- keywords: arawan; data; database; doi; language; list
- versions: original; plain text
hagedorn-bearing-2022
- author: hagedorn
- title: hagedorn-bearing-2022
- date: 2022
- words: 5578
- flesch: 50
- summary: MEASURE VALUE Number of tales 1518 Number of tale types 182 Mean tokens per tale 979.1 Median tokens per tale 642 Minimum tokens per tale 10 Maximum tokens per tale 12,406 Mean sentences per tale 45.7 Median sentences per tale 31 ATU ID TALE NAME N OF TALES 275 The tales compiled in the aft data are annotated by ATU tale type, and represent 182 distinct types.
- keywords: darányi; data; dataset; doi; journal; open; research; tale
- versions: original; plain text
han-reddit-2022
- author: han
- title: han-reddit-2022
- date: 2022
- words: 2184
- flesch: 47
- summary: Reddit’s data structure and limited restrictions on posting content provide opportunities to study online language use, communication processes, public opinions, online culture, online communities, and online social movements. Thus, this dataset will help study online social movements and its relationship with online culture.
- keywords: dataset; reddit; sentiment; stock
- versions: original; plain text
jauhiainen-social-2022
- author: jauhiainen
- title: jauhiainen-social-2022
- date: 2022
- words: 3687
- flesch: 52
- summary: Entries for document names could not be identified in the structure of the PDF file, and the identification and extraction of documents is thus based on concordance lists and document names attested in other PNA volumes. The earlier PNA volumes (1/I–3/I) were available to us as plain text files that were used to typeset the printed publications.
- keywords: assyrian; data; neo; network; pna
- versions: original; plain text
kelleher-place-2022
- author: kelleher
- title: kelleher-place-2022
- date: 2022
- words: 6508
- flesch: 45
- summary: The Nakala data set includes full data management documentation, full ethics documentation in English and French, concept notes in English and French and participant data files that include .csv metadata sheets, .wav audio recordings of interviews, .jpeg photographs of the place of the interview and open ELAN (MPI, 2021) Places data is opened on the Nakala data repository that is overseen by the Digital Humanities Very Large Research Infrastructure (Sciences Humaines Numériques Très Grande Infrastructure de Recherche – TGIR Huma-Num) (CNRS, 2022).
- keywords: data; doi; march; nakala; open; places; project; research; science
- versions: original; plain text
kuys-representing-2022
- author: kuys
- title: kuys-representing-2022
- date: 2022
- words: 6880
- flesch: 52
- summary: The principal source in this project, A.J. van der Aa’s Geographical Dictionary, has plenty of event descriptions. Data underpinning any private interpretations by van der Aa (or by others) should be confined to an RDF graph or namespace of their own.
- keywords: data; der; events; model; time; van
- versions: original; plain text
maignant-drama-2022
- author: maignant
- title: maignant-drama-2022
- date: 2022
- words: 6689
- flesch: 51
- summary: It also enables us to contribute to the field of English literature by proposing the first reusable dataset to offer numerous theatre reviews on journalistic and digital criticism. Creating this corpus based on digital reviews was less time-consuming than the first one because the reviews were already in a textual format.
- keywords: blog; corpus; data; digital; humanities; july; open; reviews; theatre
- versions: original; plain text
marongiu-static-2022
- author: marongiu
- title: marongiu-static-2022
- date: 2022
- words: 7624
- flesch: 51
- summary: We focus on the case of modal meanings in the Latin language and we showcase how we transposed the gathered data from a discursive to a visual form. Our set of modal maps features some impersonal verbs or constructions, e.g., respectively decet, licet, oportet and aequus est, necesse est, meum est among others.
- keywords: dell’oro; diachronic; latin; maps; meanings; modal; modality; semantic
- versions: original; plain text
melanie-oupoco-2022
- author: melanie
- title: melanie-oupoco-2022
- date: 2022
- words: 1678
- flesch: 52
- summary: Its contribution is marginal as only seven sonnets come from this database (that covers other kinds of French poems, most of them not being sonnets). The sonnets come from different sources from the Internet, or not: we especially want to thank the Bibliothèque nationale de France (BnF) (French National Library) that gave us access to a large corpus, from which we were able to extract an invaluable number of French poems.
- keywords: data; french; sonnets
- versions: original; plain text
nurmikko-teaching-2022
- author: nurmikko
- title: nurmikko-teaching-2022
- date: 2022
- words: 6965
- flesch: 44
- summary: edu.au KEYWORDS: Linked Open Data; bibliographic metadata; pedagogy; participant evaluations TO CITE In recognition of the role of collaboration and co- authoring in digital humanities (DH) research (Needham & Haas, 2019), workshop participants are encouraged to work together and communicate openly as a group.
- keywords: data; digital; fuller; humanities; information; ld4dh; open; participants; workshop
- versions: original; plain text
oneill-text-2022
- author: oneill
- title: oneill-text-2022
- date: 2022
- words: 2212
- flesch: 46
- summary: This paper introduces the state of the field in Newar literature, Newar manuscripts, and HTR engines. Deep learning neural networks have made it possible to build HTR models based on images of handwritten text linked with corresponding transcriptions (called “ground truth”).
- keywords: data; manuscripts; model; newar
- versions: original; plain text
pala-tracing-2022
- author: pala
- title: pala-tracing-2022
- date: 2022
- words: 6331
- flesch: 54
- summary: Benefits of this approach lie in its ability to quantify change, to study complex 3D material, and to analyse large datasets of objects, opening the possibility of constructing new large-scale studies of object shape across time and geographical regions. The method can be scaled to large datasets of 3D objects scans where changes can be computed automatically, without the need for human intervention.
- keywords: approach; distance; objects; points; shape; study; vessel
- versions: original; plain text
pan-networking-2022
- author: pan
- title: pan-networking-2022
- date: 2022
- words: 7225
- flesch: 41
- summary: In relational databases, edges usually only convey directions and at most labels (categories), but they can carry easily expandable and modifiable properties in graph databases. This means that for long term projects such as this one (which, because of the current incompleteness of the source data, calls for continuing addition of data), graph database allows for more possibilities in terms of efficient and versatile querying and expansion.
- keywords: database; graph; graph database; japanese; lawsuits; movement; network; reparation
- versions: original; plain text
piper-conlit-2022
- author: piper
- title: piper-conlit-2022
- date: 2022
- words: 2462
- flesch: 44
- summary: As we show with the overview of our data (Table 1), our institutional frameworks can include bestseller lists, prize committee shortlists, book review lists, user-generated “choice awards”, or corporate forms of categorization. We define “popular” through multiple criteria that include user-generated awards or lists, elite prize committee lists or book reviews, or bestseller tags on platforms like Amazon or the New York Times.
- keywords: books; data; fiction; genre
- versions: original; plain text
pitts-corpus-2022
- author: pitts
- title: pitts-corpus-2022
- date: 2022
- words: 1822
- flesch: 46
- summary: These advantages hold true in fragmentary languages such as Venetic or Messapic as much as in large corpus languages such as Classical Latin or Greek. This database was created in the context of a PhD project on language contact in Ancient Italy, entitled The interplay between language contact and language change in a fragmentary linguistic area: the Italic peninsula in the first millennium BCE.
- keywords: corpus; data; languages; linguistic
- versions: original; plain text
turenne-mining-2022
- author: turenne
- title: turenne-mining-2022
- date: 2022
- words: 6903
- flesch: 43
- summary: The choice of the pair Chinese–English has several motivations: firstly, the data is more easily available; secondly, there is a demand for English and Chinese tools and datasets, as English is already the lingua franca in many areas (political, economical, cultural, and scientific), and we also see an increasing interest in Chinese, which is now being taught at schools in western countries. This paper is divided into the following sections: we discuss the dataset and its sub-datasets, describe the state- of-the-art research based on bilingual corpora, machine learning, and natural language processing, and then present the results of our experiments.
- keywords: chinese; corpus; dataset; doi; domain; english; finance; language; parallel; proceedings
- versions: original; plain text
vauth-event-2022
- author: vauth
- title: vauth-event-2022
- date: 2022
- words: 1962
- flesch: 49
- summary: These annotations were used for the automation of narratological event annotations (Vauth, Hatzel, Gius, & Biemann, 2021), a reflection of inter annotator agreements in literary studies (Gius & Vauth, 2022) and the development of an event based plot model (Gius & Vauth, accepted). Inter Annotator Agreement (Krippendorff’s α) for event types.
- keywords: event; gius
- versions: original; plain text
verbruggen-social-2022
- author: verbruggen
- title: verbruggen-social-2022
- date: 2022
- words: 7049
- flesch: 36
- summary: By collecting and enriching a dataset of international organizations and congresses associated with social reform, TIC sought to map cooperation across national lines and across thematic categories. Social Reform International Congresses and Organizations (1846–1914): From Sources to Data RESEARCH PAPER CORRESPONDING AUTHOR: Christophe Verbruggen Department of History – GhentCDH, Ghent University, Ghent, BE christophe.verbruggen@ugent.be KEYWORDS: social reform; transnational history; network analysis; social internationalism; collective action TO CITE
- keywords: congresses; data; doi; ghent; international; open; organizations; reform; social; university; van
- versions: original; plain text
yi-accessibility-2022
- author: yi
- title: yi-accessibility-2022
- date: 2022
- words: 10694
- flesch: 42
- summary: Accessibility, Discoverability, and Functionality: An Audit of and Recommendations for Digital Language Archives RESEARCH PAPER CORRESPONDING AUTHOR: Irene Yi Linguistics Department, Yale University, New Haven, CT, US irene.yi@yale.edu KEYWORDS: language archives; documentation; accessibility; discoverability; functionality; linguistics; endangered languages; metadata TO CITE Language archives utilize a number of different content management systems and do not provide uniform functionality (Aznar & Seifart 2020).
- keywords: access; archives; collections; data; digital; doi; files; information; language; language archives; materials; users
- versions: original; plain text