Index of lead-pipe

Basic characteristics

Creatoreric
Date created2022-05-19
Number of items293
Number of words1624174
Average readability score55
Bibliographicsplain text; HTML; JSON
Other filesstopwords; entire corpus

Sizes

Readability

Clusters

Ngrams



unigrams


bigrams

Parts-of-speech



nouns




proper nouns




pronouns




verbs




adjectives



adverbs

Entities



any entity




persons




geo-political entities


organizations

Keywords

Next steps

The next step is for you to ask yourself some sort of question, and apply it to this data set. There are quite a number of ways to do this.

The Distant Reader and the Distant Reader Toolbox take an almost arbitrary amount of text as input and output data sets -- affectionatly known as "study carrels". The contents of this page was created from a study carrel.

Each study carrel is constituted with the same set of folders and files. These folders and files contain "features" of the original documents such as parts-of-speech, named-entities, and statistically significant keywords. For example, all of the original documents have been saved in the cache folder, and all of the plain-text versions of the original documents have been saved in the txt directory. Since almost all of the files in a study carrel are either plain-text files or tab-delimited files, Distant Reader study carrels can be accessed and used by almost any text editor, word processor, spreadsheet, database, or analysis application. The following folders contain information of particular interest:

There are a few files of note:

There are quite a number of graphical-user interface (GUI) applications you can apply to a carrel's content:

Finally, if you have Python installed, then you can install the Reader Toolbox (pip install reader-toolbox), and use the rdr command from the command line to do many of the things the GUI applications do and more. There is also a set of Jupyter Notebooks demonstrating how the Toolbox can be extended and used in conjunction with other Python modules (like Pandas, SQLite, WordNet, etc.).

For more information, please see the complete manual.

Happy reading!


Eric Lease Morgan <[email protected]>
Navari Family Center for Digital Scholarship
University of Notre Dame