“Collection” — A specific set of of texts (and data about them) that a user is working on inside a project. A collection can be the entirety of a corpus; but it can also be a particular subset of a corpus: e.g., only newspaper articles containing both the words humanities and science.
“Computer environment” — The whole computing platform containing (in replicable containers) the software and tools offered by WE1S. Users can download and install the environment on their own computers.
“Corpus / Corpora” — The total set of texts (and data about them) that a user has collected or is working on. (Compare Collection.)
“Data” — Representations of a text collection in derived forms that are not readable as plain text. For example, data that the WE1S Workspace generates from text collections include: bags-of-words or term frequencies, ngram counts, etc.
“Module” — A specific bundle of one or more Jupyter notebooks (and supporting scripts and files) available inside a project folder. Each module focuses on a particular task: e.g., creating a topic model or visualizing it. Each module contains a README.md file with a user guide for that module.
“Project” — The folder location and file structure created by running the create_new_project.ipynb notebook. A project is where users work on collections of texts and data using the Workspace’s modules.
“Metadata” — Secondary data about the data being worked on in the WE1S Workspace. Metadata includes such citation information about collections of texts as author, publication, date, etc. But it can also include other kinds of labels or tags created by a user to facilitate addressing research questions. For example, the WE1S Project labeled publication sources in some of the collections it topic modeled based on geographical region, kind of publication, self-identified association with particular social groups, etc.
“Template” — When a new project is created, folders for each module containing notebooks and supporting scripts or other resources are copied from a central location to your project folder. This is known as the project “template”, and the individual files are called the “template” files. Templates have version numbers so that it is clear what version of the template was used to produce a project even if the template files are updated after the project was created.
“Workspace” — The Jupyter notebook system within the WE1S computing environment for collecting, managing, analyzing, topic modeling, visualizing, and other operations on texts. When initially downloaded as part of the computing environment, the Workspace includes a Jupyter notebook for initiating a project and installing the necessary modules for managing project workflow.