Methods

Status

Tested Methods

  • Project.clean()
  • Project.copy()
  • Project.copy_templates()
  • Project.delete()
  • Project.exists()
  • Project.export()
  • Project.get_latest_version_number()
  • Project.get_latest_version()
  • Project.get_version()
  • Project.launch()
  • Project.make_new_project_dir()
  • Project.parse_version()
  • Project.print_manifest()
  • Project.save()
  • Project.save_as()
  • Project.unzip()
  • Project.zip()

Last Update: Jun 6, 2019.

Note

Currently, all methods are public.


API

Project.clean(self)

Gets a reduced version of the manifest, removing empty values

Project.clean_nb(self, nbfile, clean_outputs=True, clean_notebook_metadata_fields=None, clean_cell_metadata_fields=None, clean_tags_=None, clean_empty_cells=False, save=False)

Cleans metadata fields and outputs from a Jupyter notebook. By default, only the outputs are cleaned, although the method will typically be called with clean_empty_cells=True. The method returns a JSON string with the cleaned notebook. If save is set to True, the source notebook will be overwritten with the cleaned JSON string.

Based on nbtoolbelt.

Project.compare_files(self, existing_file, new_file)

Hashes and compares two files. Returns True if they are equivalent.

Project.copy(self, name, version=None)

Inserts a copy of the current project into the database using a new name and _id. If no version is supplied, the latest version is used. Regardless, the new project is reset to version 1.

If the user wishes to work with a copy of a project without saving it to the database, they should simply launch it. The unsaved project can always be saved as a new project from within the Workspace.

Tip

This method could be extended to run in the Workspace, where the project folder would first be zipped up and added to the version dict in the manifest.

Project.copy_templates(self, templates, project_dir)

Copies the templates from the templates folder to the a project folder.

Tip

There needs to be a standardised way of locating the workflow. The WMS has a config value for the templates folder, and the Workspace has a path to this template. That means that the templates folder must be flat and template folder names should be of the form topic-modeling.

Project.count_source(self, source)

This method is a helper for Project.copy_nb(). It counts and returns the number of non-blank lines, words, and non-whitespace characters in a Jupyter notebook cell.

Project.create_version_dict(self, path=None, version=None)

Creates and returns a new dictionary containing version metadata to be stored in the manifest's content field. If a project path is given, the project folder is zipped and compared to the latest existing zip archive. If it differs, a new dict is created with a higher version number.

Project.delete(self, version=None)

Deletes a project from the database using the manifest _id. If a version number (an integer) is supplied, that version's dict is removed from the manifest content, and the record is updated in the database.

Project.exists(self)

Tests whether the _id listed in the project's manifests exists in the database.

Project.export(self, version=None)

Downloads the latest version of a project. If a version number (an integer) is supplied, that version is downloaded instead.

Project.get_latest_version(self):

Gets the dict for the latest version of a project by combining Project.get_latest_version_number() with Project.get_version().

Project.get_latest_version_number(self)

Returns the highest version number in the project's manifest.

Project.get_version(self, value, key='number')

Returns the manifest dict for a specific version of the project. The versions are searched by key containing the specified value Valid keys are date, name, and number.

Project.launch(self, manifest, workflow, version=None, new=True)

Launches the latest version of a project in the Workspace. If the user does not have any datapackages stored in the database, a new v1 project_dir is created. Otherwise, if the user clicks the main rocket icon, a new project_dir is created based on the latest version. If the user clicks on a specific version's rocket icon, a project_dir based on that version's datapackage is created. Where possible, a datapackage is unzipped to the Workspace. Otherwise, the data is written to the project_dir from the database.

Project.make_new_project_dir(self, project_dir, templates)

A helper function for Project.launch(). Checks whether the project_dir is currently live in the Workspace. If not, a new project_dir is created. Templates are copied to it, and data is written to the caches/json folder from the database.

Project.parse_version(self, s, output=None)

Splits the folder name for a project version into its component parts, date, version number, and project name. By default the method returns all three as separate values, so it must be called like

date, name, number = parse_version('2019010101121212_v1_myproject')

The output parameter can be set to 'date', 'name', or 'number' to return a single value:

date = parse_version('2019010101121212_v1_myproject', output='date')

Project.save(self, path=None)

Updates the manifest record in the database or inserts a new one with version 1 and an empty workflow in the manifest.

Project.save_as(self, path=None, new_name=None)

Creates a duplicate of a project with a new name. If a path to an open project in the Workspace is supplied, that folder's contents are copied to a new project folder, notebook cells are reset, and a new manifest is written for the project. Finally, a new database record is created with the zipped folder as version 1. Otherwise, a new record is created in the database based on the old one. Note that when Project.save_as is used from WMS, the complete version history of the source project is saved and notebook outputs are not cleared. This is because the zip archive(s) in the manifest would have to be unzipped. This functionality could be added in the future.

Project.save_record(self, action='insert')

Inserts or updates a new record in the database. This is a helper function to reduce code repetition.

Project.unzip(self, source=None, output_path=None, binary=False)

Unzips the specified source file to a specified folder. By default, the source variable is assumed to be a file path; however, setting binary=False will allow the method to accept the version_zipfile value (a binary file) from the manifest. Project.unzip() is called with binary=True in Project.launch().

Project.zip(self, filename, source_dir, destination_dir)

Creates a zip archive of the source_dir and writes it to the destination_dir. The method is generally called by Project.export() with the project_dir and exports_dir as parameters.