Software Packaging


  • Reproducibility is an integral concept in the FAIR4RS principles. Appropriate software packaging is one way to account for reproducible research software, which involves collecting and configuring software components into a format deployable across different computer systems.

  • Software packaging is akin to the packaging a box for shipment. Attributes such as the software source code, installation instructions, user documentation, and test scripts all support to ensure reproducibility.

  • The purpose of a software package is to install source code for execution on various systems, with considerations including target users, dependencies, testability and scalability.

Package File History


  • Python packages make code easier to install, reuse and maintain.
  • A single pyproject.toml file is all that is required to package your Python project.
  • There are multiple standards out there for Python packaging, but pyproject.toml is the current recommended way.

Accessing Packages


  • pip is the most common tool used to download and access python packages from PyPI.
  • PyPI is an online package repository which users can choose to upload their packages to for others to use.
  • pip can also be used to install packages on your local system (installing from source)

Creating Packages


  • A package can be built with as little as 2 files, a Python script and a configuration file
  • pyproject.toml files have 2 key tables, [build-system] and [project]
  • Editable installs allow for quick and easy package development

Versioning


  • Versioning is crucial for tracking the development, improvements, and bug fixes of a software package over time. It ensures that changes are documented and managed systematically, aiding in reproducibility and reliability of the software.

  • Tools like setuptools_scm help automate the version bumping process, reducing manual errors and ensuring that version numbers are updated consistently across all project files.

  • Versioning enables users to track code changes and dependencies, allowing reliable recreation of specific software versions, and further aiding the reproducibility of your software.

Releasing Python Packages


  • GitHub tags provide a way to manage specific software versions via releases, enabling developers to easily reference and distribute stable versions of their software for their users.

  • Releases allow your software to be quickly and easily installed across different systems.

Publishing Packages


  • You can easily publish your package on PyPI for the wider Python community, allowing your users to simply install your software using pip install.

  • The University of Sheffield’s ORDA repository is another valuable platform to upload your software, further enabling software reproducibility, transparency, and research impact for all project collaborators involved.