Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Upload Python Package to PyPI when a Release is Created

on:
release:
types: [created]

jobs:
pypi-publish:
name: Publish release to PyPI
runs-on: bdg-runners
environment:
name: pypi
url: https://pypi.org/p/pysequila
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I can' see any automatic tagging and semantic versioning steps?
  2. It would be also nice to have a PR-workflow that run some basic checks like black, isort, and unit tests
  3. pre-commit hook setup
  4. It's not setup to run on our self-hosted runners
  5. Would consider to use poetry instead (we try to use it in other projects as well) for the sake of lock file, cleaner way of managing dependencies and unification of development.

Copy link
Copy Markdown
Author

@psuszyns psuszyns Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to divide work to small deliverable parts. I wanted to first setup a basic publishing scheme, make sure the setup on the PyPI side is correct, and then add modifications one by one.

Regarding your individual points:

  1. This configuration will trigger a publish to PyPI when someone creates a new 'Release': https://github.com/biodatageeks/pysequila/releases/new - which also creates a tag in the repo. Another options would be to setup this in a way so that:
  • it automatically publish when someone pushes a new tag with name starting with "v" on the master branch
  • it automatically creates a new tag with 'next' version and automatically publish
    From those options I think the last one is the worst, since we want control over which version number are we bumping. I think the first one (manually creating a release in github) is the best one. Releasing a new version should be a concious decision, proceeded by manual testing.
  1. Sure, in a next pull request.
  2. Yes, it would be nice to have some checks - but what exactly would we want? Also, that's definitely a separate task.
  3. fixed this
  4. I have a ready Poetry setup locally and wanted to push it after this PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so please create follow-up issues that we won't forget about these points.

with:
python-version: "3.9"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel
- name: Build package
run: |
python setup.py sdist bdist_wheel # Could also be python -m build
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
*.egg
*.egg/
*.egg-info/
.eggs/
*.pyc
.tox/
_build/
Expand All @@ -15,4 +16,5 @@ page/config.toml
page/docs/*
docs/source_v/*
page/resources
.DS_Store
.DS_Store
.venv/
285 changes: 0 additions & 285 deletions .gitlab-ci.yml

This file was deleted.

27 changes: 19 additions & 8 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,17 @@ Features
* other utility functions
* support for both SQL and Dataframe/Dataset API

Setup
Building
=====

::

$ python3.9 -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements.txt
$ python setup.py sdist bdist_wheel

Install
=====

::
Expand All @@ -66,19 +76,20 @@ Usage
::

$ python
>>> import os
>>> from pysequila import SequilaSession
>>> curr_dir = os.getcwd()
>>> ss = SequilaSession \
.builder \
.config("spark.jars.packages", "org.biodatageeks:sequila_2.12:1.1.0") \
.config("spark.jars.packages", "org.biodatageeks:sequila_2.12:1.3.6") \
.config("spark.driver.memory", "2g") \
.getOrCreate()
>>> ss.sql(
f"""
>>> ss.sql(f"""
CREATE TABLE IF NOT EXISTS reads
USING org.biodatageeks.sequila.datasources.BAM.BAMDataSource
OPTIONS(path "/features/data/NA12878.multichrom.md.bam")
"""
>>> ss.sql ("SELECT * FROM coverage('reads', 'NA12878','/features/data/Homo_sapiens_assembly18_chr1_chrM.small.fasta")
OPTIONS(path "{curr_dir}/features/data/NA12878.multichrom.md.bam")
""")
>>> df = ss.sql (f"SELECT * FROM coverage('reads', 'NA12878','{curr_dir}/features/data/Homo_sapiens_assembly18_chr1_chrM.small.fasta')")
>>> # or using DataFrame/DataSet API
>>> ss.coverage("/features/data/NA12878.multichrom.md.bam", "/features/data/Homo_sapiens_assembly18_chr1_chrM.small.fasta")
>>> df = ss.coverage(f"{curr_dir}/features/data/NA12878.multichrom.md.bam", "{curr_dir}/features/data/Homo_sapiens_assembly18_chr1_chrM.small.fasta")

Loading