Soorgeon

Tip

Deploy AI apps for free on Ploomber Cloud!

Convert monolithic Jupyter notebooks into Ploomber pipelines.

soorgeon.mp4

3-minute video tutorial.

Note: Soorgeon is in alpha, help us make it better.

Install

Compatible with Python 3.7 and higher.

pip install soorgeon

Usage

[Optional] Testing if the notebook runs

Before refactoring, you can optionally test if the original notebook or script runs without exceptions:

# works with ipynb files
soorgeon test path/to/notebook.ipynb
# and notebooks in percent format
soorgeon test path/to/notebook.py

Optionally, set the path to the output notebook:

soorgeon test path/to/notebook.ipynb path/to/output.ipynb
soorgeon test path/to/notebook.py path/to/output.ipynb

Refactoring

To refactor your notebook:

# refactor notebook
soorgeon refactor nb.ipynb
# all variables with the df prefix are stored in csv files
soorgeon refactor nb.ipynb --df-format csv
# all variables with the df prefix are stored in parquet files
soorgeon refactor nb.ipynb --df-format parquet
# store task output in 'some-directory' (if missing, this defaults to 'output')
soorgeon refactor nb.ipynb --product-prefix some-directory
# generate tasks in .py format
soorgeon refactor nb.ipynb --file-format py
# use alternative serializer (cloudpickle or dill) if notebook 
# contains variables that cannot be serialized using pickle 
soorgeon refactor nb.ipynb --serializer cloudpickle
soorgeon refactor nb.ipynb --serializer dill

To learn more, check out our guide.

Cleaning

Soorgeon has a clean command that applies black for .ipynb and .py files:

soorgeon clean path/to/notebook.ipynb

or

soorgeon clean path/to/script.py

Linting

Soorgeon has a lint command that can apply [flake8]:

soorgeon lint path/to/notebook.ipynb

or

soorgeon lint path/to/script.py

Examples

git clone https://github.com/ploomber/soorgeon

Exploratory data analysis notebook:

cd soorgeon/examples/exploratory
soorgeon refactor nb.ipynb
# to run the pipeline
pip install -r requirements.txt
ploomber build

Machine learning notebook:

cd soorgeon/examples/machine-learning
soorgeon refactor nb.ipynb
# to run the pipeline
pip install -r requirements.txt
ploomber build

To learn more, check out our guide.

Community

About Ploomber

Ploomber is a big community of data enthusiasts pushing the boundaries of Data Science and Machine Learning tooling.

Whatever your skillset is, you can contribute to our mission. So whether you're a beginner or an experienced professional, you're welcome to join us on this journey!

Click here to know how you can contribute to Ploomber.

Name		Name	Last commit message	Last commit date
Latest commit History 295 Commits
.githooks		.githooks
.github		.github
_kaggle		_kaggle
_static		_static
doc		doc
examples		examples
src/soorgeon		src/soorgeon
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
tasks.py		tasks.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Soorgeon

Install

Usage

[Optional] Testing if the notebook runs

Refactoring

Cleaning

Linting

Examples

Community

About Ploomber

About

Uh oh!

Releases

Uh oh!

Contributors 12

Uh oh!

Languages

License

ploomber/soorgeon

Folders and files

Latest commit

History

Repository files navigation

Soorgeon

Install

Usage

[Optional] Testing if the notebook runs

Refactoring

Cleaning

Linting

Examples

Community

About Ploomber

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors 12

Uh oh!

Languages