Functional Interface¶
pytask offers a functional interface to users who want more flexibility than is given by a command line interface. It even allows you to run pytask from a Python interpreter or a Jupyter notebook like this article here.
Let’s see how it works!
from pathlib import Path
from typing import Annotated
import pytask
from pytask import task
Here is a small workflow where two tasks create two text files and the third task merges both of them into one file.
One important bit to note here is that the second task is created from a lambda function. So, you can use dynamically defined functions to create tasks.
It also shows how easy it is to wrap any third-party function where you have no control over the signature, but you can still easily wrap them with pytask.
def task_create_first_file() -> Annotated[str, Path("first.txt")]:
return "Hello, "
task_create_second_file = task(
name="task_create_second_file", produces=Path("second.txt")
)(lambda *x: "World!")
def task_merge_files(
first: Path = Path("first.txt"), second: Path = Path("second.txt")
) -> Annotated[str, Path("hello_world.txt")]:
return first.read_text() + second.read_text()
Now, let us execute this little workflow.
session = pytask.build(
tasks=[task_create_first_file, task_merge_files, task_create_second_file]
)
────────────────────────────────────────────── Start pytask session ───────────────────────────────────────────────
Platform: linux -- Python 3.13.3, pytask 0.5.9.dev101+gd0f4cb046, pluggy 1.6.0
Root: /home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743
Collected 3 tasks.
╭─────────────────────────┬─────────╮ │ Task │ Outcome │ ├─────────────────────────┼─────────┤ │ task_create_first_file │ . │ │ task_create_second_file │ . │ │ task_merge_files │ . │ ╰─────────────────────────┴─────────╯
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
╭─────────── Summary ────────────╮ │ 3 Collected tasks │ │ 3 Succeeded (100.0%) │ ╰────────────────────────────────╯
──────────────────────────────────────────── Succeeded in 0.16 seconds ────────────────────────────────────────────
The information on the executed workflow can be found in the session.
session
Session(config={'pm': <pluggy._manager.PluginManager object at 0x715d048d1090>, 'markers': {'filterwarnings': 'Add a filter for a warning to a task.', 'persist': 'Prevent execution of a task if all products exist and even if something has changed (dependencies, source file, products). This decorator might be useful for expensive tasks where only the formatting of the file has changed. The state of the files which have changed will also be remembered and another run will skip the task with success.', 'skip': 'Skip a task and all its dependent tasks.', 'skip_ancestor_failed': 'Internal decorator applied to tasks if any of its preceding tasks failed.', 'skip_unchanged': 'Internal decorator applied to tasks which have already been executed and have not been changed.', 'skipif': 'Skip a task and all its dependent tasks if a condition is met.', 'task': 'Mark a function as a task regardless of its name. Or mark tasks which are repeated in a loop. See this tutorial for more information: [link https://bit.ly/3DWrXS3]https://bit.ly/3DWrXS3[/].', 'try_first': 'Try to execute a task a early as possible.', 'try_last': 'Try to execute a task a late as possible.'}, 'config': None, 'database_url': sqlite:////home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/.pytask/pytask.sqlite3, 'editor_url_scheme': 'file', 'export': <_ExportFormats.NO: 'no'>, 'hook_module': Sentinel.UNSET, 'ignore': ['.codecov.yml', '.gitignore', '.pre-commit-config.yaml', '.readthedocs.yml', '.readthedocs.yaml', 'readthedocs.yml', 'readthedocs.yaml', 'environment.yml', 'pyproject.toml', 'setup.cfg', 'tox.ini', '.git/*', '.venv/*', '.pixi/*', '*.egg-info/*', '.ipynb_checkpoints/*', '.mypy_cache/*', '.nox/*', '.tox/*', '_build/*', '__pycache__/*', 'build/*', 'dist/*', 'pytest_cache/*'], 'paths': [], 'layout': 'dot', 'output_path': 'dag.pdf', 'rank_direction': <_RankDirection.TB: 'TB'>, 'expression': '', 'marker_expression': '', 'nodes': False, 'strict_markers': False, 'directories': False, 'exclude': [Sentinel.UNSET, '.git/*', '/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/.pytask/*'], 'mode': <_CleanMode.DRY_RUN: 'dry-run'>, 'quiet': False, 'capture': <CaptureMethod.FD: 'fd'>, 'clean_lockfile': False, 'debug_pytask': False, 'disable_warnings': False, 'dry_run': False, 'explain': False, 'force': False, 'max_failures': inf, 'n_entries_in_table': 15, 'pdb': False, 'pdbcls': None, 's': False, 'show_capture': <ShowCapture.ALL: 'all'>, 'show_errors_immediately': False, 'show_locals': False, 'show_traceback': True, 'sort_table': True, 'trace': False, 'verbose': 1, 'stop_after_first_failure': False, 'check_casing_of_paths': True, 'pdb_cls': '', 'tasks': [<function task_create_first_file at 0x715d048c8f40>, <function task_merge_files at 0x715d048c9260>, <function <lambda> at 0x715d048c9580>], 'task_files': ('task_*.py',), 'command': 'build', 'root': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743'), 'filterwarnings': [], 'lockfile_path': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/pytask.lock'), 'lockfile_state': LockfileState(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/pytask.lock'), root=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743'), use_lockfile_for_skip=False, lockfile=_Lockfile(lock_version='1', task=[_TaskEntry(id='task_create_first_file', state='0ead6793304cfae29a4d38cc3bbb8c8fdf573e9ae138d9e51c3d495926d2918d', depends_on={}, produces={'docs/source/how_to_guides/first.txt': '23429bd9ba98dd5140309bb9b0094b3aad642430fff6fb3ca61f008ce644f34a'}), _TaskEntry(id='task_create_second_file', state='c69ede3aac47ebfb3fade7fd2bc835270757c7f568e668a5cc12446d51db413c', depends_on={}, produces={'docs/source/how_to_guides/second.txt': '514b6bb7c846ecfb8d2d29ef0b5c79b63e6ae838f123da936fe827fda654276c'}), _TaskEntry(id='task_merge_files', state='72df9444f53beb87ff9c513bce009ddcdb26e41c3d324a0f1cde13bc52181263', depends_on={'docs/source/how_to_guides/second.txt': '514b6bb7c846ecfb8d2d29ef0b5c79b63e6ae838f123da936fe827fda654276c', 'docs/source/how_to_guides/first.txt': '23429bd9ba98dd5140309bb9b0094b3aad642430fff6fb3ca61f008ce644f34a'}, produces={'docs/source/how_to_guides/hello_world.txt': 'dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f'})]), _task_index={'task_create_first_file': _TaskEntry(id='task_create_first_file', state='0ead6793304cfae29a4d38cc3bbb8c8fdf573e9ae138d9e51c3d495926d2918d', depends_on={}, produces={'docs/source/how_to_guides/first.txt': '23429bd9ba98dd5140309bb9b0094b3aad642430fff6fb3ca61f008ce644f34a'}), 'task_create_second_file': _TaskEntry(id='task_create_second_file', state='c69ede3aac47ebfb3fade7fd2bc835270757c7f568e668a5cc12446d51db413c', depends_on={}, produces={'docs/source/how_to_guides/second.txt': '514b6bb7c846ecfb8d2d29ef0b5c79b63e6ae838f123da936fe827fda654276c'}), 'task_merge_files': _TaskEntry(id='task_merge_files', state='72df9444f53beb87ff9c513bce009ddcdb26e41c3d324a0f1cde13bc52181263', depends_on={'docs/source/how_to_guides/second.txt': '514b6bb7c846ecfb8d2d29ef0b5c79b63e6ae838f123da936fe827fda654276c', 'docs/source/how_to_guides/first.txt': '23429bd9ba98dd5140309bb9b0094b3aad642430fff6fb3ca61f008ce644f34a'}, produces={'docs/source/how_to_guides/hello_world.txt': 'dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f'})}, _node_index={'task_create_first_file': {'docs/source/how_to_guides/first.txt': '23429bd9ba98dd5140309bb9b0094b3aad642430fff6fb3ca61f008ce644f34a'}, 'task_create_second_file': {'docs/source/how_to_guides/second.txt': '514b6bb7c846ecfb8d2d29ef0b5c79b63e6ae838f123da936fe827fda654276c'}, 'task_merge_files': {'docs/source/how_to_guides/second.txt': '514b6bb7c846ecfb8d2d29ef0b5c79b63e6ae838f123da936fe827fda654276c', 'docs/source/how_to_guides/first.txt': '23429bd9ba98dd5140309bb9b0094b3aad642430fff6fb3ca61f008ce644f34a', 'docs/source/how_to_guides/hello_world.txt': 'dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f'}}, _dirty=False)}, collection_reports=[CollectionReport(outcome=<CollectionOutcome.SUCCESS: 1>, node=TaskWithoutPath(name='task_create_first_file', function=<function task_create_first_file at 0x715d048c8f40>, depends_on={}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/first.txt'), name='743/docs/source/how_to_guides/first.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('a164b81e-6555-4924-bbb5-72a69300aa94'), 'after': [], 'is_generator': False, 'duration': (1774103418.967684, 1774103418.9678571)}), exc_info=None), CollectionReport(outcome=<CollectionOutcome.SUCCESS: 1>, node=TaskWithoutPath(name='task_create_second_file', function=<function <lambda> at 0x715d048c9580>, depends_on={}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/second.txt'), name='743/docs/source/how_to_guides/second.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('af932327-0c1f-478a-8a27-2b11fc9c8dcd'), 'after': [], 'is_generator': False, 'duration': (1774103418.9900103, 1774103418.9901717)}), exc_info=None), CollectionReport(outcome=<CollectionOutcome.SUCCESS: 1>, node=TaskWithoutPath(name='task_merge_files', function=<function task_merge_files at 0x715d048c9260>, depends_on={'first': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/first.txt'), name='743/docs/source/how_to_guides/first.txt', attributes={}), 'second': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/second.txt'), name='743/docs/source/how_to_guides/second.txt', attributes={})}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/hello_world.txt'), name='743/docs/source/how_to_guides/hello_world.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('3e77d036-e0fc-43e9-9b9f-895a5aeca0d4'), 'after': [], 'is_generator': False, 'duration': (1774103419.0106378, 1774103419.010855)}), exc_info=None)], dag=<networkx.classes.digraph.DiGraph object at 0x715d048d3c50>, hook=<pluggy._hooks.HookRelay object at 0x715d048d11d0>, tasks=[TaskWithoutPath(name='task_create_first_file', function=<function task_create_first_file at 0x715d048c8f40>, depends_on={}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/first.txt'), name='743/docs/source/how_to_guides/first.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('a164b81e-6555-4924-bbb5-72a69300aa94'), 'after': [], 'is_generator': False, 'duration': (1774103418.967684, 1774103418.9678571)}), TaskWithoutPath(name='task_create_second_file', function=<function <lambda> at 0x715d048c9580>, depends_on={}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/second.txt'), name='743/docs/source/how_to_guides/second.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('af932327-0c1f-478a-8a27-2b11fc9c8dcd'), 'after': [], 'is_generator': False, 'duration': (1774103418.9900103, 1774103418.9901717)}), TaskWithoutPath(name='task_merge_files', function=<function task_merge_files at 0x715d048c9260>, depends_on={'first': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/first.txt'), name='743/docs/source/how_to_guides/first.txt', attributes={}), 'second': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/second.txt'), name='743/docs/source/how_to_guides/second.txt', attributes={})}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/hello_world.txt'), name='743/docs/source/how_to_guides/hello_world.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('3e77d036-e0fc-43e9-9b9f-895a5aeca0d4'), 'after': [], 'is_generator': False, 'duration': (1774103419.0106378, 1774103419.010855)})], dag_report=None, execution_reports=[ExecutionReport(task=TaskWithoutPath(name='task_create_first_file', function=<function task_create_first_file at 0x715d048c8f40>, depends_on={}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/first.txt'), name='743/docs/source/how_to_guides/first.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('a164b81e-6555-4924-bbb5-72a69300aa94'), 'after': [], 'is_generator': False, 'duration': (1774103418.967684, 1774103418.9678571)}), outcome=<TaskOutcome.SUCCESS: 1>, exc_info=None, sections=[]), ExecutionReport(task=TaskWithoutPath(name='task_create_second_file', function=<function <lambda> at 0x715d048c9580>, depends_on={}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/second.txt'), name='743/docs/source/how_to_guides/second.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('af932327-0c1f-478a-8a27-2b11fc9c8dcd'), 'after': [], 'is_generator': False, 'duration': (1774103418.9900103, 1774103418.9901717)}), outcome=<TaskOutcome.SUCCESS: 1>, exc_info=None, sections=[]), ExecutionReport(task=TaskWithoutPath(name='task_merge_files', function=<function task_merge_files at 0x715d048c9260>, depends_on={'first': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/first.txt'), name='743/docs/source/how_to_guides/first.txt', attributes={}), 'second': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/second.txt'), name='743/docs/source/how_to_guides/second.txt', attributes={})}, produces={'return': PathNode(path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743/docs/source/how_to_guides/hello_world.txt'), name='743/docs/source/how_to_guides/hello_world.txt', attributes={})}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'collection_id': UUID('3e77d036-e0fc-43e9-9b9f-895a5aeca0d4'), 'after': [], 'is_generator': False, 'duration': (1774103419.0106378, 1774103419.010855)}), outcome=<TaskOutcome.SUCCESS: 1>, exc_info=None, sections=[])], exit_code=<ExitCode.OK: 0>, collection_start=1774103418.8794808, collection_end=1774103418.8821704, execution_start=1774103418.8835135, execution_end=1774103419.042351, n_tasks_failed=0, scheduler=TopologicalSorter(dag=<networkx.classes.digraph.DiGraph object at 0x715d04a13bb0>, priorities={'82d6a7ce01a2a50d5d4bd5081d662df92b8c500fbc172f94fb026c9d1d4ebc4a': 0, '2a06f358fc8e621754c133af76f5ac1b3e8ad5172b5803823cb264b30ea5d829': 0, '45a637ca3cc7aa973d4b315cc1bef02217b79918357fd35c6fa61f4e2d2f9948': 0}, _nodes_processing=set(), _nodes_done={'2a06f358fc8e621754c133af76f5ac1b3e8ad5172b5803823cb264b30ea5d829', '82d6a7ce01a2a50d5d4bd5081d662df92b8c500fbc172f94fb026c9d1d4ebc4a', '45a637ca3cc7aa973d4b315cc1bef02217b79918357fd35c6fa61f4e2d2f9948'}), should_stop=False, warnings=[])
Repeated Tasks¶
You can also create multiple tasks with the same function by repeating the task in a loop. (Because we are collecting the tasks ourselves in a list, we don’t necessarily need the @task decorator, but you can still use it.) This is useful when you want to run the same operation with different parameters.
from pytask import Product
tasks = []
for i in range(3):
def create_file(
value: int = i * 100, path: Annotated[Path, Product] = Path(f"output_{i}.txt")
) -> None:
path.write_text(f"Result: {value}")
tasks.append(create_file)
session = pytask.build(tasks=tasks)
────────────────────────────────────────────── Start pytask session ───────────────────────────────────────────────
Platform: linux -- Python 3.13.3, pytask 0.5.9.dev101+gd0f4cb046, pluggy 1.6.0
Root: /home/docs/checkouts/readthedocs.org/user_builds/pytask-dev/checkouts/743
Collected 3 tasks.
╭────────────────────────┬─────────╮ │ Task │ Outcome │ ├────────────────────────┼─────────┤ │ create_file[0-path0] │ . │ │ create_file[100-path1] │ . │ │ create_file[200-path2] │ . │ ╰────────────────────────┴─────────╯
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
╭─────────── Summary ────────────╮ │ 3 Collected tasks │ │ 3 Succeeded (100.0%) │ ╰────────────────────────────────╯
──────────────────────────────────────────── Succeeded in 0.09 seconds ────────────────────────────────────────────
# Cleanup
for i in range(3):
Path(f"output_{i}.txt").unlink()
Configuring the build¶
To configure the build, pytask.build() has many more options that are the same that you find on the commandline.
pytask.build?
# Cleanup
for name in ("first.txt", "second.txt", "hello_world.txt"):
Path(name).unlink()