With the realease of kedro==0.17.2
came a new module in the project template pipeline_registry.py
. Here are some notes that I learned while playing with
this new module.
migrating to pipeline_registry.py
- create a
src/<package-name>/pipeline_registry.py
file create a -
register_pipelines
function inpipeline_registry.py
that mirrors the - register_pipelines method from your
hooks.py
module do not bring the -
hook_impl
decorator remove register_pipelines method on yourProjectHooks
- class
You should now have something that looks like this in your
src/<package-name>/pipeline_registry.py
.
"""Project pipelines."""
from typing import Dict
from kedro.pipeline import Pipeline
def register_pipelines() -> Dict[str, Pipeline]:
"""Register the project's pipelines.
Returns: A mapping from a pipeline name to a ``Pipeline`` object.
"""
return {"__default__": Pipeline([])}
pipeline_registry only works in
kedro>=0.17.2
Conflict Resolution
What happens If I register pipelines in both places
I was not able to find any official documentation on how conflict resolution worked so I stepped into a project and added to both my hooks.py
and pipeline_registry.py
file. I noticed that it would pick up pipelines from both modules, but pipelines from hooks.py
always take precedence. The entire duplicate pipeline will be over written by the one from hooks.py
.
kedro automatically merges pipelines from both hooks.py takes precedence
Ready to update
In my experience there were no issues upgrading from 0.17.1
to 0.17.2
. I would reccomend only having one register_pipelines
so decide to migrate to the new pipeline_registry.py
or keep it in your hooks.py
, but both is only going to lead to confusion.