Data node management
In the following, it is assumed that my_config.py
module contains a Taipy configuration
already implemented.
Data nodes get created when scenarios or pipelines are created. Please refer to the Entities' creation section for more details.
Data node attributes¶
A DataNode
entity is identified by a unique identifier id
that Taipy generates.
A data node also holds various properties and attributes accessible through the entity:
- config_id: The id of the data node config.
- scope: The scope of this data node (scenario, pipeline, etc.).
- id: The unique identifier of this data node.
- name: The user-readable name of the data node.
- parent_id: The identifier of the parent (pipeline_id, scenario_id, cycle_id) or
None
. - last_edit_date: The date and time of the last modification.
- job_ids: The ordered list of jobs that have written on this data node.
- cacheable: The Boolean value that indicates if a data node is cacheable.
- validity_period: The validity period of a cacheable data node. If validity_period is set to None, the data node is always up-to-date.
- edit_in_progress: The flag that signals if a task is currently computing this data node.
- properties: The dictionary of additional arguments.
Get data node¶
The first method to get a data node is from its id using the get()
method:
Example
1 2 3 4 5 6 7 8 9 |
|
The data nodes that are part of a scenario, pipeline or task can be directly accessed as attributes:
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Get all data nodes¶
All data nodes that are part of a scenario or a pipeline can be directly accessed as attributes:
Example
import taipy as tp
import my_config
# Creating a scenario from a config
scenario = tp.create_scenario(my_config.monthly_scenario_cfg)
# Access all the data nodes from the scenario
scenario.data_nodes
# Access the pipeline 'sales' from the scenario and then access all the data nodes from the pipeline
pipeline = scenario.sales
pipeline.data_nodes
All the data nodes can be retrieved using the method get_data_nodes()
which returns a list of all existing
data nodes.
Example
1 2 3 4 5 6 |
|
Read data node¶
To read the content of a data node you can use the DataNode.read()
method. The read method returns the data
stored on the data node according to the type of data node:
Example
1 2 3 4 5 6 7 8 9 10 11 |
|
It is also possible to partially read the contents of data nodes, which comes in handy when dealing with large amounts
of data.
This can be achieved by providing an operator, a Tuple of (field_name, value, comparison_operator),
or a list of operators to the DataNode.filter()
method:
1 2 3 |
|
If a list of operators is provided, it is necessary to provide a join operator that will be used to combine the filtered results from the operators.
It is also possible to use pandas style filtering:
1 2 |
|
Write data node¶
To write some data on the data node, like the output of a task, you can use the DataNode.write()
method. The
method takes a data object (string, dictionary, lists, numpy arrays, pandas dataframes, etc.) as a parameter and
writes it on the data node:
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|