A data node is one of the most important concepts in Taipy Core. It does not contain the data itself but holds all the necessary information to read and write the actual data. It can be seen as a dataset descriptor or data reference.
A data node can reference any data type:
- a text,
- a numeric value,
- a list of parameters,
- a custom python object,
- the content of a JSON file, a CSV file, a Pickle file, etc.
- the content of one or multiple database table(s),
- any other data.
It is designed to model any type of data: input, intermediate, or output data, internal or external data, local or remote data, historical data, a set of parameters, a trained model, etc.
The data node information depends on the data itself, its exposed format, and its storage type.
First example: If the data is stored in an SQL database, the corresponding data node should contain the username, password, host, port, the queries to read and write the data, as well as the Python class used to represent a database line.
Second example: If the data is stored in a CSV file, the corresponding data node should contain, for instance, the path to the file and the Python class used to represent a CSV line.
Let's take a realistic example.
Let's assume we want to build an application to predict the monthly sales demand in order to adjust production planning, constrained by some capacity.
The flowchart below represents the various data nodes we want to be processed byt the tasks (in orange).
We have six data nodes modeling the data (the dark blue boxes). One each for the sales history, the trained model, the current month, the sales predictions, the production capacity, and the production orders.
Taipy proposes various predefined data nodes corresponding to the most popular storage types. More details on the Data node configuration page
In our example, the sales history is in a CSV file. For example, the sales history comes from the company record system, so we do not control its storage type. We got the data as a CSV file. We use the predefined CSV data node to model the sales history.
As for the production orders data node, we want to write the data into a database shared by other systems. We can use the SQL data node to model the production orders.
We have no particular specification for the other data nodes. We use the default storage type: Pickle.
A data node's attributes are populated based on its data node configuration
DataNodeConfig that must be
provided when instantiating a new data node. (Please refer to the
configuration details documentation for more
details on configuration).