Data integration
In this section, we explore how to integrate data into your Taipy application using data nodes.
A DataNode
is the cornerstone of Taipy's data management capabilities, providing a flexible
and consistent way to handle data from various sources. Whether your data resides in files,
in databases, in custom data stores, or on local or remote environments, data nodes simplify
the process of accessing, processing, and managing your data.
What is a Data Node?¶
A data node in Taipy is an abstraction that represents some data. It provides a uniform interface for reading and writing data, regardless of the underlying storage mechanism. This abstraction allows you to focus on your application's logic without worrying about the intricacies of data management.
A data node does not contain the data itself but holds all the necessary information to read and write the actual data. It can be seen as a dataset descriptor or data reference. It is design to model data:
-
For any format: a built-in Python object (e.g. an integer, a string, a dictionary or list of parameters, etc.) or a more complex object (e.g. a file, a machine learning model, a list of custom objects, the result of a database query, etc.).
-
For any type: internal or external data, local or remote data, historical data, a parameter or a parameter set, a trained model, etc.
-
For any usage: independent data or data related to others through data processing pipelines or scenarios.
To create a data node, you first need to define a data node configuration using a
DataNodeConfig
object. This configuration is used to instantiate one (or multiple)
data node(s) with the desired properties.
Why use Data Nodes?¶
The main advantages of using data nodes in a Taipy project are:
-
Easy to configure: Thanks to the various predefined data nodes, many types of data can be easily integrated. For more details, see the data node configuration page.
-
Easy to use: Taipy already implements the necessary utility methods to create, get, read, write, filter, or append data nodes. For more details, see the data node usage page.
-
Taipy visual elements: Benefit from smart visual elements to empower end users just in one line of code. Manage, display, and edit data nodes in a user-friendly graphical interface. For more details, see the data node selector or the data node viewer pages.
-
Data history and validity period: Keep track of the data editing history, and monitor the data validity. For more information, see the data node history page.
-
Seamless integration with Task orchestration and Scenario management: Data pipelines in Taipy are modeled as execution graphs within scenarios connecting data nodes through tasks. Task orchestration and scenario management are key features of Taipy. For more information, see the task orchestration or scenario and data management pages.
-
Support multiple alternative datasets for What-if analysis: Easily manage alternative data nodes as different versions or variations of your dataset within the same application. This is particularly useful for What-if analysis. For more information, see the what-if analysis page.
How to use Data Nodes?¶
A DataNode
is instantiated from a DataNodeConfig
object. It encapsulates the necessary
information to create the data node (e.g. the data source, the data format, the data type, the
way to read and write the data).
To integrate a data node into your Taipy application, you need to follow these steps:
-
Define a DataNodeConfig: Create a global
DataNodeConfig
object using the various predefined methods available in Taipy such asConfig.configure_data_node()
,Config.configure_csv_data_node()
,Config.configure_json_data_node()
, etc.
For more details, see the data node configuration page. -
Instantiate a DataNode: Once you have defined the data node configuration, you can instantiate your
DataNode
. Use thetp.create_global_data_node()
method.
For more details, see the data node usage page. -
Access or visualize your Data: You can now retrieve your
DataNode
, Read, write, filter, or append data as needed. For more details, see the data node usage page.
You can also use the Taipy visual elements to manage, display, and edit your data nodes. For more details, see the data node visual elements page.
Examples
Here is an example of how to integrate some data and use a global data node:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
The previous code snippet shows how to configure a data node, instantiate it,
retrieve it, write some data, and read it back.
Here is the
complete python code
corresponding to the example.
Here is another example of how to integrate some data and visualize it using the data node visual elements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
In the previous code snippet three data node configurations are created. Some
default data is passed to each of them. Then, the data nodes are instantiated.
Finally, a GUI service is started with two visual elements to visualize and edit
the data nodes though a user-friendly interface.
Here is the
complete python code
corresponding to the example.