Skip to content

DataNode class

Bases: _Entity, _Labeled

Reference to a dataset.

A Data Node is an abstract class that holds metadata related to the data it refers to. In particular, a data node holds the name, the scope, the owner identifier, the last edit date, and some additional properties of the data.
A Data Node also contains information and methods needed to access the dataset. This information depends on the type of storage, and it is held by subclasses (such as SQL Data Node, CSV Data Node, ...).

Note

It is not recommended to instantiate subclasses of DataNode directly. Instead, you have two ways:

  1. Create a Scenario using the create_scenario() function. Related data nodes will be created automatically. Please refer to the Scenario class for more information.
  2. Configure a DataNodeConfig with the various configuration methods form Config and use the create_global_data_node() function as illustrated in the following example.

A data node's attributes are populated based on its configuration DataNodeConfig.

Example

import taipy as tp
from taipy import Config

if __name__ == "__main__":
    # Configure a global data node
    dataset_cfg = Config.configure_data_node("my_dataset", scope=tp.Scope.GLOBAL)

    # Instantiate a global data node
    dataset = tp.create_global_data_node(dataset_cfg)

    # Retrieve the list of all data nodes
    all_data_nodes = tp.get_data_nodes()

    # Write the data
    dataset.write("Hello, World!")

    # Read the data
    print(dataset.read())

Attributes

config_id property

config_id: str

Identifier of the data node configuration. It must be a valid Python identifier.

edit_in_progress property writable

edit_in_progress: bool

True if the data node is locked for modification. False otherwise.

editor_expiration_date property writable

editor_expiration_date: Optional[datetime]

The expiration date of the editor lock.

editor_id property writable

editor_id: Optional[str]

The identifier of the user who is currently editing the data node.

edits property

edits: List[Edit]

The list of Edits.

The list of Edits (an alias for dict) containing metadata about each data edition including but not limited to:

  • timestamp: The time instant of the writing
  • comments: Representation of a free text to explain or comment on a data change
  • job_id: Only populated when the data node is written by a task execution and corresponds to the job's id.
Additional metadata related to the edition made to the data node can also be provided in Edits.

expiration_date property

expiration_date: datetime

Datetime instant of the expiration date of this data node.

id instance-attribute

id: DataNodeId = id or _new_id(_config_id)

The unique identifier of the data node.

is_ready_for_reading property

is_ready_for_reading: bool

Indicate if this data node is ready for reading.

False if the data is locked for modification or if the data has never been written. True otherwise.

is_up_to_date property

is_up_to_date: bool

Indicate if this data node is up-to-date.

False if a preceding data node has been updated before the selected data node or the selected data is invalid.
True otherwise.

is_valid property

is_valid: bool

Indicate if this data node is valid.

False if the data ever been written or the expiration date has passed.
True otherwise.

job_ids property

job_ids: List[JobId]

List of the jobs having edited this data node.

last_edit_date property writable

last_edit_date: Optional[datetime]

The date and time of the last modification.

name property writable

name: Optional[str]

A human-readable name of the data node.

owner_id property

owner_id: Optional[str]

The identifier of the owner (sequence_id, scenario_id, cycle_id) or None.

parent_ids property

parent_ids: Set[str]

The set of identifiers of the parent tasks.

properties property

properties

Dictionary of custom properties.

scope property writable

scope: Scope

The data node scope.

validity_period property writable

validity_period: Optional[timedelta]

The duration since the last edit date for which the data node is considered up-to-date.

The duration implemented as a timedelta since the last edit date for which the data node can be considered up-to-date. Once the validity period has passed, the data node is considered stale and relevant tasks will run even if they are skippable (see the Task orchestration page of the user manual for more details).

If validity_period is set to None, the data node is always up-to-date.

version property

version: str

The string indicates the application version of the data node to instantiate.

If not provided, the current version is used.

Methods

append()

append(
    data,
    job_id: Optional[JobId] = None,
    **kwargs: Dict[str, Any]
)

Append some data to this data node.

Parameters:

Name Type Description Default
data Any

The data to write to this data node.

required
job_id JobId

An optional identifier of the writer.

None
**kwargs dict[str, any]

Extra information to attach to the edit document corresponding to this write.

{}

filter()

filter(
    operators: Union[List, Tuple],
    join_operator=JoinOperator.AND,
) -> Any

Read and filter the data referenced by this data node.

The data is filtered by the provided list of 3-tuples (key, value, Operator).

If multiple filter operators are provided, filtered data will be joined based on the join operator (AND or OR).

Parameters:

Name Type Description Default
operators Union[List[Tuple], Tuple]

A 3-element tuple or a list of 3-element tuples, each is in the form of (key, value, Operator).

required
join_operator JoinOperator

The operator used to join the multiple filter 3-tuples.

AND

Returns:

Type Description
Any

The filtered data.

Raises:

Type Description
NotImplementedError

If the data type is not supported.

get_label()

get_label() -> str

Returns the data node simple label prefixed by its owner label.

Returns:

Type Description
str

The label of the data node as a string.

get_last_edit()

get_last_edit() -> Optional[Edit]

Get last Edit of this data node.

Returns:

Type Description
Optional[Edit]

None if there has been no Edit on this data node.

get_parents()

get_parents() -> Dict[str, Set[_Entity]]

Get all parents of this data node.

Returns:

Type Description
Dict[str, Set[_Entity]]

The dictionary of all parent entities. They are grouped by their type (Scenario^, Sequences^, or tasks^) so each key corresponds to a level of the parents and the value is a set of the parent entities. An empty dictionary is returned if the entity does not have parents.

get_simple_label()

get_simple_label() -> str

Returns the data node simple label.

Returns:

Type Description
str

The simple label of the data node as a string.

lock_edit()

lock_edit(editor_id: Optional[str] = None)

Lock the data node modification.

Note

The data node can be unlocked with the method unlock_edit().

Parameters:

Name Type Description Default
editor_id Optional[str]

The editor's identifier.

None

read()

read() -> Any

Read the data referenced by this data node.

Returns:

Type Description
Any

The data referenced by this data node. None if the data has not been written yet.

read_or_raise()

read_or_raise() -> Any

Read the data referenced by this data node.

Returns:

Type Description
Any

The data referenced by this data node.

Raises:

Type Description
NoData

If the data has not been written yet.

storage_type() abstractmethod classmethod

storage_type() -> str

The storage type of the data node.

Each subclass must implement this method exposing the data node storage type.

track_edit()

track_edit(**options)

Creates and adds a new entry in the edits attribute without writing the data.

Parameters:

Name Type Description Default
options dict[str, any]

track timestamp, comments, job_id. The others are user-custom, users can use options to attach any information to an external edit of a data node.

{}

unlock_edit()

unlock_edit(editor_id: Optional[str] = None)

Unlocks the data node modification.

Note

The data node can be locked with the method lock_edit().

Parameters:

Name Type Description Default
editor_id Optional[str]

The editor's identifier.

None

write()

write(
    data,
    job_id: Optional[JobId] = None,
    **kwargs: Dict[str, Any]
)

Write some data to this data node.

Parameters:

Name Type Description Default
data Any

The data to write to this data node.

required
job_id JobId

An optional identifier of the writer.

None
**kwargs dict[str, any]

Extra information to attach to the edit document corresponding to this write.

{}