triplets package

Subpackages

Submodules

triplets.cgmes_tools module

triplets.cgmes_tools.darw_relations_graph(reference_data, ID_COLUMN='ID', notebook=False)[source]

Create a temporary HTML file to visualize relations in a CGMES dataset.

Parameters:

reference_data (pandas.DataFrame) – Triplet dataset containing reference data for visualization.
ID_COLUMN (str) – Column name containing IDs (e.g., ‘ID’).
notebook (bool, optional) – If True, render the graph for Jupyter notebook (default is False).

Returns:

File path to the generated HTML file (if notebook=False) or the Network object (if notebook=True).

Return type:

str or pyvis.network.Network

Notes

Uses pyvis for visualization with a hierarchical layout.
Nodes include object data in hover tooltips.

Examples

>>> file_path = darw_relations_graph(data, 'ID')

triplets.cgmes_tools.draw_relations(data, UUID, notebook=False, levels=2)[source]

Visualize all relations (incoming and outgoing) for a specific UUID in a CGMES dataset.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
UUID (str) – UUID of the object to visualize relations for.
notebook (bool, optional) – If True, render the graph for Jupyter notebook (default is False).
levels (int, optional) – Number of levels to traverse for relations (default is 2).

Returns:

File path to the generated HTML file (if notebook=False) or the Network object (if notebook=True).

Return type:

str or pyvis.network.Network

Examples

>>> file_path = draw_relations(data, 'uuid1', levels=3)

triplets.cgmes_tools.draw_relations_from(data, UUID, notebook=False)[source]

Visualize relations originating from a specific UUID in a CGMES dataset.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
UUID (str) – UUID of the object to visualize outgoing relations for.
notebook (bool, optional) – If True, render the graph for Jupyter notebook (default is False).

Returns:

File path to the generated HTML file (if notebook=False) or the Network object (if notebook=True).

Return type:

str or pyvis.network.Network

Examples

>>> file_path = draw_relations_from(data, 'uuid1')

triplets.cgmes_tools.draw_relations_to(data, UUID, notebook=False)[source]

Visualize relations pointing to a specific UUID in a CGMES dataset.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
UUID (str) – UUID of the object to visualize incoming relations for.
notebook (bool, optional) – If True, render the graph for Jupyter notebook (default is False).

Returns:

File path to the generated HTML file (if notebook=False) or the Network object (if notebook=True).

Return type:

str or pyvis.network.Network

Examples

>>> file_path = draw_relations_to(data, 'uuid1')

triplets.cgmes_tools.generate_instances_ID(dependencies={'EQ': ['EQBD'], 'EQBD': [], 'SSH': ['EQ'], 'SV': ['TPBD', 'TP', 'SSH'], 'TP': ['EQ'], 'TPBD': ['EQBD']})[source]

Generate UUIDs for each profile defined in the dependencies dictionary.

Parameters:: dependencies (dict, optional) – Dictionary mapping profile names to lists of dependent profile names. Defaults to a predefined CGMES profile dependencies dictionary.
Returns:: Dictionary with profile names as keys and generated UUIDs as values.
Return type:: dict

Examples

>>> generate_instances_ID()
{'EQ': '123e4567-e89b-12d3-a456-426614174000', ...}

triplets.cgmes_tools.get_EIC_to_mRID_map(data, type)[source]

Map Energy Identification Codes (EIC) to mRIDs for a specific object type.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
type (str) – Object type to filter (e.g., ‘PowerTransformer’).

Returns:

DataFrame with columns [‘mRID’, ‘EIC’] mapping EICs to mRIDs.

Return type:

pandas.DataFrame

Notes

Filters data for objects of the specified type with ‘IdentifiedObject.energyIdentCodeEic’ key.
TODO: Add support for type=None to return all types and include type in result.

Examples

>>> eic_map = get_EIC_to_mRID_map(data, 'PowerTransformer')
>>> print(eic_map)
   mRID                                  EIC
0  uuid1  10X1001A1001A021

triplets.cgmes_tools.get_GeneratingUnits(data)[source]

Retrieve a table of GeneratingUnits from a CGMES dataset.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing CGMES data.
Returns:: DataFrame containing GeneratingUnit data, filtered by ‘GeneratingUnit.maxOperatingP’.
Return type:: pandas.DataFrame

Examples

>>> units = get_GeneratingUnits(data)
>>> print(units)
   ID  GeneratingUnit.maxOperatingP  ...

triplets.cgmes_tools.get_dangling_references(data, detailed=False)[source]

Identify dangling references in a CGMES dataset.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
detailed (bool, optional) – If True, return detailed DataFrame of dangling references; otherwise, return counts of dangling reference types (default is False).

Returns:

If detailed=True, a DataFrame with dangling references; otherwise, a Series with counts of dangling reference keys.

Return type:

pandas.DataFrame or pandas.Series

Notes

Identifies references using the CGMES convention (e.g., keys with ‘.<CapitalLetter>’).
A dangling reference is one where the referenced ID does not exist in the dataset.

Examples

>>> dangling = get_dangling_references(data, detailed=True)

triplets.cgmes_tools.get_filename_from_metadata(meta_data, file_type='xml', filename_mask='{scenarioTime:%Y%m%dT%H%MZ}_{processType}_{modelingEntity}_{messageType}_{version:03d}')[source]

Generate a CGMES filename from metadata using a specified filename mask.

Parameters:

meta_data (dict) – Dictionary containing metadata keys (e.g., ‘scenarioTime’, ‘processType’) and values.
file_type (str, optional) – File extension for the generated filename (default is ‘xml’).
filename_mask (str, optional) – Format string defining the filename structure (default follows CGMES convention).

Returns:

Generated filename adhering to the CGMES naming convention.

Return type:

str

Notes

Removes ‘Model.’ prefix from metadata keys for compatibility with string formatting.
Converts ‘scenarioTime’ to datetime and ‘version’ to integer before formatting.
Uses the provided filename_mask to construct the filename.

Examples

>>> meta = {'Model.scenarioTime': '20230101T0000Z', 'Model.processType': 'A01', ...}
>>> get_filename_from_metadata(meta)
'20230101T0000Z_A01_ENTITY_EQ_001.xml'

triplets.cgmes_tools.get_limits(data)[source]

Retrieve operational limits from a CGMES dataset, including equipment types.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing CGMES data.
Returns:: DataFrame containing operational limits with associated equipment types.
Return type:: pandas.DataFrame

Notes

Combines OperationalLimitSet, OperationalLimit, OperationalLimitType, and Terminal data.
Links equipment via Terminal.ConductingEquipment or OperationalLimitSet.Equipment.

Examples

>>> limits = get_limits(data)

triplets.cgmes_tools.get_loaded_model_parts(data)[source]

Retrieve a DataFrame of loaded CGMES model parts with their FullModel metadata.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing CGMES data.
Returns:: DataFrame containing FullModel data for loaded model parts.
Return type:: pandas.DataFrame

Notes

Does not correctly resolve ‘Model.DependentOn’ relationships.

Examples

>>> model_parts = get_loaded_model_parts(data)

triplets.cgmes_tools.get_loaded_models(data)[source]

Retrieve a dictionary of loaded CGMES model parts and their UUIDs.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing CGMES data with ‘Model.profile’ and ‘Model.DependentOn’ keys.
Returns:: Dictionary where keys are StateVariables (SV) UUIDs and values are DataFrames containing model parts (ID, PROFILE, INSTANCE_ID) and their dependencies.
Return type:: dict

Examples

>>> models = get_loaded_models(data)
>>> print(models)
{'SV_UUID': DataFrame(...), ...}

triplets.cgmes_tools.get_metadata_from_FullModel(data)[source]

Extract metadata from the FullModel entries in a CGMES triplet dataset.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing CGMES data with ‘KEY’, ‘VALUE’, and ‘ID’ columns.
Returns:: Dictionary of metadata key-value pairs for the FullModel instance.
Return type:: dict

Notes

Assumes the dataset contains a ‘Type’ key with value ‘FullModel’.
Removes the ‘Type’ key from the resulting metadata dictionary.

Examples

>>> meta = get_metadata_from_FullModel(data)
>>> print(meta)
{'Model.scenarioTime': '20230101T0000Z', 'Model.processType': 'A01', ...}

triplets.cgmes_tools.get_metadata_from_filename(file_name)[source]

Extract metadata from a CGMES filename following the CGMES naming convention.

Parameters:: file_name (str) – Name of the CGMES file (e.g., ‘20230101T0000Z_A01_ENTITY_EQ_001.xml’).
Returns:: Dictionary containing metadata keys (e.g., ‘Model.scenarioTime’, ‘Model.processType’) and their corresponding values extracted from the filename.
Return type:: dict

Notes

Expects filenames to follow CGMES conventions with underscores separating metadata fields.
Handles cases with 4 or 5 metadata elements, setting ‘Model.processType’ to empty string for older formats (pre-QoDC 2.1).
Splits ‘Model.modelingEntity’ into ‘Model.mergingEntity’, ‘Model.domain’, and ‘Model.forEntity’ if applicable.

Examples

>>> get_metadata_from_filename('20230101T0000Z_A01_ENTITY_EQ_001.xml')
{'Model.scenarioTime': '20230101T0000Z', 'Model.processType': 'A01', ...}

triplets.cgmes_tools.get_metadata_from_xml(filepath_or_fileobject)[source]

Extract metadata from the FullModel element of a CGMES XML file.

Parameters:: filepath_or_fileobject (str or file-like object) – Path to the XML file or a file-like object containing CGMES XML data.
Returns:: DataFrame with columns [‘tag’, ‘text’, ‘attrib’] containing metadata from the FullModel element.
Return type:: pandas.DataFrame

Examples

>>> df = get_metadata_from_xml('path/to/file.xml')
>>> print(df)
   tag                text  attrib
0  Model.scenarioTime  20230101T0000Z  {}
...

triplets.cgmes_tools.get_model_data(data, model_instances_dataframe)[source]

Extract data for specific CGMES model instances.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
model_instances_dataframe (pandas.DataFrame) – DataFrame containing ‘INSTANCE_ID’ column with model instance identifiers.

Returns:

Filtered dataset containing only data for the specified model instances.

Return type:

pandas.DataFrame

Examples

>>> model_data = get_model_data(data, models['SV_UUID'])

triplets.cgmes_tools.scale_load(data, load_setpoint, cos_f=None)[source]

Scale active and reactive power loads in a CGMES SSH instance.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing SSH load information.
load_setpoint (float) – Target total active power (P) setpoint for scaling.
cos_f (float, optional) – Cosine of the power factor angle (cos(φ)). If None, calculated from the ratio of total Q to P.

Returns:

Updated dataset with scaled P and Q values for ConformLoad instances.

Return type:

pandas.DataFrame

Notes

Scales only ConformLoad instances, preserving NonConformLoad values.
Maintains or computes the power factor using cos_f.

Examples

>>> updated_data = scale_load(data, load_setpoint=1000.0, cos_f=0.9)

triplets.cgmes_tools.statistics_GeneratingUnit_types(data)[source]

Calculate statistics for GeneratingUnit types in a CGMES dataset.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing CGMES data.
Returns:: DataFrame with counts, total, and percentage of each GeneratingUnit type.
Return type:: pandas.DataFrame

Examples

>>> stats = statistics_GeneratingUnit_types(data)
>>> print(stats)
   Type  count  TOTAL    %
0  Hydro   10     20  50.0
...

triplets.cgmes_tools.switch_equipment_terminals(data, equipment_id, connected: str = 'false')[source]

Update connection statuses of terminals for specified equipment in a CGMES dataset.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing EQ and SSH information.
equipment_id (str or list) – Identifier(s) (mRID) of the equipment whose terminals’ statuses are to be updated.
connected (str, optional) – New connection status (‘true’ or ‘false’, default is ‘false’).

Returns:

Updated dataset with modified terminal connection statuses.

Return type:

pandas.DataFrame

Raises:

ValueError – If connected is not ‘true’ or ‘false’.

Examples

>>> updated_data = switch_equipment_terminals(data, ['uuid1', 'uuid2'], connected='true')

triplets.cgmes_tools.update_FullModel_from_dict(data, metadata, update=True, add=False)[source]

Update or add metadata to FullModel entries in a CGMES triplet dataset.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
metadata (dict) – Dictionary of metadata key-value pairs to update or add.
update (bool, optional) – If True, update existing metadata keys (default is True).
add (bool, optional) – If True, add new metadata keys (default is False).

Returns:

Updated triplet dataset with modified FullModel metadata.

Return type:

pandas.DataFrame

Examples

>>> meta = {'Model.scenarioTime': '20230102T0000Z'}
>>> updated_data = update_FullModel_from_dict(data, meta)

triplets.cgmes_tools.update_FullModel_from_filename(data, parser=<function get_metadata_from_filename>, update=False, add=True)[source]

Update FullModel metadata in a triplet dataset using metadata parsed from filenames.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data with ‘label’ keys for filenames.
parser (callable, optional) – Function to parse metadata from filenames, returning a dictionary (default is get_metadata_from_filename).
update (bool, optional) – If True, update existing metadata keys (default is False).
add (bool, optional) – If True, add new metadata keys (default is True).

Returns:

Updated triplet dataset with FullModel metadata derived from filenames.

Return type:

pandas.DataFrame

Examples

>>> updated_data = update_FullModel_from_filename(data)

triplets.cgmes_tools.update_filename_from_FullModel(data, filename_mask='{scenarioTime:%Y%m%dT%H%MZ}_{processType}_{modelingEntity}_{messageType}_{version:03d}', filename_key='label')[source]

Update filenames in a CGMES triplet dataset based on FullModel metadata.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data with FullModel metadata.
filename_mask (str, optional) – Format string defining the filename structure (default follows CGMES convention).
filename_key (str, optional) – Key in the dataset where filenames are stored (default is ‘label’).

Returns:

Updated triplet dataset with filenames modified based on FullModel metadata.

Return type:

pandas.DataFrame

Examples

>>> updated_data = update_filename_from_FullModel(data)

triplets.rdf_parser module

class triplets.rdf_parser.ExportType(value)[source]

Bases: StrEnum

XML_PER_INSTANCE = 'xml_per_instance'

XML_PER_INSTANCE_ZIP_PER_ALL = 'xml_per_instance_zip_per_all'

XML_PER_INSTANCE_ZIP_PER_XML = 'xml_per_instance_zip_per_xml'

triplets.rdf_parser.clean_ID(ID)[source]

Remove common CIM ID prefixes from a string.

Parameters:: ID (str) – The input ID string to clean.
Returns:: The ID with prefixes (’urn:uuid:’, ‘#_’, ‘_’) removed from the start.
Return type:: str

Notes

Sequentially removes ‘urn:uuid:’, ‘#_’, and ‘_’ prefixes using removeprefix.
TODO: Verify if these characters are absent in UUIDs to ensure safe removal.

Examples

>>> clean_ID("urn:uuid:1234")
'1234'
>>> clean_ID("#_abc")
'abc'

triplets.rdf_parser.diff_between_INSTANCE(data, INSTANCE_ID_1, INSTANCE_ID_2)[source]

Identify differences between two loaded INSTANCES, by thier INSTACE_ID in the same Triplet DataFrame.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing two or more INSTANCE.
INSTANCE_ID_1 (str) – UUID of the first INSTANCE.
INSTANCE_ID_2 (str) – UUID of the second INSTANCE.

Returns:

DataFrame containing triplets that differ between the two model parts.

Return type:

pandas.DataFrame

Examples

>>> diff = diff_between_INSTANCE('uuid1', 'uuid2')

triplets.rdf_parser.diff_between_triplet(old_data, new_data)[source]

Compute the difference between two Triplet DataFrames.

Parameters:

old_data (pandas.DataFrame) – Original triplet dataset.
new_data (pandas.DataFrame) – New triplet dataset to compare against.

Returns:

DataFrame containing triplets unique to old_data or new_data, with an ‘_merge’ column indicating ‘left_only’ (in old_data) or ‘right_only’ (in new_data).

Return type:

pandas.DataFrame

Examples

>>> diff = diff_between_triplet(old_data, new_data)

triplets.rdf_parser.export_to_cimxml(data, rdf_map=None, namespace_map=None, class_KEY='Type', export_undefined=True, export_type=ExportType.XML_PER_INSTANCE_ZIP_PER_XML, global_zip_filename='Export.zip', debug=False, export_to_memory=False, export_base_path='', comment=None, max_workers=None)[source]

Export a full triplet dataset to CIM RDF XML files or ZIP archives.

Processes all instances (grouped by INSTANCE_ID) and exports them according to the specified export_type. Supports parallel processing and in-memory or disk output.

Parameters:

data (pandas.DataFrame) – Full triplet dataset with columns [‘INSTANCE_ID’, ‘ID’, ‘KEY’, ‘VALUE’].
rdf_map (dict or str, optional) – RDF mapping configuration (see generate_xml()).
namespace_map (dict, optional) – Namespace prefix-to-URI mapping (see generate_xml()).
class_KEY (str, default "Type") – Key identifying object types in triplet data.
export_undefined (bool, default True) – Export unmapped classes/attributes with default RDF syntax.
export_type (ExportType or str, default ExportType.XML_PER_INSTANCE_ZIP_PER_XML) – Export format: - XML_PER_INSTANCE: One XML file per instance. - XML_PER_INSTANCE_ZIP_PER_ALL: All XMLs in a single ZIP. - XML_PER_INSTANCE_ZIP_PER_XML: Each XML in its own ZIP.
global_zip_filename (str, default "Export.zip") – Filename for the global ZIP archive (used with ZIP_PER_ALL).
debug (bool, default False) – Enable detailed timing and debug logging.
export_to_memory (bool, default False) – If True, return file-like objects (BytesIO); if False, save to disk.
export_base_path (str, default "") – Directory to save files when export_to_memory=False. Uses current directory if empty.
comment (str, optional) – Optional XML comment added to each generated file.
max_workers (int, optional) – Number of parallel workers for XML generation. If None, runs sequentially.

Returns:

If export_to_memory=True: List of BytesIO objects with .name attribute.
If export_to_memory=False: List of saved filenames (relative to export_base_path).

Return type:

list

Examples

>>> files = export_to_cimxml(
...     data,
...     rdf_map="config/cim_map.json",
...     export_type=ExportType.XML_PER_INSTANCE_ZIP_PER_XML,
...     export_to_memory=True,
...     max_workers=4
... )
>>> for f in files:
...     print(f"name:", f.name)

Notes

Uses concurrent.futures.ProcessPoolExecutor for parallel XML generation.
All XML files are UTF-8 encoded with XML declaration and pretty-printing.
ZIP files use DEFLATED compression.
Filenames are derived from instance label or UUID.

triplets.rdf_parser.export_to_excel(data, path=None)[source]

Export triplet data to an Excel file, with each type on a separate sheet.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
path (str, optional) – Directory path to save the Excel file (default is current working directory).

Notes

Uses ‘label’ key to determine the filename for each INSTANCE_ID.
Each object type is exported to a separate sheet.
TODO: Add support for XlsxWriter properties for better formatting.

Examples

>>> data.export_to_excel("output_dir")

triplets.rdf_parser.export_to_networkx(data)[source]

Convert a triplet dataset to a NetworkX graph.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing RDF data.
Returns:: A NetworkX graph with nodes (IDs with Type attributes) and edges (references).
Return type:: networkx.Graph

Notes

TODO: Add all node data and support additional graph export formats.

Examples

>>> graph = data.to_networkx()

triplets.rdf_parser.filter_by_triplet(data, filter_triplet)[source]

Filter riplet DataFrame using IDs from another DataFrame.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
filter_triplet (pandas.DataFrame) – DataFrame containing atleast colum ID to filter by.

Returns:

Filtered DataFrame with columns [‘ID, ‘KEY’, ‘VALUE’, ‘INSTANCE_ID’].

Return type:

pandas.DataFrame

Examples

>>> filtered = filter_by_triplet(data, filter_triplet)

triplets.rdf_parser.filter_by_type(data, type_name, type_key='Type')[source]

Filter triplet dataset by objects of a specific type.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
type_name (str) – Object type to filter by (e.g., ‘ACLineSegment’).
type_key (str) – Key used in triplet to indicate type, by default “Type”

Returns:

Filtered triplet dataset containing only objects of the specified type.

Return type:

pandas.DataFrame

Examples

>>> filtered = filter_by_type(data, "ACLineSegment")

triplets.rdf_parser.find_all_xml(list_of_paths_to_zip_globalzip_xml, debug=False)[source]

Extract XML files from a list of paths or ZIP archives.

Parameters:

list_of_paths_to_zip_globalzip_xml (list) – List of paths to XML files, ZIP archives, or file-like objects.
debug (bool, optional) – If True, log file processing details for debugging (default is False).

Returns:

List of file-like objects for XML files found in the input paths or ZIPs.

Return type:

list

Notes

Supports XML, RDF, and ZIP files; other file types are logged as unsupported.
TODO: Add support for random folders.

Examples

>>> xml_files = find_all_xml(["data.zip", "file.xml"])

triplets.rdf_parser.generate_xml(instance_data, rdf_map=None, namespace_map=None, class_KEY='Type', export_undefined=True, comment=None, debug=False)[source]

Generate an RDF XML file from a triplet dataset instance.

This function processes a single instance (grouped by INSTANCE_ID) from a triplet dataset and exports it as an RDF/XML document using provided or inferred mapping rules.

Parameters:

instance_data (pandas.DataFrame) – Triplet dataset for a single instance, with columns [‘’ID’, ‘KEY’, ‘VALUE’, INSTANCE_ID’]. Must contain at least one row with KEY == class_KEY to define object types.
rdf_map (dict or str, optional) – Dictionary mapping CIM classes and attributes to RDF namespaces and export rules. If a string is provided, it is treated as a file path to a JSON configuration. If None, attempts to infer from instance data (e.g., profile-based mapping).
namespace_map (dict, optional) – Mapping of namespace prefixes to URIs (e.g., {"cim": "http://iec.ch/TC57/2013/CIM-schema-cim16#"}). Must include "rdf" namespace. If None, inferred from rdf_map or instance.
class_KEY (str, default "Type") – Column key used to identify object class/type in the triplet data.
export_undefined (bool, default True) – If True, export classes and attributes without explicit mapping using default RDF settings. If False, skip unmapped elements with a warning.
comment (str, optional) – Optional comment to insert at the top of the XML output (as XML comment).
debug (bool, default False) – If True, log detailed timing and debug information during processing.

Returns:

Dictionary containing: - 'filename' (str): Generated filename (from label or UUID). - 'file' (bytes): UTF-8 encoded XML content.

Return type:

dict

Raises:

KeyError – If required columns are missing in instance_data.
ValueError – If invalid export configuration or mapping is detected.

Examples

>>> instance = data[data["INSTANCE_ID"] == 1]
>>> result = generate_xml(
...     instance,
...     rdf_map="config/eq_profile.json",
...     comment="Exported on 2025-11-11",
...     debug=True
... )
>>> with open(result["filename"], "wb") as f:
...     f.write(result["file"])

Notes

Supports profile-based mapping (e.g., “EQ”, “SSH”) via Model.profile or Model.messageType.
Uses lxml.etree with ElementMaker for XML construction.
Undefined classes are exported with rdf:about="urn:uuid:<ID>" when export_undefined=True.

triplets.rdf_parser.get_namespace_map(data: DataFrame)[source]

Extract namespace prefix-to-URI mapping and optional xml:base from a triplet dataset.

This function searches for a NamespaceMap object (identified by KEY='Type' and VALUE='NamespaceMap') within the dataset. It then collects all key-value pairs under that instance where: - KEY is the namespace prefix (e.g., “cim”, “rdf”) - VALUE is the full URI (e.g., “http://iec.ch/TC57/2013/CIM-schema-cim16#”)

Special keys: - xml_base: Extracted separately if present (used as base URI in RDF). - Type: Automatically excluded.

Parameters:

data (pandas.DataFrame) – Triplet dataset with columns [‘INSTANCE_ID’, ‘ID’, ‘KEY’, ‘VALUE’]. Must contain a NamespaceMap instance for successful extraction.

Returns:

namespace_map (dict) – Mapping of namespace prefixes to URIs (e.g., {"cim": "...", "rdf": "..."}). Empty dict if no NamespaceMap is found.
xml_base (str) – Value of xml_base if defined within the NamespaceMap; otherwise empty str.

Examples

>>> ns_map, base = get_namespace_map(triplet_data)
>>> print(ns_map)
{'cim': 'http://iec.ch/TC57/2013/CIM-schema-cim16#', 'rdf': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'}
>>> print(base)
'http://example.com/base/'

>>> ns_map, base = get_namespace_map(empty_data)
>>> print(ns_map, base)
{} ""

Notes

The function is idempotent and safe to call on any dataset.
Uses inner merge on ID to scope entries to the correct NamespaceMap instance.
Always returns a tuple of length 2: (dict, str).

triplets.rdf_parser.get_object_data(data, object_UUID)[source]

Retrieve data for a specific object by its UUID.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
object_UUID (str) – UUID of the object to retrieve.

Returns:

Series with keys as index and values for the specified object.

Return type:

pandas.Series

Examples

>>> obj_data = data.get_object_data("uuid1")

triplets.rdf_parser.id_tableview(data, id, string_to_number=True)[source]

Create a tabular view of a CGMES triplet dataset filtered by ID-s.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing CGMES data.
id (str or list or pandas.DataFrame) – DataFrame containing IDs to filter by.
string_to_number (bool, optional) – If True, convert columns containing numbers to numeric types (default is True).

Returns:

Pivoted DataFrame with IDs as index and KEYs as columns.

Return type:

pandas.DataFrame

Examples

>>> table = id_tableview(data, 'UUID')
>>> table = id_tableview(data, ['UUID_1', 'UUID_2'])
>>> table = id_tableview(data, pandas.DataFrame({"ID":['UUID_1', 'UUID_2']})

triplets.rdf_parser.key_tableview(data, key, string_to_number=True)[source]

Create a table view of all objects with a specified key.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
key (str) – The key to filter objects by (e.g., ‘GeneratingUnit.maxOperatingP’).
string_to_number (bool, optional) – If True, convert columns containing numbers to numeric types (default is True).

Returns:

Pivoted DataFrame with IDs as index and keys as columns, or None if no data is found.

Return type:

pandas.DataFrame or None

Examples

>>> table = data.key_tableview("GeneratingUnit.maxOperatingP")

triplets.rdf_parser.load_RDF_objects_from_XML(path_or_fileobject, debug=False)[source]

Parse an XML file and return an iterator of RDF objects with instance ID and namespace map.

Parameters:

path_or_fileobject (str or file-like object) – Path to the XML file or a file-like object containing RDF XML data.
debug (bool, optional) – If True, log timing information for debugging (default is False).

Returns:

A tuple containing: - RDF_objects (iterator): Iterator over RDF objects in the XML. - instance_id (str): Unique UUID for the loaded instance. - namespace_map (dict): Dictionary of namespace prefixes and URIs.

Return type:

tuple

Examples

>>> rdf_objects, instance_id, ns_map = load_RDF_objects_from_XML("file.xml")

triplets.rdf_parser.load_RDF_to_dataframe(path_or_fileobject, debug=False, data_type='string')[source]

Parse a single RDF XML file into a Pandas DataFrame.

Parameters:

path_or_fileobject (str or file-like object) – Path to the XML file or a file-like object containing RDF XML data.
debug (bool, optional) – If True, log timing information for debugging (default is False).
data_type (str, optional) – Data type for DataFrame columns (default is ‘string’).

Returns:

DataFrame with columns [‘ID’, ‘KEY’, ‘VALUE’, ‘INSTANCE_ID’] representing the triplestore.

Return type:

pandas.DataFrame

Examples

>>> df = load_RDF_to_dataframe("file.xml")

triplets.rdf_parser.load_RDF_to_list(path_or_fileobject, debug=False, keep_ns=False)[source]

Parse a single RDF XML file into a triplestore list.

Parameters:

path_or_fileobject (str or file-like object) – Path to the XML file or a file-like object containing RDF XML data.
debug (bool, optional) – If True, log timing information for debugging (default is False).
keep_ns (bool, optional) – If True, retain namespace information in the output (default is False, unused).

Returns:

List of tuples in the format (ID, KEY, VALUE, INSTANCE_ID) representing the triplestore.

Return type:

list

Examples

>>> triples = load_RDF_to_list("file.xml")

triplets.rdf_parser.load_all_to_dataframe(list_of_paths_to_zip_globalzip_xml, debug=False, data_type='string', max_workers=None)[source]

Parse multiple RDF XML files or ZIP archives into a single Pandas DataFrame.

Parameters:

list_of_paths_to_zip_globalzip_xml (list or str) – List of paths to XML files, ZIP archives, or a single path.
debug (bool, optional) – If True, log timing information for debugging (default is False).
data_type (str, optional) – Data type for DataFrame columns (default is ‘string’).
max_workers (int, optional) – Number of worker threads for parallel processing (default is None).

Returns:

DataFrame with columns [‘ID’, ‘KEY’, ‘VALUE’, ‘INSTANCE_ID’] containing all parsed data.

Return type:

pandas.DataFrame

Examples

>>> df = load_all_to_dataframe(["data.zip", "file.xml"], max_workers=4)

triplets.rdf_parser.print_triplet_diff(old_data, new_data, file_id_object='Distribution', file_id_key='label', exclude_objects=None)[source]

Print a human-readable diff of two triplet datasets.

Parameters:

old_data (pandas.DataFrame) – Original triplet dataset.
new_data (pandas.DataFrame) – New triplet dataset to compare against.
file_id_object (str, optional) – Object type containing file identifiers (default is ‘Distribution’).
file_id_key (str, optional) – Key containing file identifiers (default is ‘label’).
exclude_objects (list, optional) – List of object types to exclude from the diff (default is None).

Notes

Outputs a diff format showing removed, added, and changed objects.
Nice diff viewer https://diffy.org/
TODO: Add name field for better reporting with Type.

Examples

>>> print_triplet_diff(old_data, new_data, exclude_objects=["NamespaceMap"])

triplets.rdf_parser.references(data, ID, levels=1)[source]

Retrieve all references (to and from) a specified object.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
ID (str) – ID of the object to find references for.
levels (int, optional) – Number of reference levels to traverse (default is 1).

Returns:

DataFrame containing triplets of all references to and from the object.

Return type:

pandas.DataFrame

Examples

>>> refs = data.references("99722373_VL_TN1", levels=2)

triplets.rdf_parser.references_all(data)[source]

Find all unique references (links) in the dataset.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing RDF data.
Returns:: DataFrame with columns [‘ID_FROM’, ‘KEY’, ‘ID_TO’] representing all references.
Return type:: pandas.DataFrame

Notes

Does not consider INSTANCE_ID in reference matching.

Examples

>>> refs = data.references_all()

triplets.rdf_parser.references_from(data, reference, levels=1)[source]

Retrieve all objects a specified object points to.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
reference (str) – ID of the reference object.
levels (int, optional) – Number of reference levels to traverse (default is 1).

Returns:

DataFrame containing triplets of objects referenced by the input, with a ‘level’ column.

Return type:

pandas.DataFrame

Notes

TODO: Add the key on which the connection was made.

Examples

>>> refs = data.references_from("99722373_VL_TN1", levels=2)

triplets.rdf_parser.references_from_simple(data, reference, columns=['Type'])[source]

Create a simplified table view of objects a specified object refers to.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
reference (str) – ID of the object to find references from.
columns (list, optional) – Columns to include in the output table (default is [‘Type’]).

Returns:

Pivoted DataFrame with IDs of referenced objects and specified columns.

Return type:

pandas.DataFrame

Examples

>>> table = data.references_from_simple("99722373_VL_TN1")

triplets.rdf_parser.references_simple(data, reference, columns=None, levels=1)[source]

Create a simplified table view of all references to and from a specified object.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
reference (str) – ID of the object to find references for.
columns (list, optional) – Columns to include in the output table (default is [‘Type’, ‘IdentifiedObject.name’] if available).
levels (int, optional) – Number of reference levels to traverse (default is 1).

Returns:

Pivoted DataFrame with IDs, specified columns, and reference levels.

Return type:

pandas.DataFrame

Examples

>>> table = data.references_simple("99722373_VL_TN1", columns=["Type"])

triplets.rdf_parser.references_to(data, reference, levels=1)[source]

Retrieve all objects pointing to a specified reference object.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
reference (str) – ID of the reference object.
levels (int, optional) – Number of reference levels to traverse (default is 1).

Returns:

DataFrame containing triplets of objects pointing to the reference, with a ‘level’ column.

Return type:

pandas.DataFrame

Notes

TODO: Add the key on which the connection was made.

Examples

>>> refs = data.references_to("99722373_VL_TN1", levels=2)

triplets.rdf_parser.references_to_simple(data, reference, columns=['Type'])[source]

Create a simplified table view of objects referencing a specified object.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
reference (str) – ID of the object to find references to.
columns (list, optional) – Columns to include in the output table (default is [‘Type’]).

Returns:

Pivoted DataFrame with IDs of referencing objects and specified columns.

Return type:

pandas.DataFrame

Examples

>>> table = data.references_to_simple("99722373_VL_TN1")

triplets.rdf_parser.remove_triplet_from_triplet(from_triplet, what_triplet, columns=['ID', 'KEY', 'VALUE'])[source]

Remove triplets from one dataset that match another.

Parameters:

from_triplet (pandas.DataFrame) – Original triplet dataset.
what_triplet (pandas.DataFrame) – Triplet dataset to remove from the original.
columns (list, optional) – Columns to match for removal (default is [‘ID’, ‘KEY’, ‘VALUE’]).

Returns:

Dataset with matching triplets removed.

Return type:

pandas.DataFrame

Examples

>>> result = remove_triplet_from_triplet(data, to_remove)

triplets.rdf_parser.set_VALUE_at_KEY(data, key, value)[source]

Set the value for all instances of a specified key.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
key (str) – The key to update.
value (str) – The new value to set for the specified key.

Notes

TODO: Add debug logging for key, initial value, and new value.
TODO: Store changes in a changes DataFrame.

Examples

>>> data.set_VALUE_at_KEY("label", "new_label")

triplets.rdf_parser.set_VALUE_at_KEY_and_ID(data, key, value, id)[source]

Set the value for a specific key and ID.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
key (str) – The key to update.
value (str) – The new value to set.
id (str) – The ID of the object to update.

Examples

>>> data.set_VALUE_at_KEY_and_ID("label", "new_label", "uuid1")

triplets.rdf_parser.tableview_to_triplet(data)[source]

Convert a table view back to a triplet format.

Parameters:: data (pandas.DataFrame) – Pivoted DataFrame (table view) to convert.
Returns:: Triplet DataFrame with columns [‘ID’, ‘KEY’, ‘VALUE’].
Return type:: pandas.DataFrame

Notes

TODO: Ensure this is only used on valid table views.

Examples

>>> triplet = tableview_to_triplet(table_view)

triplets.rdf_parser.type_tableview(data, type_name, string_to_number=True, type_key='Type')[source]

Create a table view of all objects of a specified type.

Parameters:

data (pandas.DataFrame) – Triplet dataset containing RDF data.
type_name (str) – The type of objects to filter (e.g., ‘ACLineSegment’).
string_to_number (bool, optional) – If True, convert columns containing numbers to numeric types (default is True).
type_key (str, optional) – Key used to identify object types in the dataset (default is ‘Type’).

Returns:

Pivoted DataFrame with IDs as index and keys as columns, or None if no data is found.

Return type:

pandas.DataFrame or None

Examples

>>> table = data.type_tableview("ACLineSegment")

triplets.rdf_parser.types_dict(data)[source]

Return a dictionary of object types and their occurrence counts.

Parameters:: data (pandas.DataFrame) – Triplet dataset containing RDF data.
Returns:: Dictionary with object types as keys and their counts as values.
Return type:: dict

Examples

>>> types = data.types_dict()
>>> print(types)
{'ACLineSegment': 10, 'PowerTransformer': 5, ...}

triplets.rdf_parser.update_triplet_from_tableview(data, tableview, update=True, add=True, instance_id=None)[source]

Update or add triplets from a table view.

Parameters:

data (pandas.DataFrame) – Original triplet dataset to update.
tableview (pandas.DataFrame) – Table view containing updates or new data.
update (bool, optional) – If True, update existing ID-KEY pairs (default is True).
add (bool, optional) – If True, add new ID-KEY pairs (default is True).
instance_id (str, optional) – Instance ID to assign to new triplets (default is None).

Returns:

Updated triplet dataset.

Return type:

pandas.DataFrame

Examples

>>> updated_data = data.update_triplet_from_tableview(table_view, instance_id="uuid1")

triplets.rdf_parser.update_triplet_from_triplet(data, update_data, update=True, add=True)[source]

Update or add triplets from another triplet dataset.

Parameters:

data (pandas.DataFrame) – Original triplet dataset to update.
update_data (pandas.DataFrame) – Triplet dataset containing updates or new data.
update (bool, optional) – If True, update existing ID-KEY pairs (default is True).
add (bool, optional) – If True, add new ID-KEY pairs (default is True).

Returns:

Updated triplet dataset.

Return type:

pandas.DataFrame

Notes

TODO: Add a changes DataFrame to track modifications.
TODO: Support updating ID and KEY fields.

Examples

>>> updated_data = data.update_triplet_from_triplet(update_data)

triplets package

Subpackages

Submodules

triplets.cgmes_tools module

triplets.rdf_parser module

Module contents