Skip to content

Sidra Data & AI Platform Glossary

This glossary provides definitions and links to key concepts, terms, and services used across the Sidra Data & AI Platform documentation, user interface, and technical training sessions.


Data Ingestion Concepts

Data Intake Process (DIP)

A Data Intake Process defines the configurations and infrastructure Sidra uses to ingest data from a specific source (e.g., SQL Server, SharePoint). It includes metadata, triggers, and pipelines for extraction.

Connector

A Connector in Sidra is a component that facilitates integration with external data sources by executing custom code to extract and standardize data.


Metadata and Structure

Provider

A logical grouping of Entities, similar to a schema or database in traditional systems.

Entity

An Entity is a structured or semi-structured data concept, similar to a table or document collection. Sidra supports multi-modal data types within Entities.

Asset

An Asset represents a single data item ingested into the platform, such as a CSV file, PDF document, or database extract.


Data Storage Unit (DSU)

Data Storage Units (DSUs) are isolated storage and compute environments in Sidra. They enforce data residency, enable region-based deployments, and serve as the raw and optimized data lake layers.

Note: Data in a DSU is technically transformed (naming, format, schema) but not business-transformed. This raw-but-usable layer enables domain autonomy downstream.


Sidra Services

Supervisor

Supervisor is Sidra's deployment and lifecycle management tool. It provides a user-friendly interface and API for installing, updating, and scaling Sidra components, ensuring streamlined operations and centralized control.

Authentication Service

The Authentication Service manages user identity and access within Sidra. Utilizing Keycloak, it supports industry-standard protocols like OpenID Connect and OAuth 2.0, enabling secure single sign-on (SSO) and integration with enterprise identity providers.

Authorization Service

Sidra's Authorization Service governs access control across the platform. Built on the Balea framework, it allows for granular role and permission management, ensuring that users and services have appropriate access to data and functionalities.

Data Quality Service

The Data Quality Service ensures the integrity of ingested data by applying validation rules and detecting anomalies. Leveraging the Great Expectations framework, it automates data quality checks and generates comprehensive validation reports.

API Builder

API Builder automates the creation of REST and GraphQL APIs for Sidra's Data Products. By analyzing metadata, it generates endpoints and deploys the necessary infrastructure, facilitating seamless data access and integration.

Data Catalog Service

Sidra's Data Catalog Service provides an AI-driven cataloging system that automatically generates metadata descriptions for datasets. This enhances data discoverability and supports governance by maintaining up-to-date information about data assets.

Data Products

A Data Product is a unit of data and logic, built to serve specific business needs. Each product is deployed as a Resource Group and consumes data from one or more DSUs via APIs. Data Products can include:

  • BI dashboards
  • Machine learning pipelines
  • Operational APIs
  • Web or mobile applications

Sidra provides starter templates for common scenarios, and customers can extend the framework to build custom solutions.


Integration & Extensibility

Unlimited Connector Toolkit

A framework to build custom connectors for ingesting data from niche or legacy systems.

API Connector Toolkit

Helps build integrations with third-party APIs by abstracting ingestion and transformation steps.


Features

PII Detection

Identifies personally identifiable information during ingestion and helps enforce governance rules.

Integration Hub

Based on Azure Service Bus, this messaging layer enables asynchronous communication between Sidra Services and Data Products.


Infrastructure Concepts

Region

An Azure location where DSUs and Data Products can be deployed to meet data residency and latency requirements.

Resource Group

An Azure container holding all resources related to a specific Data Product or DSU.

Azure Data Lake

Sidra uses ADLS Gen2 and Delta Lake format for scalable and efficient storage of ingested data across DSUs.


APIs

Sidra provides APIs for interacting with platform services: