Sidra Data & AI Platform Glossary¶
This glossary provides definitions and links to key concepts, terms, and services used across the Sidra Data & AI Platform documentation, user interface, and technical training sessions.
Data Ingestion Concepts¶
Data Intake Process (DIP)¶
A Data Intake Process defines the configurations and infrastructure Sidra uses to ingest data from a specific source (e.g., SQL Server, SharePoint). It includes metadata, triggers, and pipelines for extraction.
Connector¶
A Connector in Sidra is a component that facilitates integration with external data sources by executing custom code to extract and standardize data.
Metadata and Structure¶
Provider¶
A logical grouping of Entities, similar to a schema or database in traditional systems.
Entity¶
An Entity is a structured or semi-structured data concept, similar to a table or document collection. Sidra supports multi-modal data types within Entities.
Asset¶
An Asset represents a single data item ingested into the platform, such as a CSV file, PDF document, or database extract.
Data Storage Unit (DSU)¶
Data Storage Units (DSUs) are isolated storage and compute environments in Sidra. They enforce data residency, enable region-based deployments, and serve as the raw and optimized data lake layers.
Note: Data in a DSU is technically transformed (naming, format, schema) but not business-transformed. This raw-but-usable layer enables domain autonomy downstream.
Sidra Services¶
Supervisor¶
Supervisor is Sidra's deployment and lifecycle management tool. It provides a user-friendly interface and API for installing, updating, and scaling Sidra components, ensuring streamlined operations and centralized control.
Authentication Service¶
The Authentication Service manages user identity and access within Sidra. Utilizing Keycloak, it supports industry-standard protocols like OpenID Connect and OAuth 2.0, enabling secure single sign-on (SSO) and integration with enterprise identity providers.
Authorization Service¶
Sidra's Authorization Service governs access control across the platform. Built on the Balea framework, it allows for granular role and permission management, ensuring that users and services have appropriate access to data and functionalities.
Data Quality Service¶
The Data Quality Service ensures the integrity of ingested data by applying validation rules and detecting anomalies. Leveraging the Great Expectations framework, it automates data quality checks and generates comprehensive validation reports.
API Builder¶
API Builder automates the creation of REST and GraphQL APIs for Sidra's Data Products. By analyzing metadata, it generates endpoints and deploys the necessary infrastructure, facilitating seamless data access and integration.
Data Catalog Service¶
Sidra's Data Catalog Service provides an AI-driven cataloging system that automatically generates metadata descriptions for datasets. This enhances data discoverability and supports governance by maintaining up-to-date information about data assets.
Data Products¶
A Data Product is a unit of data and logic, built to serve specific business needs. Each product is deployed as a Resource Group and consumes data from one or more DSUs via APIs. Data Products can include:
- BI dashboards
- Machine learning pipelines
- Operational APIs
- Web or mobile applications
Sidra provides starter templates for common scenarios, and customers can extend the framework to build custom solutions.
Integration & Extensibility¶
Unlimited Connector Toolkit¶
A framework to build custom connectors for ingesting data from niche or legacy systems.
API Connector Toolkit¶
Helps build integrations with third-party APIs by abstracting ingestion and transformation steps.
Features¶
PII Detection¶
Identifies personally identifiable information during ingestion and helps enforce governance rules.
Integration Hub¶
Based on Azure Service Bus, this messaging layer enables asynchronous communication between Sidra Services and Data Products.
Infrastructure Concepts¶
Region¶
An Azure location where DSUs and Data Products can be deployed to meet data residency and latency requirements.
Resource Group¶
An Azure container holding all resources related to a specific Data Product or DSU.
Azure Data Lake¶
Sidra uses ADLS Gen2 and Delta Lake format for scalable and efficient storage of ingested data across DSUs.
APIs¶
Sidra provides APIs for interacting with platform services: