What is Sidra Data Platform¶

Built on Azure PaaS¶

Sidra Data Platform is an enterprise data lake solution focused on efficient deployment, eager scalability and straightforward maintenance built on the Azure ecosystem.

Customizable¶

Sidra is an adaptable and flexible platform able to process large amounts of data from different sources. Offering multi-region data storage, lineage control, auditing capabilities, or extensible APIs among other features.

Data Governance¶

It provides a common foundation, shared services and data governance on which organizations build their use cases, from analytical applications based on SQL Server to scenarios of exploratory analysis.

Competitive advantages¶

Complete deployment in a matter of days
Automated data source configuration, going from zero to data lake in hours
Modular and adaptable to each scenario with Sidra Data Products
Data Catalog and governance capabilities to address data protection challenges
A tool continuously evolving with best practices, allowing you to focus on innovation and building value on data

Key Features¶

Knowledge Store

Multimodal storage supporting all types of data sources: from databases and APIs to documents and media files.

ML Model Serving Platform

Enable your Data Science teams to build, test and deploy secure models, while keeping track of both code and training data for audit and understanding purposes.

Security and Identity

Identity management via Identity Server, allowing secured access to the platform to users with different authentication providers (Microsoft Entra ID, Google Accounts…).

Data Intake ML Models

Pre-packaged models that tackle the most common challenges during the data load process, such as corruption or anomalies in the data set, as well as automatic detection of PII sensitive data.

Integration and Extensibility

APIs for the integration of third-party tools in areas such as Data Catalog or Data Retrieval, as well as Python SDK for Data Scientists.

Data Load Automation

Automation of ETL/ELT process through the automatic generation of pipelines for extraction and movement of data as well as data processing.

Data Governance

Complete Data Catalog with web UI and API access, as well as data lineage audit and traceability.

Batch and Real-time

Support for both batch and real-time data loads, enabling operational data lake scenarios.

Sidra shared services¶

Sidra shared services offer the following properties:

Automated generation of data pipelines for fast and scalable data ingestion
Growing number of data pipelines according to platform needs. The number of data sources ingested makes no difference from the time-to-market point of view. For instance, consuming 10 or 1000 SQL Server tables requires the same deployment effort thanks to Sidra's template-based generation system.

Comprehensive audit of all the system operations
Advanced lineage tracking of data and transformation, ensuring independent examination of product and processes.

Low latency
Sidra Data Platform offers low latency for exploratory purposes and when building data products that need to update close to real-time.

Web-based management UI
Sidra Web provides a visual widget for ingesting new data, an operational tracking dashboard and Log central management.

Data Catalog
Enabling a high-level view of loaded entities across regions and using services focused on data management and global search.

Monitoring
Data Intake Processes are monitored with Power BI Dashboards so we can reduce costs in network performance and infrastructure.

Anomaly detection models for the data movement activities Helping identify unusual proceedings that can impact processes. Outlier detection happens before data is ingested, so integrity is preserved.

For more information about Sidra Service, you can check its documentation page.

Data Product¶

Any actor accessing data in the DSU for a specific business need is a Data Product.

Data Products use their own set of tools and can retrieve data from one or many DSUs, applying business logic when needed.

There is granular access control for each Data Product to preserve data security. For instance, we can configure a Data Product to use just a particular set of entities and preventing access to other data.

Data Products can be deployed in multiple instances and different geographical zones.

For more information about Data Products, you can check its documentation page.