Skip to content

Sidra Data Lake Approach to Data Intake ProcessesΒΆ

In the Sidra Data Platform Overview it is described how Sidra is an end-to-end data platform whose key approach to integrate with source systems and bring the data to the platform domain is a data-lake approach. Bringing data to the platform domain is required to access data that is usually in silos in operational systems, most of the times on-prem, and which needs to be made available for analytics consumption.

The data lake is just one of the first steps in the overall architecture to allow the fast setup of data products and applications based on analytical data. The data lake is used to standardize the Data Intake Process of different data sources and their mapping to Sidra Metadata system. Thanks to this standardization it is possible to carry out data governance use cases, like security, granular access control and define and enforce data integration standards.

You can see more information about the Sidra metadata model related to data ingestion in this page.

The Azure Data Lake Storage Gen2 (ADLS Gen2) is the service where all the data for every data Provider is added to the system for making it available for downstream consumption.

In opposition to the traditional Data Warehouses, data lakes store the information in the most pure and raw format possible (the concept of immutable data lake), whether it is structured or unstructured data. This allows to ease the data ingestion logic: shifting the paradigm from ETL (extract-transform-load) to ELTs (extract-load-transform), and to focus on the usage of this data by each Data Product.

The next key piece in Sidra for the end-to-end platform is the concept of Sidra Data Products. Data Products in Sidra are the pieces of Sidra Data Platform that enable business cases. These data business cases encompass the specific business transformation logic (application specific data transformations and validations) and the serving of a data interface to serve final use cases (e.g., reporting) or to further expose to other consuming external applications in the enterprise.

You can find more information about Sidra Data Products key concepts here.

Last update: 2023-06-22