Skip to content

Sidra Infrastructure

Sidra deploys all required infrastructure using ARM templates and organizes resources into logical Resource Groups for efficient management and compliance.

Concepts

  • Azure Resource Manager (ARM) is the deployment and management service for Azure. It provides a consistent management layer that enables you to create, update, and delete resources in your Azure subscription.
  • In Azure, a Resource Group is a container in Azure Resource Manager that holds related resources for an application.

Each Sidra Resource Group contains applications required by Sidra Core Service and related roles for proper platform operation.

ARM Templates declare the Azure resources used by the platform with all their properties in a JSON file, ensuring consistent and repeatable deployments.

For more information about Sidra deployment through ARM templates, check the documentation guide.

Resource Groups in Sidra

Sidra contains several types of Resource Groups in Azure to organize and isolate different platform components:

  • Sidra Core Service Resource Group
  • DSU Resource Group (minimum 1 DSU Resource Group, with the ability to add additional DSUs)
  • Databricks DSU and Management App managed Resource Groups
  • Data Product Resource Groups (each Data Product deploys as an independent Resource Group)

The deployment scripts in Sidra ensure that the defined resource naming convention and resource tags naming convention are satisfied when creating these resources.

Each Sidra Data Product deploys its resources into a separate, dedicated Resource Group. This approach provides logical isolation, independent lifecycle management, and clear resource ownership for each Data Product.

Sidra Core Service Resource Group

Sidra Core Service Resource Group includes the group of Azure resources required for key transversal and common services across Sidra, including:

  • App Service plan and App Services for Sidra Core Service API, Sidra Web, Authentication and Authorization services
  • Common services: Application Insights, SignalR, Log Analytics
  • Key Vault for secret management
  • Persistence layer through SQL Elastic Pool and SQL databases for normal Sidra operation: Sidra Services, Log, Authentication

Below is a logical high-level diagram depicting the key Azure resources inside the Sidra Core Service Resource Group:

core-resource-group

The xxxx prefix on the service names will be replaced by the project name prefix (4 characters) chosen at Sidra installation time.

The databases mentioned above are inside an Elastic Pool. During initial loads, we recommend scaling up these resources to at least a Standard tier with 400 eDTUs, although the size may be larger depending on the initial installation size and data intake needs.

The connection strings to connect to these databases are stored in the Key Vault with the following secret default names:

  • ConnectionStrings -- CoreContext
  • ConnectionStrings -- DwContext
  • ConnectionStrings -- IdentityServer
  • ConnectionStrings -- LogContext

SignalR manages notifications in real time. The notifications generated by SignalR can be seen in real-time in Sidra Web.

Alerts and monitoring are managed through the Application Insights deployed in the Sidra Core Service Resource Group. The metrics can be analyzed in more detail through personalized dashboards in Application Insights.

Details about the databases in Sidra Core Service

The main database, called Sidra, is composed of different schemas with different purposes. These are described in the Metadata section.

The DW database is a relational database with a dimensional model that stores operational and process data about the data intake (durations, stored volumes, etc.) of all Sidra Providers. This database is used by the Power BI operational reports as well as by the API and Sidra Web.

The Log database contains tables and views with logs of the platform.

The Authentication database is an IdentityServer4 database, responsible for keeping all the information related to authorizations required by the different agents that consume the services of the platform.

Details about the App Services in Sidra Core Service

The API App Service is a comprehensive service that provides management capabilities related to data ingestion, as well as general administration and configuration of the platform.

The Web App Service is used for the web admin portal. From this portal, users can visualize and manage data related to data intake, Data Products, logs, authorizations, and more.

Additionally, Sidra uses other services like Authentication and Authorization Server for managing aspects relating to user authorization and access management.

DSU Resource Group

Sidra DSU Resource Group includes the group of Azure resources required for the storage, orchestration, computing, cognitive and machine learning services of a deployable Data Storage Unit (DSU).

Data Storage Units (DSU) provide logical and physical isolation of data to help with data compliance and regulations. Each DSU isolates not just the data storage, but also the compute, orchestration, intake and ML models, so they can be collocated in specific geographical regions.

Even though a Sidra implementation can have multiple DSUs, they all form part of the same Data Lake, sharing the resources in Sidra Core Service for modules and functionalities like the Data Catalog, Metadata Management, Security model, and more.

A Sidra DSU Resource Group contains fundamentally the following types of resources:

  • Orchestration (mainly through Azure Data Factory)
  • Storage accounts (for Landing Zone, raw as well as optimized storage)
  • Compute services (Databricks)
  • Model deployment, knowledge store and model management services (e.g., AI Services, AI Search Service, Container Registry Service, Azure ML Service)
  • Key Vault for secret management for DSU operations. This Key Vault is different from the one in Sidra Core Service to fulfill compliance requirements that may require physically separating data and keys

Azure Data Factory is the main Microsoft service used for data integration and orchestration. ADF pipelines perform the communication between the sources of data and the data lake. The data intake processes are executed periodically and automatically through configured triggers, so that the Data Lake is always updated and in sync with the source of data.

The computing operations on the data are executed through Databricks. Standard data transformations and optimizations are executed through this platform.

Below is a logical high-level diagram depicting the key Azure resources inside the DSU Resource Group:

dsu-resource-group

The xxxx prefix on the service names will be replaced by the project name prefix (4 characters) chosen at Sidra installation time.

Data Product Resource Groups

Each Sidra Data Product deploys into its own dedicated Resource Group, providing complete isolation and independent lifecycle management. This approach offers several benefits:

  • Resource Isolation: Each Data Product has its own set of Azure resources
  • Independent Scaling: Data Products can scale independently based on their specific requirements
  • Lifecycle Management: Data Products can be deployed, updated, or decommissioned independently
  • Cost Management: Clear cost attribution and management per Data Product
  • Security Boundaries: Separate security and access controls for each Data Product

In the case of Sidra Data Products, there is high variability according to the limitless possibilities that can constitute a Data Product in Sidra. See the section on Sidra Data Products for more details about what constitutes a Data Product in Sidra and underlying resources.

As a general rule, most Data Products contain a set of common types of resources, such as:

  • Storage and persistence services
  • Compute services
  • Orchestration services
  • Common services (App Services for API, Web, Key Vault, etc.)

Each Data Product template is depicted by a different ARM template diagram based on its specific requirements and architecture.

Networking

Sidra provides flexible networking options to accommodate different organizational requirements and security policies. The platform can deploy networking infrastructure in two primary configurations:

Sidra-Managed VNet Deployment

By default, Sidra creates and manages its own Virtual Network (VNet) during deployment. In this configuration:

  • Sidra creates a dedicated VNet with appropriate subnets for different service tiers
  • All Sidra resources (Core Service, DSUs, and Data Products) are deployed within this managed VNet
  • Network security groups and routing are configured automatically
  • This approach provides the simplest deployment experience with minimal networking prerequisites

Customer-Managed VNet Injection

For organizations with existing networking infrastructure or specific compliance requirements, Sidra can inject all its resources into an existing customer-managed VNet:

  • Sidra resources are deployed into subnets within the customer's existing VNet
  • The customer maintains control over the VNet configuration, routing, and security policies
  • This approach enables integration with existing network architectures and security frameworks
  • Requires pre-existing VNet infrastructure with appropriately configured subnets

VNet Supportability Note

If you choose to inject Sidra into your own managed VNet, the VNet resources should not be created inside any of the Sidra Resource Groups. The VNet and its associated networking components (subnets, network security groups, route tables, etc.) must be managed separately from Sidra's Resource Groups to maintain proper separation of concerns and avoid deployment conflicts.

Network Security Considerations

Regardless of the networking approach chosen:

  • All inter-service communication uses secure protocols and Azure-native authentication
  • Network access can be restricted using Azure Private Endpoints where supported
  • Integration with Azure Active Directory provides identity-based access control
  • Network traffic can be monitored and logged through Azure Monitor and Network Watcher