Sidra Data Platform deployment

Each installation of Sidra Data Platform deploys its infrastructure in Azure. For that purpose, it uses a deployment project containing the code -and the artifacts- required for the deployment:

  • PowerShell scripts
  • Azure Resource Manager (ARM) templates
  • Databricks Notebooks

Between all the deployment scripts stand out the two main entrypoints for the deployment:

  • Initialization script: creates all the Azure Active Directory applications needed by Core and the resource groups with its related roles. It must be executed once before the first deployment.

  • Orchestration script: contains the parametrization and deploying order of all the resources to be deployed and will use a different set of configuration parameters which are stored in an environment configuration file. The parameters included in the environment configuration file can be modified to tune to the deployment. Depending on the size (S, M, L, XL), the orchestration script will use a different set of size parameters which are stored in a size configuration file.

The orchestration script receives several parameters, including:

Environment parameter

The environment parameter allows to select the deployment environment, some samples are Dev, Test, Prod... The environment parameter is used for:

  • Naming resources in Azure. Some of the resources will include the environment in theirs names.
  • Selecting the configuration file. Since it is specific for each environment it is called the environment configuration file.

The environment configuration file contains a set of parameters with the configuration for that specific environment. The filename must be {environment}Data.psd1, e.g. DevData.psd1, TestData.psd1, ProdData.psd1... The parameters included in the environment configuration file can be modified to tune to the deployment.

Installation size parameter

The installation size parameter allows to select the size of the deployment. The supported values are: S, M, L and XL. Depending on the size, the orchestration script will use a different size configuration file:

  • SCore.psd1, MCore.psd1, LCore.psd1, XLCore.psd1 are the filenames for the size configuration files used in the deployment of Sidra Core.
  • SClient.psd1, MClient.psd1, LClient.psd1, XLClient.psd1 are the filenames for the size configuration files used in the deployment of a Client app.

Each size configuration file contains a set of predefined size parameters. The parameters included in the size configuration file must not be modified since those files are included in a NuGet package and any change will be overwritten each time the package is updated.

How to override the parameters of a size configuration file

In order to override the value of a size parameter, it is necessary to include a parameter with the same name in the appropriate environment configuration file.

For example, it is been used the S size parameter in the deployment of Core. These are the size parameters for Databricks in the SCore.psd1:

1
2
3
4
5
# Databricks 
# ...
DatabricksClusterMinWorkers = 1
DatabricksClusterMaxWorkers = 4
#...

But in the Test environment, it is required to increase the cluster workers, so the following parameters are added to the TestData.psd1:

1
2
DatabricksClusterMinWorkers = 2
DatabricksClusterMaxWorkers = 8

Size parameters for Sidra Core

This is the list of all the size parameters organized by Azure resource and including the value configured for each size.

App Service Plan

Parameter S M L XL
AppServicePlanWindowsSkuName B1 S1 P1V2 P2V2
AppServicePlanWindowsSkuTier Basic Standard Premium Premium
AppServicePlanWindowsSkuSize B1 S1 P1V2 P2V2
AppServicePlanWindowsSkuFamily B S P P
AppServicePlanWindowsSkuCapacity 1 1 2 2

For more info check the official documentation.

Note: The ARM template creates app services plan using the API 2015-08-01, but the documentation about that specific API version has not been found.

Azure Container Register

Parameter S M L XL
AcrSku Basic Basic Standard Premium

For more info check the official documentation.

Azure Kubernetes Service

Parameter S M L XL
AksOsDiskSizeGB 0 0 0 0
AksAgentCount 3 3 3 6
AksAgentVMSize Standard_D2_v2 Standard_D2_v2 Standard_D3_v2 Standard_D3_v2

For more info check the official documentation.

Analysis Services

Parameter S M L XL
AnalysisServicesSku Developer B1 B2 S2

For more info check the official documentation.

Cognitive Services

Parameter S M L XL
CognitiveServicesSku S0 S0 S0 S0

For more info check the official documentation.

Databases

There are four databases created by Core deployment:

  • Core is the metadata database
  • Log contains the logs
  • DW contains an internal Data Warehouse
  • IdentityServer used by the Identity Server for the user authentication

The name of the database is used as a prefix in the name of the size parameters. For example in the LCore.psd1 can be found the following parameters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Main (Core) Database
CoreSizeInGB = 0 # As this is totally managed by ElasticPool
CoreSkuName = "ElasticPool"

# Log Database
LogSizeInGB = 0 # As this is totally managed by ElasticPool
LogSkuName = "ElasticPool"

# DataWarehouse Database
DWSizeInGB = 0 # As this is totally managed by ElasticPool
DWSkuName = "ElasticPool"

# IdentityServer Database
IdentityServerSizeInGB = 0 # As this is totally managed by ElasticPool
IdentityServerSkuName = "ElasticPool"

There are two ways to configure the size parameters for the databases:

Using an ElasticPool

This is the configuration by default in all the sizes. The databases configured this way will be included in the same Elastic pool. More information about Elastic pools can be found in Microsoft official docs. As can be seen in the previous example, it is only required to establish the following parameters:

Parameter Value for all the sizes
[DatabaseName]SizeInGB 0. It is required to put a value but it will be ignored when the database is deployed
[DatabaseName]SkuName ElasticPool

The deployment will take care of moving the database on the pool and to setup the right Sku configuration for the database. Only if at least one of the databases has a SkuName value equals to ElasticPool then an elastic pool resource will be created. All databases using elastic pool will share the same instance of the elastic pool.

Using a Sku model

Since there is no default values for the size parameters used in this configuration, they must always be added to the environment configuration file. The values supported in each of the parameters can be retrieved from the Database Editions List exposed below. These are the size parameters for the Sku model:

Parameter Supported values
[DatabaseName]SizeInGB It is the max size of the database in GB, e.g. 50, 100, 200, etc. It should be compatible with the ServiceObjective selected. Limits can be checked in Microsoft official docs
[DatabaseName]SkuName See column Sku in the Database Editions List below
[DatabaseName]SkuTier See column Edition in the Database Editions List below
[DatabaseName]SkuCapacity See column Capacity in the Database Editions List below
[DatabaseName]SkuSize Ignored at the moment
[DatabaseName]SkuFamily Family (leave null or empty if is N/A)

The following parameters configuration will create/scale Core database to a Standard with 50 DTU (commonly know as "S2" tier) with 250 GB of max size:

1
2
3
4
5
6
CoreSizeInGB = 250
CoreSkuName = "Standard"
CoreSkuTier = "Standard"
CoreSkuCapacity = "50"
CoreSkuSize = ""
CoreSkuFamily = ""

Database Editions List

The following table shows the list of editions for databases. It has been generated based on NorthEurope information and it can be retrieved using this PowerShell command:

1
az sql db list-editions -l northeurope -o table
ServiceObjective Sku Edition Family Capacity Unit Available
System System System N/A 0 DTU False
System0 System System N/A 0 DTU False
System1 System System N/A 0 DTU False
System2 System System N/A 0 DTU False
System3 System System N/A 0 DTU False
System4 System System N/A 0 DTU False
System2L System System N/A 0 DTU False
System3L System System N/A 0 DTU False
System4L System System N/A 0 DTU False
Free Free Free N/A 5 DTU True
Basic Basic Basic N/A 5 DTU True
S0 Standard Standard N/A 10 DTU True
S1 Standard Standard N/A 20 DTU True
S2 Standard Standard N/A 50 DTU True
S3 Standard Standard N/A 100 DTU True
S4 Standard Standard N/A 200 DTU True
S6 Standard Standard N/A 400 DTU True
S7 Standard Standard N/A 800 DTU True
S9 Standard Standard N/A 1600 DTU True
S12 Standard Standard N/A 3000 DTU True
P1 Premium Premium N/A 125 DTU True
P2 Premium Premium N/A 250 DTU True
P4 Premium Premium N/A 500 DTU True
P6 Premium Premium N/A 1000 DTU True
P11 Premium Premium N/A 1750 DTU True
P15 Premium Premium N/A 4000 DTU True
DW100c DataWarehouse DataWarehouse N/A 900 DTU True
DW200c DataWarehouse DataWarehouse N/A 1800 DTU True
DW300c DataWarehouse DataWarehouse N/A 2700 DTU True
DW400c DataWarehouse DataWarehouse N/A 3600 DTU True
DW500c DataWarehouse DataWarehouse N/A 4500 DTU True
DW1000c DataWarehouse DataWarehouse N/A 9000 DTU True
DW1500c DataWarehouse DataWarehouse N/A 13500 DTU True
DW2000c DataWarehouse DataWarehouse N/A 18000 DTU True
DW2500c DataWarehouse DataWarehouse N/A 22500 DTU True
DW3000c DataWarehouse DataWarehouse N/A 27000 DTU True
DW5000c DataWarehouse DataWarehouse N/A 45000 DTU True
DW6000c DataWarehouse DataWarehouse N/A 54000 DTU True
DW7500c DataWarehouse DataWarehouse N/A 67500 DTU False
DW10000c DataWarehouse DataWarehouse N/A 90000 DTU False
DW15000c DataWarehouse DataWarehouse N/A 135000 DTU False
DW30000c DataWarehouse DataWarehouse N/A 270000 DTU False
DS100 Stretch Stretch N/A 750 DTU True
DS200 Stretch Stretch N/A 1500 DTU True
DS300 Stretch Stretch N/A 2250 DTU True
DS400 Stretch Stretch N/A 3000 DTU True
DS500 Stretch Stretch N/A 3750 DTU True
DS600 Stretch Stretch N/A 4500 DTU True
DS1000 Stretch Stretch N/A 7500 DTU True
DS1200 Stretch Stretch N/A 9000 DTU True
DS1500 Stretch Stretch N/A 11250 DTU True
DS2000 Stretch Stretch N/A 15000 DTU True
GP_Gen4_1 GP_Gen4 GeneralPurpose Gen4 1 VCores True
GP_S_Gen5_1 GP_S_Gen5 GeneralPurpose Gen5 1 VCores True
GP_Gen4_2 GP_Gen4 GeneralPurpose Gen4 2 VCores True
GP_Gen5_2 GP_Gen5 GeneralPurpose Gen5 2 VCores True
GP_S_Gen5_2 GP_S_Gen5 GeneralPurpose Gen5 2 VCores True
GP_Gen4_3 GP_Gen4 GeneralPurpose Gen4 3 VCores True
GP_Gen4_4 GP_Gen4 GeneralPurpose Gen4 4 VCores True
GP_Gen5_4 GP_Gen5 GeneralPurpose Gen5 4 VCores True
GP_S_Gen5_4 GP_S_Gen5 GeneralPurpose Gen5 4 VCores True
GP_Gen4_5 GP_Gen4 GeneralPurpose Gen4 5 VCores True
GP_Gen4_6 GP_Gen4 GeneralPurpose Gen4 6 VCores True
GP_Gen5_6 GP_Gen5 GeneralPurpose Gen5 6 VCores True
GP_S_Gen5_6 GP_S_Gen5 GeneralPurpose Gen5 6 VCores True
GP_Gen4_7 GP_Gen4 GeneralPurpose Gen4 7 VCores True
GP_Gen4_8 GP_Gen4 GeneralPurpose Gen4 8 VCores True
GP_Gen5_8 GP_Gen5 GeneralPurpose Gen5 8 VCores True
GP_S_Gen5_8 GP_S_Gen5 GeneralPurpose Gen5 8 VCores True
GP_Gen4_9 GP_Gen4 GeneralPurpose Gen4 9 VCores True
GP_Gen4_10 GP_Gen4 GeneralPurpose Gen4 10 VCores True
GP_Gen5_10 GP_Gen5 GeneralPurpose Gen5 10 VCores True
GP_S_Gen5_10 GP_S_Gen5 GeneralPurpose Gen5 10 VCores True
GP_Gen5_12 GP_Gen5 GeneralPurpose Gen5 12 VCores True
GP_S_Gen5_12 GP_S_Gen5 GeneralPurpose Gen5 12 VCores True
GP_Gen5_14 GP_Gen5 GeneralPurpose Gen5 14 VCores True
GP_S_Gen5_14 GP_S_Gen5 GeneralPurpose Gen5 14 VCores True
GP_Gen4_16 GP_Gen4 GeneralPurpose Gen4 16 VCores True
GP_Gen5_16 GP_Gen5 GeneralPurpose Gen5 16 VCores True
GP_S_Gen5_16 GP_S_Gen5 GeneralPurpose Gen5 16 VCores True
GP_Gen5_18 GP_Gen5 GeneralPurpose Gen5 18 VCores True
GP_Gen5_20 GP_Gen5 GeneralPurpose Gen5 20 VCores True
GP_Gen4_24 GP_Gen4 GeneralPurpose Gen4 24 VCores True
GP_Gen5_24 GP_Gen5 GeneralPurpose Gen5 24 VCores True
GP_Gen5_32 GP_Gen5 GeneralPurpose Gen5 32 VCores True
GP_Gen5_40 GP_Gen5 GeneralPurpose Gen5 40 VCores True
GP_Fsv2_72 GP_Fsv2 GeneralPurpose Fsv2 72 VCores True
GP_Gen5_80 GP_Gen5 GeneralPurpose Gen5 80 VCores True
BC_Gen4_1 BC_Gen4 BusinessCritical Gen4 1 VCores True
BC_Gen4_2 BC_Gen4 BusinessCritical Gen4 2 VCores True
BC_Gen5_2 BC_Gen5 BusinessCritical Gen5 2 VCores True
BC_Gen4_3 BC_Gen4 BusinessCritical Gen4 3 VCores True
BC_Gen4_4 BC_Gen4 BusinessCritical Gen4 4 VCores True
BC_Gen5_4 BC_Gen5 BusinessCritical Gen5 4 VCores True
BC_Gen4_5 BC_Gen4 BusinessCritical Gen4 5 VCores True
BC_Gen4_6 BC_Gen4 BusinessCritical Gen4 6 VCores True
BC_Gen5_6 BC_Gen5 BusinessCritical Gen5 6 VCores True
BC_Gen4_7 BC_Gen4 BusinessCritical Gen4 7 VCores True
BC_Gen4_8 BC_Gen4 BusinessCritical Gen4 8 VCores True
BC_Gen5_8 BC_Gen5 BusinessCritical Gen5 8 VCores True
BC_M_8 BC_M BusinessCritical M 8 VCores True
BC_Gen4_9 BC_Gen4 BusinessCritical Gen4 9 VCores True
BC_Gen4_10 BC_Gen4 BusinessCritical Gen4 10 VCores True
BC_Gen5_10 BC_Gen5 BusinessCritical Gen5 10 VCores True
BC_M_10 BC_M BusinessCritical M 10 VCores True
BC_Gen5_12 BC_Gen5 BusinessCritical Gen5 12 VCores True
BC_M_12 BC_M BusinessCritical M 12 VCores True
BC_Gen5_14 BC_Gen5 BusinessCritical Gen5 14 VCores True
BC_M_14 BC_M BusinessCritical M 14 VCores True
BC_Gen4_16 BC_Gen4 BusinessCritical Gen4 16 VCores True
BC_Gen5_16 BC_Gen5 BusinessCritical Gen5 16 VCores True
BC_M_16 BC_M BusinessCritical M 16 VCores True
BC_Gen5_18 BC_Gen5 BusinessCritical Gen5 18 VCores True
BC_M_18 BC_M BusinessCritical M 18 VCores True
BC_Gen5_20 BC_Gen5 BusinessCritical Gen5 20 VCores True
BC_M_20 BC_M BusinessCritical M 20 VCores True
BC_Gen4_24 BC_Gen4 BusinessCritical Gen4 24 VCores True
BC_Gen5_24 BC_Gen5 BusinessCritical Gen5 24 VCores True
BC_M_24 BC_M BusinessCritical M 24 VCores True
BC_Gen5_32 BC_Gen5 BusinessCritical Gen5 32 VCores True
BC_M_32 BC_M BusinessCritical M 32 VCores True
BC_Gen5_40 BC_Gen5 BusinessCritical Gen5 40 VCores True
BC_M_64 BC_M BusinessCritical M 64 VCores True
BC_Gen5_80 BC_Gen5 BusinessCritical Gen5 80 VCores True
BC_M_128 BC_M BusinessCritical M 128 VCores True
HS_Gen4_1 HS_Gen4 Hyperscale Gen4 1 VCores True
HS_Gen4_2 HS_Gen4 Hyperscale Gen4 2 VCores True
HS_Gen5_2 HS_Gen5 Hyperscale Gen5 2 VCores True
HS_Gen4_3 HS_Gen4 Hyperscale Gen4 3 VCores True
HS_Gen4_4 HS_Gen4 Hyperscale Gen4 4 VCores True
HS_Gen5_4 HS_Gen5 Hyperscale Gen5 4 VCores True
HS_Gen4_5 HS_Gen4 Hyperscale Gen4 5 VCores True
HS_Gen4_6 HS_Gen4 Hyperscale Gen4 6 VCores True
HS_Gen5_6 HS_Gen5 Hyperscale Gen5 6 VCores True
HS_Gen4_7 HS_Gen4 Hyperscale Gen4 7 VCores True
HS_Gen4_8 HS_Gen4 Hyperscale Gen4 8 VCores True
HS_Gen5_8 HS_Gen5 Hyperscale Gen5 8 VCores True
HS_Gen4_9 HS_Gen4 Hyperscale Gen4 9 VCores True
HS_Gen4_10 HS_Gen4 Hyperscale Gen4 10 VCores True
HS_Gen5_10 HS_Gen5 Hyperscale Gen5 10 VCores True
HS_Gen5_12 HS_Gen5 Hyperscale Gen5 12 VCores True
HS_Gen5_14 HS_Gen5 Hyperscale Gen5 14 VCores True
HS_Gen4_16 HS_Gen4 Hyperscale Gen4 16 VCores True
HS_Gen5_16 HS_Gen5 Hyperscale Gen5 16 VCores True
HS_Gen5_18 HS_Gen5 Hyperscale Gen5 18 VCores True
HS_Gen5_20 HS_Gen5 Hyperscale Gen5 20 VCores True
HS_Gen4_24 HS_Gen4 Hyperscale Gen4 24 VCores True
HS_Gen5_24 HS_Gen5 Hyperscale Gen5 24 VCores True
HS_Gen5_32 HS_Gen5 Hyperscale Gen5 32 VCores True
HS_Gen5_40 HS_Gen5 Hyperscale Gen5 40 VCores True
HS_Gen5_80 HS_Gen5 Hyperscale Gen5 80 VCores True

For more info check the official documentation.

Databricks

Parameter S M L XL
DatabricksPricingTier premium premium premium premium
DatabricksClusterNodeTypeId Standard_DS3_v2 Standard_DS3_v2 Standard_D8s_v3 Standard_D16s_v3
DatabricksClusterDriverNodeTypeId Standard_DS3_v2 Standard_DS3_v2 Standard_D8s_v3 Standard_D16s_v3

For more info check the official Microsoft documentation and official Databricks documentation.

Elastic pool

The Elastic pool is created only if there is, at least, one database that has the value of 'ElasticPool' configured in the 'requestedServiceObjectiveName' parameter.

Parameter S M L XL
ElasticPoolDtu 50 100 200 400
ElasticPoolDatabaseDtuMin 0 0 0 0
ElasticPoolDatabaseDtuMax 50 100 200 400
ElasticPoolEdition Standard Standard Standard Standard

For more info check the official documentation.

Identity Server

Parameter S M L XL
IdentityServerSendgridPlan free free free free

KeyVault

Parameter S M L XL
KeyVaultSkuName Standard Standard Standard Standard
KeyVaultSkuFamily A A A A

For more info check the official documentation.

Operational Management Service

Parameter S M L XL
OMSTier Free Free Free Free
OMSRetentionDays 7 30 30 30

For more info check the official documentation.

Search services

Parameter S M L XL
SearchServicesSku free standard standard standard2
SearchServicesReplicaCount 1 1 1 1
SearchServicesPartitionCount 1 1 1 1
SearchServicesHostingMode default default default default

For more info check the official documentation.

Storage Accounts

Parameter S M L XL
StorageAccessTier Hot Hot Hot Hot
StorageKind StorageV2 StorageV2 StorageV2 StorageV2
StorageSkuName Standard_LRS Standard_LRS Standard_LRS Standard_LRS
StorageSkuTier Standard Standard Standard Standard

For more info check the official documentation.

Virtual Machines

Parameter S M L XL
VMsCount 2 2 2 2
VirtualMachineSize Standard_A1_v2 Standard_A1_v2 Standard_A2_v2 Standard_A4_v2
VMOSDiskType Standard_LRS Standard_LRS Standard_LRS

For more info check the official documentation.

SignalR Service

Parameter S M L XL
VMsCount Free_F1 Free_F1 Standard_S1 Standard_S1

For more info check the official documentation.

Size parameters for Sidra Client app

App Service Plan

Parameter S M L XL
AppServicePlanWindowsSkuName B1 S1 P1V2 P2V2
AppServicePlanWindowsSkuTier Basic Standard Premium Premium
AppServicePlanWindowsSkuSize B1 S1 P1V2 P2V2
AppServicePlanWindowsSkuFamily B S P P
AppServicePlanWindowsSkuCapacity 1 1 2 2

For more info check the official documentation.

Note: The ARM template creates app services plan using the API 2015-08-01, but the documentation about that specific API version has not been found.

Analysis Services

Parameter S M L XL
AnalysisServicesSku Developer B1 B2 S2

For more info check the official documentation.

Database

There is one single database created for Client apps: Client. The name of the database is used as a prefix in the name of the size parameters. For example in the LClient.psd1 can be found the following parameters:

1
2
3
4
5
6
7
# Client Database
ClientSizeInGB = 250 
ClientSkuName = "Standard"
ClientSkuTier = "Standard"
ClientSkuCapacity = "50"
ClientSkuSize = ""
ClientSkuFamily = ""

There is only one way to configure the database and it is using a Sku model. The values supported in each of the parameters can be retrieved from the Database Editions List exposed above. These are the size parameters for the Sku model:

Parameter Supported values
ClientSizeInGB It is the max size of the database in GB, e.g. 50, 100, 200, etc. It should be compatible with the ServiceObjective selected. Limits can be checked in Microsoft official docs
ClientSkuName See column Sku in the Database Editions List above
ClientSkuTier See column Edition in the Database Editions List above
ClientSkuCapacity See column Capacity in the Database Editions List above
ClientSkuSize Ignored at the moment
ClientSkuFamily Family (leave null or empty if is N/A)

For more info check the official documentation.

KeyVault

Parameter S M L XL
KeyVaultSkuName Standard Standard Standard
KeyVaultSkuFamily A A A A

For more info check the official documentation.

Storage Accounts

Parameter S M L XL
StorageAccessTier Hot Hot Hot Hot
StorageKind StorageV2 StorageV2 StorageV2 StorageV2
StorageSkuName Standard_LRS Standard_LRS Standard_LRS Standard_LRS
StorageSkuTier Standard Standard Standard Standard

For more info check the official documentation.