Data Storage Units tables

Sidra allows configuration of multiple Data Storage Units (DSU) which union composes the Data Lake. The following set of tables contains the information about the DSUs and the Azure resources used by them.

DataStorageUnit table

This table contains information for each of the DSUs of the system.

Column Description
Id DSU identifier
Name Name of the DSU. It is also the name of the Azure resource that stores the information of the DSU
ResourceGroupName Name of the Azure Resource Group that includes all the Data Storage Unit related resources
ClusterName Name of the cluster
IdClusterType ClusterType identifier
SecurityPath Path of identifiers used for authorization
IdLocation Location identifier that references the [Management].[AzureRegion] table

ClusterType table

The table ClusterType contains data about the different types of cluster -e.g. Databricks, HDInsight Hadoop...-, it is used to provide to Sidra Manager UI additional information about the cluster configured in the system for each DSU.

Column Description
Id ClusterType identifier
ClusterTypeName Name of the cluster type, e.g. Databricks
ClusterTypeDescription Description of the cluster type

Storage table

This table contains Azure Storage accounts, with which DSU are related and which role plays in that relationship.

Column Description
Id Storage identifier
Name Name of the Azure resource that stores the information
IdDataStorageUnit The identifier of the DSU associated to this Storage
IdStorageRole The identifier of the StorageRole associated to this Storage

StorageRole table

This table contains the list of roles that a Storage account can play related to a DSU.

Column Description
Id StorageRole identifier
StorageRoleName Name of a role that a storage account can play
StorageRoleDescription Description of the role

This table contains static data with the following rows:

Id StorageRoleName StorageRoleDescription
0 Principal The storage account is used as principal storage for data that is being used.
1 Staging The storage account is used as a temporary storage (staging) during data movement.
2 Backup The storage account is used for data backup.

LandingZone table

This table contains information about the landing zones of the DSUs. A landing zone is a location in a Storage account configured to ingest in the DSU any file copied there.

Column Description
Id LandingZone identifier
IdStorage Identifier of the Storage associated to this LandingZone
BasePath Path to the landing zone in the Storage