From the Data Catalog view, users can navigate or search through all the Providers and Entities which are visible given the permissions in place. Sidra Data Catalog provides data management capabilities from a user-friendly interface.
Sidra Data Catalog is based on the Sidra metadata hierarchy model. Please refer to this page for details about the different objects of the Sidra metadata model.
The functionality of Sidra Data Catalog includes the following main functionalities:
- Visualize which data Assets (Providers, Entities and Attributes) are available in the platform.
- Visualize key indicators and figures about these data Assets, like total volumes or Attribute popularity. Some of these indicators at an aggregate level are also available in Sidra Web Dashboard.
- Visualize and search among these Assets by using different Attribute filters (name, owner, ...).
- Preview data in a secure and compliant way.
- Document the different Assets by means of enriched metadata, like adding enriched descriptions or assigning tags to the Data Catalog items.
It is important to note that this Data Catalog and all its underlying metadata is accessible through a secure API. This allows to easily integrate with other external data governance and data management tools outside of Sidra. Sidra has been integrated in the past with other tools (Alteryx Connect) by using this mechanism.
Below are the main sections in the Data Catalog web interface:
Main sections in Data Catalog¶
Providers view page
The Providers view page includes an overview of all the Providers that have been configured in Sidra for data ingestion.
This page includes a bar chart visualization of the Providers sorted by ingested size into the platform.
The Providers view page also includes a list of all the configured Providers, in either a card view or a list view navigation style.
The card view allows an icon-based navigation experience. Each Provider card includes the key details of a Provider, and allows to drill-through that Provider to either see the Provider detail page, or see the list of Entities that are associated with that Provider.
The Providers view page also allows to change the layout of the Providers to a list view. Each element in the Providers list view also supports the action menu to drill-through the Provider details and the Entities.
For switching between cards view and list view, just click on the Cards and List icons on the top right corner of the page:
Entities view page
This page is similar to the Providers view page, but displays the list of all the configured Entities for a specific Provider.
Provider detail page
The Provider detail page displays the main metadata elements corresponding to a Provider structure inside Sidra metadata, namely:
- Provider image
- General info
- Total size
- Creation date
The Provider detail page also displays a list including all the Entities that are associated to the Provider.
The Provider editor allows the user to document the Providers in depth.
Most of the interface real state is dedicated to the documentation editor, whose field General Info is implemented with an embedded markdown that allows the user to generate rich format documents with links to external documentation, etc.
In addition to the documentation editor with real time preview and optional full screen mode, there is also support for editing the Provider image, tags, etc:
Entity detail page
The Entity detail page displays the main metadata elements corresponding to an Entity structure inside Sidra metadata, in a similar way to how Provider details are shown in the Provider detail page.
Entity metadata can also be populated from the UI, including support for Entity documentation, tags, and a short description.
In addition to that, the Entity editor shows the Attributes along with the Attribute popularity, which is a measure of how often the specific Attribute is retrieved by the Client Applications in relation to the rest of the Entity's Attributes.
An Attributes detail page is being implemented as part of the Roadmap 2021 for Sidra Web.
The data preview section inside an Entity detail page displays a sample of the ingested data into the platform. By default, all Entities have data preview enabled.
Since some Entities might have sensitive data that cannot be exposed to all of the Data Catalog users, specific Attributes can be masked, so that all the users that don't have (without) the required permissions will only see the masked version. Admin users will be able to see the full data without masking, but other user roles with no access to masked data will see the sensitive Attributes masked as per the data masking defined in the Dynamic Data Masking configuration in Azure portal.
By default, and similarly to the Attribute pane described above, the system only shows data for the business Attributes, hiding all internal Sidra ones. If the user wants to check a data preview for system columns, the "Show system attributes" option at the top of the page can be used to enable them.
Searching, sorting and filtering items
The Providers view page contains a Search interface for filtering the list of Providers by specific search term on either Name or Description of the Provider.
This Search capability retrieves not only Providers, but also Entities based on the search string.
The Data Catalog also includes a feature for filtering the elements in the UI (Providers and Entities) on a number of Attributes.
The filter options are open after clicking on the Filter icon on the top right corner of the page:
Filter action allows to filter by different and composed criteria:
Next to the Filter icon there is also a Sorting icon in order to sort the items in the list of Providers and Entities by different criteria.
Data Masking configuration¶
Sidra Web Data Catalog provides a data masking functionality in order to control who can see unmasked data in the Data Preview module inside the Entity detail view page.
By means of data masking, we ensure that only authorized users (who have been assigned the
MaskedDataReader role) will be able to see the data in clear from the Data Preview tables.
The executed data masking is provided by the native Dynamic Data Masking functionality by SQL Server.
This feature incorporates several masking functions according to the origin data (e.g. email, credit card, custom).
The masking is applied on the tables under the schema
DataPreview, which are created in the SQL Sidra Core database during the execution of
the data ingestion processes (transfer query script).
Sidra Core implemetns in its metadata system support for defining, at Attribute level, which data masking function to implement.
The Attribute column to be used for such configuration is called
For example, when defining a data masking rule with the custom function, over an Attribute
DataMask field of such Attribute
will need to be updated with this information:
Accoreding to the Microsoft SQL Server implementation, we would apply a masking for all the characters except the first character of this String type field.