Sync UI for Data Products¶
Sidra's wizard now includes the option to check and interact with the different artifacts of a Sidra Data Product without the need to do changes in the database, Azure App Service or ADF. This is possible thanks to the
Settings button added inside a specific Data Product (image below). If the
Settings button is not available is because the Data Product needs to be updated.
By clicking on it, the Sidra's UI will allow the following actions.
The options enabled for the Sync are:
- Status: Running or stopped.
- Polling Interval (in minutes): editable field by the
Edittop right button, by default is set in 2 minutes. It defines how often the Sync is checking if there are new Assets. In Azure, this parameter can be checked on the configuration section of the Web App.
- A configurable field
Sync metadata when new Assets are availablethat, when enabled, syncs the metadata whenever a new Asset is detected according to the polling interval defined.
- Also, the user will be able to stop the Sync by the top right button of the page, replacing the manual method for stopping or running it inside the App Service of the Data Product, in the Azure webjobs section.
This section shows the Data Product intake pipelines. Also, the creation of new pipeline is now possible as well as the management of existing pipelines, changing:
- The template used
- The behaviour to synchronize Assets
- The Entities to be consolidated
- The consolidation mode used
- The definition of the Staging tables
The different edition options are described as follows.
2.1 New pipeline¶
By the top right button
New, the user now is able to add a new pipeline with the desired template and its parameters as, for example, the stored procedure name which loads the data, that is, the orchestrator. Furthermore, the sync behaviour can be configured here with the values:
Load All New Dates,
Load First New Date,
Load Last New Date,
Load Pending Up To Valid Date...
2.2 Configuration pipeline¶
After clicking on the gear button, a new page for details of configuration is available which is composed of several sections:
Details: informative box the chosen pipeline.
In this part the Entities associated with this particular pipeline can be checked. This is the relationship depicted in the
Sidra.EntityPipelinetable in the database of the Data Product. This means, the selected Entities will be the ones extracted from the DSU to the Staging tables.
Editbutton is enabled in order to establish this Entity-pipeline relationship directly, avoiding the manual change in the database for associating Entities as described here.
In this screen, DSU, Provider and Entity information is depicted. The
Consolidation Modecan be modified here as well as including Entities in the pipeline or mark the field
Mandatory. When an Entity is included here, the Staging Configuration section is updated with the new Entity data.
This section incorporates the table
Sidra.StagingConfigurationof the Data Product, which is responsible for adding additional configuration to the Entities associated with the pipelines: schema name, table name, adding a query for the Databricks, etc.
This part will be automatically populated with default values when adding Entity-pipeline relationships in the section 2 (
Status of Load Processes
This important section is showing the data load information (ADF pipelines status) from Sidra Data Product database and ADF. When an error occurs in the data load, an info icon will be shown with the ADF message on it.
Durationfields will depend on ADF records so, with the time, this information will be no longer shown.
2.3 Reload Entities¶
A button is included in order to reload Entities after any error or situation that could happen. This button replaces the manual change of status in the database for these Entities meaning in a huge time saving. Regarding the
Start Date window:
Start Datewill need to be specified for the Entities reload for a specific date range (start date to current date).
- If it is not specify, the entire Entity will be reloaded.
- A warning will appear informing about the scale-up of resources.
2.4 Edit pipeline¶
The same parameters configured when we create a new pipeline, can be changed in this section by the button for editing (pencil).
2.5 Delete pipeline¶
When doing the delete option, the pipeline in the Data Product database will appear as removed (
IsRemoved column set to 1) and the current screen in the UI will not show the pipeline. In the backend, the API will trigger the
DataFactoryManager webjob deploying the changes in ADF, removing the pipeline.
This part acts as an informative part depicting the Assets of the Data Product sorted by date.
- When clicking on
Editbutton or when creating a new Data Product,
Descriptionfield must be filled.
Descriptionfield has a maximum of 256 characters.
Titlefield has a maximum of 250 characters.
- When removing the whole title in
Editoption, some unexpected behaviour can happen returning the original value.
- When editing a Data Product to add a new image, image selector will not open.
- When the user name for a Data Product is changed, the image must be loaded again.