Skip to content

MySQL data

Types and transformations

Some of the original MySQL data types are not supported by Azure Data Factory conversions to Parquet (the format of storage in Sidra DSU).

For those data types that are not supported, Sidra connectors as plugins incorporate some type translation mechanisms in place, used both at metadata extraction and at data extraction phases.

If a certain data type is not supported, that data type is automatically changed to the closest supported type as defined in a Type Translations table.

You can find more information about the general process for type translations for plugins is in this page.

Data Extraction pipeline

Once with all the information provided in the Configuration steps section, Sidra Core will create and deploy the actual data extractor pipeline. The data extraction pipeline is where the actual movement and transformation of data happens:

  • On one hand, the copy data ADF activities are executed, which actually move the data between the source (MySQL Database) and the destination (Azure Data Lake Gen2).
  • On the other hand, the transfer query scripts are executed for each Entity in order to perform data optimization, data validation, etc and loading the data in its optimized format in the data lake.

The time to execute this pipeline is variable depending on the volume of data and the environment.

Initial full data synchronization

Once Sidra is connected to the source database, Sidra MySQL Database connector plugin first copies all rows from every table in every schema and table that has not been explicitly set to be excluded.

For each table (Entity), rows are copied by performing a SELECT statement. Copy Activity in Azure Data Factory parallelizes reads and writes according to the source and destination.

Loading incremental data mechanisms

Once an initial synchronization is complete, Sidra performs incremental synchronizations on the new and modified data in the source system.

Sidra Connector plugin for MySQL Database uses the following mechanism for incremental updates: Non-Change Tracking mechanism enabled by Sidra and Non-Change Tracking custom mechanism enabled by Sidra.

Last update: 2023-12-13