What is an ETL and why PLM should care?
Don't start a PLM project without knowing what an ETL is
Filter by Category
Filter by Author
Your PLM project will not install a new isolated island. If you do so, then you haven’t understood the whole digitalisation process and digital thread concept applying not only to PLM but to your whole organisation. Therefor, you need to understand up-front how your system will communicate with the rest of the company’s tools. You will also have to define how you interact with the outside world. The ETL is part of an eco-system of tool that helps you for this.
Don’t start a PLM project without knowing what an ETL is !!
This is I believe the sample TLA I have seen so far. It says exactly what it does: Extract, Transform and Load data.
The main strength of an ETL on the Extract phase is to allow you to retrieve data from as many sources as possible. The sources can be diverse from application that are exposing an API to a simple text file stored in a folder.
Types of system you may query :
The goal of ETLs will be to have as many connectors possible. Talend and its open source model allowed to let the community build a lot of integrations.
Transform is where it becomes much more tricky. Transformation requires a lot of different capabilities like mapping fields, converting flow into arrays of data or into objects, filtering data, joining tables aggregating data,etc. When you are done with all the available tools your ETL provide, most of them allow to add some custom code to make sure you are not limited.
Finally the LOAD process has the same technical goal of the EXTRACT process: load the prepared data to as many target systems as possible.
The #1 scenario for ETL is migration. I have managed a few migration perfectly with an ETL. Usually the graphical UI and the versioning of your ETL setup will make it possible to explain how the migration flow works without getting too technical.
I have used ETLs several times to make sure legacy systems could still be integrated to the new solution we were provided. This is usually where we work the most with extracting/inserting data in databases or even playing with files. ETLs often have this cool feature which allow to look for any change in a folder. So whenever a new file appear it can trigger an ETL flow.
The long term use-case for an ETL is the connection with a larger enterprise system like an ESB (Enterprise Service Bus) which I will describe in a future blog post. The goal is to have a central system which will manage the different data sources and connect triggers and data on a single bus. The connection between this bus and any other system would be handled with an ETL allowing to standardize as much as possible the data on the Bus.
The one risk with ETL is to start creating too many one-to-one connections. It becomes complicated to maintain at some point. Depending on the context it might suit you very well because you need to keep these integrations independent in their evolution. But the bigger the system becomes the more you will need to look for a better organized system using an ESB.
I have found everything I wanted using Talend. Haven’t tried others except clover ETL a few years back.
Here is a great video introducing to ETL
Last June (June 17th 2021), Neo4j raised $325 millions. Last week ( october 5th 2021), Memgraph raised $9.34 millions. Tigergraph raised $105 millions last winter (February 17th...
Don’t miss any post by subscribing