Talend Metadata Manager – A brief introduction

In this series of our news blog, we will give you a condensed introduction to the Talend Meta Data Manager (TMM). One of our data management experts has recently joint a deep dive session on TMM in Paris together with other experts from all over Europe. But let’s start from the beginning.

What is Metadata Management?

Metadata management is an end-to-end process and governance framework for creating, controlling, enhancing, defining, and managing a metadata model. It involves establishing policies and processes that ensure information can be analyzed and maintained to best effect across the organization.General benefits achieved by the introduction of metadata management include:

  • Consistency of definitions of metadata so that terminology variations don’t cause data retrieval problems.
  • Less redundancy of effort and greater consistency across multiple instances of data because data can be reused appropriately
  • Data lineage is made available across data sources and operational systems and can in combination with change logs used as audit trail.
  • Enables a comprehensive impact analysis on changes to the data integration environment.
  • Maintenance of information across the organization that is not dependent on a particular expert’s knowledge.
  • Greater efficiency, leading to faster product and project delivery.

Talend Metadata Manager (TMM)

Talend Metadata Manager connects data from platforms, databases, and analytics tools to generate a holistic view of the information supply chain in a language that everyone can understand.

Talend Metadata Manager provides a comprehensive set of capabilities for all facets of metadata management. At the heart of Talend Metadata Manager is a repository which contains repository objects, such as models and mappings that are organized into folders. Models can be harvested from Talend Data Integration models, Data Modeling tools, Data Warehouses, external metadata repositories for relational databases (RDBMS), and Data Integration and Business Intelligence tools. A particular type of repository object called Configuration, can connect “metadata stitching” models and mappings together to represent an Enterprise Architecture, including full support for data flow lineage and impact analysis, as well as semantic lineage definitions. Talend metadata Manager includes views and functionality for different user types and companywide collaboration:

Business users are more likely to use data lineage feature

  • “Given an item on a report, what data entry system fields impact these results?”
  • “Why are the numbers on this report the way they are?”
  • “How do I change the system data to correct the results of this report?”

Data stewards and data analysts are more likely to use the impact analysis feature:

  • “If I require a change to this field, what reports will be impacted?”
  • “How is this identity information merged with the information system?”

IT architects and developers need both impact analysis and data lineage features:

  • “How many systems are required to determine the dimensions for this portion of the OLAP model?”
  • “A business report use case is asking the lineage for particular values in a report, so where does the data come from and how is it manipulated?”
  • “If I must change these elements (data type, code sets, etc.) in my operational data store, what is the downstream impact?”
  • “This new ETL process is populating my staging warehouse in new ways; how does this impact my reporting system?”

Harvesting & Stitching Metadata

In a TMM context, metadata harvesting means collecting all metadata from several data sources. The metadata can be harvested on almost any platform of a company’s information system by using so called bridges, which are platform-dedicated connectors using specific drivers to connect to a data source or system. Examples are databases , existing data modeling tools, ETL tools , Big Data platforms, business intelligence software (for example, Tableau, Cognos, Qlik), etc.

 

cimt Talend Metadata Manager

Figure 1 – Data Lineage & Traceability

Harvested metadata is combined to build up an up-to-date and realistic overview of the company’s information system model. Using automated harvesting mechanisms, also changes in the metadata are tracked. In the TMM repository several metadata models are created, each of which represents a single, isolated brick of the conceptual company information model.

cimt Talend Metadata Lineage

Figure 2 – Example Architecture in TMM after harvesting & stitching

Business Glossary

The business glossary standardizes and fosters common understanding of terminology used across an organization following the ISO 11179 standard – a result of two principles of semantic theory, combined with basic principles of data modelling. Using the glossary, the metadata model can be enriched to a common enterprise standard using business terms that usually are not available in technical source descriptions.

Published metadata architecture diagrams can be browsed using the metadata explorer, which is a simplified user interface for business users. All components specific to the configuration can be viewed in the explorer.

Talend Metadata Manager integration

The Metadata Manager seamlessly integrates with Talend data integration and data quality components to enable consolidated operational data governance processes based on shared repositories and dictionaries. Impact analysis, data lineage and linking technical metadata with business definitions are key features to support data management, project scoping and collaboration.

If you would like to learn more about the capabilities of Talend Metadata manager, do not hesitate to contact us for more details and a live demo. We are looking forward to demonstrating TMM in action.