Data is a valuable asset for all businesses and is not always used to its full potential. Even worse, misuse of data, most of the time unintentionally, can result in inefficiency and poor decision making which ends up in missed opportunities and decreased profits.
cimt supports organizations with their digital transformations. One of the aspects of this process is applying data governance. DAMA-DMBOK defines this as ‘The exercise of authority, control and shared decision making (planning, monitoring and enforcement) over the management of data assets.’
Data governance defines the people, processes, framework and organization necessary to ensure that an organization’s information assets (data and metadata) are securely, formally, properly, proactively and efficiently managed throughout the enterprise to secure its trust, privacy, security, meaning and accuracy (EW Solutions).

The inefficiency and poor decision making come from, among others, companies struggling with their data for various reasons:
• Data chaos
• Unreliable data
• Different industry or territory regulations

A tool that helps solve and/or prevent these issues through the practices and processes of data governance is Talend Data Catalog, part of Talend Data Fabric. Talend Data Catalog creates a single source of trusted data for anyone in the company, solving all the above-mentioned data challenges.


  • data chaos

Company growth comes with increased data generation and consumption, be it from internal tools or from external sources (marketing, website, etc.). This automatically induces more complexity in the data model; workflows adapt, new structures emerge, privacy guidelines change (also depending on geo localization) which in turn makes scalability harder to achieve. This is known as data chaos; when a company loses track of data that keeps piling up every day in an unstructured format which in turn leads to a loss of revenue and business opportunities.

According to Forbes, data volumes are exploding, more data has been created in the past two years than in the entire previous history of the human race (Forbes, 2015), but also less than 0.5% of all data is ever analyzed and used. The potential with structured and organized data is limitless and that is where Data Catalog comes in to help solve and prevent data chaos happening in the company in the future.

Data Catalog helps you have one big picture of what your data model looks like through its lineage feature; data is linked and kept structured even when you scale up. The overview also gives insight into what data is not structured, or not linked to the rest of the company structure to keep managers up to date with issues within the company model.

As an example, the figure below shows the semantic lineage of data including Personally Identifiable Information (PII).


  • unreliable data

Moreover, simultaneous development of complex data models by multiple developers increases the risk of having unreliable results or losing data which was previously present in data models. In some cases, if a mistake is made and data is unreliable, you may wish to turn back the clock and compare earlier versions of data to help fix the mistake while minimizing disruption to all team members.

Data Catalog uses the repository to maintain a version history for each harvested or uploaded data model, so that you can keep track of every modification. This makes it possible to easily access and compare the previous data models with the faulty one and decreases the time spent identifying the cause of an issue. Version Controlling System (VCS) makes it possible to revert changes and restore older versions of data models.


  • Different industry or territory regulations

Data Catalog also offers one point of control for all the data, which reduces risks of inaccuracy and accessibility issues by having everyone rely on one source of truth when it comes to how data should comply to company processes and legal matters (think privacy, security, etc.). This is particularly relevant for companies who need to comply with different regulations if they operate in multiple countries or industries.


In conclusion, Talend Data Catalog enables organizations to achieve data governance by ensuring that data is usable, accessible, protected and trusted. Effective data governance leads to better data overview and insight which in turn leads to better decision making.