Master Data Management: a 2-for-1 MDM book review
Published on Linked In, Friday, 30th April 2021 by David Finlay
First of all, a little about me, and what drove me to write a 5203-word article on MDM …
I see Master Data Management as a critical business enabler essential to operating and sustaining business processes within and across ERP, CRM, and BI system landscapes. I owe my passion for all things Master Data to experiences earlier in my career in various SCM roles where I learned how SAP systems integrate business functions and financial flows. I also witnessed the woes that go with poor MDM practices and have literally been that guy pacing a warehouse in Germany at 9 pm on a Friday night calling the US HQ to fix missing master data so that we could receive the latest urgent product launch and have it delivered to a hospital for Saturday AM delivery. If you can think of a master data disaster, I have probably seen it at least once in my past roles in Demand Planning, warehouse consolidation projects, distribution center operations, as well as SAP implementations for manufacturing sites. Later I got to do something about it in my role of Material Master (Product) Data Governor. I like to think I left the place in a better way and worked with some great people along the way. More recently, I was deeply involved in Data Quality cleansing and process redesign with OTC and Service teams preparing for a global SAP rollout. Currently, I’m project managing GTIN reassignment activities in a Product Serialization CoE for a pharma company in spinoff mode. If you ever get a chance to work in the departments most impacted by master data, you should do it. I feel that kind of experience can really stand to you if you want to make a career in MDM, whether in the business or IT or increasingly somewhere in between.
Aside from experience, unless you have all the answers - you should also read – a lot. You don’t have time to make every mistake yourself – so learn from the best and take heed of the worst others have experienced. I was going through my business book library (shelf) in early March and noticed that I had several tomes on Data Governance, Data Stewardship, Data Quality, ERP Migration Projects, as well as the DAMA DMBoK and even a volume or two on Data Storytelling. But, aside from some course materials and personal notes, I owned not a single book dedicated to MDM! Given Master Data has become an important part of my niche – it was about time I remedied this. So, eight weeks later, here I am with this article which aims to review and draw out some nuggets of wisdom from two of the most ‘popular’ books about MDM.
If you’ve been wondering where to dive into MDM – I hope you get something out of my effort. I realize this isn’t ‘light bedtime reading,’ but it’s a 12 page summary vs. the 957 pages I needed to read first to get this in front of you 😉
The two books covered in this article are:
- Master Data Management, by David Loshin. Published by Morgan Kaufmann / Elsevier (2009).
- Enterprise Master Data Management: An SOA Approach to Managing Core Information, by A. Dreibelbis, E. Hechler, I. Milman, M. Oberhofer, P. Van Run. Published by IBM Press (2008).
Although Data Stewardship and Data Governance get some coverage, they are not their primary emphasis of these books. More comprehensive and recent books are available serving those purposes, so I’ve embedded links for them. I’m also keen to hear from you in the comments section. In addition, please recommend any other MDM books you have found valuable or post links to relevant content. Lastly, please support the authors and consider buying their books.
For full transparency:
I am not affiliated with the authors, nor have I received a fee or even a free book copy for compiling this review. I will refer to these books as ‘IBM’ and ‘Loshin’ to avoid retyping the full book titles throughout this text. Where very similar topics are addressed in both books, I endeavor not to repeat them in the takeaway sections. I do this merely for brevity. It in no way suggests the other book does not cover a particular topic in part or at all. In some cases where I mention ‘IBM on…’ or ‘Loshin on …’, I am directly quoting or sometimes paraphrasing the gist of their intent to be a little more concise given the constraints of the article format.
Where possible, I also like to add a sentence prefaced with ‘My perspective’ to expand on how the topic hit home with me. If a particular paragraph resonates with you, this may well encourage you to buy the book(s) and read all about it. Once again, please mention in the comments section which topics are relevant to your experiences and challenges.
Introduction
One of the first things I am sure you will notice is that both books were published more than twelve years ago. So you will surely ask whether they remain relevant. Surprisingly, or perhaps not at all surprising to those long-suffering at the butt-end of bad data, the answer is that the foundational concepts of MDM haven’t changed all that dramatically. Technology has improved, and evolving regulations have brought a slew of additional requirements (think GDPR, CCPA, etc.) – but most organizations still struggle with getting the basics of MDM right. Scott Taylor put it well in a recent DM Radio interview: ‘MDM is macro-trend agnostic.’
I like to think of Master Data as ‘that thing that has always been needed’ ever since businesses started using computers to execute business transactions and maintain bookkeeping records. Of course, it’s not always the most exciting subject to all people, but it’s not likely to go out of fashion any time soon.
Many companies have recently found an urgent need to improve MDM to unlock value from enterprise data analytics and business process mining initiatives. But they have also learned (often the hard way through expensive and/or failed initiatives) that to make any real headway in the ‘sexier’ topics of Robotic Process Automation (RPA), Machine Learning (ML), and Predictive Analytics – you had better have the ‘little data’ (master data) in decent shape. And you’d better have a roadmap to get there and requisite resources focused on it. Master Data will not master itself!
Top takeaways from the Loshin book
Loshin on … Doing the right thing because it needs doing, but don’t expect it to be easy
Most individuals are driven by incentives that are awarded based on heroism instead of ensuring predictability. Managing siloed data sets is simpler than integrating data across the enterprise. It is always easier to treat the symptoms. But organizations that want to be competitive in the information age - genuinely need to understand the value of high-quality information. Anticipate that a rapid transition from a loosely coupled confederation of vertical silos to a more tightly coupled collaborative framework will ruffle some feathers.
My perspective: you know you are doing MDM right when heroics are rarely required, and you are spending more time on prevention rather than on cure. The only problem with doing it well is that you will then be asked to do it cheaper! So to avoid the race to the bottom – regularly spend time documenting value and stay relevant by evolving with business requirements. To help with this, I recommend training in Business Relationship Management from the BRM Institute.
Loshin on … The human connections that help run our businesses
It is not just systems that work better together (because of improved master data). The people managing and using those systems also forge better working relationships, leading to more effective business management.
My perspective: while I like data, I love studying human behavior. I’m intrigued about why we do what we do and the actors' motivations involved in business activities. MDM is a great subject to get into if you enjoy working with lots of different business functions and are interested in continuous improvement. If it’s boring – you are doing it wrong.
Loshin on … Tieing MDM solutions back to business challenges
MDM is not the end objective; rather, it is the means by which other strategic and operational objectives are accomplished, for example, improved decision making. Inconsistency across business intelligence activities often occurs because of duplication in the underlying data (both transactional and master). Questions regarding the consistency of reports can stymie management decision-making, leading to missed business opportunities and further silo-building. The information consistency provided by MDM across applications reduces data variability, which in turn minimizes mistrust of data and allows for clearer, faster business decisions.
Loshin on … How to support business needs through focussed MDM initiatives
- Assess the use of commonly used information objects, collections of valid data values, and business rules
- Identify core information objects that could benefit from centralization
- Instantiate a standardized model for integrating and managing key information objects
- Manage collected and discovered metadata as a browsable resource
- Collect data from candidate data sources, evaluate how different data instances refer to the same real-world entities, and create a unique, consolidated view of each one
- Provide methods for transparent access to the unified view of real-world data
- Institute appropriate data stewardship and management policies and procedures at corporate and LOB levels to ensure a high-quality master data asset
Loshin on … The need to be thorough with stakeholder analysis
Understanding stakeholder requirements is a critical step involving collecting LOB data requirements and collating and then synthesizing those requirements into an enterprise view. It involves conducting interviews with key stakeholders, including executive sponsor(s), primary information consumers, and representatives from impacted groups.
My perspective: While, in theory, this advice could be used during any consulting engagement for almost any business topic, I have found that it is imperative to consider the supply and demand aspect of business data. Too often, those creating master data don’t have an appreciation of downstream processes where it will be used. Too often, the people feeling the pain of poor data quality struggle to locate the resources they should collaborate with on continuous improvement initiatives.
Loshin on … Developing an Implementation Road Map tailored to the desired state of maturity
On the one hand, one might expect all organizations to strive for execution at the strategic performance level. On the other hand, however, achieving the capabilities at this level requires a significant investment in time and resources—an investment for which interim value is expected. Therefore, it is more reasonable to chart a roadmap through the different maturity levels, detailing the business value expected as each level is attained.
My perspective: For additional reading on this topic, I recommend the maturity model course by George Firican or take the Certified Enterprise Data Management Associate (EDMA) course and related exam from CMMI. I have found that not everyone agrees on the value of maturity models. Still, I will say that in the proper context and adequately explained, they can be an excellent visual aid to help with as-is assessment and progressive future-state road mapping.
Loshin on … On the need the thread carefully on the topic of Data ‘Ownership’ and especially centralization
One major drawback to centralizing ownership is politics. This is because the reassignment of ownership, by definition, removes responsibilities from individuals, some of whom will feel threatened by the transition.
My perspective: This is a double-edged sword. When you decentralize data ownership, you also get a different set of problems like dilution of responsibilities and finding it harder to get disparate groups to care holistically. There is no panacea to these choices since the they all require culture-dependent solutions.
Loshin on … Tips to identify CDEs (Critical Data Elements)
CDEs are those that are determined to be vital to the successful operation of the organization. Some examples are:
- required for operational decision processing
- contributing to key performance indicators within an organizational performance scorecard (or KPIs etc.)
- supporting the organization’s regulatory compliance initiatives
- contributing to the presentation of values published in external reports (including financial statements)
- containing personal information protected under a defined privacy or confidentiality policy
- containing critical information about an employee
- containing critical information about a supplier
- containing detailed information about a product
- supporting part of a published business policy
My perspective: MDM is a vast topic, and there are 100s of fields in any given Master Data domain. Not all are equally important. One tried and tested method is to prioritize based the ranking of CDEs of the business under review.
Loshin on … Don’t be surprised when you ultimately get the behavior that you incentivize
One of the most significant historical problems with data governance is the absence of follow-through. Although some organizations may have well-defined governance policies, they may not have established the underlying organizational structure to make it actionable. This requires two things: (1) the definition of the management structure to oversee the execution of the governance framework and (2) a compensation model that rewards that execution.
My perspective: While Loshin mentions this primarily in the sense of governance, I think it equally applies to MDM when thinking about the ‘who does what.’ From hard-won experience, I can only forewarn you that ‘what gets measured gets done. Appealing to the better nature of individuals will get you so far, but never forget how busy people are and that there are typically bonuses at stake for getting the ‘day job’ done vs. ‘helping out’ by being good data citizens.
Loshin on … Data controls and preventing adverse downstream effects
Contrary to the intuitive data quality ideas around defect prevention, the desire is that the control process discovers many issues. The goal is assurance that if any issues cause problems downstream, they can be captured very early upstream. Loshin says that although we can implement automated processes for validating that values conform to format specifications, belong to defined data domains, or are consistent across columns within a single record, there is no way to automatically determine if a value is accurate. As a result, there are always going to be data issues that require attention and remediation.
My perspective: Arguably, of all the MDM topics, this may be the one that has undergone the most dramatic change in the twelve years since this book was published. There are by now some very real AI and Machine Learning use cases that can augment the human element though rarely remove it altogether. I suspect there are another five years to go before the promise catches up with the hype. We must remember that someone has to teach the robots, and if we try to automate a flawed process, we will just have flawed processes happening faster.
Loshin on … Attributes for overseeing Data Quality in the form of SLAs
- Business impacts associated with potential flaws in the data elements
- Data quality dimensions associated with each data element
- Assertions regarding the expectations for quality for each data element for each identified dimension
- Methods for measuring conformance to those expectations (automated or manual)
- The acceptability threshold for each measurement
- The individual to be notified in case the acceptability threshold is not met
- An explanation of how often monitoring is taking place
- A clarification of how results and issues will be reported
- A description of to whom and how often issues are reported
- The times for expected resolution or remediation of the issue
- A description of the escalation strategy that will be enforced when the resolution times are not met
- A process for logging issues, tracking progress in resolution and measuring performance
My perspective: Loshin is particularly strong on DQ. Although I haven’t (yet) read it – it is not surprising that he wrote a follow-up book in 2010 called The Practitioner's Guide to Data Quality Improvement.
Loshin on … Feedback loops between MDM systems and analytical applications
Per Loshin, analytical applications are more likely to use rather than create master data. Still, these applications may also contribute information to the master repository that can be derived from analytical models. For example, customer data may be managed within a master model, but a customer profiling application may analyze customer transaction data to classify customers into threat profile categories. These profile categories, in turn, can be captured as master data objects for use by other analytical applications (fraud protection, bank secrecy act compliance, etc.) in an embedded manner. In other words, integrating the results of analytics within the master data environment supplements analytical applications with real-time characteristics.
My perspective: this was quite the eye-opener for me as I had not thought about it this way before. In hindsight (isn’t hindsight great!), this kind of mechanism would have helped me conceptualize a better product backorder allocation system using a level of automation regarding customer ABC classification and subsequent prioritization.
Loshin on … why doctors know uniqueness
The concept of uniqueness itself is not necessarily cast in stone. The number of attributes that essentially qualify unique identification may be greater or fewer depending on business requirements. To a catalog company, two records may represent the same person if they share a name and address, but a surgeon may require a lot more data to ensure that the person on the operating table is the one needing surgery.
My perspective: It never ceases to amaze how creative people can get when coming up with reasons to create separate product numbers for things that are actually the same! I recommend spending time creating detailed guidelines and using cases to show where this kind of thing is considered acceptable and where not. Then, offer a ‘hotline’ in case of doubt.
Top takeaways from the IBM book
IBM on … ‘Know thyself’ is good advice that never gets old
Knowing who your customers are, what products & services you offer, and what accounts you have with suppliers is fundamental. Master data is some of the most valuable information that a business owns. It represents vital information needed across different business processes, across organizational units, and between operational systems and decision support systems. In essence, master data defines the enterprise.
My perspective: consider the question: how many active finished good SKUs (material numbers) do you have? How long did it take you to come up with an answer? If you ask three other people in your company who ‘should know’ – will you get the same answer or even an answer close to yours? Surprised by the answers? Well, at least you are not alone….
IBM on … Lost in translation: what is a customer?
Master data captures the key things that all departments must (will have to) agree on, both in meaning and usage, e.g., it is important to understand what defines a customer, which customers exist, where customers are located, and what products they have purchased. Many operational business processes touch master data, e.g., introducing a new product, onboarding a new supplier, or adding a new phone number to a customer account. Master Data must be accurate and consistent. Trustworthy (master) data is a fundamental ingredient of meaningful analytics.
My perspective: although master data used in CRM can often be different from ERP, they also have commonalities and quite some potential duplication. You will save a lot of time long-term by agreeing on which systems are considered ‘book of record’ for which fields and then making sure that anybody with a need to know – gets to know about it.
IBM on … Data interoperability and Enterprise Architecture
It’s challenging to achieve benefits from master data spread across multiple systems if those systems lack controls and integration. Without an authoritative source of master data, business processes become more complex to develop and implement. Another consequence is ‘architectural brittleness’ —making a slight change in one system can significantly impact many other systems. This affects the organization’s ability to evolve and change according to market pressures. When master data is spread across systems in an unmanaged way … it is difficult to optimize relationships with customers and suppliers across product lines or relate sales performance to product categories.
My perspective: Having multiple systems is a fact of life, but they are often run in silos, both in terms of business users and IT support. Encourage and incentivize staff to take the cross-functional and cross-system view. Fantastic things can happen when they do.
IBM on … The ‘local’ vs. broader perspective
Typically, Lines of Business (LOBs) capture and maintain unique representations of core business information such as customer and product, each with their own unique slant on the usage and representation of that information. Often, however, it’s about control. A LOB sees ‘their’ data as critical to its operations and may not see value in sharing it with the broader enterprise. These factors encourage LOBs to seek to control their own master data, sometimes acting as barriers to sharing this business information for the benefit of the parent company.
My perspective: I notice that many authors use the term LOB. It took me a while to get used to it because I have worked in companies that use other terms that mean that same thing. But, in case it helps, if you see any of the following terms – then they usually mean the same as LOB: Product Lines, Franchises, Business Units, Segments, Divisions.
IBM on … The technological ‘rat’s nest’
A common reason for master data redundancy is the introduction of packaged applications such as ERP and CRM. Typically, these are designed to manage their own master data. When multiple packaged applications are deployed in an environment, an interesting puzzle arises—each will likely only store the information it needs for its own operations. When you have two or more of them (a very likely scenario), there is no common definition of the master data elements. Both ERP and CRM need information about customers —but because they each maintain unique customer attributes, neither represents a complete view of the customer. A common solution is to implement an MDM System to support the complete representation of customer information.
My perspective: If you don’t or can’t do this, you will need to invest quite some time into documenting translations and derivations and performing consistency reviews. For some reason, this kind of knowledge usually resides only in the head of an expert who is retiring next month, and nobody has considered how to live without it.
IBM on … The so-called ‘Domains’ of MDM
IBM likes to describe Master Data in terms of the ‘Who, What, Where and How’:
IBM on … The (very) collaborative aspect of MDM and, in particular - the finished good product (material) master
IBM provides a simplified but helpful example where information about new products (or items) is received and then incrementally extended, augmented, validated, and approved by different users with different roles and responsibilities.
My perspective: Most of my own experience has been in the product data domain. I am not used to seeing such a simple diagram for what is often a very complex process – but this does the job nicely to give one an overarching appreciation that product master is not particularly linear and evolves – sometimes over years in industries with long R&D timeframes.
IBM … MDM Implementation Style
There is a pretty good overview of the most common technical implementations.
My perspective: Although the so-called ‘transactional hub’ is today by far the most common form – you are likely to encounter the other methods in mature and/or large companies. It’s thus worth studying the differences carefully.
IBM on … Reference Data vs. Metadata
Where Metadata often describes the structure, origin, and meaning of things, Reference Data is focused on defining and distributing collections of common values. Reference data enables accurate and efficient processing of operational and analytical activities by enabling processes to use the same defined values of information for common abbreviations, codes, and validation. Note: Loshin refers to metadata as ‘the data about the data’.
My perspective: Many people (even self-styled experts) confuse these terms. Show you are a cut above the rest by knowing the difference.
IBM on … Data Quality
IBM explains some of the dimensions of data quality, focusing on Accuracy, Completeness, Consistency, and Timeliness.
My perspective: While there are some anecdotes here, there are plenty of books explicitly dedicated to DQ – far more in fact than on MDM. Those wishing to delve deeper would do well to seek out books by Danette McGilvray and Laura Sebastian-Coleman or internet sites from Dan Meyers.
IBM on … Articulating the benefits of MDM
This is one area where the IBM book excels. It starts with a reasonably familiar (and by now widely adopted and adapted) variation of the data defense vs. offense playbook.
But the real value is in the truly memorable examples they provide as food for thought. Most of these examples center around the benefits of connecting data points to serve customers better, reduce time-to-market, or even detect wrongdoing. Here are two of them:
- Without MDM, a customer who just bought a high-end product on the internet and is calling into the call center for support might not appear as a very valuable client to the customer service representative. Without managed master data, it is difficult to get a complete (and timely) view of such a customer and determine its value to the enterprise. Consequently, resources can’t be optimally applied, and it’s difficult to provide higher levels of service as would otherwise be possible by marrying segmentation strategies with enabling master data.
- We can also use MDM to uncover fraud proactively and to create alerts or take appropriate actions. For example, through partner management features, we could detect that the purchasing manager is married to the VP of Sales in one of your largest vendors!
IBM on … Reference Architecture for MDM
Chapter 3 will appeal to Enterprise Architects or those scoping out MDM tool functionalities. Here, IBM starts by stating that a good architecture principle will not be outdated by advancing technology and has objective reasons for advancing it instead of alternatives. The authors posit the following points (among others) as core principles to guide architecture decisions; more specifically that the MDM solution should:
- provide the ability to decouple information from enterprise applications and processes to make it available as a strategic asset.
- provide the enterprise with an authoritative source for master data that manages information integrity and controls the distribution of master data across the enterprise in a standardized way that enables reuse.
- provide flexibility to accommodate changes to master data schema, business requirements, and regulations and support the addition of new master data.
- provide the ability to incrementally implement a Master Data Management Solution so that a Master Data Management Solution can demonstrate “immediate value.”
- leverage existing technologies within the enterprise where prudent to do so.
IBM on … Security topics relevant to MDM
After reading chapter 4, I now think differently about security controls. MDM increases the value of information—but for that very reason, it also increases the potential damage to the organization if master data is improperly modified or disseminated. Core business information is increasingly a target of opportunity for corrupt insiders and opportunistic outsiders, e.g., hackers using technical or social engineering tactics. It is impossible to completely eliminate risk, but you can balance (via mitigation) how you address those risks against the ability to do business.
IBM point out that before MDM, master data was scattered, and each application would have had its own security, UI, interfaces, and repositories. For an attacker to gain access to all of the master data, they would have needed to successfully penetrate all of the different systems and determine the relationships between them. In other words, the bad guy had to actually do the job of integrating the master data across the silos! Thus, the barriers that make it hard for attackers to reconstruct an enterprise’s master data also make it difficult for legitimate employees to make use of it.
My perspective: I think these points make the case to ensure a good working relationship between Master Data Managers and IT Access Controls staff. If we cripple the business with controls that are too tight, we won’t get value from our investments in systems. On the other hand, the business should be prepared to take the time and map out user roles and use cases.
IBM on … Some Product Master Data Blueprints
Another nice feature of the IBM book is that chapter 6 takes real-life business challenges and attempts to work through them almost in case-study fashion. Two of these are:
- New Product Introduction (NPI) Solution Blueprint for Consumer Electronics Industry
- Global Data Synchronization Solution Blueprint for Retail
My perspective: Although my own background is in Life Sciences, primarily Medical Devices, and recently also Pharma, I found I could relate well to the CE, and Retail challenges posed and pick up some new ideas to try out.
IBM on … Roles & Responsibilities involved in MDM
An extensive appendix provides good ideas to consider when writing job descriptions for both Business and IT roles involved along the MDM continuum.
My perspective: During the past decade, these roles are becoming much more common. I recommend you search on Linked In to compare and contrast. Chapter 9 also introduces Data Stewardship and Governance, though I would say that you should read both David Plotkin and John Ladley to get a better grounding in those respective topics.
And finally, some tips and considerations before buying the books:
- All things considered, I believe that both books belong in your reference library; if you are serious about MDM and want to go a level (or three) deeper than you will typically get just by consuming an article or attending a webinar.
- While it's obviously just my opinion, I would describe the IBM book (weighing in at 657 pages) as a ‘subject matter primer’ rather than a ‘cookbook.’ In contrast, the comparatively slimline Loshin book (at 300 pages) is more consultant-style and attempts to guide you through the phases of creating the ‘business case’ and building associated roadmaps and implementation plans.
- Don’t be afraid that the reading level will be too technical. I’m a business guy (see my profile on Linked In) and had no trouble with 85% of it. The other 15% I will dip into again if the situation requires it.
- Since you will be very likely highlighting and taking copious notes, do yourself a favor and buy hardback copies if you can. At least in the US, clean second-hand copies are easy to come by via Amazon and others.
- If you like to mix and match credited content in presentations or just like to have it on your laptop or iPad for the convenience of lookup and reference - I recommend you purchase the eBook versions in addition to the paper version. Both are available in PDF format per the links near the beginning of this article.
If you have read this far – I thank you for your time and wish you luck in your MDM journey. Always be learning - and stay data-lit(erate)!
David Finlay
Master Data Aficionado at SAP Retail
3yWell written AND worth reading.
Senior IT Project Lead @ Siemens Healthineers | Global Rollouts
3yHi David, A (very) long, worthwhile read, thank you for the summary and also your perspective to some of the points mentioned in both books. I think it made me pause and actually invest the mental effort to think what I personally think about these statements rather than just read it “through”. Favorite sentence: “Master Data will not master itself!” From you or one of the books? 😊 Thank you!
💡 Award Winning Data Governance Leader | DataVenger | Founder of LightsOnData | Podcast Host: Lights On Data Show | LinkedIn Top Voice 2024
4yDavid Finlay your in depth-reviews are like nothing I've read before. They're brilliant. Thank you very much for mentioning my course. I really appreciate it.
Dave, I’m intrigued by the value MDM holds for the commercial end of what Is a hugely complicated And IP rich product . I “want the PDF” and wish you huge success convincing and getting business from the “ real world “ !
Global VP of Operations @ Euclid Vision Group | Operations Management
4yDavid enjoyed your comparative analysis of both books. Some real nuggets of gold in here for anyone interested in MDM!!