By Dr. David Marco
In order to have a successful Enterprise Information Management (EIM) and data management program companies need to ensure that they are properly executing the fundamental building blocks of these initiatives. The basic building blocks of EIM are Meta Data Management, Data Governance and the EIM Organization.
Figure 1: EIM Building Blocks
Understanding Data Governance
Data Governance: defines the people, processes, framework and organization necessary to ensure that an organization’s information assets (data and metadata) are formally, properly, proactively and efficiently managed throughout the enterprise to secure its trust, accountability, meaning and accuracy.
In general, governance means establishing and enforcing the processes for how a group agrees to work together. Specifically, data governance services are the establishment of:
- Chains of responsibility to empower people around an organization’s data
- Measurement to gauge effectiveness of the activities
- Policies to guide the organization to meet its goals
- Control mechanisms to ensure compliance with regulations and law
- Communication to keep all required parties informed
At the highest level, data governance is concerned with the management of data – its availability, currency, usefulness, accuracy and relationships with other enterprise data.
Data governance requires a great deal of training and education. It is not an IT function, although many technical products and tools are used to administer governance. Data governance is a business responsibility, shared with IT but “owned” by the business entity and instituted across the enterprise. Like any other enterprise effort, successful data governance involves people, processes, tools, standards and activities that are managed at both strategic and operational levels. And, like any other successful enterprise initiative, data governance starts with a vision, which is communicated and sustained by the enterprise.
Data Stewardship: The process of having data stewards work with the data and metadata of an organization to ensure its quality, accuracy, formats, domain values, and that it is properly defined and understood across the enterprise.
Data Stewardship’s role is to ensure organizational data and metadata meet quality, accuracy, format and value criteria; ensuring that data is properly defined and understood (standardized) across the enterprise.
Data Stewards: A person(s) responsible for working with the data and metadata. The data steward acts as the conduit between IT and the business. The data steward (which is often not just one person, but a collection of people) align the business needs with the IT systems supporting them (both decision support and operational). The data steward has the challenge of guaranteeing that one of the corporation’s most critical assets–its data and metadata–is being used to its fullest capacity.
Some people may say that their company does not have any data stewards, but this is not true. Every company has data stewards. There is always someone within the company to whom people turn with questions about what the data means. This person is the data steward, even if he or she doesn’t have the title.
Your company’s size, organization, and industry dictates how much effort you will need to place in data governance. Industries we have found to require greater data stewardship include pharmaceutical, certain government organizations (e.g., intelligence, military, energy), insurance, banking, and security brokers and investment advice.
Having had the opportunity to form several data stewardship organizations, we can attest that no two data stewardship groups are exactly the same. The Data Stewardship Framework provides guidelines for how these groups are formed. This framework is designed to provide corporations and government entities with the strategies and guidelines necessary to implement a highly successful data stewardship organization.
Types of Data Stewards
Throughout this section we use the term “data steward” to refer to the four types of data stewardship committee roles:
- Executive sponsor
- Chief steward
- Business steward
- Technical steward
As each role is reviewed in the following sections, keep in mind that (with few exceptions) they are not full-time jobs.
Any initiative that cuts across a company’s lines of business must have executive management support. It is imperative in breaking down the barriers and the “ivory towers” that exist in all of our companies. Do not underestimate the obstacles that political challenges present; they are the greatest challenge that any data stewardship committee faces.
Good executive sponsors do not need to attend every data stewardship meeting, nor do they need to participate in tasks like defining data definitions. Instead, the executive sponsor needs to provide the appropriate level of support for their business or technical stewards and for the data management program as a whole.
It can be more difficult to find a business executive sponsor than it is to find a technical executive steward. Look for five key qualities in your executive sponsor:
- Someone willing to be an executive sponsor
- A person with executive ranking
- Someone with high credibility
- Someone knowledgeable about problems within the company
- A person willing to challenge the company status quo
A large financial institution was looking to implement an enterprise-level data stewardship committee. We had a technical executive sponsor; however, she had not identified a business executive sponsor. As part of our engagement with this client, we conducted a readiness assessment of their metadata management project which was part of their larger EIM effort. During this assessment we interviewed a member of the company’s executive management team. This person has worked at this company for over 20 years and was a very bright individual. He had a strong belief in his company’s need for data stewardship and he clearly understood how the lack of data stewardship has cost his company significant dollars. After this meeting we went back to the client counterpart and stated that we found a business executive sponsor.
The chief steward is responsible for the day-to-day organization and management of the data stewardship committee. Like any other organization the data stewardship committee needs a leader or project manager. Typically, the chief steward will be a senior level, as opposed to executive level, individual with an organization.
The chief steward must be a highly credible person within your organization. He or she should have a sound knowledge of both the technical and the business sides of the corporation. This knowledge is vital as some stewards are from the business and some are from the technical side. The chief steward needs to understand the politics within the organization and have the insight on how to navigate around those challenges. Most importantly, the chief steward must have strong leadership and communication skills to help guide the data stewardship committee. This is most evident when this person need attain consensus across disparate groups.
The business steward is responsible for defining the procedures, policies, data meanings, and requirements of the enterprise. Keep in mind that the business stewards can be organized from a departmental level (e.g. consumer lending, military branch, pharmacology) or by subject matter (e.g. logistics, shipping).
Business stewards need to have a strong knowledge of the business requirements and policies of the corporation. They must make sound decisions and work with key members of their business in order to gain consensus on their organizations’ business policies and requirements.
The technical steward is a member of the IT department. These people focus on the technical metadata and data that needs to be captured by the data stewardship committee.
Next month I will walk through the first tasks that the data governance team must address and the typical data stewardship activities that they will be involved in.
Preparing for Data Stewardship
The data stewardship committee must conduct a data governance assessment and complete the following tasks before they can capture and define business and technical metadata:
1. Form a charter
2. Define and prioritize the committee’s activities
3. Create committee rules of order
4. Establish roles for committee members
5. Design standard documents and forms
Form a Charter
The first task of the data stewardship committee is to form a documented charter for their activities. This charter states the business purposes that necessitated the data stewardship committee formation. The data stewardship charter should not be a voluminous document, as this document’s goal is to provide a clear direction as to the committee’s strategic business goals. Obviously, this charter needs to target the specific concerns and opportunities of the company. Best Practice: I like the data stewardship charter to fit on one single-spaced page. Anything longer is likely too long.
For example, pharmaceutical companies tend to have extensive and elaborate data stewardship committees. Its data stewardship charter traditionally focuses on clinical trials, the process a pharmaceutical company goes through to research, develop, and attain government approval for new compounds (drugs). The average cost for developing a new drug is between $150-$250 million, and over 10 years of time, before it can be brought to market. Every day a new compound is delayed from reaching the market costs the company $1 million in lost revenue. This includes the extra time it will take to recoup sunk expenses (ever see the interest expense on $150 million?) and the possibility of a competitor creating a competing compound.
During these trials, government agencies like the FDA have rigorous standards that must be met before a new drug can gain approval. Passing these FDA (and other agency requirements) tests is not easy. These organizations and the corresponding legislation require that a pharmaceutical company has very definitive definitions for their data elements. Clearly, a pharmaceutical company‘s data stewardship committee’s charter will focus very heavily on how they can expedite the passing of the FDA audits.
Define and Prioritize Committee Activities
Once the charter has been created, the data stewardship committee needs to define the specific activities that they will be performing. It is vital that these activities will support the strategic objectives of the data stewardship charter.
Once these activities have been defined, they must be prioritized so the data stewardship team knows what to tackle first. At this point in the process we suggest using a matrix to show the possible activities. On the vertical axis, list which of these activities will be most beneficial to the organization (see Figure 1).
Figure 2: Prioritization Matrix
Create Committee Rules of Order
Once the activities of the data stewardship committee have been identified, the committee will have to create rules of order for their organization. Following is a sample of the types of rules of order that will be needed to be defined:
- Regular meeting schedule
- Meeting structure or agenda
- Issue documentation
- Issue resolution
- Meeting notes capture and dissemination
Establish Roles for Committee Members
After the data stewardship committee has defined their rules of order it will be important for this team to formally define their different data stewardship roles and responsibilities. Earlier we defined four data stewardship roles: executive sponsor, chief steward, business steward and technical steward. Certainly, these roles are a good beginning set for any new data stewardship committee; however, if you are like most companies you will tailor these roles, titles and descriptions to suit your company’s specific needs.
Design Standard Documents and Forms
Now it is time for the data stewardship committee to create any standard documents or forms that will support the data stewardship activities. This activity is important, as you do not want to have each steward creating their own document for each activity.
One of the most common documents that will be required is a change control document. Members of your company use it to formally document their data stewardship tasks. For example, suppose that a key task of your data stewardship committee is to define business metadata definitions. Certainly you will have business stewards working on these definitions; however, some people may not formally be part of the data stewardship committee. These people may want to recommend changes to the business definitions that your business stewards defined. Clearly you would need a form (optimally Web-based, tied to a managed metadata environment (MME)) that would allow them to provide feedback.
Another common form is a data stewardship feedback mechanism. It is important that the data stewardship committee is not viewed as a group that is in their own ivory tower. Allowing feedback on the things that your data stewardship committee is doing well, as well as recommendations on what they can do better helps to ensure that you are meeting the needs of your constituency.
Data Stewardship Activities
The specific activities of the data stewardship committee will vary from one organization to another and from industry to industry. However, there are some common activities, listed here:
- Define data domain values
- Establish data quality rules, validate and resolve them
- Set up business rules and security requirements
- Create business metadata definitions
- Create technical data definitions
The data stewards who work on these different activities will be primarily working with metadata; however, there will be occasions when they may need to work with actual data. The following sections walk through these activities and provide a set of guidelines and best practices for performing them. We discuss the typical data stewards who perform each task. It is important to note that sometimes there are people who are highly knowledgeable on the data and the business policies around the data, even though they do not belong to the particular stewardship group that I mention for the activity. For example, there may be some technical stewards who are as knowledgeable on the business policies and data values as any of the “official” business stewards. So even though I state in my guidelines that the business stewards should be creating the business metadata definitions, obviously you would want these technical stewards working with the business stewards to define the business metadata definitions.
For all of these activities, the chief steward will play a critical role in ensuring they are properly completed. A good chief steward ensures that the technical and business stewards work thoroughly and expediently, understanding how easy it is for a group to fall into “analysis paralysis”. The chief steward will act as a project manager or guide in each of these activities. Most importantly, the chief steward aids committee members in any resolving any conflicts — and there will be conflicts.
Define Data Domain Values
Once the business stewards define the key data attributes, they need to define the domain values for those attributes. For example, if one of the attributes is state code, then the valid domain values would be the two-character abbreviations of the states (e.g. CA, FL, IL, NY).
As with all data stewardship tasks, this metadata will be stored in the MME. It is highly recommended that a Web-based front-end be developed so that the business stewards can easily key in this vital metadata.
In many cases data modelers input attribute domain values into their modeling tool. If this process has occurred in your company, you can create a process to export that metadata from the modeling tool and into the MME. This will allow the business steward a good starting point in their process to enter domain values.
Establish Data Quality Rules, Validate and Resolve Them
Data quality is the responsibility of both the business and technical stewards. It is the responsibility of the business steward to define data quality thresholds and data error criteria. For example, the data quality threshold for customer records that error during a data warehouse load process may be 2%. Therefore if the percentage of customer records in error is greater than 2% than the data warehouse load run is automatically stopped. An additional rule can be included that states if the records in error is 1% or greater but less than 2%, then a warning message is triggered to the data warehouse staff; however, the data warehouse run is allowed to proceed. An example of data error criteria would have a rule defined for the “HOME_LOAN_AMT” field. This rule would state that the allowable values for the “HOME_LOAN_AMT” field is any numeric value between $0-$3,000,000.
It is the responsibility of the technical stewards to make sure the implementation of the data quality rules are adhered to. In addition the technical stewards will look to work with the business stewards on the specific data quality threshold and data error criteria.
Set Up Business Rules and Security Requirements
Business rules are some of the most critical metadata within an organization. Business rules describe how the business operates with its data. A business rule describes how the data values were derived and calculated, if the field relates (cardinality) to other fields, data usage rules and regulations, and any security requirements around a particular entity or attribute.
For example, a healthcare insurance company may have a field called “POLICY_TREATMENTS”. This field may list the specific medical treatments that a policy holder has undergone. The business rule for this field is an alphanumeric, 20 byte field, whose “system of record” is “System A”. In addition, there may be security requirements on this field. Most health insurance companies provide coverage to its employees, so the security requirement for this field is that the IT department cannot view this field or associate it with any fields that would identify the policy holder. When security rules like these are broken, the corporation is vulnerable to legal exposure.
Create Business Metadata Definitions
One of the key tasks for the business stewards is to define the business metadata definitions for the attributes of a company. It is wise to begin by having the business stewards to define the main subject areas of their company. Subject areas are the “nouns” of the corporation: customer, product, sale, policy, logistics, manufacturing, finance, marketing, and sales. Typically, companies have 25-30 subject areas, depending on their industry. Once the business stewards define the subject areas, then each of these areas can be further drilled down. For example, a company may distinguish between the different lines of business or by subsidiary.
Some data elements require calculation formulas of some sort. Your company may have a data attribute called “NET_REVENUE”. ‘NET_REVENUE” may be calculated by subtracting “gross costs” from “gross revenues”. Any calculation formulas should be included in the business metadata definitions.
Once the key data elements are identified, then the business stewards can begin working on writing metadata definitions on the attributes. The process for capturing these definitions needs to be supported by an MME. The MME includes metadata tables with attributes to hold the business metadata definitions. In addition, a Web-based front-end would be given to the business stewards to key in the business metadata definitions. The MME captures and tracks these metadata definitions historically, using “from” and “to” dates on each of the metadata records. A metadata status code is also needed on each row of metadata. This status code shows if the business metadata definition is approved, deleted or pending approval.
When the first business metadata definitions are entered, it is common to mark them as “pending”. This allows the business data stewards to gain consensus on this elements before changing their status to “approved”.
Create Technical Meta Data Definitions
The technical stewards are responsible for creating the technical metadata definitions for the attributes of a company. It is important to understand that technical metadata definitions will fundamentally differ in form from business metadata definitions. As business metadata definitions are targeted to the business users, technical metadata definitions are targeted for an organization’s IT staff. Therefore it is perfectly acceptable to have SQL code and physical file and database locations included in the technical metadata definitions.
Usually it is too much work to have the technical stewards list all of the physical attributes within the company. Instead, begin with the technical stewards listing their key data attributes. By identifying the core data attributes, the IT department can focus technical metadata definitions on only the most important data attributes. Once your technical stewards have defined these initial physical attributes they can now start working on the remaining attributes.
The process for capturing these technical data definitions is a mirror image of the process to capture business metadata; in fact, the Web-based user screens should look very similar. The same functionality described in the business metadata definitions above (from and to dates, status codes, and so on) should also be included.
Once both the business and technical stewards define their metadata definitions, any discrepancies will almost immediately come to light–and there will be discrepancies. For example, the business stewards may define “product” as any product that a customer has purchased. The technical stewards may define “product” as a product that is marked as active. These two definitions are clearly different. In the business stewards definition, any product (active or inactive) that is currently on an open order for a customer would be valid. Obviously, the IT staff will want to work with the business users to repair these hidden system defects.
Author’s Biography – Dr. David Marco, Fellow IIM, CDMP (Master), CBIP, CDP
Best known as the world’s foremost authority on data governance and data literacy, he is an internationally recognized expert in the fields of data management, data literacy, advanced analytics, data warehouse, business intelligence, metadata management and data stewardship. In 2004 David Marco was named the “Melvil Dewey of Metadata” by Crain’s Chicago Business as he was selected to their very prestigious “Top 40 Under 40” list and was named by DePaul University as one of their “Top 14 Alumni Under 40”. In 2008 David Marco earned the DAMA Data Management Professional Achievement Award. In 2020 he was awarded the title of Professional Fellow from the Institute of Information Management (their highest honor). He is the president of DataManagementU.com and is their lead contributor.
David Marco is the author of the widely acclaimed, 2 top-selling books in metadata management history, “Universal Meta Data Models” and “Building and Managing the Meta Data Repository” (available in multiple languages). In addition, he is a coauthor for multiple books and published hundreds of articles some of which have been translated into Mandarin, Russian, Portuguese and others. He is the President of DataManagementU.com. [email protected]