Data models allow an organization to “know thyself.” Only by understanding the “what,” “when,” and “how” of data structures and workflows can a company successfully implement applications, cloud, and reporting across an enterprise system landscape.
Their survival depends on a shared understanding of how data is defined and communicated throughout their organization.
For that reason, most companies should prioritize using data models to establish and share a collective understanding of their data.
Enterprise Data Models (EDMs) provide detailed specifications of an organization’s data and establish a uniformity of meaning across the company. For data to be considered enterprise-significant, all stakeholders must share the same definition.
A comprehensive understanding of an organization’s relationship with its data results from clear definitions of the activities that determine its usage.
A successful EDM must not include poorly implemented or redundant activities and instead be based upon ideal definitions of activities. Processes and functions can help a company differentiate between what they are actually doing and what they should be doing.
Process vs Function
Processes specify what activities an organization actually carries out and often include mechanisms.
Functions are abstract definitions of activities that an organization needs to perform in order to survive and thrive in the future. Functions definitions do not include any mechanisms used to perform them.
The lower the degree of overlap, the more opportunities exist for an organization to re-engineer processes and improve operational effectiveness. The EDM should be resilient to mechanistic changes; any changes to systems or activities should not require the EDM to be re-defined. To achieve this, base the EDM model on abstract function definitions rather than mechanistic process definitions. Of course, this is easier said than done. Some mechanistic definitions may be required due to legislative or regulatory requirements.
The Enterprise Data Model as a Communication Tool
Since no single stakeholder has the “big picture,” it’s imperative for an EDM Modeler to establish a list of SMEs and create an effective working relationship with them.
The EDM model should be used as a communication tool to share the agreed understanding of its enterprise across an organization. To prevent the communication barrier that results from using abstract names and terms, the Data Modeler should always supplement definitions with synonymous and relevant concrete examples.
High-Level Data Modelling Process
- Create engagement: The Modeler should record a list of key stakeholders and plan communication with those individuals.
- Perform 3D modeling: Gather any raw information that feeds into data modeling activities; question, evaluate, and refine outcomes; define the data models and other related artifact definitions.
- Publish the data models to the enterprise’s audience after proper governance processes.
Note: An EDM Model should have longevity and require fewer updates over time. A common cause of reduced longevity is mechanistic definitions.
Standardizing Attributes Names
The most commonly used attribute naming conventions have three potential elements:
- Attribute Subject – What entity characteristic is relevant to the data?
- Modifier – What is the subject qualifier?
- Domain Type – Which data domain is the attribute conformed to?
E.g. Subject: “Area,” Modifier: “Unit of Measure,” Datatype Domain: “Code”
Avoid technology-based naming conventions such as “Is Response Received,” “Full Name,” or “Creation Datestamp.”
Avoid adding a prefix to the attribute. All attributes in data models must be without any structure or internal patterns to the data they contain.
In the Logical Data Model, it’s important to recognize that no attribute should be complex.
The property of an Entity that allows each instance to be differentiated from the other is called its Unique Identifier. Every single instance of an Entity should have it’s uniqueness guaranteed using a Unique Identifier based upon one or more attributes.
When a natural key is not available for an Entity, the next best approach is adopting an external referencing system to master the Unique Identifier. I would highly recommend the tool Accurids for its state-of-the-art functionality and ability to provide stable, resolvable persistence identifiers.
Important points to consider for Enterprise Data Modelling
To prepare enterprise definitions that maximize support for localized data requirements, a company should consider:
- Currency: ISO currencies are advised for the domain.
- Language: Reference & Master Data might have to be recorded in different languages. Instead of relying on software-based translations, use a separate entity to record Reference & Master Data in a different language provided by an SME. For Transactional data, use a translation tool instead.
- Regional Type Entities: Different values must be captured in different parts of the world. Add an Entity that allows you to record the jurisdiction.
- Attribution – Enterprise vs Division: One of the biggest issues when defining models is attribute proliferation. As you receive input from each country and the model develops, you need to add more attributes. This proliferation of attributes creates a lot complexity and is not sustainable. To address this issue, use concrete definitions for data model structures, entities, and attributes. Support regional and locally applicable attributes with metadata definitions.
Benefits of an Enterprise Data Model
An EDM provides a comprehensive foundation for the definition of data across an organization. The enterprise data model is comprised of:
- Conceptual Data Model: Communicates high-level entities and their meaning across an organization. For big firms, it’s advisable to break CDM into individual focus areas such as “Product,” “Customer,” “Finance,” or “Marketing.”
- Logical Data Model: LDM is fully normalized and derived from the conceptual data model. The purpose of LDM is to define the organization’s data scope, definition, and structures.
- Physical Data Model: PDM describes the definition and structure of data elements at rest. The PDM generation process transforms the LDM’s implementation-agnostic representation to a technology-dependent representation of PDM.
- Canonical Data Model: CDM describes the definition and structure of the data elements that flow across a firm’s system landscape. They are crucial to provide a basis for validating the specification of interfaces and APIs.
Note: It’s recommended that only modifications that have more generalized significance should be incorporated into the more abstract models from more specific models.
Enterprise Data Lexicons
The enterprise data lexicon provides reference definitions that are entirely consistent with the visual structural data models. They also provide an alternative and complementary communication mechanism that consists of:
- Enterprise data dictionary
- Enterprise semantic definitions:
- Catalog of meanings, examples, and abbreviations
- Regional or colloquial terms
- Ontology or taxonomies of terms
- Data lifecycle definitions
OSTHUS can help you design a pragmatic enterprise data model that suits your organization’s vision and needs. Our data and industry experts can guide you through problems, concerns, and questions while offering solutions for successful implementation.
For more details, please book an early appointment with our data strategy and governance experts at our OSTHUS website.
- Dave Knifton (auth.) - Enterprise Data Architecture (2014)
- Dave Knifton (auth.) - The Data Model Toolkit(2016)