Canonical Models

Introduction
A canonical model is a standard, shared data model that acts as a universal translator between different systems, applications, or domains. It provides a common language and format for data, simplifying integration by transforming all data into a single, consistent representation before it's exchanged. This avoids the need for numerous point-to-point connections between every system.
How it works
- Central hub: It establishes a central "hub" for data. Systems don't need to connect to every other system directly; they only need to connect to the hub.
- Inbound transformation: When data enters the system, it is translated from its native format into the canonical model's format.
- Outbound transformation: When data needs to be sent to another system, it is translated from the canonical format into that system's native format.
- Simplification: This "hub-and-spoke" model reduces the complexity of integrations from an exponential number to a linear one, as each new system only requires one integration point to the central hub.
- Future-proofing: If a system is replaced, only the mapping to the canonical model needs to be updated, protecting the rest of the integrations from change.
Benefits of using a canonical model
- Interoperability: Enables systems with different technological foundations to easily exchange data.
- Standardization: Ensures consistency in data definitions, formats, and semantics across the organization.
- Simplification: Creates a unified model that is easier for various systems to use.
- Consistency and quality: Provides a single source of truth for data, leading to more accurate and consistent data across different applications.
- Reusability: The standard model can be reused across the organization, making new integrations faster and easier.
Example
Imagine a company with an employee contact list scattered across a CRM, an HR system, and a spreadsheet. Each system might store the data differently. A canonical data model could define a standard format for "employee contact" with fields like firstName, lastName, email, and phoneNumber in a specific order. Data from each source would be transformed into this canonical format, and any system needing contact information would query the canonical model, which can then provide the data in a consistent way.




