Content Unimodel – Can We Standardize Content Modeling?
At the CMS Connect 25 conference in Montreal, I led a breakout group of CMS experts in a discussion of the topic, “Content Modeling for Structured Content: What Does it Mean to You and Can We Standardize?”
This post outlines our discussion and adds several takeaways of my own.
Proposed Benefits of a Unified Content Model
The group focused on themes of experimentation and opportunity:
Can the content management community establish a common, shared and unified content model across domains and organizations?
We considered benefits to creating a unified content model across the content management domain, a “content unimodel”. Here are some hypotheses that resulted:
Alignment & Standardization: The content management community uses general but sometimes unclear, overlapping or conflicting terms to describe content model primitives. A unified content model could clarify and formalize concepts and references.
Efficiency & Portability: Ragged and denormalized content models can generate spaghetti and sprawl even within the same organization. A unified content model could streamline migration and re-platforming through reduced ETL steps among content sets and systems.
Cohesion & Extension: The Web relies operationally on shared frameworks, patterns and standards. The community could employ and extend a unified content model alongside other Web conventions such as design systems, front-end libraries and linked data.
To illustrate this: The dotCMS content model offers contentlets, containers and VTL files; while the Drupal content model offers nodes, blocks and views. An organization that builds its content model in one of these systems uses that system’s built-in, foundational model as a base model for the organizational content model.
The proposition is that a unified content model could bridge or transcend these in-house models and systems. Content Management Interoperability Services (CMIS) and Java Content Repository (JCR) could be viewed as earlier, similar approaches.
Rationales for a Unified Content Model
Several rationales could drive the content management community toward a unified content model and its benefits:
Lingua Franca: As a community, we already communicate somewhat interchangeably about content primitives: notably page, block, section, template, etc.
Coalescence: There exist rough approximations of unified content models on the Web: block-built landing pages, product information pages, courseware, etc.
Norms: There exist both official and de facto frameworks and standards for many other aspects of Web content: HTML, CSS, RDF, WCAG, Material Design, Bootstrap, etc.
Linked Data: Schema.org, Dublin Core and other schemas contain content management-related elements that could support a unified content model: name, description, ImageObject, VideoObject, AudioObject, mainEntityOfPage, Title, Date, Type, Format, Relation, etc.
Semantic Web: There exist dedicated, industry-oriented ontologies that can be extended for specific domains, i.e. FIBO, IOF, SNOMED. Similarly, a content ontology could be blended into individual organizational domains.
Public Benefit: Many organizations share their design systems publicly. Likewise, the content management community could learn from each other’s public content models and consolidate their elements in a unified content model.
AI Optimization: A unified content model supporting well-structured and easily scrapeable content might better inform artificial intelligence technologies and improve information retrieval and quality.
Assumptions Underlying a Unified Content Model
A team involved in realizing a unified content model would need to assert and challenge current assumptions, among them:
Value: The content management community would find value in adopting and participating in a unified content model.
Viability: A unified content model that grows large can be maintained against semantic drift and schema drift.
Feasibility: A unified content model can be effectively documented, shared, engineered and extended.
Usability: The content management community can easily understand and implement a unified content model.
Consolidation: A unified content model and its structured elements can transcend any single domain or organization.
Formalization: A unified content model can be formalized as a standards document, ontology, JSON representation, or other technical artifact.
Structured Content: The structure of content can be contained or reflected in :
existing content models, types and templates
existing data models and metadata fields
elements within content formats, i.e. HTML, JSON, XML
hyperlinks among content items
programmatic relationships such as aggregations and queries
technical implementations such as JCR and CMIS
taxonomy values applied to content
semantic relationships among content items
narrative types, as described by Deane Barker
Narrow Domains: Content models are still concentrated in and narrowly tailored to specific business units, organizations, industries and domains. See the Recipe content model.
Technical Isolation: Established content management technologies like dotCMS, Drupal, Sitecore, Typo3, WordPress, etc. have their own foundational yet siloed content models.
Customization: Even when an organizational content model is built out from a foundational content model, it tends to stay within the same organization and system.
Challenges Involved In a Unified Content Model
We anticipated the main challenges to shaping and propagating a unified content model, which present a steep barrier to adoption:
Presentational References: At the conceptual level, many elements in contemporary content models remain closely tied to design and display concerns and involve presentational references, for example: banner, carousel, drawer, panel, tile, etc.
Scalability: At the logical level, it can already be challenging to achieve agreement within a single organization regarding a content model. It will prove challenging to reach consensus across an entire content management community.
Implementation: At the physical level, each organization would nonetheless be responsible for implementing and extending a unified content model in its own data layer; the costs and complications might prove impractical.
Conclusions – The Elusive Content Unimodel
Our CMS Experts group discussion of standardization around a unified content model arrived at the sense of an idea too early for its time:
Untested Hypothesis: A common, shared and unified content model across domains and organizations remains an intriguing but still academic and multivariate proposal.
Internal Misalignment: The most pressing present need is for any organization to first achieve alignment around its own internal content model.
External Divergence: There seems to be no current, compelling use case, groundswell support, nor practical path toward a unified content model.
AI-Driven Standardization: The community may well see future standardization around a unified content model that better conveys content and meaning to artificial intelligence technologies.
Reduced Observability: Meanwhile, creative and innovative content modeling happening inside organizations today often stays hidden from the wider content management community.
Looking to the future, our group shared a perception that, in the end, we might already be more alike than different among our scattered organizational content models.
The group supported sharing content models publicly for community observability – which when viewed together might one day shape a common, shared and unified content model.