Content Unimodel – Can We Standardize Content Modeling?

Picture of an orange, plastic tool for tracing different shapes.

At the CMS Connect 25 conference in Montreal, I led a breakout group of CMS experts in a discussion of the topic, “Content Modeling for Structured Content: What Does it Mean to You and Can We Standardize?”

This post outlines our discussion and adds several takeaways of my own.

Proposed Benefits of a Unified Content Model

The group focused on themes of experimentation and opportunity:

Can the content management community establish a common, shared and unified content model across domains and organizations?

We considered benefits to creating a unified content model across the content management domain, a “content unimodel”. Here are some hypotheses that resulted:

  • Alignment & Standardization: The content management community uses general but sometimes unclear, overlapping or conflicting terms to describe content model primitives. A unified content model could clarify and formalize concepts and references.

  • Efficiency & Portability: Ragged and denormalized content models can generate spaghetti and sprawl even within the same organization. A unified content model could streamline migration and re-platforming through reduced ETL steps among content sets and systems.

  • Cohesion & Extension: The Web relies operationally on shared frameworks, patterns and standards. The community could employ and extend a unified content model alongside other Web conventions such as design systems, front-end libraries and linked data.

To illustrate this: The dotCMS content model offers contentlets, containers and VTL files; while the Drupal content model offers nodes, blocks and views. An organization that builds its content model in one of these systems uses that system’s built-in, foundational model as a base model for the organizational content model.

The proposition is that a unified content model could bridge or transcend these in-house models and systems. Content Management Interoperability Services (CMIS) and Java Content Repository (JCR) could be viewed as earlier, similar approaches.

Rationales for a Unified Content Model

Several rationales could drive the content management community toward a unified content model and its benefits:

  • Lingua Franca: As a community, we already communicate somewhat interchangeably about content primitives: notably page, block, section, template, etc.

  • Coalescence: There exist rough approximations of unified content models on the Web: block-built landing pages, product information pages, courseware, etc.

  • Norms: There exist both official and de facto frameworks and standards for many other aspects of Web content: HTML, CSS, RDF, WCAG, Material Design, Bootstrap, etc.

  • Linked Data: Schema.org, Dublin Core and other schemas contain content management-related elements that could support a unified content model: name, description, ImageObject, VideoObject, AudioObject, mainEntityOfPage, Title, Date, Type, Format, Relation, etc.

  • Semantic Web: There exist dedicated, industry-oriented ontologies that can be extended for specific domains, i.e. FIBO, IOF, SNOMED. Similarly, a content ontology could be blended into individual organizational domains.

  • Public Benefit: Many organizations share their design systems publicly. Likewise, the content management community could learn from each other’s public content models and consolidate their elements in a unified content model.

  • AI Optimization: A unified content model supporting well-structured and easily scrapeable content might better inform artificial intelligence technologies and improve information retrieval and quality.

Assumptions Underlying a Unified Content Model

A team involved in realizing a unified content model would need to assert and challenge current assumptions, among them:

  • Value: The content management community would find value in adopting and participating in a unified content model.

  • Viability: A unified content model that grows large can be maintained against semantic drift and schema drift.

  • Feasibility: A unified content model can be effectively documented, shared, engineered and extended.

  • Usability: The content management community can easily understand and implement a unified content model.

  • Consolidation: A unified content model and its structured elements can transcend any single domain or organization.

  • Formalization: A unified content model can be formalized as a standards document, ontology, JSON representation, or other technical artifact.

  • Structured Content: The structure of content can be contained or reflected in :

    • existing content models, types and templates

    • existing data models and metadata fields

    • elements within content formats, i.e. HTML, JSON, XML

    • hyperlinks among content items

    • programmatic relationships such as aggregations and queries

    • technical implementations such as JCR and CMIS

    • taxonomy values applied to content

    • semantic relationships among content items

    • narrative types, as described by Deane Barker

    • unstructured content not yet examined

  • Narrow Domains: Content models are still concentrated in and narrowly tailored to specific business units, organizations, industries and domains. See the Recipe content model.

  • Technical Isolation: Established content management technologies like dotCMS, Drupal, Sitecore, Typo3, WordPress, etc. have their own foundational yet siloed content models.

  • Customization: Even when an organizational content model is built out from a foundational content model, it tends to stay within the same organization and system.

Challenges Involved In a Unified Content Model

We anticipated the main challenges to shaping and propagating a unified content model, which present a steep barrier to adoption:

  • Presentational References: At the conceptual level, many elements in contemporary content models remain closely tied to design and display concerns and involve presentational references, for example: banner, carousel, drawer, panel, tile, etc.

  • Scalability: At the logical level, it can already be challenging to achieve agreement within a single organization regarding a content model. It will prove challenging to reach consensus across an entire content management community.

  • Implementation: At the physical level, each organization would nonetheless be responsible for implementing and extending a unified content model in its own data layer; the costs and complications might prove impractical.

Conclusions – The Elusive Content Unimodel

Our CMS Experts group discussion of standardization around a unified content model arrived at the sense of an idea too early for its time:

  • Untested Hypothesis: A common, shared and unified content model across domains and organizations remains an intriguing but still academic and multivariate proposal.

  • Internal Misalignment: The most pressing present need is for any organization to first achieve alignment around its own internal content model.

  • External Divergence: There seems to be no current, compelling use case, groundswell support, nor practical path toward a unified content model.

  • AI-Driven Standardization: The community may well see future standardization around a unified content model that better conveys content and meaning to artificial intelligence technologies.

  • Reduced Observability: Meanwhile, creative and innovative content modeling happening inside organizations today often stays hidden from the wider content management community.

Looking to the future, our group shared a perception that, in the end, we might already be more alike than different among our scattered organizational content models.

The group supported sharing content models publicly for community observability – which when viewed together might one day shape a common, shared and unified content model.


Next
Next

Planning Your Content Migration – A Technical Quick Guide