Ontology Engineering Tradecraft: This Is Ours, Own It

Ontology engineering has been around long enough that we should no longer be satisfied with vague answers about what it is.

We should not say, “Ontology engineers use OWL.” Lots of people use OWL. We should not say, “Ontology engineers use the Semantic Web stack.” So do other fields. We should not retreat into the familiar claim that ontology engineers are concerned with universals, or types, or formal representations of reality. We are not philosophers.

The question before our community is this:

What distinguishes ontology engineering from nearby disciplines, where one finds, for examples, taxonomies, concept maps, data models, database management, coding, and machine learning?

If we cannot answer that question, we do not have a distinct discipline. A discipline requires shared methods, shared success conditions, shared training, and shared standards of evaluation. It requires a way to say, with some seriousness: this is what counts as good work, this is what counts as bad work, and this is why.

Here is the proposal:

Ontology engineering is the discipline concerned with constructing machine-interpretable artifacts designed to systematically disambiguate information, improve information quality, and facilitate information interoperability.

The Promise and the Confusion

Ontologies are formally well-defined, machine-interpretable controlled vocabularies designed to represent entities and the logical relations among them. They are designed to make explicit the meanings buried in datasets. They provide a semantic layer across information silos. They allow people, machines, and systems to coordinate around shared representations of the world.

This promise is often muddled because ontology engineers conflate different kinds of interoperability. We talk about “interoperability” as if it were one thing, but there are at least three axes:

  1. Human-human interoperability
  2. Human-machine interoperability
  3. Machine-machine interoperability

These axes are related, but they are not the same. They have different success conditions, and ontology engineers routinely confuse them.

Human-Human Interoperability

Human-human interoperability concerns whether people understand one another. It is the work of aligning meanings across stakeholders, communities, organizations, and domains.

This is where we elicit knowledge from experts, formulate competency questions, clarify key terms, and build consensus.

Ask three economists to define “GDP” and you may receive three answers. Ask three intelligence analysts to define “threat,” “risk,” “actor,” “source,” “capability,” or “event,” and watch the same thing happen.

Human-human interoperability is not trivial. It is not “just definitions.” It is the work of making disagreement visible enough that it can be addressed.

Ontology engineers operate on this axis when they extract key terms from domain experts, evaluate how those experts use language, and identify the points at which apparent agreement conceals real ambiguity.

But ontology engineers are not alone here. Dictionaries, thesauri, taxonomies, Wikipedia pages, standards documents, training materials, and institutional practices all operate, in different ways, along the human-human axis.

So this cannot be what uniquely distinguishes ontology engineering.

Human-Machine Interoperability

Human-machine interoperability concerns whether computational systems can interpret and act upon human-understandable descriptions of the world.

Here we translate expert knowledge into formal representations that machines can reason over. We use OWL, first-order logic, RDF, SPARQL, SHACL, reasoning engines, graph databases, and related tools.

This is the point at which a definition stops being a sentence in a document and starts becoming part of a computational artifact.

Ontology engineers operate on this axis when they formalize domain knowledge into machine-readable languages, build logical models, create class hierarchies, specify relations, add axioms, and test whether machines can infer what humans intended them to infer.

Progress here is evaluated by faithfulness of representation, human-machine integration, and the extent to which machines can extract useful explicit and implicit information from human-defined structures.

But again, ontology engineers are not alone here. Logicians, knowledge representation researchers, data modelers, software engineers, and AI researchers also work on this axis.

Machine-Machine Interoperability

Machine-machine interoperability concerns whether independent systems can exchange, interpret, and process data without humans in the middle.

This requires robust data structures, shared semantics, exchange protocols, APIs, query endpoints, and syntactic standards such as RDF, JSON-LD, OWL, REST, and SPARQL.

Success conditions include syntactic coherence, semantic alignment, computational feasibility, and the ability to query or reason across distinct datasets locally or over web protocols.

As should be obvious, ontology engineers are not alone here either; addressing interoperablity challenges along this axis is not what makes our discipline unique.

Interoperability Is Not Enough

Ontology engineering is often sold as a solution to interoperability. Taking all three axes as within scope of ontology engineering does not make the discipline unique. For one thing, it does not really explain why one would want to address interoperability problems. For that we turn attention to the fitness of information for its purpose, or information quality.

Interoperability and improving information quality must be pursued together.

Too much emphasis on interoperability leads to thin semantic artifacts that connect systems but lack the axiomatic richness needed to support serious validation, reasoning, and quality improvement.

Too much emphasis on information quality leads to isolated models that are locally impressive and globally useless.

We have all seen the patchwork quilts. An organization stitches together databases. Then it discovers inconsistencies. It stitches together more datasets to repair those inconsistencies. Then it re-checks the previous stitching. Then it updates the mappings. Then it prays the whole thing holds together long enough for the application to work.

Ontology engineering aims at addressing interoperability and information quality simultaneously. That means we need shared best practices: top-level ontologies, modularization, disciplined reuse, clear release strategies, explicit mappings, automated quality control, and governance mechanisms that treat ontology artifacts as serious infrastructure.

The Open Biological and Biomedical Ontology Foundry, the Industrial Ontologies Foundry, and related efforts exist because communities eventually learn the same lesson: if everyone builds their own conceptual universe from scratch, the result is not innovation. It is fragmentation with nicer tooling.

Each adopts a top-level ontology which gives ontology engineers a shared starting point. It provides a common architecture for distinguishing material entities, processes, qualities, realizable entities, spatial regions, temporal regions, and information entities. This in turn provides a foundation on which to avoid and uncover subtle ambiguities...

Systematic Disambiguation

Which leads to the final piece of our characterization of ontology engineering as a distinct discipline. At its heart heart is systematic disambiguation.

To disambiguate is not to find the one true interpretation of a phenomenon and declare victory. It is to identify and represent the logical space of relevant interpretations. It is to make distinctions explicit where data, language, systems, and human practices collapse them. For example:

  • Type versus instance: Algeria as a particular country versus country as a type.
  • Information versus what the information is about: an occupation code versus the person who holds an occupation.
  • Material versus immaterial entities: a river versus the site where a river used to flow.
  • Process versus product: ontology engineering versus the ontology produced through that engineering.

These distinctions sound simple; that is exactly why people overlook them. Let me say this more clearly.

These distinctions are so obvious you will forget to include them in your data.

Use a top-level ontology so you don't have to rely on some clever person to remember to disambiguate information from what it is about. Aim to disambiguate by design for the same reason. We should releae ourselves from reliance on genius; we should democratize access to meaningful inferences currently hidden in our data.

This Is Ours

There is no other discipline focused on systematically disambiguating data in this manner. It is what we do and what we do best.

Own it.

Systematic disambiguation is what allows ontology engineering to address both interoperability and information quality. It allows us to build representations that are clear enough for humans, formal enough for machines, and stable enough for systems to exchange and reuse.

For example, a complete representation of any serious use case will eventually touch every major region of BFO. It will involve material entities, processes, qualities, realizable entities, temporal regions, spatial regions, and information entities.

You cannot address deep interoperability challenges if you only model one sliver of reality and pretend the rest does not matter.

But this does not mean every application needs the full graph.

The Full Semantic Cloth

The semantic layer and the operational layer are not the same thing.

The full semantic representation is the cloth; the operational release is the cut.

Different applications need different subsets, different views, different modules, and different interfaces.

You should cut your operational release from the whole semantic cloth.

Do not confuse the fact that users need simplicity with the idea that the underlying representation should be simplistic. Users need useful views. Machines need coherent structure. Organizations need release strategies. Ontology engineers need to understand all three.

Toward a Discipline

The new era of ontology engineering will be built by method.

But it will only become a discipline if we insist on the methods that distinguish it.

So let us stop hiding behind tool names. Let us stop defining ontology engineering by whatever syntax we happen to use this year. Let us stop pretending that a taxonomy becomes an ontology because someone exported it in RDF.

Ontology engineering is not taxonomizing. It is not concept mapping. It is not data modeling. It is not coding. It is not database management. It is not machine learning. It may use all of these. It may collaborate with all of these.

But it is not reducible to any of them.

Ontology engineering is the systematic construction of machine-interpretable artifacts that disambiguate information, improve information quality, and facilitate interoperability.

That is the tradecraft.

And if we are serious about the future of the field, that is where we begin.