Organisations that have invested in MedDRA, SNOMED CT, or carefully curated internal controlled vocabularies often assume they are already well-positioned for AI applications. The vocabulary is there; surely the AI can use it. The assumption is understandable, but it is wrong — and understanding why requires clarity about what controlled vocabularies provide and what they do not.

What Controlled Vocabularies Provide

A controlled vocabulary provides a standardised set of terms for describing things: preferred labels, synonyms, and in the better ones, hierarchical relationships. MedDRA tells you that myocardial infarction is preferred over heart attack, and that it belongs to the Cardiac disorders system organ class. SNOMED CT tells you that myocardial infarction is a type of ischaemic heart disease and provides a code that any system can reference. These are genuinely valuable — they ensure that data entered using different terms can be aggregated and compared.

What AI Applications Actually Need

A large language model or retrieval-augmented generation pipeline needs more than a list of standardised terms. It needs to know not just what things are called, but what they are: their properties, their causal relationships to other concepts, the conditions under which they apply, and the logical constraints that make certain combinations of facts impossible. Without this relational context, an AI system making clinical or pharmacological judgements is pattern-matching over text — and in regulated domains, pattern-matching over text produces answers that are approximately right in the common case and dangerously wrong in the edge cases that actually matter.

The Knowledge Graph Gap

The gap between a controlled vocabulary and a knowledge graph is precisely where most AI applications in pharmaceutical and clinical settings fail. A knowledge graph adds the relational layer that controlled vocabularies lack: drug-indication relationships, mechanism-of-action links, contraindication structures, clinical trial eligibility criteria expressed as logical conditions, and adverse event patterns expressed as probability-weighted property chains. Grounding AI outputs in this relational layer transforms a general-purpose language model into a domain-reliable tool. Building that relational layer on top of existing controlled vocabularies — rather than replacing them — is the most pragmatic path for organisations that have already invested in vocabulary governance.

Why Controlled Vocabularies Alone Are Not Enough for Modern AI

What Controlled Vocabularies Provide

What AI Applications Actually Need

The Knowledge Graph Gap

Ready to build your knowledge infrastructure?

What Controlled Vocabularies Provide

What AI Applications Actually Need

The Knowledge Graph Gap

Ready to build your knowledge infrastructure?

More in Ontology Foundations

What Is a Medical Ontology? A Practical Guide

Taxonomy, Thesaurus, or Ontology: Which Does Your Organisation Need?

Semantic Interoperability: How Ontologies Bridge Clinical Systems