The translation gap between preclinical and clinical drug development is one of the most persistent and costly problems in pharmaceutical research. Compounds that demonstrate efficacy in preclinical models fail in clinical trials far more often than they succeed. While some of this failure reflects genuine biological differences between animal models and human disease, a significant portion reflects knowledge management failures: preclinical and clinical evidence is stored in systems that use incompatible ontologies, making systematic translational comparison difficult or impossible.

The Ontological Alignment Problem

Preclinical data is generated using concepts from animal biology: mouse gene nomenclature, rodent phenotype ontologies, pharmacokinetic parameters measured in rat or dog models, and histopathological findings described using veterinary pathology terminology. Clinical data uses human gene identifiers, human disease ontologies, clinical endpoint definitions, and adverse event coding in MedDRA. These are not merely different vocabularies for the same concepts — some preclinical concepts have no direct human equivalent, and some clinical concepts have no standard preclinical representation. An ontological alignment layer maps preclinical concepts to their most appropriate human equivalent, documents the mapping relationship type and confidence, and flags cases where no reliable cross-species translation exists.

Translational Knowledge Queries

With a cross-species ontological alignment in place, translational queries become tractable. Programme scientists can ask: "in which preclinical studies did compound X show efficacy against a disease model aligned with indication Y, and what was the PK-PD relationship at the efficacious dose?" Then: "what human exposure level would correspond to the efficacious animal exposure, based on established allometric scaling and protein binding parameters?" And: "in the human phase I study, was the projected exposure level achieved, and if not, is the dose range for phase II sufficient to cover the predicted efficacious exposure?" Each of these questions requires connecting preclinical pharmacology data, cross-species pharmacokinetic data, and clinical study data — a connection that an ontologically aligned knowledge graph makes systematic.

Improving Translational Prediction

Organisations that maintain a well-curated translational knowledge graph accumulate institutional learning about which preclinical models and endpoints are reliably predictive of clinical outcome in their therapeutic areas. This learning — "in our oncology portfolio, preclinical tumour growth inhibition at this exposure level has historically predicted clinical disease control rate with this reliability" — is currently embedded in the heads of experienced scientists and is lost when they leave. Capturing it formally in the knowledge graph, as qualified translational assertions with evidence provenance, transforms it into institutional knowledge that persists across personnel changes and informs future programme decisions.