Grounding is the technical mechanism by which AI outputs are linked to explicit, verifiable knowledge representations — ensuring that a generated claim about a drug's safety profile can be traced to a specific assertion in a curated knowledge graph, rather than to a statistical pattern in a training corpus. Several grounding approaches exist, and the right choice depends on the precision-recall requirements, infrastructure constraints, and regulatory defensibility standards of the specific application.
Pre-retrieval Grounding
In pre-retrieval grounding, the knowledge graph is queried before the language model generates its response. The query results form part of the prompt context, constraining the model to generate responses that are consistent with the retrieved knowledge. This is the basis of the RAG architecture and is the most common approach for knowledge-intensive question answering. Its limitation is that the quality of the response depends on the quality of the retrieval: if the knowledge graph query returns incomplete or irrelevant results, the model's response will reflect those gaps.
Post-hoc Verification
In post-hoc verification, the language model generates a response without ontological constraints, and a verification step then checks each factual claim in the response against the knowledge graph. Claims that cannot be verified are flagged, qualified, or removed. This approach is more computationally expensive but handles cases where the appropriate knowledge to include in the response is not known in advance. It is particularly useful for open-ended analysis tasks where the model must synthesise evidence from multiple knowledge domains.
Structured Output Grounding
For applications that require machine-readable outputs — rather than natural language prose — structured output grounding requires the model to produce its response as a set of typed assertions in a predefined schema, each linked to a knowledge graph concept identifier. The model is constrained to use concept identifiers from the target ontology for any entity it references, making the output directly machine-processable and its factual claims directly verifiable. This approach is ideal for data extraction, entity normalisation, and knowledge graph population tasks where the output must be validated before entering a curated knowledge store.