Biomedical Concepts
Biomedical Concepts
The notion of biomedical concepts has been discussed within the CDISC community for almost 10 years now, with the most recent prominent appearance occurring during this April’s CDISC 360 project wrapup webinar. In this post we will examine a bit closer what biomedical concepts are (or intend to be), what the use of biomedical concepts can possibly achieve, and what the obstacles are to achieve the benefits promised by the use of biomedical concepts.
Let me start by saying that clinical data standards built on biomedical concepts is a great idea and would be a great thing to have, but that in itself is insufficient to judge how to go about it. Just like having a vaccine is a great idea, there are many factors that come into play to determine how to go about it: how to understand the scientific method, how to develop it, how to test it, how to manufacture and distribute it at scale, how to price it, how to message it so the public develops sufficient trust, how to plan for long-term adoption, how to deal with mutations, and many other factors. Almost the same can be asked verbatim about the development and adoption of biomedical concepts.
So, what is a biomedical concept? In a nutshell, a biomedical concept identifies a discrete unit of knowledge in any of the biomedical information sciences. This is on purpose very broad, so let’s drill down a little bit deeper. To describe a unit of knowledge means that we are looking for conceptual definitions and models rather than how this information can be collected and stored in operational databases. In other words, we are focused on identifying concepts, providing good definitions of concepts, and create models of semantic relationships between these concepts. It’s all about the meaning of information and not the bits and bytes. There are several ways how such knowledge organization systems (KOS) can be represented, e.g. as ontologies, metadata schemas, vocabularies, taxonomies, thesauri etc.
We can drill down further by narrowing our focus to using biomedical concepts specifically for clinical data standards. Within this scope, most of what we need is confined to “small” concepts and “localized” models, e.g. it is fairly straightforward to define a biomedical concept for Vital Signs Heart Rate Observation, identify its constituent parts (observation datetime, observation method, observation result, observation result unit etc.), and fill out the definitions and relationships of the parts. This is very good news: “small” and “localized” means that there is no need for a complex and large upfront model, hence biomedical concepts can be developed in an incremental fashion. The bad news is that there are many of them. I don’t have an exact number, but 30k to 50k is probably a good range for a combination of what CDISC standards cover today and what large sponsor extensions may additionally need.
Up until 2016 the leading idea in CDISC has been to take BRIDG as the KOS foundation, but I have always argued against that for several reasons, most importantly because of the model impedance (BRIDG is a domain model, not a KOS) and the fact that BRIDG is a very complex model that doesn’t sit well with “small” and “localized”. This is not to say that good ideas can be taken from BRIDG, but in my opinion it shouldn’t be used as the starting model out of the box. My own recommendation is to use the ISO 11179 Metadata Registry standard as the foundational layer, and it seems that CDISC 360 has started to adopt that idea. Let’s have a look at why there are good arguments to be made for such a decision.
A biomedical concept model for clinical standards would be a great knowledge resource, but in and of itself it would not be that a great incentive for sponsors to spend time and resources on, first to develop systems to support it, and second to develop all the needed content. The reason is that some semantic information is already available from the CDISC Controlled Terminology (CT). Although the published CT files are quite limited compared to the original terminology maintained in the NCI Thesaurus (which is a proper KOS maintained as an OWL ontology), their content has been sufficient for sponsors to get by, at least for now. For biomedical concepts to be adopted, there needs to be sufficient incentive to do so. This brings us to a key issue of the current CDISC standards, apart from just missing consistent conceptual standards.
For many years and as of today, CDISC standards for collection, tabulation, analysis, and exchange have been developed in isolation by different teams following their own versioning and timelines. It is almost a mirror image of the silo processes we find today in most biometrics organizations. As a result, these standards are disconnected and inconsistent. Biomedical concepts can be a driver for more coherent standards development: when standards are developed for a certain subject area, they should be published as one consistent package that covers concepts, collection, tabulation, analysis, and controlled terminology. In fact, such a package should contain even more, covering planning related aspects (study design, eligibility, objectives, endpoints, activities) so that true end-to-end standards can be achieved from protocol to submission. This aspect has been recognized in TransCelerate’s DDF Hackathon challenge, where biomedical concepts act as the link between study endpoints and the study activities planned in the Schedule of Activities.
This brings us back to considering the underlying metadata model and how to integrate operational standards with conceptual standards. This is already baked into the ISO 11179 Metadata Registry model. Many organizations use a metadata registry (MDR) to manage clinical data standards. The Nurocor Clinical Platform (NCP) Metadata Registry (MDR) does exactly that based on the ISO 11179 standard. ISO 11179 also lays out how to organize operational and conceptual metadata, and how to connect those two layers, while leaving sufficient latitude for implementers to choose a specific type of KOS. This is the route we followed for the NCP Metadata Registry. We extended the MDR model in two directions beyond covering the well-known operational data standards: link the operational metamodel to a biomedical concept layer and extend clinical standards to cover study planning standards (study design, eligibility, objectives, endpoints, activities). By doing so, we have demonstrated the feasibility of implementing operational and conceptual clinical standards from protocol to submission based on the ISO 11179 model, and the use of those standards to drive digital clinical information flow in digital protocols and lean protocol processes.
This all sounds great, except for one major caveat. Although CDISC has concluded the CDISC 360 project, there are currently no formal materials or white papers available about the actual metamodel for biomedical concepts, there is no published methodology for developing standards driven by such a model, nor is there any substantial clinical standards content available based on biomedical concepts. In other words, the biomedical concept in its current state is effectively an empty box. Nurocor’s NCP implementation makes the best of this situation by establishing a minimal set of assumptions for the conceptual part of the model, so that customers can implement clinical standards from protocol to submission either without biomedical concepts or with biomedical concepts as soon as they become available at scale (i.e., “building the box” into NCP now so that the box can be filled later on without introducing major rework and disruption). That way we protect existing clinical standards investments, but at the same time we are prepared to adopt biomedical concepts whenever they become available.