In this blog, Stuart Malcolm, our Head of Standards, Efficiency and Automation, discusses his 2021 PHUSE US Connect presentation, ‘Automation of ADaM using statistical data and metadata exchange (SDMX).’ Stuart considers the pharmaceutical crisis that motivated his talk.
The pharmaceutical industry has faced a longstanding crisis of declining productivity. Indeed, one study found that the number of new drugs approved per billion US dollars of R&D budget has halved roughly every nine years since 1950. Authors of a 2012 Nature article coined the phrase ‘Eroom’s law’ to describe this negative phenomenon – an inversion of ‘Moore’s law’ that describes productivity gains from experience.
One of the ways that we have tried to overcome ‘Eroom’s law’ is by harnessing the power of automation. As we strive for greater efficiency across the drug discovery and development value chains, we increasingly apply automated methods – from laboratory high throughput screening and robotics to data analysis.
Data analysis for clinical trials offers scope for improvements and efficiency gains, with many organisations exploring new methods to streamline workflows. Yet, one factor that limits us from making strides towards greater efficiency is the tendency to focus on improving the status quo – which leads to incremental enhancements at best, rather than tackling a problem from a new angle. While our sector has unique demands as a regulated environment, other vertical markets face – and have tackled – similar challenges. To fully embrace innovation and usher in a new era of automation and elevated productivity, we should perhaps take the opportunity to learn from others rather than reinventing the wheel. We tend to overestimate the differences between our own and other sectors while underestimating the similarities and synergies. I believe that if we put our heads above the parapet, there are practical, implementable solutions that could contribute to tackle ‘Eroom’s law’.
One approach with the potential of significant improvement over ‘business as usual’ is to move from data as 2-dimensional tables to multi-dimensional knowledge graphs. However, the challenge is how to represent statistical analyses in a standard way that we can exchange.
This is the type of challenge that I believe is best approached by building on similar problems which have been solved in other industries.
Take, for example, the SDMX (Statistical Data and Metadata eXchange) standard. This comprehensive, domain-neutral, ISO standard for statistical and metadata exchange was first released in 2004 and came with a full suite of technical standards, statistical guidelines and IT architecture and tools. While first established for use in banking and government organisations, this flexible framework could have robust applications within the clinical trials industry and provide answers to some of the clinical trial reporting bottlenecks we regularly encounter and enable us to unlock the potential of metadata-driven approaches.
Within the pharma sector, CDISC is well-embedded as our submission standard. As a standard that plays well with other frameworks, SDMX could dovetail with CDISC and help improve efficiency without affecting pre-existing data submission processes.
One of the advantages it offers is the way it describes both statistical data and domain-specific metadata and its strong alignment with CDISC’s own concept for producing analysis based on linked data – an area of intense focus now within the programming community.
Many of the challenges that we face when trying to manage change within an analysis project are caused by our ‘waterfall’ model of data capture to analysis. We set up the data capture tool, produce the documentation, write programmes to analyse the data, create the analysis, and review the results. That’s all well and good until, upon review, we find that we needed to make a change. These adjustments have a ripple effect across the whole of our analysis, sending us back to an earlier stage of the process and often requiring us to make multiple changes. No doubt all of us have been in the situation during a project where carefully planned timelines go off track because of these data interdependencies. We risk finding ourselves scrambling to complete the outputs to a deadline because of an inability to revise an assumption at a central point that cascades across the deliverables.
The Knowledge graph offers a potential solution to this deep-seated problem. By converting our clinical data to linked data models first, then onwards to CDISC ADaM, we could produce TFLs more seamlessly without manually changing programs. The broad premise would implement a standards-based linked data model as an engine to generate the submission-ready datasets in CDISC format, taking advantage of its metadata-driven approach.
This is, of course, just one of many potential methods and ideas that we could translate from other sectors into our own industry. Investing some time as a community to explore these cross-sector opportunities could yield impactful results and help us to defeat Eroom’s law – de-risking our next ventures into automation for clinical trial reporting and freeing up our talent for strategic initiatives.
Attending the 2021 PHUSE EU Connect? Our talented Programmers (including Stuart Malcolm) will be giving a range of insightful presentations. Follow us on LinkedIn for updates or sign up to our mailing list and stay up to date with the latest news and events.