Skip to main content

R is one of the most popular programming languages used for data sciences across industry sectors, and within academia. It is renowned for its statistical analysis and visualisation capabilities, its open-source environment, and its thriving, collaborative community of users.

Within the pharmaceutical industry, while SAS continues to be the primary statistical programming language used, especially within regulatory settings, R’s prevalence and range of applications have been increasing for several years. Early-career Statisticians and Programmers have typically learned R during their university courses and research and are often keen to continue its use as they embark on their careers. As an open-source tool, R has the advantage of a wealth of talent and energy available within the community. For example, many of the latest cutting-edge statistical methodologies have been developed with R code and packages to support them. Professional interest in the R language and its potential among biopharma statistics and programming industry associations is buoyant, resulting in numerous collaborative initiatives and efforts to drive up adoption and practical applications. 

R is particularly well suited for exploratory work, for furthering our understanding, and explanatory work for communicating data. With respect to the latter, R’s capabilities for data visualisations are substantial, through its ggplot2 and R shiny packages and user-generated libraries of easily accessible figures. There is a hive of activity within our sector among the PSI community of statisticians and the PHUSE community of programming professionals exploring the potential of the language for this, and other applications. For example, PSI’s visualisation special interest group, which I founded, hosts monthly ‘Wonderful Wednesday‘ webinars showcasing visualisation techniques. Most of these submissions are in R. As well as the webinars themselves, the special interest group’s accompanying blog provides several practical examples which I’d encourage you to read.

In many respects, R is now the leading tool for visualisation within the pharma industry and beyond. In day-to-day life, as well as within the drug development environment, we now routinely see more complex visualisations applied for their storytelling capabilities. Reporting on COVID-19 exemplified this broader trend, with analysts from the Financial Times, BBC, and others becoming well known for their visualisation and graphical capabilities. As all good Statisticians know, compelling data visualisations are critical tools that allow us to communicate more effectively with stakeholders and inform better decision-making across the product life cycle.

With this in mind, at Veramed, our team has been working enthusiastically on various internal and external R initiatives. As part of the PSI community, we have contributed to the above-mentioned visualisation special interest group to share our capabilities and learnings with our peers. Several members of our team presented a poster at this year’s PSI conference on how to influence decisions with data visualisations, specifically applying Gestalt Principles, a set of ideas about information processing and pattern recognition, to aid understanding.

As part of our internal training efforts, we provide regular sessions on data visualisation to strengthen the capabilities of our team and benefit from one another’s experiences. Notably, these sessions cover not only technical tips and tricks for creating better visualisations in R, but also share strategic context for our team on the importance of targeting audience, channel and message when developing graphical tools for any given task. 

Increasingly, we are sharing these training initiatives externally with clients as part of broader services developing R-based data visualisations and underpinning processes to enable their adoption. 

For the future, I expect to see deepening investment by sponsors and CROs alike in R initiatives. Increasingly, larger pharmaceutical organisations, especially, have created R validated environments and are promoting a new paradigm of adopting more visualisations and moving away from traditional tables. 

Naturally, R will continue to find a home in non-regulatory settings, including for publications and HTA submissions. However, a fundamental paradigm shift is the use of R within regulatory environments for approval submissions, and we are now moving closer to making this a reality. In the last 12 months, the R consortium submitted a pilot submission package in R to the FDA following eCTD specifications and including a proprietary R package and other required eCTD components.

It’s clear that the increased use of R language represents new opportunities for efficiency, collaboration, and better-informed decisions across the product life cycle. As Statisticians and leaders, it’s essential that we not only adapt to these innovations but embrace and help to drive them forward.

Learn more about our innovative service offerings across RWE, data visualisations and more.

EVIDENCE AND VALUE GENERATION