Over the last few years I’ve become interested in opportunities for using R in production, in the broad sense of code with users other than its original author. The reason is that this is an important avenue for R users to bring value to others. In the words of David Robinson, “anything still on your computer is, to a first approximation, useless.”
My public work in the R community has also largely fallen into this category, including production-grade metrics and logging packages, connecting R to RabbitMQ (which is very popular in enterprise production systems), and writing R’s first external profiler – a tool for answering questions like “why is my R code slow in production?”
My private work in R has included work on platforms and tools to build and host many dozens of internal packages as well as development of production R APIs and Shiny apps. I was also intimately involved in migrating many of these workloads to Kubernetes – increasingly the target for production workloads in industry, and a tool I feel strongly about as part of R’s future.
In my experience it can be empowering for R users to have a clear path from ad-hoc analysis to “data products” – graphics, emails, reports, APIs, or even interactive applications – as they need them. And so I’ve been advocating for these data products to have production-friendly defaults (or at least production-friendly stories), all while personally discovering what it means, operationally, to actually take R to production.
A few months ago I had the realisation that if I’m really serious about expanding the frontier for R in production, the best place to do so is at RStudio. No organisation is as plugged into the broader ecosystem as they are, and no organisation has as broad a reach to improve these tools and techniques for the wider data science community, in R and beyond.
So I’m happy to announce that I have joined RStudio, at least in part to work on improving the story for R in production, and to help out the folks making that leap.Continue Reading →