I’m fan of Bryan Cantrill’s argument that we ought to think about “platform values” when assessing technologies, highlighted in a recent talk, but explained more thoroughly in an earlier one on his experience with the node.js community.
In my reading, he argues that of the many values (such as performance, security, or expressiveness) that programming languages or platforms may hold, many are in conflict, and the platform inevitably emphasises some of these values over others. These decisions are a reflection of explicit or implicit “platform values”.
Cantrill illustrates this with a series of examples, but, no surprise, the R platform does not make his shortlist. I couldn’t resist trying to cook up my own taxonomy of platform values for the R language and its community.
More broadly, though, Cantrill believes that the values of a platform affect the projects that adopt them, and conflicts can arise between the values of a project and its respective choice of platform. This strongly echoes my own experience in the R community, and I’ve gotten a measure of clarity for future and existing projects by learning to articulate these conflicts.
R’s Values: The List
Note: The effect of the following tables will be diminished without this site’s CSS.
Cantrill gives the following selection of values that we might claim as “core” to a platform, although he doesn’t consider them exhaustive:
• Approachability | • Integrity | • Robustness |
• Availability | • Maintainability | • Safety |
• Compatibility | • Measurability | • Security |
• Composability | • Operability | • Simplicity |
• Debuggability | • Performance | • Stability |
• Expressiveness | • Portability | • Thoroughness |
• Extensibility | • Resiliency | • Transparency |
• Interoperability | • Rigour | • Velocity |
He also gives a few illustrations of platform values; for example, he formulates the core values of the C programming language as
• Approachability | • Integrity | • Robustness |
• Availability | • Maintainability | • Safety |
• Compatibility | • Measurability | • Security |
• Composability | • Operability | • Simplicity |
• Debuggability | • Performance | • Stability |
• Expressiveness | • Portability | • Thoroughness |
• Extensibility | • Resiliency | • Transparency |
• Interoperability | • Rigour | • Velocity |
This is not to say that these values preclude us from writing safe, robust, or composable C code – in fact, this is expected of expert C programmers – but that the language itself does not offer strong support for doing so. Many features that would make C programs safer have been sacrificed at the altar of performance.
In contrast, these are what he sees as the core values of the Javascript
community in general and node.js
in particular:
• Approachability | • Integrity | • Robustness |
• Availability | • Maintainability | • Safety |
• Compatibility | • Measurability | • Security |
• Composability | • Operability | • Simplicity |
• Debuggability | • Performance | • Stability |
• Expressiveness | • Portability | • Thoroughness |
• Extensibility | • Resiliency | • Transparency |
• Interoperability | • Rigour | • Velocity |
There are some additional examples for C++, Awk, K, OpenBSD – which values only Security – and others in his talks.
I would formulate the core values of the R platform as follows:
• Approachability | • Integrity | • Robustness |
• Availability | • Maintainability | • Safety |
• Compatibility | • Measurability | • Security |
• Composability | • Operability | • Simplicity |
• Debuggability | • Performance | • Stability |
• Expressiveness | • Portability | • Thoroughness |
• Extensibility | • Resiliency | • Transparency |
• Interoperability | • Rigour | • Velocity |
The presence of Approachability and Velocity in this list should be no surprise: R is and always has been a language heavily geared to those learning a bit of programming to fulfil their need to do data analysis and statistics. It has a long history of efforts to get that group up and running as fast as possible. I also think that the recent push to build tools and packages that make working with R “fun” or “easy” – complete with the requisite emoji – is a re-expression of these values as core to the platform.
Similarly, it is clear to me that R holds Extensibility by its users, through
the development of functions and then packages, in very high regard. It is a
widely-held view that any non-trivial
R code ought to be bundled up into a package so that it can be distributed
painlessly to others. In fact, this view is so pervasive that the traditional
“scripting” options for R are limited: Rscript
did not become part of the
language until 2007, and none of the built-in tooling (e.g. NAMESPACE
or
DESCRIPTION
files, or R CMD check
) works with standalone R scripts.
It’s worth explaining why I have highlighted Integrity. What I have in mind
here is R’s language level support for the notion of “missing values” (NA
).
All functions must handle missing values, and function authors are constantly
on the hook for deciding how their function ought to behave in the absence of
data (na.rm
, anyone?). This represents a remarkable commitment to ensuring the
integrity of R’s many mathematical programs.1
R’s Values: The Consequences
Part of the larger argument about why platform values are important is that they affect the nature of the projects that are built on these platforms. Some of this is unequivocally positive: for instance, I would argue that efforts by folks like DataCamp to teach R as widely as possible have been much helped by the platform’s preexisting commitment to Approachability.
Similarly, one might well attribute the enormous popularity of R as an implementation language for cutting-edge statistical methods to the platform’s focus on Velocity and Extensibility, at least in the case of the academic statistical community.
And I have no doubt that R’s focus on Integrity has had a quiet but meaningful impact on the many statistical analyses that are undertaken with it.
Still, we can’t have everything, and in the case of the R platform, there is no doubt that you can encounter hurdles if you pursue projects that align with different values. In my view, the most pressing concern is the way R’s platform values diverge from those held by those running applications of one kind or another.
Much of my recent professional life has been focused on divining ways for my team to run R code in “production” environments, where values like Stability, Resiliency, Robustness, Measurability, and Debuggability become dramatically more important. This has had a substantial impact on the nature of the R code we write, and the kinds of tools we build to help us write it better.
With the advent of Shiny and Plumber, these past few years have witnessed a sea change in the ability for R users to tackle writing true applications. It remains to be seen whether R’s platform values – as they exist today – will curtail the growth of these projects, or whether these projects will have some success in changing the culture of R itself.
-
One might also argue that the popular tidyverse ecosystem attempts to bring Composability and Compatibility more into the foreground. You can see this in the current mission statement of the project:
The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.