Policy for Language Evaluation and Selection

Posted on Feb 8, 2014

Policy for Language Evaluation and Selection

In 2014, two languages qualify as industry standard Systems Programming Languages, C/C++ and Java. I define Systems Programming languages as the category of languages that have applicability outside web development or desktop applications development. C# and Objective C are a little unclear, but because they tend to be closely associated with a specific platform, and because of their connection to the C/C++ family of languages, I will lump them into one category.

The semantics of Industry Standard are debatable, but I’ll take the most naive approach of consulting one of the popular indices of developer-language use, such as the Tiobe Index. In the Systems Programming category, The C and Java families have a representation 1 to 2 orders of magnitude larger than their nearest peers

Developer usage shouldn’t be treated as an explicit measure of quality, but if we accept that quality was a consideration in language selection, there is some derivative information about quality. It’s probably safe to make some other second-order inferences from the index, such as the level of proficiency one is likely to find in the general population of systems programmers, and the availability of tools and resources for developing in the language.

I want to make it clear that I am not arguing that the most popular programming language is the best, but making an objective decision in an environment where new languages are introduced regularly, we must start somewhere. If there were possible to measure quality in that way, we’d likely see less uniformity in the numbers for the second tier of languages in the usage index. There’s good news, and it’s that the designers of these languages are clearly doing something right.

Rather than speculate about why C and Java comprise this exclusive category of >10% share, let’s focus on a more general question:

“Why would should we use anything other than the industry standard languages?”

To non-programmers, this question probably seems like an obvious place to start, but to anyone who has spent time in the software development business, one can almost hear the indignant snorts erupting around the conference table, or the staccato of keystrokes erupting from outraged online reader.

I know that many programmers, especially enthusiasts of languages in the 0-10% usage category misinterpret this honest question as a rhetorical one, motivated by arrogance or ignorance. Most programmers pride themselves on their ability to be rational, but when you contrast the kind of cold, data-driven logic that they might apply to selecting a new motherboard and CPU to the kind of passionate hallway arguments that erupt over language selection, you might wonder whether language selection bears a stronger resemblance to religion than science. In terms of the level of objectivity involved, I think this is probably true.

Some parts of the question are easily answered; let’s try those now.

First, why have a policy about language selection at all? More specifically, who would need a language selection policy? Businesses, not individuals, typically bother with language selection policies. Most individual programmers I know also have a language selection policy, but it’s more of a list than an algorithm. Let’s focus on business.

When writing new code or replacing old code where language change is an option, hiring and training practices will be affected. The management of these teams therefore have a financial stake in the selection, but individual contributors may have trouble understanding why a modest increase in training or staffing should be such a problem. Only when dozens or hundreds of teams are all responding to trends in technology can one see the high cost. Keeping these costs predictable calls for a sound strategy surrounding the standardization around languages.

Once we accept that language selection policy has a place in business, the question of why should we use anything other than the industry standard languages can be refined further . The question we are really asking is, assuming that the bulk of businesses are efficient, and the language selection policies employed by those businesses (presuming they used one) resulted in C/C++ and Java leading usage industry wide, what characteristics of those languages motivated their broad selection?

This answer is also easy: the selection was motivated by efficiency. What is not easy is determining exactly where the efficiency comes from. It could be as simple as lowering operating expense by reducing program instruction execution counts, or some higher-order efficiency like reduced tool cost, faster development cycle time, etc. I can only speculate, but it is safe to assume that any company which does not make engineering decisions based on efficiency is unlikely to be represented in the language usage index for long. We can infer that these languages assisted them, to that end.

The real question is, therefore, not “why should we use anything other than the industry standard languages”, but rather “adjusting for training and staffing costs, what efficiencies are offered by allowing development teams to adopt a new programming language?”

This is the second question that I expect many programmers to snort at.

Is This Good For the Company?

{% img center http://gwb.blob.core.windows.net/bcaraway/WindowsLiveWriter/CompanyMaturity_DE0E/good_for_company_10.jpg >}}

Many developers look at language selection as a kind of inalienable freedom, and for some good reasons. Our curiosity and intellectual development as programmers are well-served by occupational exposure to new languages and concepts. Furthermore, we see ourselves as uniquely knowledgeable about the technical design requirements of the projects that we work, and having bean counters interfering with that decision making is an abomination of ignorance.

It is a complication that programmers are also in many cases (certainly always among elite companies) participating in profit-sharing arrangements with their employers. How can a programmer simultaneously worry over the company’s symbol on the stocker ticker yet make engineering decisions without regard to the company’s financial efficiency? We must find a way to resolve this duality.

I think this calls for yet another rephrasing of the original question:

Adjusting for training and staffing costs, using empirical
measurement, what efficiencies are offered by allowing development
teams to adopt a new programming language?

The answer is now harder still. What objective measurements of efficiency can we make of a programming language? I know that I can generally write a given program in Python more quickly than I can write it in C++ but I can’t tell you how much faster, and more importantly I can’t guarantee that it in all cases. I also suspect that the code I’ve worked on in Erlang has far fewer errors than similar code I’ve written in C++, but benchmarking this would probably require switching languages first and then trying to derive data from the bug tracking system to estimate improvement.

Many programmers will also argue that evaluating languages objectively misses many distinguishing characteristics that make other languages great. That may be true, but just as with my Python example, if we can’t measure something, we can’t make predictions about it. It is fair for managers to place the burden of devising novel ways of measuring quality and efficiency in support of their arguments on the programmers themselves.

Where programmers do have enormous leverage is by producing cost savings by discovering efficiencies. Cost is uncontroversial. Even in cases where the cost of change is disputed, we can measure the cost after the fact and learn from our experience.

I believe that a Language Selection Policy should have reducing costs by uncovering efficiencies as its fundamental criteria. If we start with a “Nobody Ever Got Fired for Buying IBM” language selection policy that says, strictly, “Use C/C++ and/or Java”, amending the policy entails demonstrating through careful study how accepting another language yields new efficiencies. By “careful study” I mean collecting empirical data subject to internal peer review. In other words, the onus is on applicant.

Many companies also have a few other languages in wide use in addition to C++ and Java, like Python and Ruby. It seems perfectly sensible to incorporate those languages into the initial language selection policy.

Here is a set of the metrics I use for my own studies, and that I would ask to see for proposals for accepting new languages:

Micro-benchmarks

Micro-benchmarks of common task-oriented functions, measured in the number of x86_64 instructions executed per item. For byte compiled languages, the VM cost cannot be excluded from the micro-benchmark measure, but rather amortized over the cost of millions of benchmark iterations.

Start a task (whatever the unit of concurrency is)
Read/Write data to disk, per 1KB read/written
Read/Write data to socket, per 1KB read/written
Open/Close a socket
Read/write a row to/from your data storage reference (psql, riak, etc)
Resolve a DNS A record

Macro-level costs

Vsize/RSS for hello world program
Vsize/RSS measure for hello world program that spawns 1000 tasks

Reference Implementations

Every language policy should entail some kind of reference to be implemented in the language, along with some objective measures of quality. This more subjective but over time I think it becomes clear to reviewers what matters. Satisfying this requirement with third-party code is even better, provided it meets certain provenance rules. A good example would be a client library for an internal system of moderate complexity, or the implementation of some api in a server (like a REST api).

The reference implementation should also give evidence about the cost (in x86_64 instructions execute) of compiling and testing the program, and suitable micro-benchmarks should be required, e.g. the cost in x86_64 instructions of processing a given REST API request.

Instruction count may be wholly unsuitable for your business; most of my own experience is in targeting COGS spending, so the volume of CPU instructions issued can be translated into Power and Cooling or CAPEX intuitively. Other measurements like implementation time or lines-of-code count might count.