Ivar's rambling: November 2016

Tuesday, 22 November 2016

ROC theorem: Readable, Optimised and Correct code. Pick two.

Etiketter: code

ROC theorem

With database you have the well known CAP theorem. Consistency, Availability and Partition Tolerance. You can only have two. Databases have to make compromises between these pillars. You can not fully have all three.

With code you also have to decide on compromises between readable code, optimised code and correct code (ROC). And you can not have all three.

This often creates arguments between people on soap boxes from the various camps.

Correct code

Correct code, clever, terse, generic code that avoids handling a lot of edge cases. Often functional code that can be very elegant with little to no theoretic side effects. And easily composed as part of other code.

Correct code can be readable and fast, but also sometimes horrible to understand and very costly to train, write and maintain.

Optimised code

Optimised code, fast and scalable. No unnecessary cruft and takes short cuts to achieve the end results so performant from day one.

Scalable code, code you optimise to support horizontal scalable solutions, with little state and restartable.

May discourage typed system e.g. a message based Actor system, multiple layers of caching, or overuse of parallelised futures to avoid bottlenecks.

Optimised code can be "correct" code but often full of unclear and undocumented short cuts and frustratingly slow/buggy to develop.

Readable code

Readable code, simple to read and quick to understand, by people of different levels of skill. Easy to spot bugs and is maintainable by anyone. It is pragmatic in its approach and quick to develop. Can be terse if it is the most readable but often more verbose.

Readable can be "correct" as flow is easier to understand, but often not particularly performant and can be at more at risk of bugs due to more exposed code.

Not mutually exclusive

You can have all 3 pillars for some smaller sets of code. But not for whole code bases and at a cost for how much.

This is more about the priority and focus of the code you write.

Will others work on the code base, today, next week, next year? Then readable is important.
Is multiple people working on the same part of the code base? From different teams? With varied experience, or even just different locations? Then readable is important.
Is the solution used by millions? Does big O make any difference? Then consider optimised code. Note: very, very few companies/products actually need this.
Is a bottleneck been proven in production? Then optimised code is valid. But not necessarily across the code base.
Is the product business critical? With heavy integration dependants? Then correct code may be a priority.
Is the team highly skilled? Not that large, and low churn? Then correct code is an option.
Do you pair program 100%, from day 1 of onboarding? Then correct code is an option.

With unlimited time to implement and continuously heavy training, and therefore a lot money, you may achieve higher levels of 2 of them or even all 3. But that is not realistic. Pick your priorities. These are not mutually exclusive but they are at a cost of each other.

All the ROCs

You may detect my preferences. I prefer very simple and readable code, that is functional, and scalable. In that order.

I like that anyone can pick up and work on a task for most parts of a code base. I subscribe to the idea of frequent pair rotation across tasks and systems to make sure multiple people is aware of and had an input into any part of an architecture. That leads towards readable code so the overhead of swapping is low.

That a new member of our team or from another team in 6 months time can easily contribute to "our" code base for a small pull request without learning "our take" on category theory is valuable.

I prefer functional programming, with proper type checking, using functors and monads for composition. I like terse code that I can trust, but it must still be readable and maintainable by others than the original author(s).

So some overuse of higher kinded types, free monads, etc adds too much cruft for me, and risky recruitment demands. (At the moment, I am prone to evolve and may have completely changed my mind by the time you read this...)

Horizontally scalable, concurrent code is in the back of my mind of most of solutions. No state, using futures, REST principles etc are core to all my code.

But I detest premature optimisation. Only occasionally in my career have I had to modify any code to support some optimised flow. I do not work for Facebook/Twitter/Google (yet) but I have worked for financial, telcos, and games companies with enormous traffic, and still this was rarely a problem at code level.

I often spot potential optimisations and consciously say no, not yet, if it is not also the most readable and correct alternative. I even avoid parallelisation of futures if there is not yet any obvious need especially if it makes the code less readable.