Ivar's rambling: November 2014

Tuesday, 25 November 2014

Evolve or wither slowly

3 years ago (wow that went fast!) I wrote a blog post called Do not rewrite. The main points in it was to never rewrite whole applications. Never. Instead gradually rewrite one feature at the time whilst still delivering new business value.

I also wrote a more recent post about Paper cuts and broken windows. Which emphasises the importance of fixing problems straight away and not let them fester.

I want to extend these posts to include you should rewrite all the time, even when not immediately necessary.

Recap

Extracting the main points in the previous blogs why big rewrites are bad and continual rewrites are good, and to also sum the obvious why no writes at all are very bad.

Big rewrite disadvantages

Cost

No business feature - no value

Risk

Big deployments
Forgotten features

Likely to never finish
Not fully replacing old system

Support yet another system

Continual rewrite benefits

Reduced risk

Smaller delta

Quicker feedback
Modularising

Faster - less bottlenecks

Leaner

Remove features and bloat

Actually finishing

No rewrite disadvantages

Death by thousand paper cuts

Even the smallest change becomes slow and painful

Broken windows

People do not care if they introduce bugs or break other things

Staff exodus

Few wants to work with painful systems

Less obvious benefits

There are other perhaps unexpected but important benefits not raised in the previous post.

Business domain knowledge

Every time a refactor or minor rewrite takes place the knowledge of that part is refreshed and stay current in the company. If a system is not touched for a long while the knowledge of it may be forgotten especially if staff involved has moved on.

If a critical bug or urgent feature needs to be added to that system then the turn around is exponentially different between a recently updated system and one forgotten.

Technical knowledge

This is also true for the technical part of a system, the how, not just the what and why it does something. An old COBOL or complicated legacy custom build Java application will take a long time to work out compared to a contemporary tech stack application. You might even have to hire new staff or expensive contractors to fix it.

With up to date knowledge it will be a lot less risk that they might not do it properly.

Quick turnaround

If a new feature needs to be added then to any system then getting people ramped up and delivering it will be much quicker if it is on a contemporary and well known tech stack. If a new system needs to be integrated with existing systems then the API integration will be smoother and a lot quicker.

Technical migration

If a mass migration of a technology stack is needed, perhaps migrating from in house data warehouse to a public cloud based provider, or moving from monolithic to horizontally auto scaled instances, then having most systems on a contemporary and probably quite similar technology stack will speed any urgent migrations. This will avoid/reduce the need to for many deep cave explorations of old customised mystery legacy applications.

Staff retention

A very important reason to rewrite applications and features and in general keep technology up to date is staff retention. (Obviously a million other reasons exists for staff churn as well).

If you avoid death by a thousand paper cuts and broken windows staff will not mind working on the products. A negative culture will be less likely.

If you migrate to newer technologies and let people frequently learn new stuff they will be much more interested in the work and less likely to want to move elsewhere. (refer to my blog post Peak interest - the learning and sharing curve).

Recruitment

If word spreads that you keep an up to date and interesting technology stack then hiring new members of staff will be a lot easier, and you will hopefully attract more qualified candidates. On boarding will also be quick and people will be up to speed quicker with newer well known technology.

Lean technology - lean organisation

If the applications are continually refactored and evolved (as long as it is not done in a hacky ninja rock star development fashion) the architectures will become modularised, scalable and leaner in general.

Not necessarily a micro service architecture but a lean architecture that are more likely to be adaptable for the future.

Hopefully a lean architecture and highly skilled retained staff will also lead to / need a lean organisation so the business is more likely to succeed.

Rewrite when not needed

I would also suggest people and organisations refactor and rewrite when they do not see the obvious reasons for it.

I do not entirely mean rewriting applications on a whim when there is no features to be added or bugs to be fixed, after all I do emphasise the need for delivering business value along with all rewrites.

But I do think even when the obvious pain is not there that you should try to refactor and update applications. For example if the application is just one or two version behind the latest but still seems fine then still update it. It will be less painful than when it next time is 3-4 version behind with a bigger delta of change.

Rewrite encapsulation

One type of no business value rewrite I would suggest is a good idea is to encapsulate legacy systems API as soon as you come across it.

Even if you are not changing a legacy system but merely reading data from for example, then adding a contemporary facade to it straight away will be of value in the future when you do need to update it.

Experiment and rewrite experiments

Another important staff retention technique on top generally keep up to date with technology is to let them experiment with new technology.

I would avoid core systems, especially customer facing systems, but internal tools is prime candidates for field testing technology.

This will also lead to discovering technology that you can use in other applications if proven useful that otherwise would have been missed or delayed.

Naturally these experimental applications should also not be abandonware and be rewritten as often as other systems. Though they will probably always be good candidates for next set of technology experimentations.

Rewrite exploration and euthanasia

Certain system never comes involved in new features so will not be included in a normal rewrite. These will consciously have to be found and rewritten without business value. Though probably just as a contemporary facade initially.

Leaving these as abandonware is not a good idea. Either kill them or update them.

Summary

The main points was already covered in my previous blogs, but I hope people see the value of rewriting frequently to especially keep staff, organisational architecture and company in general up to date for whatever happens in the future.

A utopian expectation of all applications being up to date is too much to hope for. At the other end where companies that rarely and/or minimally update their applications, they are doomed to fail. The grey scale in between decide whether companies will wither and eventually die or survive.

Friday, 21 November 2014

Don't download the internet. Share Maven and Ivy repositories with Docker containers

Etiketter: code, docker, ivy, maven, virtualization

Downloading the internet. A term commonly used when building Maven projects. It is due to downloading your applications dependencies and their transitive dependencies. With a normal application those dependencies can add up to quite a few jars and a lot of megabytes to download. Especially on a 'clean' machine without cached dependencies in your local repository.

Running a new container in Docker sometimes feels the same as every image layer it is built on top of also has to be downloaded. Eventually these are also cached in the Docker cache.

If you then have a Maven based application in Docker you have the perfect storm of bandwidth hogging. Especially as your Docker image's Maven configuration will by default not reuse any cached dependencies and instead always download everything from Maven Central. And for each new build or launch it will re-download the internet as it knows of no local cached dependencies.

Work around, not solution

My work around is to mount my local host repositories for Maven, and for its derivative Ivy, as data volumes for the Docker container. This then avoids re-downloading the internet on every further build and launch and will also reuse dependencies I most likely have already downloaded on the host.

It is not the best solution as your images should ideally not be influenced by what you have on your host machine, but it saves a lot of time, and a lot of grief.

Vagrant

When I run my docker containers inside a Vagrant instance I first mount my host repositories by adding these 2 lines to the Vagrantfile:

config.vm.synced_folder "/Users/myusername/.m2", "/home/vagrant/.m2"

config.vm.synced_folder "/Users/myusername/.ivy2", "/home/vagrant/.ivy2"

Boot2docker

Since Docker 1.3 the OSX /Users path has been by default shared with the boot2docker VM on the same path. So to share the maven repositories you need to link the /Users path to your docker user home folder.

boot2docker ssh;

ln -s /Users/mysername/.m2 /home/docker/.m2;

ln -s /Users/mysername/.ivy2 /home/docker/.ivy2

Data volume container

For easy sharing these folders between many docker containers I mount them as a data volume container and naming it ‘maven’.

docker run -d -P --name maven \

-v ~/.m2:/root/.m2:rw \
-v ~/.ivy2:/root/.ivy2:rw ubuntu

I am basing it on the basic Ubuntu image, and it will stop immediately as no process is running. That is fine, the mounted volumes will still work.

Alternatively instead of mounting the host folders you can have a persistent container with these folders exposed as VOLUME in the Dockerfile so it is shared in a similar manner.

Launch container

docker run -d --volumes-from maven \
yourapplicationimage

Obviously you probably have other options on your docker launch, but the important bit here is the ‘--volumes-from maven’ which maps all the volumes from the data volume container called maven into this container.

Now in theory all containers with Maven and Ivy based build tools such as SBT should first look at the mounted Maven and Ivy repositories for their dependencies.

Build problem

Unfortunately if you build new images frequently you will have the same problem still as the above solution is only for running containers not building. During the build stage Docker does not allow you mount any volumes. This is so it is reproducible anywhere.

In your Dockerfile you can ADD or COPY whole repositories into your image, but that will make it very bloated with a lot of irrelevant dependencies unless you somehow construct and maintain a perfect repository. (Ps. don’t do this).

Maven repository manager

One solution that makes sense in general is to add you companies Maven repository manager’s public Maven Central mirror to the image. That way at least you will only download the intranet, not the internet.

For a Maven build add a settings.xml file to your image folder with at least these settings if using Nexus:

<mirrors>
<mirror>
  <id>Nexus</id>
  <name>Nexus Public Mirror</name>
  <url>http://nexusmachine/nexus/content/groups/public</url>
  <mirrorOf>central</mirrorOf>
</mirror>
</mirrors>

Then add this to your Dockerfile:

ADD settings.xml /root/.m2/settings.xml

Note this will be overwritten if you later during the run stage mount .m2 folder on top of it, but for most that is fine as the .m2 folder usually contain a relevant settings.xml file anyway.

For a SBT build add this repositories file to your image folder with this content:

[repositories]

local

maven-local

company-repo: http://nexusmachine/nexus/content/groups/public

scala-tools-releases

maven-central

Then add this to your Dockerfile:

ADD repositories /root/.sbt/repositories

Repository manager container

A further solution is to run a repository manager as another container and have all Maven/SBT builds refer to it instead. This avoids the involvement of the host computer yet saves any network traffic as everything is localhost.

Pull down something like https://registry.hub.docker.com/u/mattgruter/artifactory/ .

docker pull mattgruter/artifactory

Configure then run and name it:

docker run -d --name artifactory \
mattgruter/artifactory

Modify the above settings.xml and repositories to refer to you locally linked artifactory name as url instead, e.g this one liner:

company-ivy-repo: http://artifactory/artifactory/repo/,
[organization]/[module]/[revision]/[type]s/[artifact](-[classifier]).[ext]
company-repo: http://artifactory/artifactory/repo/

(The example above added it as an Ivy repository which Artifactory supports).

Hopefully these tips will avoid downloading the internet too often and save a few grey hairs.

Thursday, 13 November 2014

Learning whilst writing, and relearn later

Etiketter: blog, write

Martin Fowler recently made a comment that he writes books so that he can learn about a subject.

That resonated well with me, as whilst I don’t write books I do write technical howto documents on my website and my blog. And it is true I mostly do write these documents as I am learning the topic myself.

The main reason I write and share these documents is still to help others. Hopefully someone else will be able to stand on my shoulders and have an easier time learning the same thing. I certainly use a mixture of other’s docs as I learn stuff and hopefully I reference them appropriately in my docs.

Focus and relearn

But I write howtos also to help myself. By documenting each step I keep focus on learning it properly. It also encourages me to achieve certain meaningful levels before I bounce on to the next shiny thing.

An additional unintentional benefit to me is that in e.g. 6 months, 2 years or even just a few weeks later when I need to use/learn the same technology, I have ready made revision notes for me to get up to speed instantly.

Accidental expert

People mistakenly/naïvely think I am expert on the domain of each howto I write.

On some topics covered in a few older docs I may be an experienced fountain of knowledge…, but with most I only scratched the surface. Quite a few I never touched again and have more or less completely wiped any knowledge from my little mind.

So writing these docs can be a little bit of a curse. Of my top five most visited blog posts three of them are related to Hibernate from years ago. I am not an Hibernate expert nor do I want to be, but this could trick people into assuming that I am. My most visited (by a country mile) howto doc is about setting up an email server. I know a bit about the subject but it is not my job nor interest, I merely wrote a good howto in 2003.

But in general sharing is worth it. The amount of thank notes I have received and general good feeling I get from these are gold.

Evangelical

Another benefit to myself is that I now have reference points when I evangelise about a certain topic. I can refer to my own work and words instead of soon to be forgotten spoken words.

I also starting to be confident enough to present, submit CFPs and in general talk about subjects which are often built on documents and blog posts I have previously written.

Dockerise it all - containerised addiction

Etiketter: code, docker, dogma, virtualization

TL;DR: CDD - Container Driven Development

(CDL - Container Driven Life)

Occasionally you come across new interesting technology that is easy to use, and suddenly a veil is unveiled and you become aware of so many opportunities and potential. Maybe it was the same the first time you discovered programming, or OO or FP programming, message queues or NoSQL databases, or distributed source control, or moving from IDE only to proper reproducible build tools (Ant, Maven, Gradle, SBT etc). With certain tools it becomes your favourite hammer that you carry everywhere searching for nails to hit it with.

Docker is such a revelation, at least for me. And I am hammering a wide variety of nails with it. And with the hammer-nail pattern/anti-pattern some of those nails really didn’t need hammering but it was fun.

Docker history

Docker was only revealed in March 2013 by Solomon Hykes and other people at dotCloud . It got a lot of publicity in the hacker news section of the world as people could see the possibly potential in this new tangent of virtualisation. It baked and matured through 2013 as people occasionally showed examples of applying it to their workflows, but still it was too rough for most to use.

But during 2014 especially since the summer the publicity and widespread use of Docker has exploded. Downloads from Docker’s registry hub of ready made images has exponentially rocketed sine the summer of 2014 by 1387%! Now the majority of people I talk to or follow on twitter mention Docker and show examples of how they use it. Conferences and tech news are flooded with new ways to use Docker. And projects that extend or are based on Docker are multiplying all the time.

As someone who has always been interested in virtualisation, automation, reproducibility, build tools and provisioning, introducing containers like Docker has been a welcomed evolution and a revelation. Being already heavily invested in the use of Vagrant for development and systems the migration to/ combination with Docker has been smooth.

Docker addiction step 1. Exploration

First steps are usually to try the easy hello-world-ish examples. Downloading an Ubuntu image and launching bash inside a Docker container. Basically exploring Docker and its commands, which is what I cover in my how to install, basic use patterns and many handy tips with Docker howto.

Docker addiction step 2. Imagination

My howto document also takes you into step 2 which is more about creating your own basic images. Creating a simple Apache or Nginx container with a simple volumised website. Creating images ready with a Java SDK or node.js etc. I have made some simple Dockerfiles of these type of images available at github.com/flurdy/Dockerfiles.

Then you usually also wrap a simple application of your own in an image and running it on other computers. You start to understand how and where Docker can be used.

Docker addiction step 3. Diversity

Once you got one application inside a container you start to experiment with other applications. Might take a look at other types of applications and more likely support applications such as databases inside a container.

You might start to link containers to each other such as an application to a database container and some of the other handy Docker tips in my howto.

Docker addiction step 4. Proliferation

Now you have tried a mixture of applications with Docker you might spread the use to even more core applications and more. You start to link multiple containerised applications to each other.

Replacing third party integration and old legacy applications with a container image, wiring up and switching between different versions of an application via container linking, etc.

As you start to basically recreate environments using Docker containers you might start to facilitate promotion between staging environments only using container images. And eventually production is containerised.

You are probably already sharing and discovering images at work and on the internet. Using the public Docker registry hub and/or internally with tools such as Quay.io

Docker addiction step 5. Eccentricity

When comfortable with writing Dockerfiles and fully aware of the Docker way of layers and process then you start to experiment with more unconventional Docker images.

You might create Android SDK in an image, ready made SSH tunnels to apps or databases in another. You add tools such as a local maven repository manager, local DNS servers, mail servers, etc. into containers. You realise desktop applications can actually be containerised.

Eventually tools to facilitate easier and more manageable Docker life such as Fig, Panamax, Shipyard, Dokku, etc. are maybe among your toolset.

Docker addiction step 6. Ubiquitous

By now all your core applications are run/runnable inside a container.

They link to databases, message queues, 3rd party adapters, and other smaller applications that are all also inside containers.

Your testing environments, staging environments, production environment, development environment and REPL are all running inside containers.

When you create a new application or touch an old legacy application your first step is to write a Dockerfile and containerise the application. You design applications and processes from the ground up to be containerised.

This is what I call CDD - Container Driven Development.

Possibly by now you are convinced everything belongs inside a container, so a better name is perhaps CDL- Container Driven Life.