In April a colleague and me went to visit our partner in Delhi. Over breakfast in the hotel we reflected on the difference between the restaurant service there compared to Stockholm (this was a normal chain business hotel).
First, there were more people working with service in the restaurant than there were hotel guests having breakfast. Second, most people working in the restaurant had a very specialised responsibility. Finally, people working in the restaurant had time for the extra details: chatting, decorating and… smiling :-).After a breakfast or two we came up with a number of simple options to make the restaurant more efficient, but then we stopped. It was apparent, for this hotel, resource efficiency was not a goal.
However, if you were to move this hotel to Stockholm, a lot of changes would have to be made.
Balance: our Cultural Aspect
Remember we had Continuous Delivery as part of our vision (part 1). At the time we had a large off-shore team (14 people) handling operations including daily monitoring and deploys. Daily operations went smooth but we started questioning the big team. Also, deploys were a painful event: it required lots of planning and a full night’s work for around ten people. This meant taking both a high risk and a high cost. Plus people needed to work nights.
Doing deploys every hour or even just every day seemed very far away.
We decided to gradually bring operations back to Stockholm. The team we had in Stockholm were two people so we needed a radical change. A miracle!
Balance: Replacing Manual Work with Tools
What we found (partly already knew) was that most of the daily operations work was done manually: for instance, every morning there was a manual detailed check of all log files. This would not work: we needed alarms with relevant notifications when things went bad, not reactive time consuming analysis. In general procedures were carried out in a manual and very complex way. This not only required a lot of people. Compared with the service in the breakfast restaurant all the manual work meant a high risk: so many things could and did go wrong. We needed to automate everything that could be automated, making the process more repeatable.
Improving Deploys – How did it Go?
We set up a plan to stick to deploys every second week and for each time have at least one part of the process improved. After three months the off-shore team could be reduced to half the size and after six months time it could be removed completely.
Deploys now took about two hours and could be made during day-time and Ansible (our automation tool) shares status updates on Slack 🙂