Welcome to part 2 in a series on applying DevOps to Data Science. You can read part 1 here. Applying DevOps to Data Science. In this blog I want to begin to look at defining what is DevOps and begin to understand why DevOps can help a Data Scientist deploy models faster.
What is DevOps?
This divide between those who develop and those who deploy has been a struggle in traditional software development for a long time. Software developers would typically work in isolation building their applications. Once it is built it would be handed over to the operations department for deploying/migrating to production. This process could take a long amount of time.
The delay means the development team have to wait longer to deploy changes, code becomes stagnant and the developers are reliant on operations availability. When the Agile Methodology became popular a shift in the deployment process began to emerge. Developers were working with operations as part of the same team to enable sharing of responsibilities. DevOps emerged.
The term DevOps is a portmanteau of dev and ops, dev relating to developers and ops relating to operational staff. In traditional software development, there has typically been a separation between the developers who are building an application and operations who are tasked with deploying and monitoring the application once it is live. The two roles are very different and have different skills, techniques and interests. Developers are interested in if a feature has been built well, operations are interested in whether the application and infrastructure is performing as required (Kim et al, 2016 pp xxii).
Prior to DevOps, the roles of development and operations were distinct and had little crossover. Unfortunately, this meant that the lead time to deployment was long and prone to errors. At the start of 2010, industry experts began talking about applying the principles of Lean production to software development.
Lean practices were responsible for halving the time taken to manufacture vehicles, with 95% of orders being shipped on time (Kim et al, 2016 pp xxii). The term DevOps was coined to encapsulate a series of processes and culture changes aimed at reducing the amount of time required for an application to go in to production and for changes to propagate to production.
Jez Humble defined DevOps as “a cross-disciplinary community of practice dedicated to the study of building, evolving and operating rapidly-changing resilient systems at scale.” (Muller, E, 2010). Muller further goes on to add his definition “DevOps is the practice of operations and development engineers participating together in the entire service lifecycle, from design through the development process to production support” (Muller, E, 2010). In the largest survey of DevOps practitioners, Puppet propose that “DevOps is an understood set of practices and cultural values that has been proven to help organizations of all sizes improve their software release cycles, software quality, security, and ability to get rapid feedback on product development” (Puppet, 2017).
There are many differing definitions of the term DevOps however they all have a common theme around culture/community, tools, software lifecycle, process and support, with the goal to improve software release cycles.
Culture is one aspect of DevOps. For DevOps to be implemented fully a series of concepts and associated tools can be implemented. DevOps encapsulates the following areas:
-
Culture
-
Source Control
-
Continuous integration
-
Continuous deployment
-
Infrastructure as code
-
Configuration as code
-
Automation
-
Operational Monitoring
This can be order in importance as seen in Figure 1.
Why is DevOps important?
A fundamental part of DevOps is automation. Doug Seven in his article “Nightmare: A DevOps cautionary tale”, Seven documents how an American stock trading lost over $400,000,000 in 45 minutes due to a failed manual deployment on one of their eight servers (Seven, 2014). Seven notes that this could and should have been avoided through good DevOps principals and automated deployments. DevOps could have saved this business 400 million dollars. Take a moment for that to sink in.
This problem is not limited to software development this could have been about a rouge machine learning model. If a model is checking fraud and starts flagging all transactions incorrectly as fraud that could cause a huge impact to the business in question. But we will come back to the discussion at greater length later in the series.
One of the foundations of DevOps is to strive towards shorter release cycles. Shorter release cycles mean code changes are promoted to production faster. This directly correlates with a higher throughput and higher quality and stable code (Puppet, 2017). DevOps is on the rise, in a survey conducted in 2016 16% of respondents worked on a DevOps team, in 2017 this number has increased to 27% (Puppet, 2017). This indicates that DevOps is becoming more popular and organisations are beginning to see the value it offers. It may also indicate a shift in culture to one of collaboration, a goal DevOps attempts to achieve.
IT performance can be measured by the throughput of code and stability of systems (Puppet, 2017 pp20). Throughput is measured by how frequently code is deployed and how fast they can move from committing code to deploying it. The stability is defined as how quickly the “system can recover from downtime and how many changes succeed” (Puppet, 2017). High performing teams could see 46 times more frequent deployments, 440 times shorted lead time for changes, 96 times faster mean time to recovery with failure rates were 5 times lower than companies who did not implement DevOps (Puppet, 2017 pp21). That is staggering!
So why don’t all development houses apply DevOps? We because it is difficult and requires complete by-in from all levels of the organisation. The numbers listed in the Puppet survey are an ideal and are not guaranteed. However if a company sees even a fraction of these benefits, DevOps will provide a return-on-investment.
Faster deployments, fewer errors, faster changes, these are all important when looking at production machine learning. Traditional software development and machine learning engineering are not that different.
In the next blog, we will explore the key parts of DevOps in more detail
Topics Covered :
Author
Terry McCann