• Share
  • Share

How to adopt
machine learning

Machine learning is more than just a tool or capability—it’s a core product feature that fuels innovation. And we should treat it as such.

Tyler Deutsch | October 7, 2016

Machine learning (ML) is being leveraged by many of the top technology companies like Google, Facebook, Microsoft, and Amazon, and is embedded in many of the services we use every day—from online shopping recommendations, to preventing fraudulent purchases, to providing right-place, right-time advertising to your mobile device. Now that almost every company has the power to be a data company, taking advantage of that data with intelligent automated apps is a core competitive advantage.

How machine learning is used today

Computer gaming pioneer Arthur Samuel defines machine learning as a subset of artificial intelligence “that gives computers the ability to learn without being explicitly programmed.

Forward-thinking firms are viewing machine learning as not only a tool or capability, but as something that can be embedded into software products via APIs.

Take the self-driving car, for example. Automotive companies are now focusing more on the software going inside the car—which is highly reliant on machine learning—instead of mainly focusing on the car’s physical features. And machine learning is the core feature of self-driving cars that makes them aware of their surroundings and enables them to learn.

Another example is Slack. Internal messaging apps may seem like old news, but with integrated bots that learn about how people communicate, Slack is making communicating with fewer emails much easier.

Internal enterprise apps and tools are also getting a needed makeover away from spreadsheets and emailing toward more interactive, intelligent, self-learning apps that look and feel like consumer apps.

By taking a more product-driven approach, companies become more innovative and forward-thinking, and can take strides past competitors—especially when product managers and data scientists work hand in hand.

Here are three ways to successfully implement machine learning:

1. Bring your own code

Data scientists want to code in ways that feel most natural and intuitive to them. And they want to apply the right algorithm or approach that fits the use case without being tied down by the “enterprise one tool standard.” By supporting multiple languages like R, Python, Java, Go, Octave, and Julia, you can ensure new features and algorithms are at your disposal. You can also hedge your bets in case a community of one particular language starts to shift toward another. By allowing data scientists to bring their own tried-and-true approaches with the language of their choice, they can be that much more productive.

The downside can be having to support all those different code-bases and moving the code to production (see The Data Mining Group’s article). But with proper versioning (e.g. Git), branching, committing strategies, DevOps processes, and environment automation and promotion standards, you can get the best of both worlds.

“Machine learning is the core feature of self-driving cars that makes them aware of their surroundings and enables them to learn.”

2. Embrace DevOps

Much has been written about DevOps and its ability to speed up time-to-value and innovation. Machine learning is no different. New approaches and algorithms—for example, deep learning—are coming out all the time, and data scientists are trying them out through code and relying less on GUI-based interfaces. After the new approach has been tested out in a sandbox environment with limited scope, it’s time to move toward development, QA, and finally, production. Each one of these environments can be automated with DevOps through tools like Jenkins, Puppet, Chef, Ansible, and Docker. In other words, ML, similar to any type of analytics, can be something done once to quickly make a decision; however, companies should think of ML as software products—something to be deployed and maintained with quick development cycles using agile and continuous integration methodologies.

Code like R and Python can be versioned using Git-related technologies just like software. Typically, data scientists can work on the same model by branching to try out different input variables and types of training data sets, as well as tweaking the model’s parameters in different ways. If one scientist has achieved a high amount of predictability with their particular model, they make a “pull request” or push the code back to the branch. In this way, machine learning models can be thought of as “features” in a product-driven world.

3. Use pre-made cloud environments

Machine learning development isn’t something that should be done on local computers using locally stored data. Data scientists can be much more productive by having a pre-made environment with all the tools, packages, languages, and data sets ready for development—typically done in cloud environments. These environments can be built and “rented” for periods of time, then shut down when they’re not in use. Even better, new data scientists that come on board can use an environment that looks the same as others without having to create their own from scratch via proper imaging, virtualization, and snapshots—the pre-made dev, QA, and production environments are already there. Firms like Domino have made a business building such environment platforms.

“Data scientists can be much more productive by having a pre-made environment with all the tools, packages, languages, and data sets ready for development.”

Data science environments oftentimes use GPU (graphically processing units) especially designed for machine learning, because they can handle much of the heavy compute processing and memory required for advanced ML approaches like deep learning.

These environments can also change and be updated as new algorithms and frameworks are developed, which can then be redeployed, so everyone has access to the latest enhancements. Google’s open source, TensorFlow, is an example of an open source library that can be enabled by one of these environments.

By treating machine learning as a core product feature, rather than a data analytics capability or tool, companies can become more innovative, better understand users, increase adoption, and differentiate themselves from competitors.

Tyler Deutsch is no longer with Slalom.

Tyler Deutsch
Tyler Deutsch is a solution principal in Slalom’s Chicago office. He helps companies take advantage of big data, cloud, and machine learning to deliver innovative products and data-as-a-service. Follow him on Twitter: @TylerDeutsch.