How can we tell if a drink is beer or wine? Machine learning, of course! In this episode of Cloud AI Adventures, Yufeng walks through the 7 steps involved in applied machine learning.
The world is filled with data. Lots and lots of data. Everything from pictures, music, words, spreadsheets, videos and more. It doesn’t look like it’s going to to slow down anytime soon. Machine learning brings the promise of deriving meaning from all of that data.
From detecting skin cancer, to sorting cucumbers, to detecting escalators in need of repairs, machine learning has granted computer systems entirely new abilities.
But how does it really work under the hood? Let’s walk through a basic example, and use it as an excuse talk about the process of getting answers from your data using machine learning.
The world is filled with data. Lots and lots of data. Everything from pictures, music, words, spreadsheets, videos and more. It doesn’t look like it’s going to to slow down anytime soon. Machine learning brings the promise of deriving meaning from all of that data.
Let’s pretend that we’ve been asked to create a system that answers the question of whether a drink is wine or beer. This question answering system that we build is called a “model”, and this model is created via a process called “training”. The goal of training is to create an accurate model that answers our questions correctly most of the time. But in order to train a model, we need to collect data to train on. This is where we begin.
Wine or Beer?
Our data will be collected from glasses of wine and beer. There are many aspects of the drinks that we could collect data on, everything from the amount of foam, to the shape of the glass.
For our purposes, we’ll pick just two simple ones: The color (as a wavelength of light) and the alcohol content (as a percentage). The hope is that we can split our two types of drinks along these two factors alone. We’ll call these our “features” from now on: color, and alcohol.
The first step to our process will be to run out to the local grocery store and buy up a bunch of different beers and wine, as well as get some equipment to do our measurements — a spectrometer for measuring the color, and a hydrometer to measure the alcohol content. Our grocery store has an electronics hardware section 🙂
Gathering Data
Once we have our equipment and booze, it’s time for our first real step of machine learning: gathering data. This step is very important because the quality and quantity of data that you gather will directly determine how good your predictive model can be. In this case, the data we collect will be the color and the alcohol content of each drink.
https://medium.com/towards-data-science/the-7-steps-of-machine-learning-2877d7e5548e