In my last post, I walked
you through the process of running an experiment with Azure Machine
Learning. Before you jump into your own experiment, you would do well to
check out some examples first. Microsoft has provided many examples of
experiments, and you may be able to find something similar to what you want to
accomplish.
A list of sample
experiments can be found here.
You can browse for Trending
Experiments, or Microsoft Examples. Either way, a search panel with
filters is available to find an experiment that is similar to what you want to
run. Once you find what you are looking for, you can open it in ML Studio
to see how they built it. You can choose to run it, or even modify it to
see what happens.
When you are finished
playing with the samples, you can use them as a template for your own
experiment. Again, all this can be done in your free Studio work space.
One of the trickier aspects
of running predictive analytics is choosing the correct algorithm.
Microsoft has provided a Cheat Sheet that will help you choose the right
one. You can download the Cheat Sheet here
You can choose to keep it
on the computer, or if you want to print it, it prints to an 11” X 17” Tabloid
sheet. This Cheat Sheet will hopefully point you in the right
direction. It was designed for people who already have a firm understanding
of machine learning. So, the Cheat Sheet provides only a generalized
overview that should give some guidance. It will not point you to the
specific algorithm you need. In fact, most of the available algorithms
are not even listed in the Cheat Sheet.
Choosing the right
algorithm is a process of trial and error. Even data scientist cannot
always predict which algorithm will work best. The factors that come into
play when choosing the right algorithm include the size, quality, and nature of
the data being analyzed. How you intend to use the answer will also play
an important roll. With the Cheat Sheet, you should be able to narrow it
down to a few candidates. Then you will have to run some experiments in
order to find which algorithm provides you with the best solution.
On the Cheat Sheet, working
from the START button, read the path and algorithm labels like this: “For
<path label>, use <algorithm>.”
This isn’t supposed to give
you the exact answer, only point you in the right direction. Data scientists
will tell you that the only sure way to find the best algorithm is to try them
all and compare the results.
No comments:
Post a Comment