To round out the algorithms this week we will look more deeply at Neural network, SVMs and Bayesian algorithms.
Neural Network algorithms are used when the result you
are looking for is a moving target and there are a large number of possible
inputs, none of which have a strong individual correlation to the final
result. Programmers may us this to teach a computer how to recognize
writing or images. It is also commonly used for fraud detection.
There are innumerable varieties of Neural Network algorithms, but Azure only
deals with Directed Acyclic Graphs (“DAGs”). All Decision Trees are DAGs,
but not all DAGs are Decision Trees. “Directed” means that the
connections between each node has a direction. Acyclic means that no
matter which node you start with, and walk through all the possible nodes,
following the directions, you will not return to the node you started
with. The Graphs part lets you know that you will get a graphical
output. So, for example, a family tree is directed because parents lead
to children, and cannot go the other way. It is acyclic because your
ancestors can never be your descendants. A first year university student
is faced with a DAG problem. They must choose subjects that follow
requirements. You cannot take a Data Science course until you have taken
a prerequisite like R programming. By adding all the subjects with their
prerequisites into a graph, you will have a DAG. Through the use of many
combinations of simple calculations, your model is able to learn sophisticated
class boundaries and data trends. It is Neural Network algorithms that
are used for the deep learning that is creating the artificial intelligences
popping up in many areas. This deep learning can take a long time to
train.
Support Vector Machines (SVMs) are supervised
learning methods that are used for classification, regression, and outlier
detection. When 2 classes of data cannot easily be separated, SVMs will
find the boundary line that separates them with the widest possible
margin. Azure ML will only perform the separation of 2 classes using a
straight line. Because this ML algorithm is kept simple, it is fast, and
not prone to overfitting. Azure ML has a second class of SVMs called
two-class locally deep SVM. This ML algorithm combines several small
linear SVM problems to produce a non-linear separation boundary. This
algorithm is particularly useful for detecting anomalies in your data.
Bayesian ML algorithms are excellent for ensuring you
have not over fitted your data. Azure ML has 2 types of Bayesian
models: 2-class Bayes’ point, and Bayesian linear regression.
2-class Bayes’ point machines were originally developed at Microsoft, so, they
are particularly robust in Azure and definitely a source of pride at
Microsoft. Bayesian statistics treat quantities of interest as random
variables. While the core ideas Bayesian analysis are centuries old, they
have had their biggest impact in the last 20 years through ML. The
flexibility of Bayesian analysis allow for structured models of real world
phenomena. It allows you to trade off complexity in exchange for some
degree of structure or fit.
If what you are looking for in your data is very
specific, Microsoft has a few Specialized algorithms that will get you
the information quickly and easily. For example, if you are looking for
anomalies in your data, instead of using a generic SVM, Azure ML has an SVM
that is designed specifically to root them out. Other algorithms have
been tuned to find specific items to make your job of finding them just that
much easier.
No comments:
Post a Comment