Thursday, July 14, 2016
Overview of Azure Analytics offerings
In June, Microsoft announced that it would pay $26.2 billion to purchase LinkedIn. They paid a 50% premium over LinkedIn’s share price – a price that had been plummeting due to their substantial losses. This was the third largest corporate acquisition in history. Analysts have been scratching their heads trying to justify this purchase.
Despite being the world’s largest software maker, Microsoft’s main focus since Satya Nadella took over as CEO in 2014 has been cloud computing, machine learning, and artificial intelligence. Acquiring LinkedIn fits well with this focus. In addition to the synergies Microsoft hopes to leverage, LinkedIn has a well-respected team of data scientists that have been coveted by tech firms.
When it comes to analyzing data in the cloud, Microsoft is going all in. Money, corporate culture, intellectual capital, and advertising, are some of the indications of how important this is to Microsoft. The variety of robust products, services, and integration across Microsoft’s various tools and platforms, now on offer at Azure, is the proof that Microsoft is very serious about this.
My keenest interest in Azure lies with the analytics offered on Azure. Analytics in Azure is actually a substantial group of things that allow you to organize and analyze data. Depending on the nature of the data and what information you are trying to tease out of it, you will need a different tool. Microsoft has them all. Here is an overview of the analytics tools on offer through Azure:
Data Lake Analytics: A data lake is a very large collection of raw data. Data lakes are a relatively new phenomenon (2010) that grew, as “Big Data” became a thing. When you have a steady “stream” of data filling a data lake, analytics will allow you to find the subset(s) of data that point to correlations or trends.
HDInsight: The “HD” stands for “Hadoop Distribution”. HDInsight is only available on Azure. It provides a framework to manage analyze and generate reports using big data.
Machine Learning: As I already mentioned, ML is one of the main focuses for Microsoft. ML is used to find hidden insights without having to explicitly tell the computer where to look. I covered this topic in a series of blogs previously.
Stream Analytics: (Continuing with the water analogy) Stream analytics is a high throughput, low latency analytic that allows for immediate understanding of real time data.
Data Factory: Like any factory, raw materials come in, they are processed, and products (not necessarily finished) come out. A data factory accepts data and processes it into ready to use data that can be used for consumption, or further analysis.
Event Hubs: An event hub is a place where millions of data points collected from millions of sensors (welcome to the internet of things) are received, integrated, processed and then shared back to devices that make use of the integrated information.
Data Catalog: The Azure Data Catalog is a fully managed service that makes it easy to find the data sources you need. It is a community of data sources.
Power BI: Power BI is Microsoft’s suit of BI tools that allow you to set up and use dashboards that will monitor and process your data quickly. It provides you with visual displays of your data that will give you the big picture on any device.
In my next blog, I will begin to explore each of these analytics tools in more detail.