Tuesday, October 4, 2016

Data Stream Analytics

Back in November 2015, both Amazon and Microsoft announced within a few days of each other that they would be opening cloud data centers in the UK.  On September 7, 2016, Microsoft claimed victory as the first to open, offering complete cloud services with data residency in the UK.  The Amazon data center isn’t expected to open until the end of the year, or early 2017.

Being first with their offering in the water, Microsoft has already claimed some big fish.  None are bigger than the Ministry of Defense!  The UK Ministry of Defense will be moving its computing from a secure military network to Microsoft’s Azure cloud.

That is an incredible vote of confidence in the reliability and security of Microsoft’s Azure cloud.

Now, off for a light stroll along the stream.

I have briefly mentioned data streams in my previous blog posts about Data Lakes and Data Lake Stores.  Then, I described a Data Stream as simply the source of the data that fills up the lake.  Now I will try to clear the water a bit more.

It is called a “stream” because like its aqueous cousin, it provides a constant flow that originates from one or many sources or “tributaries” that all flow into a large holding tank.  Like the water systems you are familiar with, a Data Stream can vary in size from a slow trickle, to an enormous flow.

Twitter is a commonly cited example of a Data Stream that many researchers have used to conduct some trend analysis.  PubNub, for example, provides a link to a live syndicated Data Stream that you can use.  It is a real time stream of messages streamed at a maximum rate of 50 tweets per second.  Azure also provides a Twitter streaming API for you to practice with and perhaps even conduct some research?

Other examples of Data Streams that would be of more interest to a corporation would be computer network traffic, credit card transactions for fraud detection, ATM transactions, or sensor data.  Walmart streams their sales information to a central lake in order to track performance and ensure appropriate inventory.  GM monitors the sensors in your vehicle to warn you of problems, or even automatically notify 911 if you have been in an accident.

A Data Stream that may be of personal interest would be stock prices.  If you manage your own stock portfolio, you have likely set an alert for you to be notified if a certain stock goes above or below set values.  Sometimes these alerts trigger an automatic buy or sell in your account.  Your alert is sent out in real time, not at the end of the day.

Instead of waiting for the data to be deposited in a lake for later analysis, Azure has tools that allow your Data Stream to be analyzed in real-time.  If, for example, an ATM is about to run out of money or has malfunctioned at 11:00 AM, you don’t want to have to wait for an overnight process to tell you that there is a problem.  Customers have come to expect that your services will not be down for long.  If they are, a new stream of complaints will be produced on Twitter, Yelp, Amazon, or any of a million other places.

In Azure, you can analyze a Data Stream in real time using a language very similar to SQL.  Azure will easily scale to meet your needs, whether it is keeping tabs on a small, but vital stream, or overseeing a 1 GB/second torrent and extracting the immediate concerns.

Sometimes the alert you need isn’t as simple as the temperature in your computer room is too high.  In the case of a political party monitoring public sentiment to specific news events, your Data Stream can be filtered through Azure’s Machine Learning tools to extract some more nuanced information.  That is one of the advantages of having all of the Azure tools available on an as needed basis.

If you are ready to dip your toe into a stream, and practice there are tutorials built into Azure that you can access for free.  You can start your learning here: http://bit.ly/2dGjQvv

If you want to jump straight to some social media practice, Microsoft has a complete tutorial for real time twitter analysis here: https://azure.microsoft.com/en-us/documentation/articles/stream-analytics-twitter-sentiment-analysis-trends/

Finally, if you just want to watch someone else do it first, you can watch a 15 minute demonstration here: http://bit.ly/2dGkvgk

No comments:

Post a Comment