Wednesday, April 27, 2016

SQL Data Warehouse and Azure ML!

Great news, on April 4th Microsoft announced support for Azure SQL Data Warehouse as a data source and a destination in Azure Machine Learning.  It is exciting to now be able to use the Azure SQL Database connection options in the Reader and Writer modules to interact with the Azure SQL Data Warehouse.  It is important to keep in mind that when you’re using the Writer module, the destination tables must already exist in SQL Data Warehouse.  So what does that mean for Azure ML?
In many machine learning projects, you want to consume data for training.  That data comes from a relational database.  Azure ML has supported reading data coming from an Azure SQL database for quite sometime. The exciting news this month is that now, we can have access to Azure SQL Data Warehouse as well.  This new feature allows us the ability to process a huge volume of data regardless of weather it is relational or not.

This new feature allows 2-way communication to the Azure SQL DW.  The process to connect to the Azure SQL DW is only slightly different from the Azure SQL DB in that the table must exist before you execute the experiment.  In addition, you must specify the number of rows that you want to write to at a single time.

To get started, first, you must create your Azure SQL DW server through the portal.  Use the reader module to read from the Azure SQL DW.  Create a new blank experiment and add a reader module.  The connection is the same as you would use for an Azure SQL DB; add the server and database using a username and password.

Remember when using Azure SQL DW the T-Sql syntax is slightly different from Azure SQL DB.  Take a look here to see what is different.  To create a table in Azure SQL DW you cannot use the GUI, you need to write it in code and that is available here.

When you are ready to write to the Azure SQL DW you will use the Writer module and specify you will only write 1 row at a time.  This limitation is due to the fact that the Writer module uses table value constructors a syntax that is incompatible with Azure SQL DW.  To fully understand the insert syntax, take a look here.

Due to the limitation of writing a single row at a time, if you need to write a large data set, it may be worth while to write to blob storage first.  Once that is done, then use the external table feature in Azure SQL DW to import the data.

Although there are limitations, they are minor.  Just remember the table must exist and you can only write 1 row at a time.  Feel free to use the external table feature to speed things up and enjoy this new feature.

No comments:

Post a Comment