Control Flow Introduced in Azure Data Factory V2

One of the most exciting new features in Azure Data Factory will be very familiar to data developers, data engineers and ETLers familiar with hand-coding data integration jobs or designing SSIS packages …

“Control Flow”

In general computer science terms, Control Flow is described as “the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated,” (Wikipedia https://en.wikipedia.org/wiki/Control_flow). In the Microsoft Data Integration world, tools like SQL Server Integration Services (SSIS) enables control flow via tasks and containers (https://docs.microsoft.com/en-us/sql/integration-services/control-flow/control-flow). Azure Data Factory’s V1 service was focused on building sequenced pipelines for Big Data Analytics based on time windows. The V2 (preview) version of ADF now includes workflow capabilities in pipelines that enable control flow capabilities that include parameterization, conditional execution, loops and if conditions.

If-Then

In ADF pipelines, activities are equivalent to SSIS tasks. So an IF statement in ADF Control Flow will need to be written as an activity of type “IfCondition”:

"name": "MyIfCondition",
"type": "IfCondition",
"typeProperties": { "expression":
  { "value": "@bool(pipeline().parameters.routeSelection)",
    "type": "Expression" }

IfTrue and IfFalse are properties of that activity which perform subsequent actions via Activities:

"ifTrueActivities":
  [ { "name": "CopyFromBlobToBlob1", "type": "Copy" ... ]

"ifFalseActivities":
  [ { "name": "CopyFromBlobToBlob2", "type": "Copy" ... ]

https://docs.microsoft.com/en-us/azure/data-factory/control-flow-if-condition-activity

Loops

Do-Until: https://docs.microsoft.com/en-us/azure/data-factory/control-flow-until-activity

"name": "Adfv2QuickStartPipeline",
"properties":
  { "activities":
    [ { "type": "Until", "typeProperties":
        { "expression": { "value": "@equals('false', pipeline().parameters.repeat)",
           "type": "Expression" }, "timeout": "00:01:00", "activities":
              [ { "name": "CopyFromBlobToBlob", "type": "Copy" ...

For-Each: https://docs.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity

"name": "<MyForEachPipeline>",
    "properties":
        { "activities":
             [ { "name": "<MyForEachActivity>", "type": "ForEach",
                   "typeProperties": { "isSequential": "true",
                    "items": "@pipeline().parameters.mySinkDatasetFolderPath", "activities":
                       [ { "name": "MyCopyActivity", "type": "Copy", ...

 

Parameters

Parameters are very common in SSIS packages as well to make for very flexible execution so that packages can be generalized and re-used. This is now available in ADF via parameters in control flow:

https://docs.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions

You can build a pipeline that is parameterized like this example with parameterized paths:

"name": "Adfv2QuickStartPipeline",
"properties": {
"activities": [
 {
"name": "CopyFromBlobToBlob",
"type": "Copy",
"inputs": [
 {
"referenceName": "BlobDataset",
"parameters": {
"path": "@pipeline().parameters.inputPath"
 },
"type": "DatasetReference"
 }
 ],
"outputs": [
 {
"referenceName": "BlobDataset",
"parameters": {
"path": "@pipeline().parameters.outputPath"
 },
"type": "DatasetReference"

You can then execute that pipeline via the Invoke Pipeline PowerShell cmdlet: https://docs.microsoft.com/en-us/powershell/module/azurerm.datafactoryv2/Invoke-AzureRmDataFactoryV2Pipeline?view=azurermps-4.4.1. That command will let you either pass the parameters for inputPath and outputPath either on the command line PS or define the values dynamically via parameter files, i.e. (params.json):

{
"inputPath": "markkdhi/tmp",
"outputPath": "markkdhi/user"
}

 

 

Advertisements

Azure Data Factory V2 Quickstart Template

If you’d like to try a quick & easy way to get up & running with your first V2 pipeline of Azure Data Factory, check out the video that I just recorded walking you through how to use the Azure Quickstart Template.

To get to the template, you can search the Azure Quickstart template gallery, or here is a direct like to the ADF V2 template.

Deploying this template to your Azure subscription will build a very simple ADF pipeline using the V2 public preview service, which was announced at the end of September at Ignite.

The pipeline executes a copy activity that will copy a file in blob store. In the example video I linked to above, I use it to make a copy of a file in the same blob container but with a new filename.

To learn more about the new features in ADF V2, see our updated public documentation home page: https://docs.microsoft.com/en-us/azure/data-factory/.

How to Make Use of Azure Cognitive Services

Azure Cognitive Services is a service that is available from your Azure portal which allows you to set-up accounts that enable the use of Microsoft’s pre-built algorithms and models for common advanced analytical functions such as Text Analytics, Recommendation Engines, speech and vision recognition as well as many more service-oriented APIs. These models have been trained and are ready for you to use and consume in your solutions, such as Power BI business intelligence solutions.

There is a very common use case emerging that I’m finding in the Cloud Analytics space that relates specifically to a portion of the Text Analytics function where there are a number of solutions that can take advantage of the Sentiment Analysis capabilities that Azure Cognitive Services provides. By sending in a text string to the API service, you can leverage the existing Microsoft sentiment engine to determine if the message that you sent in is positive, negative or neutral based upon a score.

The best way to demonstrate this is with Power BI. I am going to briefly touch on 2 of the most common social media sources to pull real-time data from and core the messages for sentiment analysis: Twitter & Facebook.

First, the easiest with Power BI (PBI) is Facebook. There is an out-of-the-box connector for Facebook included with PBI, so you can use Get Data to bring data directly into a Power BI model from Facebook very easily:

pbi4

Now, to hook into the Microsoft Sentiment Analysis, follow the method described very well on this PBI community page and from the associated PASS Virtual Chapter presentation on this very topic by Gil from DataChant: http://community.powerbi.com/t5/Community-Blog/Sentiment-Analysis-in-Power-BI/ba-p/55898. His full working sample is awesome and you can play around with it here: https://app.powerbi.com/view?r=eyJrIjoiMjU5NWM3MTMtMTg0NS00YzE2LTk0YTYtMmZkZWM4NjYwMmFjIiwidCI6IjIyNzNjNDFiLWI4ZDAtNDVhZi1iZWU2LWUwODQ5NmFlNjcxOCIsImMiOjN9.

To download and use a similar sample in the PBI Desktop, just download the PBIX from my Github. I recommend going into the detailed PBI Community link from Gil above, it rocks and gives you the background you need to customize this for yourself with step-by-step instructions on how this was made.

For the demo above that I uploaded to Github, I included Gil’s Trump & Clinton Facebook election sentiment analysis data on Tab 2 of the report view on PBI. Tab 1 has Microsoft BI Facebook sentiment data where I used the Microsoft BI Facebook page as the data source. Each of those datasets is run through the Cognitive Services API for Sentiment Analysis, part of the Text Analytics suite.

To make it work for you, follow Gil’s instructions on getting an API key from Cognitive Services and plug in your API Key in the GetSentimentResults function in the Power Query / Edit query tool. You can use the free limited service for Sentiment Analysis or go to the Azure Portal and create a full-use Cognitive Services account. The code for calling out to the API and saving the sentiment score in the PBI model is in a Power Query function call in the PBIX and looks like this:

(Source as table) as any =>
let
JsonRecords = Text.FromBinary(Json.FromValue(Source)),
JsonRequest = “{“”documents””: ” & JsonRecords & “}”,

JsonContent = Text.ToBinary(JsonRequest, TextEncoding.Ascii),
Response =
Web.Contents(“https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment?&#8221;,
[
Headers = [#”Ocp-Apim-Subscription-Key”= “your api key here”,
#”Content-Type”=”application/json”, Accept=”application/json”],
Content=JsonContent
]),
JsonResponse = Json.Document(Response,1252)
in
JsonResponse

As you can tell from my sample Sentiment Facebook dashboard below, I did not put as much time into the report as Gil did. So take the PBIX and add your own visuals to make it better!

pbi

Now onto the 2nd example, using Twitter data. There is no out-of-the-box connector directly into PBI for live Twitter feeds, so I used the Solution Template for Twitter (Marketing use case) from the Power BI Solution Template site: https://bpsolutiontemplates.com/.

That site provides you with a “wizard” that walks through the set-up of an Azure SQL Database, your Cognitive Services API as I mentioned above (free or the Azure account) and the Twitter handles/keywords that you want to use as a filter. This creates a much more robust solution to grab the Twitter data in that the solution template with spin-up Logic Apps, App Service and API connectors to Twitter and Sentiment Analysis on the backend instead of directly from Power BI as the first example with Facebook above.

So, essentially Power BI will just connect to your Azure SQL DB that you’ve chosen from the template and that SQL DB will get populated from the Azure App that is built automatically for you. This separates the data from the business logic and from the presentation layer in PBI. The code to call Twitter and Cognitive Services are all in the “middle tier” on the App Service in Azure, which you can open directly from your Azure Portal. Modify the API calls, strings and sentiment bins in that code:

//Sentiment analysis – Cognitive APIs
string sentiment = await MakeSentimentRequest(tweet);
sentiment = (double.Parse(sentiment) * 2 – 1).ToString(CultureInfo.InvariantCulture);
string sentimentBin = (Math.Floor(double.Parse(sentiment) * 10) / 10).ToString(CultureInfo.InvariantCulture);
string sentimentPosNeg = String.Empty;
if (double.Parse(sentimentBin) > 0)
{
sentimentPosNeg = “Positive”;
}
else if (double.Parse(sentimentBin) < 0)
{
sentimentPosNeg = “Negative”;
}
else
{
sentimentPosNeg = “Neutral”;
}

The out-of-the-box reports from these Power BI templates are quite good with a social graph visualization and a number of report tabs for different pivots of the Twitter data and sentiment scores.

pbi2

pbi3

Azure Marketplace Solutions

Yes, it’s been a while and I’m just now getting back into the swing of things, keeping MS-SQL Dude running, after my 2 year hiatus in the Open Source world.

To get back into the swing of things, I’ll share with you all a series of videos that I put together with the Microsoft Azure Partner team around the database, analytics and big data offerings from other software vendors that are available as solution template images on the Azure Marketplace:

We first started at the fundamental data level with my old friends at DataStax and their Cassandra database product

We then moved up the stack into batch analytics, streaming and distributed execution engines on Hadoop with my friends at Hortonworks

Lastly, we finished up with the end-user data visualization and BI layer from Looker

All of these data solutions are available from the Azure Marketplace link above.

You can also find several of the presentations from this series at my Slideshare site here.

 

It’s a SQL Server Weekend!

Tomorrow is SQL Saturday Philly in our Microsoft Technology Center in Malvern, PA!!

Here is the SlideShare link to my presentation on Microsoft Cloud BI

This is my blog post from today on SQL Server Pro Magazine BI Blog for Microsoft Cloud BI

Here is the SlideShare link to my presentation on Microsoft DW for SQL Server DBAs

Have a great SQL Server weekend all!!  Br, Mark

New MSSQLDUDE Videos: SQL Inventory with MAPS and WP7 Cloud BI

If you would like a short 10-minute intro on how to get started using the Microsoft Assessment and Planning Toolkit (MAPS) to inventory, discovery and record your SQL Server footprint, versions, utilization, users, licenses, etc., then you can have a look at the video that I just posted on our Microsoft YouTube channel for SQL Server data warehouse & BI group here in the East Coast U.S. field organization: http://www.youtube.com/watch?v=VdHs8Ee1iZI. BTW, the MAPS tool is a FREE download from Microsoft.com, see it here. This is version 6.5 of this tool and if you have any version prior to it, I suggest that you upgrade to it and run another SQL Server assessment on your network. This version of MAPS now has the awesome feature of virtualization discovery, proposal and recommendations built-in. This makes MAPS the premier SQL Server tool when consolidation your SQL Server environment.

And if you have a Windows Phone 7 and would like to see what a WP7 Mobile Cloud BI App would look like, take a look at this video that I just posted for our Microsoft field national DW & BI team: http://www.youtube.com/watch?feature=player_embedded&v=P1akSjiZasU#!. This describes the updated V2 improved version of the WP7 App for Microsoft Cloud BI on a mobile device which I described in my slide presentation here.

These are some of the BI data visualizations that you get out of the box with this WP7 app on your phone through the native app: