Azure Data Factory V2 Quickstart Template

If you’d like to try a quick & easy way to get up & running with your first V2 pipeline of Azure Data Factory, check out the video that I just recorded walking you through how to use the Azure Quickstart Template.

To get to the template, you can search the Azure Quickstart template gallery, or here is a direct like to the ADF V2 template.

Deploying this template to your Azure subscription will build a very simple ADF pipeline using the V2 public preview service, which was announced at the end of September at Ignite.

The pipeline executes a copy activity that will copy a file in blob store. In the example video I linked to above, I use it to make a copy of a file in the same blob container but with a new filename.

To learn more about the new features in ADF V2, see our updated public documentation home page: https://docs.microsoft.com/en-us/azure/data-factory/.

Advertisements

SQL Server on Azure VMs – Updated Tips April 2017

There are a lot of “best practice” and “performance guide” links available on the Internet to provide guidance on migrating SQL Server instances and databases from on-prem to Azure on VMs (aka Azure IaaS). In this post, I wanted to share a few findings from the field that I’ve encountered along the way recently working with customers on projects that migrate from on-prem SQL Server to Azure VMs. This is not a guide to migrating to Azure SQL Database and I do not cover all of the issues, migration steps or performance measures in an exhaustive manner. I will share a number of good links to help in that respect. But Azure as a platform is changing and evolving very rapidly and so it seemed like a good time to catch my breath and share a few tips here.

Tip #1: Take the time to analyze and evaluate current on-prem workloads prior to database lift and shift

Don’t just jump right into SSMS and use “Migrate to Azure”. Benchmark your current database and resource utilization. Measure IOPS, TX/s, CPU, RAM, network … all classic SQL Server constraints. This will help to determine the proper class of VM to choose for your SQL Server instances. Azure VMs come in classes of resources and you need to make sure that you have enough capacity. I generally recommend DS_v2 and G-series VMs. You can always scale-up to higher-numbered VM in the same series in Azure. But note that there are resource capacity limits grouped together (RAM, IOPS, disks, network) in each series and they all scale-up together when you upgrade your VMs.

Tip #2: Look at Azure Managed Disks for production workloads

Microsoft recently released Managed Disks (https://azure.microsoft.com/en-us/services/managed-disks/) as a way to eliminate the need to manually manage the details of the Azure IO subsystem in terms of storage accounts, relaxes limitations on bandwidth and IOPs that storage accounts imposed and provides a higher level of VM reliability. The absolute best detailed description of Managed Disks is here by Igor Pagliai.

You can choose MDs if you choose to build a SQL Server VM from a base Windows image in Azure or a pre-built SQL Server image. If you choose MDs or UMDs, make sure the use premium disks for production environments. Azure provides a 99.9% SLA with single-instance VMs but they must be using premium disks.

Tip #3: Backup to URL

You are running your SQL Server locally on a VM in the Cloud. Make sure that you use the Backup-to-URL feature in SQL Server to store your backups in Azure. You can use standard storage for your backups and consider GRS to keep geo-redundant copies of your backups, relying on Azure’s behind-the-scenes async replication.

Tip #4: Understand the Azure storage subsystem

If you are not going to use Managed Disks (see above), aka UMDs, then you will need to manage the storage accounts yourself. Here is a good SQL Server on Azure IaaS VMs perf guide with an Azure storage explainer that I like: http://www.faceofit.com/performance-tuning-best-practices-for-sql-on-azure-vm/.

Here are some tips below on storage, disk and LUN usage within SQL Server VMs in Azure. I indicate relevance to managed disks (MDs) and unmanaged disks (UMDs):

  1. (UMD) Build 2 additional Storage Accounts (SAs) after provisioning the VM. Keep the original VHDs where they are and add new premium disks for data and log in separate SAs. You do this from the VM’s Disks blade. Make sure to do this entire process one at a time. This way you can keep better track of the LUN #s to be sure that the SA being presented to the Windows OS is the LOG vs. the DATA LUN. SQL Server also allows you to also map database files directly to blob storage using URLs, as opposed to presenting LUNs to the OS, see here. There are a number of limitations listed in that document and since mapping raw disks to volumes and drives is very common and storage/database admins are comfortable with that approach, I typically do not recommend it at this time.
  2. (UMD/MD) Keep TempDB mapped to local disks, which are SSDs on the VM. It is also a good practice to use the local SSDs for BPE: https://blogs.technet.microsoft.com/dataplatforminsider/2014/09/25/using-ssds-in-azure-vms-to-store-sql-server-tempdb-and-buffer-pool-extensions/.
  3. (UMD/MD) We need to determine if the database needs optimization for read or write. For read workloads, set disk caching policy to “ReadOnly”. For write-intensive (and for SQL log files disk & tempdb disk, if you store tempdb on separate VHDs) set disk caching policy to “None”. This is on the Attach New Disk blade.
  4. (UMD/MD) Those are raw volumes, so we need to map them disk letters in Server Manager from Windows Server on the VM. Create a folder for LOG and a folder for DATA in each of the new volumes from the respective drive letter home dir.
  5. (UMD/MD) For TempDB, create a data (rows) file for each CPU core exposed to the VM. I.e. if the VM has 8 cores, generate 8 TempDB data files.http://sqlmag.com/storage/sql-server-storage-best-practices
  6. (UMD) You are now striped across multiple disks, SAs and multi-threaded maximizing available resources and ensuring that you are not getting IOPs and TX/sec capped.
  7. (UMD/MD) Azure disks are limited to 1 TB, so if you need files and drives larger than 1 TB, create a virtual volume using Windows Storage Spaces to pool multiple VHDs together.

Tip #5: SQL Server images from the Azure Gallery are easy and quick, but be careful

By far, the quickest and easiest way to start-up a new SQL Server in Azure is to pick the SQL Server image that you want to spin-up from the Azure gallery. Just be aware that you must (a) pick bring-your-own-license to use your existing MSFT SQL Server licenses or (b) pay-as-you-go pricing that is baked into the monthly $$ cost that you see on your bill for the amount of time that the VM is running. Also, you cannot upgrade SQL Server on VM created from the gallery. You will need to migrate your databases to the next version of SQL Server on a new image.

Tip #6: High Availability & Disaster Recovery

Best practice in production is to use Availability Groups. There is a pre-built image option available in the Azure gallery for SQL Server that builds out the networking and secondary nodes for an AG for you. It also created the ILB/ELB that Azure requires to handle routing to primary nodes. If you run single-instance VMs for your SQL Server, make sure that you choose Premium Storage to gain that 99.9% SLA from MSFT Azure.

That’s all that I’ll cover for now. These are some of the basics, updated as of April 2017. Since joining the Azure team @ MSFT 2 years ago, I’ve learned to be very careful about regularly updating these sorts of lists and indicating their dates of validity / spoil-by-date!

Best, Mark

How to Make Use of Azure Cognitive Services

Azure Cognitive Services is a service that is available from your Azure portal which allows you to set-up accounts that enable the use of Microsoft’s pre-built algorithms and models for common advanced analytical functions such as Text Analytics, Recommendation Engines, speech and vision recognition as well as many more service-oriented APIs. These models have been trained and are ready for you to use and consume in your solutions, such as Power BI business intelligence solutions.

There is a very common use case emerging that I’m finding in the Cloud Analytics space that relates specifically to a portion of the Text Analytics function where there are a number of solutions that can take advantage of the Sentiment Analysis capabilities that Azure Cognitive Services provides. By sending in a text string to the API service, you can leverage the existing Microsoft sentiment engine to determine if the message that you sent in is positive, negative or neutral based upon a score.

The best way to demonstrate this is with Power BI. I am going to briefly touch on 2 of the most common social media sources to pull real-time data from and core the messages for sentiment analysis: Twitter & Facebook.

First, the easiest with Power BI (PBI) is Facebook. There is an out-of-the-box connector for Facebook included with PBI, so you can use Get Data to bring data directly into a Power BI model from Facebook very easily:

pbi4

Now, to hook into the Microsoft Sentiment Analysis, follow the method described very well on this PBI community page and from the associated PASS Virtual Chapter presentation on this very topic by Gil from DataChant: http://community.powerbi.com/t5/Community-Blog/Sentiment-Analysis-in-Power-BI/ba-p/55898. His full working sample is awesome and you can play around with it here: https://app.powerbi.com/view?r=eyJrIjoiMjU5NWM3MTMtMTg0NS00YzE2LTk0YTYtMmZkZWM4NjYwMmFjIiwidCI6IjIyNzNjNDFiLWI4ZDAtNDVhZi1iZWU2LWUwODQ5NmFlNjcxOCIsImMiOjN9.

To download and use a similar sample in the PBI Desktop, just download the PBIX from my Github. I recommend going into the detailed PBI Community link from Gil above, it rocks and gives you the background you need to customize this for yourself with step-by-step instructions on how this was made.

For the demo above that I uploaded to Github, I included Gil’s Trump & Clinton Facebook election sentiment analysis data on Tab 2 of the report view on PBI. Tab 1 has Microsoft BI Facebook sentiment data where I used the Microsoft BI Facebook page as the data source. Each of those datasets is run through the Cognitive Services API for Sentiment Analysis, part of the Text Analytics suite.

To make it work for you, follow Gil’s instructions on getting an API key from Cognitive Services and plug in your API Key in the GetSentimentResults function in the Power Query / Edit query tool. You can use the free limited service for Sentiment Analysis or go to the Azure Portal and create a full-use Cognitive Services account. The code for calling out to the API and saving the sentiment score in the PBI model is in a Power Query function call in the PBIX and looks like this:

(Source as table) as any =>
let
JsonRecords = Text.FromBinary(Json.FromValue(Source)),
JsonRequest = “{“”documents””: ” & JsonRecords & “}”,

JsonContent = Text.ToBinary(JsonRequest, TextEncoding.Ascii),
Response =
Web.Contents(“https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment?”,
[
Headers = [#”Ocp-Apim-Subscription-Key”= “your api key here”,
#”Content-Type”=”application/json”, Accept=”application/json”],
Content=JsonContent
]),
JsonResponse = Json.Document(Response,1252)
in
JsonResponse

As you can tell from my sample Sentiment Facebook dashboard below, I did not put as much time into the report as Gil did. So take the PBIX and add your own visuals to make it better!

pbi

Now onto the 2nd example, using Twitter data. There is no out-of-the-box connector directly into PBI for live Twitter feeds, so I used the Solution Template for Twitter (Marketing use case) from the Power BI Solution Template site: https://bpsolutiontemplates.com/.

That site provides you with a “wizard” that walks through the set-up of an Azure SQL Database, your Cognitive Services API as I mentioned above (free or the Azure account) and the Twitter handles/keywords that you want to use as a filter. This creates a much more robust solution to grab the Twitter data in that the solution template with spin-up Logic Apps, App Service and API connectors to Twitter and Sentiment Analysis on the backend instead of directly from Power BI as the first example with Facebook above.

So, essentially Power BI will just connect to your Azure SQL DB that you’ve chosen from the template and that SQL DB will get populated from the Azure App that is built automatically for you. This separates the data from the business logic and from the presentation layer in PBI. The code to call Twitter and Cognitive Services are all in the “middle tier” on the App Service in Azure, which you can open directly from your Azure Portal. Modify the API calls, strings and sentiment bins in that code:

//Sentiment analysis – Cognitive APIs
string sentiment = await MakeSentimentRequest(tweet);
sentiment = (double.Parse(sentiment) * 2 – 1).ToString(CultureInfo.InvariantCulture);
string sentimentBin = (Math.Floor(double.Parse(sentiment) * 10) / 10).ToString(CultureInfo.InvariantCulture);
string sentimentPosNeg = String.Empty;
if (double.Parse(sentimentBin) > 0)
{
sentimentPosNeg = “Positive”;
}
else if (double.Parse(sentimentBin) < 0)
{
sentimentPosNeg = “Negative”;
}
else
{
sentimentPosNeg = “Neutral”;
}

The out-of-the-box reports from these Power BI templates are quite good with a social graph visualization and a number of report tabs for different pivots of the Twitter data and sentiment scores.

pbi2

pbi3

Azure Marketplace Solutions

Yes, it’s been a while and I’m just now getting back into the swing of things, keeping MS-SQL Dude running, after my 2 year hiatus in the Open Source world.

To get back into the swing of things, I’ll share with you all a series of videos that I put together with the Microsoft Azure Partner team around the database, analytics and big data offerings from other software vendors that are available as solution template images on the Azure Marketplace:

We first started at the fundamental data level with my old friends at DataStax and their Cassandra database product

We then moved up the stack into batch analytics, streaming and distributed execution engines on Hadoop with my friends at Hortonworks

Lastly, we finished up with the end-user data visualization and BI layer from Looker

All of these data solutions are available from the Azure Marketplace link above.

You can also find several of the presentations from this series at my Slideshare site here.

 

SQL Server 2012 Database Backups in Azure

SQL Server 2012 inlcudes new & updated features to SQL Server Management Studio (SSMS) that provide out-of-the-box integration into the Windows Azure platform-as-a-service Cloud service from Microsoft. It goes beyond just SQL Azure integration as you can now also connect into Windows Azure storage accounts where you can store files and blobs, including DACPAC & BACPAC files, essentially providing DBAs with out-of-the-box backup-to-cloud capabilities.

From a DBA’s perspective, this can be very beneficial because this would allow you to take your SQL Server backups and post them into the Azure cloud where files are automatically protected and replicated for high availability. This will also eliminate the need for you to maintain infrastructure for backups locally on your site. You would utilize Azure Storage for maintenance, retrieval and disaster recovery of your database backups. Here is a link with more details on Azure Storage.

Here are the steps to backup from SSMS 2012 to Windows Azure:

  1. First thing to note is that you will need to sign-up for a Windows Azure account and configure a storage container. You can click here for a free trial.
  2. Now, on SSMS, choose a database to backup. But instead of the normal Backup task, select “Export Data-Tier Application”. This is going to walk you through the process of exporting the schema with the data as a “BACPAC” file output.

3. On the next screen in the wizard, you will select the Azure storage account and container where SQL Server will store the export. Note that it will first backup the database (schema & data) to a local file and then upload it to the cloud for you.

4. Once the process is complete, you will see your exported backup as a BACPAC in your storage container. To restore a BACPAC, you right-click on the file from your container and select “Import Data-Tier Application”.

BTW, this process is identical to the way that you can export & import databases in SQL Azure. You can also easily now move your databases (schema and/or data) to and from SQL Server and SQL Azure with these BACPACs. This is not a full-service TLOG and data file backup like SQL Server native backups. This is more of a database export/import mechanism. But that is pretty much the most interactive that a DBA will get with SQL Azure anyway because you do not perform any TLOG maintenance on SQL Azure database. Microsoft takes care of that maintenance for you.

Nice New SQL Azure Integrations in SQL Server 2012

Have you downloaded the newest RTM (release to manufacturing) version of SQL Server 2012 yet? If not, download it here.

And if you are a big Cloud Computing enthusiast like I am, you will definitely want this latest version of SQL Server. The SQL Server 2012 RC0 is missing some of the newly-released integrations to Azure that I am going to highlight below. These are only available in the new RTM version of SQL Server 2012:

  1. Connect directly to your Azure Storage from the Azure Storage connector on SSMS Object Explorer. Here you can use Azure blob storage for your database backups to eliminate the need to use tape libraries or eliminate local storage for database backups. Azure Storage is automatically redundant with 3 copies of your data kept safe in Microsoft data centers.
  2. It is now super-easy to move database schemas from on-premises to Azure cloud in SQL Azure using the updated Data-Tier Application or DACPAC capabilities. And this function is also much less sensitive to database objects than in the previous SQL Server 2008 R2 first version of DAC along with ability to single-click deploy your database to SQL Azure.
  3. SQL Azure has introduced a new built-in programmatic capability to design scale-out OLTP SQL Server solutions using a sharding technology known as Federations. These are now manageable and visible natively in SSMS.

New MSSQLDUDE Videos: SQL Inventory with MAPS and WP7 Cloud BI

If you would like a short 10-minute intro on how to get started using the Microsoft Assessment and Planning Toolkit (MAPS) to inventory, discovery and record your SQL Server footprint, versions, utilization, users, licenses, etc., then you can have a look at the video that I just posted on our Microsoft YouTube channel for SQL Server data warehouse & BI group here in the East Coast U.S. field organization: http://www.youtube.com/watch?v=VdHs8Ee1iZI. BTW, the MAPS tool is a FREE download from Microsoft.com, see it here. This is version 6.5 of this tool and if you have any version prior to it, I suggest that you upgrade to it and run another SQL Server assessment on your network. This version of MAPS now has the awesome feature of virtualization discovery, proposal and recommendations built-in. This makes MAPS the premier SQL Server tool when consolidation your SQL Server environment.

And if you have a Windows Phone 7 and would like to see what a WP7 Mobile Cloud BI App would look like, take a look at this video that I just posted for our Microsoft field national DW & BI team: http://www.youtube.com/watch?feature=player_embedded&v=P1akSjiZasU#!. This describes the updated V2 improved version of the WP7 App for Microsoft Cloud BI on a mobile device which I described in my slide presentation here.

These are some of the BI data visualizations that you get out of the box with this WP7 app on your phone through the native app: