How to Make Use of Azure Cognitive Services

Azure Cognitive Services is a service that is available from your Azure portal which allows you to set-up accounts that enable the use of Microsoft’s pre-built algorithms and models for common advanced analytical functions such as Text Analytics, Recommendation Engines, speech and vision recognition as well as many more service-oriented APIs. These models have been trained and are ready for you to use and consume in your solutions, such as Power BI business intelligence solutions.

There is a very common use case emerging that I’m finding in the Cloud Analytics space that relates specifically to a portion of the Text Analytics function where there are a number of solutions that can take advantage of the Sentiment Analysis capabilities that Azure Cognitive Services provides. By sending in a text string to the API service, you can leverage the existing Microsoft sentiment engine to determine if the message that you sent in is positive, negative or neutral based upon a score.

The best way to demonstrate this is with Power BI. I am going to briefly touch on 2 of the most common social media sources to pull real-time data from and core the messages for sentiment analysis: Twitter & Facebook.

First, the easiest with Power BI (PBI) is Facebook. There is an out-of-the-box connector for Facebook included with PBI, so you can use Get Data to bring data directly into a Power BI model from Facebook very easily:

pbi4

Now, to hook into the Microsoft Sentiment Analysis, follow the method described very well on this PBI community page and from the associated PASS Virtual Chapter presentation on this very topic by Gil from DataChant: http://community.powerbi.com/t5/Community-Blog/Sentiment-Analysis-in-Power-BI/ba-p/55898. His full working sample is awesome and you can play around with it here: https://app.powerbi.com/view?r=eyJrIjoiMjU5NWM3MTMtMTg0NS00YzE2LTk0YTYtMmZkZWM4NjYwMmFjIiwidCI6IjIyNzNjNDFiLWI4ZDAtNDVhZi1iZWU2LWUwODQ5NmFlNjcxOCIsImMiOjN9.

To download and use a similar sample in the PBI Desktop, just download the PBIX from my Github. I recommend going into the detailed PBI Community link from Gil above, it rocks and gives you the background you need to customize this for yourself with step-by-step instructions on how this was made.

For the demo above that I uploaded to Github, I included Gil’s Trump & Clinton Facebook election sentiment analysis data on Tab 2 of the report view on PBI. Tab 1 has Microsoft BI Facebook sentiment data where I used the Microsoft BI Facebook page as the data source. Each of those datasets is run through the Cognitive Services API for Sentiment Analysis, part of the Text Analytics suite.

To make it work for you, follow Gil’s instructions on getting an API key from Cognitive Services and plug in your API Key in the GetSentimentResults function in the Power Query / Edit query tool. You can use the free limited service for Sentiment Analysis or go to the Azure Portal and create a full-use Cognitive Services account. The code for calling out to the API and saving the sentiment score in the PBI model is in a Power Query function call in the PBIX and looks like this:

(Source as table) as any =>
let
JsonRecords = Text.FromBinary(Json.FromValue(Source)),
JsonRequest = “{“”documents””: ” & JsonRecords & “}”,

JsonContent = Text.ToBinary(JsonRequest, TextEncoding.Ascii),
Response =
Web.Contents(“https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment?”,
[
Headers = [#”Ocp-Apim-Subscription-Key”= “your api key here”,
#”Content-Type”=”application/json”, Accept=”application/json”],
Content=JsonContent
]),
JsonResponse = Json.Document(Response,1252)
in
JsonResponse

As you can tell from my sample Sentiment Facebook dashboard below, I did not put as much time into the report as Gil did. So take the PBIX and add your own visuals to make it better!

pbi

Now onto the 2nd example, using Twitter data. There is no out-of-the-box connector directly into PBI for live Twitter feeds, so I used the Solution Template for Twitter (Marketing use case) from the Power BI Solution Template site: https://bpsolutiontemplates.com/.

That site provides you with a “wizard” that walks through the set-up of an Azure SQL Database, your Cognitive Services API as I mentioned above (free or the Azure account) and the Twitter handles/keywords that you want to use as a filter. This creates a much more robust solution to grab the Twitter data in that the solution template with spin-up Logic Apps, App Service and API connectors to Twitter and Sentiment Analysis on the backend instead of directly from Power BI as the first example with Facebook above.

So, essentially Power BI will just connect to your Azure SQL DB that you’ve chosen from the template and that SQL DB will get populated from the Azure App that is built automatically for you. This separates the data from the business logic and from the presentation layer in PBI. The code to call Twitter and Cognitive Services are all in the “middle tier” on the App Service in Azure, which you can open directly from your Azure Portal. Modify the API calls, strings and sentiment bins in that code:

//Sentiment analysis – Cognitive APIs
string sentiment = await MakeSentimentRequest(tweet);
sentiment = (double.Parse(sentiment) * 2 – 1).ToString(CultureInfo.InvariantCulture);
string sentimentBin = (Math.Floor(double.Parse(sentiment) * 10) / 10).ToString(CultureInfo.InvariantCulture);
string sentimentPosNeg = String.Empty;
if (double.Parse(sentimentBin) > 0)
{
sentimentPosNeg = “Positive”;
}
else if (double.Parse(sentimentBin) < 0)
{
sentimentPosNeg = “Negative”;
}
else
{
sentimentPosNeg = “Neutral”;
}

The out-of-the-box reports from these Power BI templates are quite good with a social graph visualization and a number of report tabs for different pivots of the Twitter data and sentiment scores.

pbi2

pbi3

Azure Marketplace Solutions

Yes, it’s been a while and I’m just now getting back into the swing of things, keeping MS-SQL Dude running, after my 2 year hiatus in the Open Source world.

To get back into the swing of things, I’ll share with you all a series of videos that I put together with the Microsoft Azure Partner team around the database, analytics and big data offerings from other software vendors that are available as solution template images on the Azure Marketplace:

We first started at the fundamental data level with my old friends at DataStax and their Cassandra database product

We then moved up the stack into batch analytics, streaming and distributed execution engines on Hadoop with my friends at Hortonworks

Lastly, we finished up with the end-user data visualization and BI layer from Looker

All of these data solutions are available from the Azure Marketplace link above.

You can also find several of the presentations from this series at my Slideshare site here.

 

Monitor SQL Server CDC with Customer Data Collector

I’ve posted several times over the years here @MSSQLDUDE on the benefits of SQL Server Management Data Warehouse (MDW) for baselining your SQL Server performance and sending those measurements to a central server for monitoring. MDW makes use of the SQL Server Data Collector feature and is a feature you get out of the box. Here are some of my writings on MDW here @MSSQLDUDE: http://en.search.wordpress.com/?q=mdw&site=mssqldude.wordpress.com.

There is a debate among SQL Server DBAs about the worthiness of MDW and Data Collectors because they can appear to be a “heavy burden” on your server and provide overhead that you may not wish to occur on your production server. That being said, the way things work in our world and jobs in databases is a give and take with performance, monitoring, scalability and providing functionality to our business users & customers to allow them to do their jobs. Any monitoring and baselining product incur a small performance cost to your servers. There are a number of options out there in the market and many DBAs write their own scripts to store DMV data into tables and files to provide similar functionality.

What I wanted to point you all to is something that I think proves the value of Data Collectors (and MDW) through extending the functionality with custom data collectors.

I’ve been working on a CDC project recently in a large BI/DW scenario and one thing that you notice when working on CDC instead of longer-run SQL Server features like Replication, is that there is not much support in SSMS or other built-in SQL Server tools for CDC.

But what is really nice is this custom data collector for CDC that is included on MSDN from this terrific article onMonitoring the Change Data Capture Process. Scroll down to the Data Collector portion and the code to paste into SSMS is right there for you to create and start your data collector. This is a great way to monitor your CDC log processes.

BTW, you have to make an update to the XML in code from MSDN … Look for the SELECT @parameters code and replace the entire SELECT … XML statement with this:

SELECT @paramters = CONVERT(xml, 
    N'<ns:TSQLQueryCollector xmlns:ns="DataCollectorType">
        <Query>
          <Value>SELECT * FROM UserDB.sys.dm_cdc_log_scan_sessions</Value>
          <OutputTable>cdc_log_scan_data</OutputTable>
        </Query>
      </ns:TSQLQueryCollector>');

Enjoy! Best, Mark

Big Data and the Telecom World

The complicated world of telecommunications analytics continues to be a primary driver behind complex data analytics solutions and I find it mentioned time and time again in Big Data use cases and scenarios.

Those of us who have lived in this world for years will probably agree with me that we’ve been pioneers in “Big Data techniques” ever since we were asked to build CDR (call detail record) data warehouses. My first CDR solution was for customer service and marketing at AT&T in the 1990s. We used Oracle for the DW and hired PhD statisticians to build models for predictive analytics on top of that data.

The marketing department was able to utilize that data to better understand customer patterns of usage and make data-driven decisions about how to package subscriptions and products. The call center team used the analytics from our cubes for market basket and association algorithms that provided reps with the ability to cross-sell to customers, which was also used by sales for up-sell opportunities to corporate accounts.

Then there is also the mass amounts of streaming data coming from network equipment which was used by engineering and the NOC for troubleshooting, predicting outages and tuning the network. Rules for correlation, thresholds and root-cause were needed to make sense of the 1000s of events/sec and not overwhelm systems and dashboards.

Does that sound familiar to today’s “Big Data use cases”? It should. We used to call these techniques CEP (complex event processing) and VLDB (very large databases). Really, at the end of the day, what this meant was that our DBAs, architects and developers needed to think about scale and distributed architectures to account for the scale that we were dealing with.

Today, it is a nice evolution of these techniques to see Hadoop, sharded databases, NoSQL and in-database analytics providing packaged, easier ways to process and manage systems of TB & PB scale.

Essentially, what this means is that these techniques now become available to all IT departments and examples like the churn & customer analytics (the holy grail of telcos is churn management) solutions become better, faster with improved data sampling because of new, emerging Big Data technologies.

I found this story on the Internet by Harish Vadada from Telecom Cloud here. It talks about T-Mobile with databases like Oracle & SQL Server using Big Data technologies such as Hadoop, to improve the delivery of customer & churn analytics to drive both the bottom-line and top-line of their business. Very impressive and spot-on to what I am saying here in this post.

Cheers! Mark

 

SQL Saturday #200 Philadelphia – Microsoft Malvern – June 1

June 1 is the date for the 2013 SQL Saturday Philly @ the Microsoft Malvern office again this year. Here is the home page for the event. You can register for FREE there and also submit a session.

The event is an entire Saturday filled with FREE training on SQL Server, business intelligence, data warehousing and Big Data. The only cost is an optional $10 for the catered lunch. Or bring your own lunch, your choice!

I will present an updated version of my Big Data material called “Microsoft Big Data in the Real World” and I’ll share the updated material here on MSSQLDUDE, MSDN and KromerBigData once I have that completed and checked in.

Thanks! Best, Mark

New MSDN Blog Home for MSSQLDUDE

Hey MSSQLDUDE fans! I’m migrating over to MSDN blogs now that I’m in a new group @ Microsoft. We focus on developers, which is why I chose MSDN over Technet for the home for the MSSQLDUDE blog. I’ll cross-post for awhile, but then I’ll use this MSDN home for all Microsoft-specific postings. Anything that falls outside that scope and is more personal opinion and observation, I’ll keep @ https://mssqldude.wordpress.com/.

It’s exciting to be back @ MSDN! I haven’t maintained an MSDN blog since DBCF & MEC in 2009! http://blogs.msdn.com/b/makrom/.

For those new to MSSQLDUDE, you can come here for a variety of SQL Server topics, tips & tricks and links. Mostly, I focus on BI, DW, Cloud (Azure) and Big Data with a few posts on DBA stylings. For the most part, I stick with latest & greatest SQL Server technologies. There are plenty of excellent blogs out there on the previous versions of SQL Server that can help nearly every SQL Server DBA or developer.

Thanks! Best, Mark

And now for a quick break from Big Data blogging (sort of) …

… My apologies for long-time readers who have felt inundated with my recent focus on Big Data here @ MSSQLDUDE!

Well, today, I want to send you over to Robert Davis’ always-excellent blog where he has an in-progress series called 31 Days of SQL Server Disaster Recovery.

Very, very important and great stuff in that series as I’ve started reading through it this week to see if there are things that I can pull out of the posts that I haven’t run into or tried before.

As was the case with Robert when he hosted brown bag training sessions for us @ MSFT, I have found some new things and good tidbits in there.

OK, now getting back to reality: even in the SQL Server world, database platforms (SQL Server included) are moving more & more toward schemaless-on-write, distributed processing, in-memory and columnar storage. These are all underpinnings of Big Data platforms, so I’m not going to let you go without bringing-up my new area focus again!

Big Data (probably this year) will become part-and-parcel of any data warehouse or analytics solution. This is because now that the business community, CIOs and vendors are on-board with these techniques, BI, DBA, data architect and DW pros cannot ignore the mass amounts of data that typically hang outside of your data warehouse as “exhaust data”.

So keep learning these new technologies and capabilities and incorporate it into your 2013 strategies and architectures. You won’t be able to set them aside any longer!

Best, Mark