SQL Azure Federations

On the latest update to the SQL Azure Database Management Tool, I see that (a) the updated UI looks a lot like the preview of the Project Barcelona SQL Server lineage tool: http://www.sqlmag.com/blog/sql-server-bi-blog-17/business-intelligence/sql-server-denali-dependencies-amp-lineage-140558. And (b) SQL Azure Federations are enabled!

What are SQL Azure Federations? Think about it as the ability to scale-out SQL Server databases: http://blogs.msdn.com/b/usisvde/archive/2011/10/18/sql-azure-roadmap-simplifies-scale-out-needs-with-bigger-data-sql-azure-federation.aspx. I know alot of SQL Server DBAs and developers that have wanted scale-out for SQL Server transactional systems for YEARS. Well, what is nice is that you can build your own scale-out solutions now with this feature built direclty into SQL Azure meant exactly for scale-out.

However, this is not yet available for SQL Server traditional on-premises databases and still requires you to manually partition and shard your database applications, but do so using the built-in SQL Azure Federations capability. Check-out this Cloud Ninja sample on Codeplex: http://shard.codeplex.com/. And the coding for SQL Azure Federations looks to be quite nice & easy:

-- Connect to federation

USE FEDERATION Orders_Federation (CustId = '00000000-0000-0000-0000-000000000000') WITH RESET, FILTERING = OFF

GO

— Create Customers table

CREATE TABLE Customers(   CustomerID uniqueidentifier NOT NULL,   CompanyName nvarchar(50) NOT NULL,   FirstName nvarchar(50),   LastName nvarchar(50),   PRIMARY KEY (CustomerId) ) FEDERATED ON (CustId = CustomerID)

GO

And there are screens available to you on the SQL Azure Management Tool to monitor and manage your Federations (which is additionally good since you won’t see this in SSMS as SQL Server does not have this capability of Federations):

That’s just my sample. When you actually start using a sharded database application, this GUI becomes very helpful to watch the Federations SPLIT and grow:

Very, very nice. But for scale-out SQL Server (non-Azure), you will need either PDW or sharding OLTP SQL Server databases with distributed partitioned views or other application-layer mechanisms. For now, that is, anyway.

SQL Azure: Scale-Out and Big Data

Perhaps the NoSQL / Big Data trend in high-performance computing, made popular by Hadoop and MapReduce will end up being the “killer app” or at least “killer capability” for cloud databases?

I find it be an interesting thought because the currently available Microsoft cloud databsae, SQL Aure, is a complete SQL Server-based transactional database complete with support for SQL Server tools, T-SQL, ODBC, referential integrity, etc. The current maximium single stand-alone database size is 50 GB.

But Microsoft has recently shown a lot of interest in providing support for scaled-out large database workloads, first with SQL Server Parallel Data Warehouse (PDW) and then the recent announcement of PDW support for Hadoop. Scale-out applications built on traditional SQL Server have been around for some time. The typical mechanisms used to do this are based on partitioning or “sharding” the data to fan-out the data and queries across SQL Servers using MS-DTC, replication or Service Broker.

SQL Azure is coming out with a built-in capability to enable this style of “sharded” partitioning called Database Federations. This is a link to a terrific write-up of using these concepts in a Big-Data application, written by Roger Jennings for Visual Studio Magazine.

Note that this capability is not yet available even in CTP (beta) for SQL Azure yet at the time that I am writing this blog post. I like the fact that these capabilities are being surfaced as a built-in T-SQL command. There will be accompanying ADO.NET library changes with APIs and methods to manipulate the distributed data and to query it appropriately as well.

Very interesting, exciting ways that SQL Azure can be used. Once I get access to the CTPs, I’ll start building out distributed apps using that capability and blog my results here for you. In the meantime, that article link above gives you some code samples to start thinking about your Big Data architectures.