Service Broker Replication – Table of Contents
Using AlwaysOn Availability Groups with Service Broker Replication
Well, it has certainly been a while since the last installment of this blog series, and now I’m working at a new company and doing a pretty different kind of work. However, I’ve been getting a lot of requests to complete this series, and I still have the source-code, so let’s get to it!
When we last left off on our little adventure, I described the general architecture of my Service Broker Replication system and why I made the design decisions that I did. In this blog post, I’m going to further that discussion a bit by explaining how Service Broker and AlwaysOn Availability Groups can be used together to increase the availability of the system.
Why is Availability Important to Service Broker Replication?
Well, other than the obvious answer that availability is important to EVERY system, there is one key part of the Service Broker Replication topology that is especially susceptible to a failure, and where a failure would be devastating: the distributor database. If the distributor database fails, then not only will messages fail to be sent through the environment (and therefore the replication partners would get out of sync, but we would also lose the ability to bring new replication partners online, as we wouldn’t have the message history needed to bring them up to speed. For these reasons, some kind of availability solution is critical to Service Broker Replication.
In addition to preventing data-loss, we also need to minimize the amount of time that the distributor is unavailable during an outage. This is because the longer the distributor is inaccessible, the more our replication partners will get out of sync, and the more likely we are to have conflicts occur because expected updates aren’t being replicated across the environment. With that said, we could just implement something like Database Mirroring, Log Shipping, or a Failover Cluster Instance, and all of those are certainly viable options. However, what would happen if there were a loss of connectivity at the data-center housing our distributor database? What if that outage lasted for a couple minutes? How about a couple hours? A couple days? A couple weeks? I think you get the idea. If we implement one of the availability solutions I mentioned above, we don’t really have an answer to those questions (yes, Log Shipping and geo-clustering can potentially solve the issue, but they each have caveats I’d like to avoid, like data-loss due to backup timing and prohibitively expensive and complex SAN hardware).
Enter SQL Server 2012 Enterprise Edition and AlwaysOn Availability Groups. Chances are, if you’re at an organization that needs a solution like Service Broker Replication, then you’re probably already running Enterprise Edition. If not, you may want to consider increasing my licensing budget a bit, because Enterprise Edition has some pretty amazing features! One of those amazing features is AlwaysOn Availability Groups, which take the best features of Database Mirroring and Failover Cluster Instances and combine them together. For more information on AlwaysOn Availability Groups and why they solve a lot of availability problems, check out my AlwaysOn Availability Groups page.
The reason why AlwaysOn Availability Groups are a big win for Service Broker Replication is that unlike Failover Cluster Instances, Availability Groups don’t require any cluster shared storage objects. Therefore, geo-clustering with AlwaysOn Availability Groups gets MUCH easier (and cheaper) than it is in a Failover Cluster Instance. So, if you have an Availability Group that spans two data-centers, and the first data-center’s Internet connection fails, your Availability Group will automatically failover to the node in the working data-center, and your Service Broker Replication topology will remain up and running, with no data-loss (actually, there is a possibility for data-loss if you’re running in asynchronous commit mode, but it’s usually pretty minimal).
Another beautiful thing about this combination is that even if messages do fail to send to the distributor or from the distributor to a replication partner during the failover process, those messages will remain en-queued by Service Broker and will be resent once connectivity is restored, which is usually within a minute or two. Therefore, as long as we don’t have any data-loss at the distributor database, our replication partners will synchronize as though no failure even happened.
In addition to making the distributor database a member of an Availability Group, you can also reap the benefits of Availability Groups at each of your replication partners, and keep your local databases and applications up and running in the event of patching, hardware failures, and losses of connectivity. However, I would consider running Availability Groups at the replication partners a lower priority than running an Availability Group at the distributor, so if cash is short, at least make sure your distributor is protected.
Stay tuned! This series gets a lot more juicy in the next installment, as I dive into the different message types that Service Broker Replication sends and how they each work. This is where we make the leap from theoretical to practical, so you won’t want to miss it!
Service Broker Replication – Table of Contents