Monday, 9 February 2015

Event Hubs Publisher Policy - Explained

Event Hubs, in a nutshell, is a large scale, very-high throughput, highly durable event ingestion pipeline. Events/incoming data can come from a wide variety of sources – like, diagnostic logs published from web roles, device telemetry flowing from devices (all-the-way to cloud, in IoT scenarios) etc. At the scale Event Hubs handles, most of the times, the number of publishers sending data to Event Hubs can vary from hundreds (like web roles sending diagnostics logs) to millions (of small devices sending very few bytes every minute). In most of these cases, having fine-grained control on securing access for Publishing data to Event Hubs is extremely crucial.

The scenarios include but not limited to:
1.      Revoking a specific rogue device “which is compromised” / “which is malfunctioning”, to send Telemetry information
2.      Temporarily disable collecting diagnostics for a Particular WebRole – as the role has a known problem and its logs are not required

-        Where
a)      Having control at per-sender level
b)     Identifying a particular Sender and then
c)      Being able to revoke access to a particular Sender (aka Publisher) becomes extremely important.

But, using SAS Authentication, we can create a maximum of only 12 SAS rules with ‘send only’ permissions and cannot address the problem of securing access to a Million senders! Event if the Service Bus service allows to create a SAS rule per sender - creating a Million rules is definitely not a scalable solution too. And more over – declaring and storing the list of all the Million senders – out-of-which most of the senders will never be Revoked access to - is a complete waste of resources. In fact, the only piece of information that the System needs to know – is – who should NOT be allow to send!

We desire a secure way, using which, every Sender created, can authorize to the Service Bus Event Hubs service and then after identifying the sender, Service Bus Event Hubs service just has a list of senders whom – “it will not allow” – which will be very small compared to the “list of all Senders (a million)”.

So, in-short, Publisher policy enables ‘per-publisher access control’ to the Event Hubs – “where the scale of Publishers can be in-the-order-of a Million”. Here’s how it works:
1.      Every Publisher (aka Sender) can be assigned with a Unique Token which identifies that Publisher.
2.      The ‘token’ combined with the ‘Event Hubs endpoint’ is sufficient to be able to Send.
3.      All messages that comes with that token will be implicitly IDENTIFIED as a Publisher (Sender client doesn’t need to add any publisher property in the actual message)
4.      Developer can now declaratively ‘Revoke-Access’ to a Particular Publisher and again ‘Restore-Access’ to that Publisher using Management API’s.

Implementing Event Hubs Publisher Policy using .Net SERVICEBUS SDK:

Step - 1: Assigning a per-publisher token
Fetch a SAS policy with Send only permissions from Azure portal. We will use the SAS key to generate per-publisher tokens.
As this is a SaS Token – SharedAccessSignatureTokenProvider class offers a Util method which can generate tokens, given a publisher name. Remember that the ‘publisher’ parameter of this method is the identifier which can be used to revoke access to the Send Calls.

var publisherId = "PUBLISHER_NAME";
var publisherToken = SharedAccessSignatureTokenProvider.GetPublisherSharedAccessSignature(ServiceBusEnvironment.CreateServiceUri("sb", "ServiceBusNamespaceName", string.Empty), "EVENTHUBNAME", publisherId, "sasKeyName", "sasKey", TimeSpan.FromDays(10));
Instead of directly distributing the SAS Keys to all senders – these sender-scoped Tokens needs to be issued to the Senders. Let’s say, you are using Event Hubs as a stream of telemetry events from your devices, then, each Device ID (or a function of DeviceID – like suffix’ing region code to DeviceId etc) can be the PublisherId.

The generated token, publisherToken, is a SHA2 hash of the information you passed to the method – GetPublisherSharedAccessSignature. So, no one will be able to infer anything from the Token – except Service Bus Service!

Note: **Remember that, you will be using this SAS Key to generate tokens for all of your publishers**. So, if this SAS Key is compromised – all your devices will be! Please, refrain from distributing SAS keys, when using Publisher Policy.

If the token is not generated in this step in the right way – the actual send calls will fail. For example, there is a typo in Event Hub name or the Service Bus Namespace Name – the GetPublisherSharedAcessSignature API cannot detect it!

This step is supposed to issue a token to a Particular Sender – which means, this should not run in Sender’s context.

Step – 2: Publish data to Event Hubs

Now that, we got a token for the Sender, let’s dive into the Service Bus SDK API – which can be used to instantiate a Sender using this Token.

var connectionString = ServiceBusConnectionStringBuilder.CreateUsingSharedAccessSignature(ServiceBusEnvironment.CreateServiceUri("sb", "ServiceBusNamespaceName", string.Empty), "EVENTHUBNAME", "PUBLISHER_NAME", publisherToken);
var sender = EventHubSender.CreateFromConnectionString(connectionString);
sender.SendAsync(new EventData(Encoding.UTF8.GetBytes("From PUBLISHER_NAME")));

NOTE: All messages sent using a Publisher identifier will be hashed on to the same Event Hubs partition. Event Hubs partitionKey (set in EventData) cannot be different from Publisher identifier.

Step – 3: Publisher identification:

Service Bus Gateway’s will look at the authorization token from the message and will make sure that the message came with the right ‘publisherToken’ and then will forward the message to the hashed Event Hubs partition.
All the messages sent from a publisher are guaranteed to land on the Same Event Hubs Partition.

On the consuming side, the publisher identifier that was set while sending the message can be retrieved using:
var publisher = eventData.SystemProperties[EventDataSystemPropertyNames.Publisher];

Step – 4: Revoke/Restore access to Publisher

What we did so far, is to communicate to Service Bus Event Hubs service – a way – to extract the Publisher identifier of every sender talking to a particular Event Hub. In essence – without doing the above steps – all senders publishing data to Event Hub are one and the same!
Now, that we have a way to *identify* publishers, we have the Power to Revoke any One of them!

Let’s see how we can do this. All management operations offered by Service Bus are present in NamespaceManager class – and so is the Revoke operation. To be able to revoke access to any Publishers – you will need ‘Manage’ permissions on that particular Event Hub. Remember that – the previous SAS key that was created only had ‘Send’ permissions. The ‘RootManageSharedAccessKey’ will typically have this (refer to Dan’s blog – if you want to know how to get ConnectionString from Azure Portal). 

var nsManager = NamespaceManager.CreateFromConnectionString(connectionString);
nsManager.RevokePublisher("EventHubName", "PUBLISHER_NAME");

Here’s how to restore the access back to that Publisher:
nsManager.RestorePublisher("EVENTHUBNAME", "PUBLISHER_NAME");

Note: It’s worthwhile to note that, a combination of SAS Keys and Publisher Policy can be used to implement a notion of logical security groups.
Let’s say, for example, if the first batch of Devices that are created are issued a PublisherToken using SAS Key SK1, and second batch of Devices are issued tokens using SK2 and if there is a bug in the devices released in the Batch 2 and hence, you want to stop all these devices from sending data to Event Hubs – all you need to do is to roll the SAS Key SK 2 used for Batch 2 devices until the problem is fixed.