vCloud Director Meets vShield App

This article was originally posted on the VMware vCloud corporate blog. I am re-posting here for the convenience of the readers of my personal blog.

By: Massimo Re Ferre’ (Staff Systems Engineer – Global CoE) and Joe Sarabia (Sr. Consultant – Global CoE)

Background

In the last few years I have seen a rise of interest for vCloud Director use cases where multiple virtual machines (in a vApp or across vApps) can share a single Layer 2 network and yet be secured, at the vNIC level.

The good news is that VMware vCloud Network and Security App (formerly vShield App) does exactly that. The bad news is that vShield App is not yet consumable in self-service by a vCloud Director tenant.

The following is a slide I presented at VMworld 2011 (in session CIM2231):

As you can see, I pointed out that these security groups (aka trusted zones or enclaves) could be configured by the vShield Admin but not by the tenant.

This is what the out of the box vCloud Director experience allows you to consume from a network and security perspective:

Note: this was based on vCloud Director 1.5. With vCloud Director 5.1 the Load Balancing services are now exposed via the vCD UI/APIs as well.

Previous workarounds

In the VMworld presentation I offered a couple of solutions to work around the limitation of vCloud Director not exposing vShield App functionalities for tenant consumption.

The first one is what I referred to as “Managed Services”:

In essence a tenant would need to open a ticket with the cloud service provider (private or public) and ask them to put the proper tenant’s VMs inside the proper security groups – easy to implement, as it doesn’t require any development or customization, but not very “cloudy”.

The second solution I offered is what I referred to as “Self-service with customization”:

A good option if you are using a custom portal (where you can dispatch API calls to both vCD and vShield Manager) but not all cloud service providers want to develop a custom portal, so it may not be a viable workaround for many of customers and partners.

Fast forward a couple of years.

A better solution

With the introduction of vCloud Director Notifications and Blocking Tasks in vCloud Director 1.5, the fact that vCO is becoming more and more core to how you build a VMware based IaaS cloud and the introduction of new functionalities in vCloud Director 5.1, such as API extensions and Metadata tagging, new scenarios and possibilities are arising.

Particularly in this post I am going to focus on the Metadata tagging scenario.

In vCloud Director 5.1 almost all objects (including obviously VMs) can be tagged with a key/value mechanism. For example you can say that a MySQL VM is tagged with the value DATABASE in the key SECURITYGROUP:

This opens up a huge amount of opportunities in the context of consuming a vShield App from within vCloud. Joe Sarabia and I brainstormed a bit around this a few days ago and he decided to go ahead and build a small prototype to demonstrate this. More on this later.

Before we jump into this prototype, I need to share a bit more context around what Joe implemented.

This is a graphical representation of the new scenario with a high level flow.

At the (very) high level this is what happens.

  • A VM is tagged with a particular key/value pair

  • At Power-on a blocking task is used to stop the (Power-on) operation and call out to the AMQP bus

  • vCenter Orchestrator receives and reads the message on the AMQP bus

  • vCenter Orchestrator parses the message and matches the VM tag with a vShield App security group

  • vCenter Orchestrator runs a workflow against vShield Manager to put the VM into the proper vShield App security group

Please note that the nature of this small prototype is such that security groups are pre-created and rules defining (blocked and allowed) traffic are pre-configured.

In essence, with the logic Joe prototyped, you can consume an existing security plumbing, but you cannot modify it. The idea could be that these settings can be managed through tickets with the cloud service provider, but the placement of those VMs in the proper security group is dynamic and automatic (policy based according to the tagging).

You can go as far as you want with this. You can create enough logic inside vCO so that, if the metadata value doesn’t match an existing security group, the security group gets created (along with some default rules perhaps).

Or alternatively the cloud administrator could leverage an existing service portal where the user can create, delete, update security groups and associate traffic rules for later consumption via vCD.

It can be as complex rich as you want.

Use cases

There are many use cases where this may be useful. Right now, in vCloud Director, the only way to segment traffic and protect workloads is via the Edge Gateway.

This is all good but the moment you have a lot of microsegments to deal with, you end up burning a lot of Layer 2 networks. Not to mention that an Edge Gateway, as of today, supports up to 10 networks.

This is when a mechanism that allows you to create micro security zones on a single Layer2 network becomes very handy. Imagine a vCD virtual datacenter (aka Organization vDC) with a single Edge gateway that maps to an External Network (Internet or Corporate Network) and to a private Routed Organization Network. On top of this Org Network you can create dozens of those security enclaves without creating other Layer2 connected to the Edge.

So far we (Joe and I) have primarily thought about microsegmenting a Routed or Internal Organization Network. We haven’t thought the details about microsegmenting an Organization Network configured as a Direct Connect to an External Network (note the prototype Joe built tactically use an External Network because it was easier for him to demo that setup).

This would in turns allow different tenants to share the same External Network by being able to have a native external address (no NAT or static routing through the Edge) and still be protected by means of these vShield App security groups. This requires a bit of additional thinking because sharing a Layer2 among different tenants may have deeper implications if not properly planned. Microsegmenting a private Routed or Internal Organization Network has less implications and security exposures.

vShield Plugin

Those of you familiar with vCenter Orchestrator may have spotted that Joe has used the REST APIs plugin to connect to vShield Manager. I would like to say we have done this to demonstrate vCO can connect and orchestrate pretty much everything, but the reality is we have to do so because, at the time of this writing, a vCloud Network and Security plugin for vCO is not yet available.

This makes things a bit more time consuming because there are no native workflows and actions available to interact with the vCNS API. Instead, you have to build these yourself by parsing and building XML and using things like the HTTP-REST plug-in to generate workflows.

The potential consumption model for these extensions

This is where things become interesting and “architecturally elegant”.

An advanced (DevOps?) vCloud consumer at run-time could use these tags that, when building a 3-tier application, can set the proper security characteristics on a per VM basis.

Alternatively these tags could be assigned to a vApp by a cloud administrator or catalog administrator so that a less smart vCloud consumer could deploy the vApp from a catalog and inherit the security settings (tags) pre-defined in the vApp template in vCD.

Even more interesting, now a higher-level tool like vCloud Automation Center can leverage this infrastructure security plumbing and set those metadata tags when a blueprint gets deployed on vCD.

The beauty of this is that you don’t have to create 1:1 integrations across all products in the stack. You can implement extensions or policy enforcements at the vCD level so that both a vCD consumer and a consumer above vCD (like vCAC) can benefit from it. No need to re-invent the wheel at each layer.

Flexibility (and openness)

We intended to document this as a reference framework and architecture on how to use metadata tags to enforce policies on the platform. There are customers that are, for example, exploring the applicability of this framework to enforce affinity and anti-affinity VMs placement policies on vSphere.

Others are thinking about describing backup policies to VMs based on these metadata (eg a VM with BACKUP=GOLD is backed up every night while a VM with a BACKUP=BRONZE is backed up every week). Of course this framework, as is, does not take into account the restore process, only the backup policy description and enforcement, re: how data needs to be protected.

This is also open enough to leverage third party security mechanisms. We have documented and Joe prototyped an integration with RabbitMQ, vCenter Orchestrator, vShield Manager and vShield App but nothing would stop you from using your technology of choice to cover any of these specific areas. Did I say “open”? Really?

Last but not least this is a great example of how cloud service providers could extend the vCloud platform without compromising compatibility with the core. This metadata approach can be enabled in any given cloud, thus allowing a user to tag VMs to get them protected (as an additional non standard service). Nothing would stop the same tenant to deploy the same vApp in another vCloud-based cloud even if the new target doesn’t have these metadata security enforcements (essentially he could still tag the VMs, but that would have no effect).

This feature enables freedom to move for the tenant, while also allowing the cloud administrator to extend the core features and differentiate.

Conclusions

This is, in my opinion, a great example of the extensibility and richness of the platform. Note we have only discussed here the metadata tagging approach, which is geared towards policy-based enforcement at deployment time. We haven’t talked about the vCloud API extension approach, which opens up an even broader and richer set of capabilities. This could cover many other use cases (for example how you can manipulate security groups and how you can restore VMs backed up based on metadata tagging using a vCloud API call).

The use case we described here (vCD and vShield App integration) may end up being built-in into the core vCloud Suite one day. The other 10 million use cases that can be implemented with this framework may not.

And finally, below is short demonstration of the prototype Joe built:

https://youtu.be/gz8OrZ1ETVk

Massimo.