My Journey with Red Hat CloudForms 4.2, a Cloud Management Platform

Introduction

I first discovered the cloud about five years ago. Since then, the revolutionary technology has caused me to change from someone that visits a physical data center location to a couch potato!

However, a question started to bother me as I realized that on a larger scale, outsourcing IT maintenance might not be as appealing. How are enterprise organizations going to adopt a cloud-based IT model given their requirements for governance, risk, and compliance? I also wondered, as the cloud movement grew, how will these organizations leverage this elastic capability in a cost-effective way?

There are numerous admirable characteristics about the cloud, but to me, the chief one is its nature of utility. If you are not leveraging this characteristic, you are not realizing the cost savings cloud technology can offer. For example, with cloud utility, a business can switch servers on and off automatically in a compliant fashion. If a server is not making the business money, switch it off. If the business could be making more money by having an additional server, switch a new one on. If a developer wants to quickly try something, he can via the self-service catalog in the private cloud self-service portal.

Readers may be thinking that this all sounds deceptively easy. First, let’s think through the details of the full stack, because it is not just the server that we need to worry about. There is the mixed bag of firewalls, load balancers, Domain Name System (DNS), middleware, and other items on the technology front. On the governance side, there are aspects related to change control, testing, business continuity, disaster recovery, ITIL, and more. Server builds automation, the components needed to provide the necessary application platform, and then the application code itself are well established. But organizations will have invested in particular technologies for governance and technical architectural reasons. For example, ELB just does not have the technical feature their million dollar application needs, so they’ll have vendor load balancer appliances.

With all of this complexity, how would one pull all these technologies, processes, and standards together?

Ultimately, we need glue for this hybrid and bespoke world. There is also a loopback in this requirement, in that the glue needs to be compliant, too.This is where Red Hat CloudForms comes in. The following content is a compilation of some things that I’ve learned over the last few projects and working with CloudForms, which will serve as introduction to organizations interested in leveraging the technology.

My Findings and Learnings

The platform has many excellent benefits, including the following:

As you can imagine, there is certain knowledge that comes from using a platform like this in the real world, and CloudForms does not have an easy on-ramp. Although readers should stay tuned for other use cases, I have focused this article on vCenter and Cloudforms Service Catalog-based requests for now. Here are some things I’ve picked up:

An orchestrator orchestrates. CloudForms is an orchestrator. Therefore, my mantra is: “Go somewhere else first and if it is not possible, then use CloudForms”. That is, if you have an IPAM system, don’t make CloudForms do IP management. If you have a configuration management system, use that to configure systems and services. CloudForms should only be used to tell the configuration management system to create a load balancer with certain characteristics.

Figure 1: Service Object Relationships

In terms of code reuse, this is a big deal. I might want to do a Microsoft Active Directory group check at the approval stage (request) and again at the end stage (task) of a provisioning. The framework provides mechanisms to manage this.

This allows the script to try another context if the first fails. This works when using the script within a context, e.g. request or task. How do we reuse code that is used in request and task context.

For this, we check the vmdb_object_type and from the example above, you can see there are not just the two contexts request and task, but also vm and others.

Object Walker can be called from your code block and it will dump the context to the automation.log. The vmdb_object_type value is always provided at the end. Object Reader formats the dump in a more human friendly format. Object Walker provides you with detailed information, describing all CloudForms objects available for use in the script it was called. For example, userid can be accessed to “know” who the requester is. There is a wealth of very useful objects.

The requester/user logs into the portal and is presented with a service catalog form: Please fill in this form and hit Submit.

CloudForms has out-of-the-box capability to generate the Service Catalog form. Automate > Customization > Service Dialogs is the starting point.

Figure 2: Customization form generator

This GUI allows you to build the form framework. Each field in the form is an Element and the element’s name value is the variable that stores the user’s input.

Figure 3: Element name

In this case, the variable dialog_ip_addr will be created by the CloudForms automation engine and can thus be retrieved for use in the code.

Once we have our form structure ready we tie it to a Service Dialog, via Services > Catalogs > Catalogs Items.

Figure 4: Catalog Item

The catalog item provides entry points to the automation engine. In the case of non-generic types (VMWare, AWS, etc.), it provides default values to provisioning requests (e.g. a network, IP address, number of CPUs, vCenter cluster, etc.). As previously mentioned, these default values can then be changed based on user inputs from the form. Catalog Items are then added to Catalogs (Services > Catalogs > Catalogs).

The user completes the form and then hits Submit.

Welcome to the State Machine. It is the mechanism that abstracts the developer from some important application architectural patterns such as retry, timeout, on error, etc. and therefore, makes CloudForms an enterprise orchestrator, without the enterprise-size development team. The developer only needs to write Ruby scripts, not complete software code with enterprise architectural patterns.

Figure 5: An Approval Statemachine

One way to describe the CloudForms state machines is that they are the mechanisms that control the execution of standalone Ruby scripts. CloudForms steps through each state in the state machine, applying On Entry, On Exit, etc. controls for each if defined.

Once the service catalog form has been submitted, the approval process starts. Part of this process includes a resource quota validation. The flow enters the approval state machine. This is typically the first opportunity for customization of a CloudForms state machine. It is likely that the organization will have a change control system (ServiceNow, BMC Remedy, etc.). The developer can write integration code to automate the change creation process. For this, the state machine will typically be extended with a few states (Ruby scripts to execute) and attributes. State machine attributes are values provided to all scripts in the state machine as variables. In figure 5 above, approval_type is an attribute and it can be retrieved in code as follows:

When it comes to integration with external systems, the state machines generally contain attributes for the external system’s API URL and authentication details (passwords are encrypted by CloudForms). That way, they are not repeated in multiple code scripts, each script can just retrieve and use the attribute from the state machine. The password attribute encryption helps, too!

Approval state machines can be extended for any number of reasons, since each organization will have their own process before proceeding, which might involve checking Active Directory to ensure the requester is a member of a specific AD group.

Once the approval state machine is complete, CloudForms will check whether the requester has enough resource quota to fulfil the request. It is because of the quota policy check that the developer needs to specify the number of VMs to be provisioned at the service catalog (request) stage.

We now have an approved request, time to execute it and provision something. Figure 4 showed the Provisioning Entry Point. This is the instance that will be called next. An instance can be described as a localised state machine. It will inherit your state machine’s schema, but you can change the values of each state. Usually an instance executes a method (Ruby script), so the developer now has a way to overwrite values passed to the method. It allows for a more granular control of a method in the case that the global state machine values don’t work.

Typically for service catalog based requests the CatalogItemInitialization instance is the first script to execute. The primary function of this script is to parse the inputs provided by the requester via the form and then set up variables so that they can be used by child tasks later in the flow. For example, given the input of a user selecting a specific data center, we can retrieve the value once the flow reaches the data center placement process.

The flow now starts the actual provisioning stage. This is of course done via a state machine.

Figure 6: A Provisioning state machine

This state machine will again be extended and modified to suit each organization’s requirements. Here you are likely to write integration scripts for IPAM, DNS, AD, Configuration Management, and ITIL.

The standalone execution of methods (Ruby scripts) via states in state machines is great, but you need to know how to pass variables between methods in a state machine and also pass then to methods in other state machines. To pass variables within a state machine, use the state variable. Variables are actually available between methods in a state machine, but if a state machine retry loop is triggered, all variables are lost. To safeguard against this, use state variables to declare variables in a state machine.

To persist variables between state machines, use the option function.

This is especially handy for workflows that involve a change control record — one that is created within the approval state machine and which then needs to be closed once provisioned.

Another use case for variable persistence is the server object. For instance the IPAM IP Id number is required to release the IP address. The Id is provided during provisioning and can be attached to the server object via custom attributes. The value can then be retrieved during retirement.

Figure 7: Example datastore structure

To do this, copy the method under consideration into a temporary domain that executes before the original domain. You can now change the code and test it without affecting the original domain and instantaneously revert to the original by deleting the copied method.

Figure 8: Using domain sequence for testing

Further reading and help

  1. The open source community is active and helpful. Join the forum at http://talk.manageiq.org
  2. Check out ManageIQ on Gitter
  3. I also highly recommend that you read Peter McGowan’s “Mastering Automation in CloudForms”
  4. And of course, my favourite—when you are stuck, use Object Walker
Daniel Wessels

Daniel Wessels

Cloud, DevOps and Automation Consultant

I am an end to end Systems Engineer & Consultant with more than eighteen years experience in various industries including Government, Investment & Retail Banking, Media, Marketing, Telecommunications, Transportation and Pharmaceuticals. I provide infrastructure and application consultancy to businesses with existing cloud footprints or that are looking at cloud adoption.

Related Posts