A deep look into OpenShift v2

The PaaS (Platform As A Service) market is dominated by two worthy technologies – OpenShift from Red Hat and Pivotal’s CloudFoundry. It is amazing that a disruptive technology category like PaaS is overwhelmingly dominated by these two open source ecosystems , which results in great choice for consumers.

I have used OpenShift extensively and have worked with customers on large & successful deployments. While I do believe that CloudFoundry is robust technology as well, this post will focus on what I personally know better – OpenShift.

Platform as a Service (PaaS) is a cloud application delivery model that is typically between IaaS and SaaS.

The Three Versions of OpenShift

OpenShift is Red Hat’s PaaS technology for both private and public clouds. There are three different versions: OpenShift Origin,Online and Enterprise.

OpenShift Origin, the community ( and open source) version of OpenShift, is the upstream project for the other two versions. It is hosted on GitHub and released under an Apache 2 license.

OpenShift Online is the public PaaS as a Service, currently hosted on Amazon AWS.

OpenShift Enterprise is the hardened version of OpenShift with ISV & vendor certifications.

3_flavors_of_openshift

OpenShift Terminology

The Broker and the Node are two main server types in OSE. The Broker is the manager cum orchesterator of the overall infrastructure and the overall brains behind the operation. The Nodes are VMs or baremetal servers where end user applications live.

The Broker exposes a WebGUI (a charming throwback to the 1990s) but more importantly an enterprise class and robust REST API. Typically one or more brokers that manage multiple nodes. Multiple brokers can be clustered for HA purposes.

Application

This is the typical web application or BPM application or integration application that will run on OpenShift. OpenShift v2 was focused on webapp workloads but offered a variety of mature extension points as cartridges.

You can interact with the OpenShift platform via RHC client command-line tools
you install on your local machine, the OpenShift Web Console, or a plug-in you
install in Eclipse to interact with your application in the OpenShift cloud.

Gear

A gear is a server container with a set of resources that allows users to run their
applications.

Cartridge

Cartridges give gears a personality and make them containers for specialized applications. Cartridges are the plug-
ins that house the framework or components that can be used to create and run an
application. One or more cartridges run on each gear, and the same cartridge can
run on many gears for clustering or scaling. There are two kinds of cartridges:
Standalone & Embedded. You can  also create a custom cartridge like its running in OSE. EAP+Autoconfig for some F5 stuff. E.g. Create an F5 cartridge that can call out the F5 API every-time it detects a load situation.

Gears

Scalable application

Application scaling enables your application to react to changes in traffic and au‐
tomatically allocate the necessary resources to handle your increased demand. OpenShift is unique in that it can do application scaling both ways – scale & descale – dynamically. The OpenShift infrastructure monitors incoming web traffic and automatically brings
up new gears with the appropriate web cartridge online to handle more requests.
When traffic decreases, the platform retires the extra resources. There is a web page
dedicated to explaining how scaling works on OpenShift.

Foundational Blocks of OpenShift v2 

The OpenShift Enterprise multi-tenancy model is based on Red Hat Enterprise Linux, and it provides a secure isolated environment that incorporates the following three security mechanisms:
SELinux
SELinux is an implementation of a mandatory access control (MAC) mechanism in the Linux kernel. It checks for allowed operations at a level beyond what standard discretionary access controls (DAC) provide. SELinux can enforce rules on files and processes, and on their actions based on defined policy. SELinux provides a high level of isolation between applications running within OpenShift Enterprise because each gear and its contents are uniquely labeled.
Control Groups (cgroups)
Control Groups allow you to allocate processor, memory, and input and output (I/O) resources among applications. They provide control of resource utilization in terms of memory consumption, storage and networking I/O utilization, and process priority. This enables the establishment of policies for resource allocation, thus ensuring that no system resource consumes the entire system and affects other gears or services.
Kernel Namespaces
Kernel namespaces separate groups of processes so that they cannot see resources in other groups. From the perspective of a running OpenShift Enterprise application, for example, the application has access to a running Red Hat Enterprise Linux system, although it could be one of many applications running within a single instance of Red Hat Enterprise Linux.
How does the overall system flow work? 

1. As mentioned above, the Broker and the Node are two server types in OSE. The Broker is the manager and orchesterator of the overall infrastructure. The Nodes are where the applications live.

The MongoDB that sits behind the broker manages all the metadata about the application; the Broker also manages the DNS stuff with a BIND plugin; Auth piece manages credentials and stuff.

2. As mentioned, the gear is the resource container and is where the application lives. The gear is actually CGroups and SELinux config. Using CGroups and SE Linux enables one to create high density, multitenant apps.

3. The cartridge is the framework and stack definition. E.g. Create a Tomcat definition with spring, struts installed on it.

4. The overall flow is that the user uses the REST API to request the Broker to create a Tomcat application for instance and asks for a Scaled or a NonScaled application for a given gear size(this is all completely customizable).

The Broker communicates with the Nodes and enquires which of those have the capacity and “can you host this application?”. The Broker then gets back a simple health-check from the nodes and decides to place the app on the nodes that have been identified based on a health indicator. All this communication happens via MCollective (which is based on ActiveMQ).

5. A bunch of things happen at that point of time once the Broker decides where to place the workload.

a) A UNIX level user is created on the gear. Ever application gets a UUID and a user associated with that.So it creates a home directory for the user, put the SELinux policies associated with that user on that gear, so that they have very limited and scoped access to whatever is on that gear only. If you go and if you do a ps, you cannot see anything that is not yours. It is all controlled by SELinux and all that stuff gets setup when you create an application.

b) The gear is created with whatever cgroups config with memory,cpu and disk space etc based on what the user picked.

c) The next step is the actual cartridge install and puts all actual stack (tomcat in this instance) and the associated libraries and all that stuff. Then it will go startup Tomcat on that node, do all the DNS resolution so that the app is publicly addressable.

d) Base OpenShift ships with lots of default cartridges with all kinds of applications. Spec 2.0 made it a lot easy to create new cartridges.

You can write custom cartridges than expose more ports and services across gears.

The other thing is you can build hybrid cartridges so that the first gear spun up would spin up the first framework and the next gear would spin up the next framework. So that a webfront end and then a vertex cluster running on the backend and all the intergear communication is handled by OSE. It is all very customizable.

For instance, you can add EAP and JDG/Business Rules in one hybrid cartridge and have it all be autoscaled. Or you can build multiple cartridges and one will be the main cartridge and have all these embedded cartridges back that up.

The difference between the hybrid and the single framework is that when it receives a scale-up event, it has to decide what to install on a given gear, so the logic gets a little more complicated.

openshift-enterprise-workflow

e) Networking and OSE.

The first thing to keep in mind is that all of OpenShift Online (which a lot of people know and understand) is really designed around an Amazon EC2 infrastructure. The 5 ports per gear limitation is really around EC2 limitations and does not really apply in enterprise where you are typically running this on a fully controlled VM.

Intergear communication between the gears of an app is via IPTables.And it is all configurable as well as SELinux policies with the ports etc. It is all customizable in the Enterprise version.

It is important to understand how routing works on a node to better understand the security architecture of OpenShift Enterprise. An OpenShift Enterprise node includes several front ends to proxy traffic to the gears connected to its internal network.
The HTTPD Reverse Proxy front end routes standard HTTP ports 80 and 443, while the Node.js front end similarly routes WebSockets HTTP requests from ports 8000 and 8443. The port proxy routes inter-gear traffic using a range of high ports. Gears on the same host do not have direct access to each other.
In a scaled application, at least one gear runs HAProxy to load balance HTTP traffic across the gears in the application, using the inter-gear port proxy.
OpenShift_Networking

THE COMPLETE PICTURE –
————————————

All OpenShift applications are built around a Git source control workflow – you code locally, then push your changes to the server. The server then runs a number of hooks to build and configure your application, and finally restarts your application. Optionally, applications can elect to be built using Jenkins, or run using “hot deployment” which speeds up the deployment of code to OpenShift.

As a developer on OpenShift, you make code changes on your local machine, check those changes in locally, and then “push” those changes to OpenShift.

Following a workflow –

The git push is really the trigger of a build/deploy/a start and stop. All that happens via a git push. GIT here is not really for source control but as a communication mechanism for an application push to the node.

So if one follows the workflow from the developer side, they typically use the client tools that speak REST/CLI/Eclipse to the broker i.e. create the app,or,start and stop the application. Once that application is created; everything you do from changing Tomcat config, changing your source, deploying a new war etc is all going to go through GIT.

The process is that I get to whatever client tool I use, REST/CLI/Eclipse etc to communicate to the broker asking for an application creation. The broker,via MCollective, goes to the nodes and picks the ones it intends to create the gears on.

Once the gears are created along with a Unix user/ Cgroups/SELinux/home directory/cartridge config etc based on whatever you pickup.

There it does routing configuration on the gear as all the applications bind to a loopback address on the nodes so they’re not externally routable. But on every node, there is also an Apache instance running so it multiplexes all that traffic so that all nodes are externally routable.

There is a callback to the Broker to go and handle all that DNS and the application is now accessible from the browser. At this point, all you have is a simple hello world templated app.

Also check out the below link on customizing  OSE autoscale behavior via HAProxy

https://www.openshift.com/blogs/customizing-autoscale-functionality-in-openshift

GIT and OSE

Every OpenShift application you create has its own Git repository that only you can access. If you create your application from the command line, rhc will automatically download a copy of that repository (Git calls this ‘cloning’) to your local system. If you create an application from the web console, you’ll need to tell Git to clone the repository. Find the Git URL from the application page, and then run:

$ git clone

Once you make changes, you’ll need to ‘add’ and ‘commit’ those changes – ‘add’ tells Git that a file or set of files will become part of a larger check in, and ‘commit’ completes the check in. Git requires that each commit have a message to describe it.

$ git add .
$ git commit -m “A checkin to my application”

Finally, you’re ready to send your changes to your application – you’ll ‘push’ these changes with:

$ git push

And thats it!

OpenShift v3, which was just launched this week at Red Hat Summit 2015, introduces big changes with a move to Docker and Kubernetes in lieu of vanilla linux containers & barebones orchestration. Subsequent posts will focus on v3 as it matures.

2 thoughts on “A deep look into OpenShift v2”

  1. Wonderful post on OpenShift v2 especially the workflow of git-push deployments.You described a highly complex process so succinctly in such an easy to understand manner. We are currently using Origin v2x and are considering v3. Waiting for you to write an equally deep post on v3. Thanks again.

Leave a Reply

Your email address will not be published. Required fields are marked *