Hadoop in Finance: Big Data in the Pursuit of Big Bucks

I had the recent honor of being interviewed by Nick Ismail, editor at Information Age UK. We discussed IT trends driving financial services worldwide. For this week’s blogpost, I wanted to share the resulting article published in the magazine on 22 March, 2017 @ (http://www.information-age.com/hadoop-finance-big-data-pursuit-big-bucks-123465206/)  

Apache Hadoop (the freely available data crunching engine) can be used by financial institutions to grow their top lines while reducing compliance risks often associated with “sticky” financial relationships.

One of the great features of Hadoop is that it allows enterprises to store, analyse and share multiple data streams simultaneously, a fact that is immensely useful helping the users of this technology detect anomalies in their company’s output
One of the great features of Hadoop is that it allows enterprises to store, analyse and share multiple data streams simultaneously, a fact that is immensely useful helping the users of this technology detect anomalies in their company’s output.

There are very few industries that are as data-centric as banking and insurance. If there’s one thing that can be taken as certain in today’s rapidly changing financial world it’s that both the amount and the value of data collected is constantly increasing day by day.

Every interaction that a client or partner system has with a banking institution produces actionable data that has potential business value associated with it.

After the 2008 crisis, regulatory constraints have required the recording and reporting of more data than ever before, and, as capital and liquidity reserve requirements have also increased, the need to know exactly how much capital needs to be reserved, based on current exposures, is critical.

In order to keep up with market trends and run services more effectively, Apache Hadoop, a platform for storing and analysing massive amounts of data, is currently gaining momentum. Its market is forecasted to grow at a compound annual growth rate of 58%, surpassing $1 billion by 2020.

 As data becomes digital, financial institutions can no longer ignore the importance of harnessing information. Major Banks, insurers and capital markets are now able to leverage Hadoop for better decision making, fraud detection, new product development and accurate forecasts.

This article will delve into some of the various opportunities presented by Hadoop in the financial sector.

Distributed computing for smarter decision making

Hadoop is often used in the provision of financial services due to its power in both data processing and analysis. Banks, insurance companies and security firms rely on it to store and process huge amounts of data they accrue in the course of their employ, which would otherwise require expensive or cost prohibitive hardware and software licenses.

Large retail banks receive thousands of incoming applications for checking and savings accounts every week. Practice dictates that bankers would normally consult 3rd-party risk scoring services before opening an account, or granting a loan.

They can (and do) override do-not-open recommendations for applicants with poor banking histories. Many of these high-risk accounts overdraw and charge-off due to mismanagement or fraud, costing banks millions of dollars in losses. Some of this cost is passed on to the customers who responsibly manage their accounts.

Applications built on Hadoop can store and analyse multiple data streams and help, for example, regional bank managers control new account risk in their branches.

They can match banker decisions with the risk information presented at the time of decision and thereby control risk by highlighting if any individuals should be sanctioned, whether policies need updating, and whether patterns of fraud are identifiable.

Over time, the accumulated data informs algorithms that may detect subtle, high-risk behaviour patterns unseen by the bank’s risk analysts.

This is especially important to insurance industries who seek to gather actionable intel on their prospective clients quickly, to make better decisions concerning risk, especially in “sticky” financial relationships (like mortgages) which often last for more than 10 years.

Being successful in this means millions in savings a year, which can then be passed on to low-risk clientele or company shareholders. Hadoop also supports predictive modelling, allowing enterprises to quantify future risk and develop market strategies accordingly.

This information can also then be turned into publicly reportable intelligence, increasing the perceived expertise (and thus brand value) of the enterprise engaged in data analysis.

Turning data into fraud protection

Not only does Hadoop help build companies’ prestige, it also protects them from the professional malfeasance that could be detrimental to it. This especially is a factor that should be studied carefully by financial sector workers, since one of the main benefits they would obtain is the protection against fraud through their data analysis efforts.

Because of the massive capacity for storage in Data Lakes, extensive records can be constantly collated and updated, including what decisions were made, what risks were present at the time of decision, how internal policies on the issue have changed over time, and whether there have been emerging patterns of fraud.

This is crucial also because maintaining control over data and understanding how to query it are set to be massively important considerations for regulatory reporting and adherence. Banks need to understand their existing customer data to predict and modify mortgages as appropriate for borrowers in financial distress.

Tracking money laundering

Following the establishment of international anti-money laundering (AML) requirements to prevent the cross-border flow of funds to criminal and terrorist organisations, malicious actors have chosen to hide in the ever more complex world of trading, away from banks.

This is a global business valued at more than $18.3 trillion, formed of an intricate web of different languages and legal systems.

Due to storage limitations, leading institutions are unable to archive historical trading data logs, subsequently reducing the amount of information available for risk analysis until after close of business. Under these premises, this gap creates a window of time which would allow money laundering or rogue trading go unnoticed.

Hadoop is now able to provide unprecedented speed-to-analytics and an extended data retention timeline, allowing more visibility into all trading activities for a comprehensive and thorough scrutiny.

The trading risk group accesses this shared data lake to processes more position, execution and balance data. They can do this analysis on data from the current workday, and it is highly available for at least five years—much longer than before.

The bottom line

These benefits, ranging from faster data analysis to money-laundering protection, come from being able to act on hitherto undetectable evidence and customer patterns.

One of the great features of Hadoop is that it allows enterprises to store, analyse and share multiple data streams simultaneously, a fact that is immensely useful helping the users of this technology detect anomalies in their company’s output.

A regional bank manager, for example, might have to control new account risk in their branches, the indicators of which can be extremely subtle and nigh imperceptible to the average employee.

Using Hadoop, this information can be collected, scanned for anomalies over various branches, and sent by an automated ticker plant in real time to the relevant decision maker, saving time and improving operational margins.

All the factors outlined above have real world effects on the bottom line of modern institutions and this is only truer for the financial industry. The fact that big data has been warmly welcomed by the financial community only makes the competitive disadvantage felt by tech laggards more pronounced. The point is, data matters more and more, just as data surrounds us more and more, so it’s important that businesses accept and learn to navigate this environment.

Sourced by Vamsi Chemitiganti, GM financial services, Hortonworks

The Definitive Reference Architecture for Market Surveillance (CAT, UMIR and MiFiD II) in Capital Markets..

We have discussed the topic of market surveillance reporting to some depth in previous blogs. e.g.http://www.vamsitalkstech.com/?p=2984.  Over the last decade, Global Financial Markets have embraced the high speed of electronic trading. This trend has only accelerated with the concomitant explosion in trading volumes. The diverse range of instruments & the proliferation of trading venues pose massive regulatory challenges in the area of market conduct supervision and abuse prevention. Banks, Broker dealers, Exchanges and other market participants across the globe are now shelling out millions of dollars in fines for failure to accurately report on market abuse violations. In response to this complex world of high volume & low touch electronic trading, global capital markets regulators have been hard at work across different jurisdictions & global hubs e.g. the FINRA in the US, the IROC in Canada and the ESMA in the European Union. Regulators have created extensive reporting regimes for surveillance with a view to detecting suspicious patterns of trade behavior (e.g, dumping, quote stuffing & non bonafide fake orders etc). The intent to increase market transparency on both the buy and the sell side. Based on the scrutiny Capital Markets players are under, a Big Data Analytics based architecture has become a “must-have” to ensure timely & accurate compliance with these mandates. This blog attempts to discuss such a reference architecture.

Business Technology Requirements for Market Surveillance..

The business requirements for the Surveillance architecture are covered at the below link in more detail but are reproduced below in a concise fashion.

A POV on European Banking Regulation.. MAR, MiFiD II et al

Some of the key business requirements that can be distilled from regulatory mandates include the below:

  • Store heterogeneous data – Both MiFiD II and MAR mandate the need to perform trade monitoring & analysis on not just real time data but also historical data spanning a few years. Among others this will include data feeds from a range of business systems – trade data, eComms, aComms, valuation & position data, order management systems, position management systems, reference data, rates, market data, client data, front, middle & back office, data, voice, chat & other internal communications etc. To sum up, the ability to store a range of cross asset (almost all kinds of instruments), cross format (structured & unstructured including voice), cross venue (exchange, OTC etc) trading data with a higher degree of granularity – is key.
  • Data Auditing – Such stored data needs to be fully auditable for 5 years. This implies not just being able to store it but also putting in place capabilities in place to ensure  strict governance & audit trail capabilities.
  • Manage a huge volume increase in data storage requirements (5+ years) due to extensive Record keeping requirements
  • Perform Realtime Surveillance & Monitoring of data – Once data is collected,  normalized & segmented, it will need to support realtime monitoring of data (around 5 seconds) to ensure that every trade can be tracked through it’s lifecycle. Detecting patterns that could perform surveillance for market abuse and monitor for best execution are key.
  • Business Rules  – Core logic that deals with identifying some of the above trade patterns are created using business rules. Business Rules have been covered in various areas in the blog but they primarily work based on an IF..THEN..ELSE construct.
  • Machine Learning & Predictive Analytics – A variety of supervised ad unsupervised learning approaches can be used to perform extensive Behavioral modeling & Segmentation to discover transactions behavior with a view to identifying behavioral patterns of traders & any outlier behaviors that connote potential regulatory violations.
  • A Single View of an Institutional Client- From the firm’s standpoint, it would be very useful to have a single view capability for clients that shows all of their positions across multiple desks, risk position, KYC score etc.

A Reference Architecture for Market Surveillance ..

This reference architecture aims to provide generic guidance to banking Business IT Architects building solutions in the realm of Market & Trade Surveillance. This supports a host of hugely important global reg reporting mandates – CAT, MiFiD II, MAR etc that Capital Markets need to comply with. While the concepts discussed in this solution architecture discussed are definitely Big Data oriented, they are largely agnostic to any cloud implementation – private, public or hybrid.

A Market Surveillance system needs to include both real time surveillance of trading activity as well as a retrospective (batch oriented) analysis component. The real time component includes the ability to perform realtime calculations (concerning thresholds, breached limits etc). real time queries with the goal of triggering alerts. Both these kinds of analytics span structured and unstructured data sources. For the batch component, the analytics involve data queries, simple to advanced statistics (min, max, avg, std deviation, sorting, binning, segmentation) to running data science models involving text analysis & search etc.

The system needs to process tens of millions to billions of events in a trading window while providing highest uptime guarantees. Batch analysis is always running in the background.

A Hadoop distribution that includes components such as Kafka, HBase and near real time components such as Storm & Spark Streaming provide a good fit for a responsive architecture. Apache NiFi with its ability to ingest data from a range of sources is preferred for it’s ability to support complex data routing, transformation, and system mediation logic in a complex event processing architecture. The capabilities of Hortonworks Data Flow (the enterprise version of Apache NiFi) is covered in the below blogpost in much detail.

Use Hortonworks Data Flow (HDF) To Connect The Dots In Financial Services..(3/3)

A Quick Note on Data Ingestion..

Data volumes in the area of Regulatory reporting can be huge to insanely massive. For instance, at large banks, they can go up to 100s of millions of transactions a day. At market venues such as stock exchanges, they easily enter into the hundreds of billions of messages every trading day. However the data itself is extremely powerful & is really business gold in terms of allowing banks to not just file mundane reg reports but also to perform critical line of business processes such as Single View of  Customer, Order Book Analysis, TCA (Transaction Cost Analysis), Algo Backtesting, Price Creation Analysis etc. The architecture thus needs to support multiple ways of storage, analysis and reporting ranging from compliance reporting to data scientists to business intelligence.

Real time processing in this proposed architecture are powered by Apache NiFi. There are five important reasons for this decision – 

  • First of all, complex rules can be defined in NiFi in a very flexible manner. As an example, one can execute SQL queries in processor A against incoming data from any source (data that isnt from a relational databases but JSON, Avro etc.) and  then route different results to different downstream processors based on the needs for processing while enriching it. E.g. Processor A could be event driven and if any data is being routed there, a field can be added, or an alert sent to XYZ. Essentially this can be very complex, equivalent to a nested rules engine so to speak. 
  • From a Throughput standpoint, a single NiFi node can typically handle somewhere between 50MB/s to 150MB/s depending on your hardware spec and data structure. Assuming 100-500 kbytes of average messages, for a throughput of 600MB/s, the architecture can be sized to about 5-10 NiFi nodes. It is important to note that performance latency of inbound message processing depends on the network, could be extremely small. Under the hood, you are sending data from source to NIfi node (disk), extract some attributes in memory to process, and deliver to the target system.
  • Data quality can be handled via the aforementioned “nested rules engine” approach, consisting of multiple NiFi processors. One can even embed an entire rules engine into a single processor. Similarly, you can define simple authentication rules at the event level. For instance, if Field A = English, route the message to an “authenticated” relationship; otherwise send it to an “unauthenticated” relationship.

  • One of the corner stones in NiFi is called “Data Provenance“, allowing you to have end to end traceability. Not only can the event lifecycle of trade data be traced but you can also track the time at which it happened & the user role who made the change and metadata around why did it happen.

  • Security – NiFi enables authentication at ingest. One can authenticate data via the rules defined in NiFi, or leverage target system authentication which is implemented at processor level. For example, the PutHDFS processor supports kerberized HDFS, the same applies for Solr and so on.

Overall Processing flow..

The below illustration shows the high-level conceptual architecture. The architecture is composed of core platform services and application-level components to facilitate the processing needs across three major areas of a typical surveillance reporting solution:

  • Connectivity to a range of trade data sources
  • Data processing, transformation & analytics
  • Visualization and business connectivity
Reference Architecture for Market Surveillance Reg Reporting – CAT, MAR,MiFiD II et al

The overall processing of data follows the order shown below and depicted in the diagram below –

  1. Data Production – Data related to Trades and their lifecycle is produced from a range of business systems. These data feeds from a range of business systems (including but not limited to) – trade data, valuation & position data, order management systems, position management systems, reference data, rates, market data, client data, front, middle & back office, data, voice, chat & other internal communications etc.
  2. Data Ingestion – Data produced from the the above layer is ingested using Apache NiFi from a range of sources described above. Data can also be filtered and alerts can be setup based on complex event logic. For time series data support HBase can be leveraged along with OpenTSDB. For CEP requirements, such as sliding windows and complex operators, NiFi can be leveraged along with Kafka and Storm pipeline.  Using NiFi will make the process easier to load data into the data lake while applying guarantees around the delivery itself.  Data can be streamed in real time as it is created in the feeder systems. Data is also loaded at end of the trading day based on the P&L sign off and the end of day close processes.  The majority of the data will be fed in from Book of Record Trading systems as well as from market data providers.
  3. As trade and other data is ingested into the data lake, it is important to note that the route in which certain streams are processed will differ from how other streams are processed. Thus the ingest architecture needs to support multiple types of processing ranging from in memory processing, intermediate transformation processing on certain data streams to produce a different representation of the stream. This is where NiFi adds critical support in not just handling a huge transaction throughput but also enabling “on the fly processing” of data in pipelines. As mentioned, NiFi does this via the concept of “processors”.
  4. The core data processing platform is then based on a datalake pattern which has been covered in this blog before. It includes the following pattern of processing.
    1. Data is ingested real time into a HBase database (which uses HDFS as the underlying storage layer). Tables are designed in HBase to store the profile of a trade and it’s lifecycle.
    2. Producers are authenticated at the point of ingest.
    3. Once the data has been ingested into HDFS, it is taken through a pipeline of processing (L0 to L3) as depicted in the below blogpost.

      http://www.vamsitalkstech.com/?p=667

    4. Historical data (defined as T+1) once in the HDFS tier is taken through layers of processing as discussed above. One of the key areas of processing is to run machine learning on the data to discover any hidden patterns in the trades themselves. Patterns that can connote a range of suspicious behavior. Most surveillance applications are based on a search for data that breaches thresholds and seek to match sell & buy orders. The idea is that when these rules are breached, alerts are then generated for compliance officers to conduct further investigation. However this method falls short with complex types of market abuse.A range of supervised learning techniques can then be applied on data such as creating a behavioral profile of different kinds of traders (for instance junior and senior) by classifying & then scoring them based on their likelihood to commit fraud. Thus a range of Surveillance Analytics can be performed on the data. Apache Spark, is highly recommended for near realtime processing not only due to its high performance characteristics but also due to its native support for graph analytics and machine learning – both of which are critical to surveillance reporting.For a deeper look at data science, I recommend the below post.

      http://www.vamsitalkstech.com/?p=1846

    5. The other important value driver in deploying Data Science is to perform Advanced Transaction Monitoring Intelligence.  The core idea is to get years worth of Trade data in one location (i.e the datalake) & then applying  unsupervised learning to glean patterns in those transactions. The goal is then to identify profiles of actors with the intent of feeding it into existing downstream surveillance & TM systems.
    6. This knowledge can then be used to constantly learn transaction behavior for similar traders. This can be a very important capability in detecting fraud in traders, customer accounts and instruments.Some of the usecases are –
      • Profile trading activity of individuals with similar traits (types of customers, trading desks & instruments, geographical areas of operations etc.) to perform Know Your Trader
      • Segment traders by similar experience levels and behavior
      • Understand common fraudulent behavior typologies (e.g. spoofing) and clustering such (malicious) trading activities by trader, instrument and volume etc. The goal being to raise appropriate downstream investigation case management system
      • Using advanced data processing techniques like Natural Language Processing, constantly analyze electronic communications and join them up with trade data sources to both detect under the radar activity but also to keep the false positive rate low.
    7. Graph Database – Given that most kinds of trading fraud happens in groups of actors – traders acting in collusion with  verification & compliance – the ability to view complex relationships of interactions and the strength of those interactions can be a significant monitoring capability
    8. Grid Layer – To improve performance, I propose the usage of a distributed in memory data fabric like JBOSS DataGrid or Pivotal GemFire. This can aid in two ways –

      a. Help with fast lookup of data elements by the visualization layer
      b. Help perform fast computation process by overlaying a framework like Spark or MapReduce directly onto a stateful data fabric.

      The choice of tools here is dependent of the language choices that have been made in building the pricing and risk analytic libraries across the Bank. If multiple language bindings are required (e.g. C# & Java) then the data fabric will typically be a different product than the Grid.

      Data Visualization…

      The visualization solution chose shouldI enable the quick creation of interactive dashboards that provide KPIs and other important business metrics from a process monitoring standpoint. Various levels of dashboard need to be created ranging from compliance officer toolboxes, executive dashboard to help identify trends and discover valuable insights.

      Compliance Officer Toolbox (Courtesy: Arcadia Data)

      Additionally, the visualization layer shall provide

      a) A single view of Trader or Trade or Instrument or Entity

      b) Investigative workbench with Case Management capability

      c) The ability follow the lifecycle of a trade

      d) The ability to perform ad hoc queries over multiple attributes

      e) Activity correlation across historical and current data sets

      f) Alerting on specific metrics and KPIs

      To Sum Up…

      The solution architecture described in this blogpost is designed with peaceful enterprise co-existence in mind. In the sense, it interacts and is also integrated with a range of BORT systems and other enterprise systems such as ERP, CRM, legacy surveillance systems. This includes all and any other line of business solutions that typically exist as shared enterprise resources (such as CRM or ERP systems or other line-of-business solutions).

The Industrial Internet Comes of Age..

In 2017, the chief strategic concerns for Global Product Manufacturers are manifold. These range from their ability drive growth in new markets by creating products that younger customers need, cut costs by efficient high volume manufacturing spanning global supply chains  & effective distribution and service. While the traditional lifecycle has always been a huge management challenge the question now is how digital technology can help create new markets and drive higher margins in established areas. In this blogpost, we will consider how IIoT (Internet Of Things) technology can do all of the above and foster new business models -by driving customer value on top of the core product.

An Industry in Digital Dilemma

The last decade has seen tectonic changes in leading manufacturing economies. Along with a severe recession, employment in the industry has moved along the technology curve to a more skilled workforce. The services component of the industry is also steadily increasing i.e manufacturing now consumes business services and also is presented as such in certain sectors. The point is well made that this industry is not monolithic and there are distinct sectors with their own specific drivers for business success[1].

           The diverse sectors within Global Manufacturing (McKinsey [1])

Global manufacturing operations have evolved differently across each of the above five sectors – Global innovators for local markets, Regional processing, Energy intensive commodities, . For instance, each of the sectors has different geographical locations where production takes place, diverse supply chains, support models, efficiency requirements and technological focus areas and competitive forces.

However the trend that is broadly applicable to all of them is the “Industrial Internet”.

The Industrial Internet of Things (IIOT) can be defined as a ecosystem of capabilities that interconnects machines, personnel and processes to optimize the manufacturing lifecycle.  The foundational technologies that IIOT leverages are Smart Assets, Big Data, Realtime Analytics, Enterprise Automation and Cloud based services.

Globally integrated manufacturers must constantly assess and fine-tune their strategy across these above eight stages. A key aspect is to be able to collect data throughout the process to derive real-time insights from the lifecycle, suppliers and customers. IoT technologies allied with Big Data techniques provide ways to store this data and to derive real-time & historical analytic insights. Thus the Manufacturing industry is moving to an entirely virtual world across its lifecycle, ranging from product development, customer demand monitoring to production to inventory management. This trend is being termed as Industry 4.0 or Connected Manufacturing. As devices & systems become more interactive and intelligent, the data they send out can be used to optimize the lifecycle across the value chain thus driving higher utilization of plant capacity and improved operational efficiencies.

Let us consider the impact of the IIOT across the above sectors.

IIOT moves the Manufacturing Industries from Asset Centric to Data Centric

The Generic Product Manufacturing Lifecycle Overview as depicted in the above illustration covers the the most important activities that take place in the manufacturing process. Please note that this is a high level overview and in future posts we will expand upon each stage accordingly.

The overall lifecycle can be broken down into the following eight steps:

  1. Globally Integrated Product Design
  2. Prototyping and Pre-Production
  3. Mass production
  4. Sales and Marketing
  5. Product Distribution
  6. Activation and Support
  7. Value Added Services
  8. Resale and Retirement

IIoT impacts drives Product Design and Innovation

IoT technology can have a profound impact on the above traditional lifecycle in the following ways –

  1. The ability to connect the different aspects of the value chain that hitherto have been disconnected. This will fundamentally transform the asset lifecycle leading to higher manufacturing efficiencies, reduced wastage and more customer centric manufacturing (thus reducing recall rates)
  1. The ability to manage and integrate diverse data from sensors, machine data from operational systems, supplier channels & social media feedback drives real time insights
  2. The Connected asset lifecycle also leads to better inventory management and also drive optimal resupply decisions
  3. Create new business models that leverage data across the lifecycle to enable better product usage, pay for performance or outcome based services or even a subscription based usage model
  4. The ability track real time insights across the customer base thus leading to a more optimized asset lifecycle
  5. Reducing costs by allowing more operations ranging from product maintenance to product demos, customer experience sessions to occur remotely

Manufacturers have been connecting the value chain together for many years now. The M2M (mobile to mobile) implementations have already led to rounds of improvements in the so called ‘illities’ metrics– productivity, quality, reliability etc. The real opportunity with IIoT is being able to create new business models that result from the convergence of Operational Technology (OT) with Information Technology (IT). This journey primarily consists of taking a brick and mortar industry and slowly turning it into a data driven industry.

The benefits of adopting the IIOT range from improved quality owing to better aligned, efficient and data driven processes, higher operational efficiency overall, products better aligned with changing customer requirements, tighter communication across interconnected products and supplier networks.

The next post in this series will delve deeper into the Manufacturing Servitization phenomenon.

References

  1. McKinsey & Company  – Global Manufacturing Outlook  – http://www.mckinsey.com/business-functions/operations/our-insights/the-future-of-manufacturing

How Big Data & Advanced Analytics can help Real Estate Investment Trusts (REITS)

                                                         Image Credit – Kiplinger’s

Introduction…

Real Estate Investment Trust’s (REITS) are financial companies that own various forms of commercial and residential real estate. These assets include office buildings, retail shopping centers, hospitals, warehouses, timberland and hotels etc. Real estate is growing quite nicely as a component of the global financial business. Given their focus on real estate investments, REITS have always occupied a specialized position in global finance.

Fundamentally, there are three types of REITS –

  1. Equity REITS which exclusively deal in acquiring, improving and selling properties with the aim of higher returns for their investors
  2. Mortgage REITS only buy and sell mortgages
  3. Hybrid REITS which do both #1 and #2 above

REITS have a reasonably straightforward business model – you take the yields from the properties you own and reinvest the funds to be able to pay your investors (a mandated 95% of dividends). Most of the traditional REIT business processes are well handled by conventional types of technology. However more and more REITs are being challenged to develop a compelling Big Data strategy that leverages their tremendous data assets. 

The Five Key Big Data Applications for REITS… 

Let us consider at the five key areas where advanced analytics built on a Big Data foundation can immensely help REITS.

#1 Property Acquisition Modeling 

REITS owners can leverage the rich datasets available around renters demographics, preferences, seasonality, economic conditions in specific markets to better guide capital decisions on acquiring property. This modeling needs to take into account land costs, development costs, fixture costs & any other sales and marketing costs to appeal to tenants. I’d like to call this macro business perspective. Also from a micro business perspective, being able to better study individual properties using a variety of widely available data – MLS listings for similar properties, foreclosures, closeness to retail establishments, work sites, building profiles, parking spaces, energy footprint etc can help them match tenants to their property holdings. All this is critical to getting their investment mix right to meet profitability targets.

                                  Click on the Image for a blogpost discussing Predictive Analytics in Capital Markets

#2 Portfolio Modeling 

REITS can leverage Big Data to perform more granular modeling of their MBS portfolios. As an example, they can feed in a lot more data into their existing models as discussed above. E.g.  Demographic data, macroeconomic factors et al.  

A simple scenario would be if Interest Rates go up by X basis points – what does that mean for my portfolio exposure, Default Rate, Cost Picture, Optimal times to buy certain MBS’s etc ?  REITS can then use that info to enter hedges etc to protect against any downside. Big Data can also help with a range of predictive modeling across all of the above areas as discussed below.  An example is to build a 360 degree view of a given investment portfolio.

                                                         Click on Image for a Customer 360 discussion 

#3 Risk Data Aggregation & Calculations 

The instruments underlying the portfolios themselves carry large amounts of credit & interest rate risk. Big Data is a fantastic platform for aggregating and calculating many kinds of risk exposures as the below link discuss in detail. 

  

                                            Click on Image for a discussion of Risk Data Aggregation and Measurement 

 

#4 Detect and Prevent Money Laundering (AML)

Due to the global nature of investment funds flowing into real estate, REITS are highly exposed to money laundering and sanctions risks. Whether or not REITS operate in high risk geographies (India,China, South America, Russia etc) or have complex holding structures – they need to file SAR (Suspicious Activity Reports) with the FinCEN.  There has always been a strong case to be made that shady foreign entities and individuals were laundering ill gotten proceeds to buy US real estate. In early 2016, the FinCEN began implementing Geographic Targeting Orders (GTOs). Title companies based in the United States are now required to clearly identify the real owners of either limited liability companies (LLCs) or any other partnerships, and other legal entities being used to purchase high end residential real estate using cash.

AML as a topic is covered exhaustively in the below series of blogposts (please click on image to open the first one).

                                                         Click on Image for a Deepdive on AML

#5 Smart Cities, Net New Investments and Property Management

In the future, REITS would want to invest in Smart Cities which are positioned to be leading urban centers offering mobility, green technology, personalized medicine, safe services, clean water, traffic management and other forward looking urban amenities. These Smart Cities target a new kind of client- upwardly mobile, technologically savvy, environment conscious millenials. According to RBC Capital Markets, Smart Cities presents a massive investment opportunity for REITS. Such investments could provide REITS offering income yields of around 10-20%. (Source – Ben Forster @ Schroeders).

Smart Cities will be created using a number of high end technologies such as IoT, AI, Virtual Reality, Device Meshes etc. By 2020, it is estimated that these buildings will be generating an enormous amount of data that needs to be stored and analyzed by landlords.

As the below graphic from Cisco attests, the ability to work with IoT data to analyze a range of these micro investment opportunities is a Big Data challenge.

The ongoing maintenance and continuous refurbishment of rental properties is a large portion of the business operation of a REIT. The availability of smart sensors and such IoT devices that can track air quality, home appliance malfunction etc can help greatly with preventive maintenance.

Conclusion..

As can be seen from some of the above business areas, most REITS data needs require a holistic approach across the value chain (capital sourcing, investment decisions, portfolio management & operations). This approach spans various horizontal functions like Customer Segmentation, Property Acquisition, Risk, Finance and Business Operations.
The need of the hour for larger REITS is to move to a common model for data storage, model building and testing.  It is becoming increasingly obvious that Big Data can provide massive business opportunities for REITS.

Why the Internet of Things (IoT) is about Data Driven Ecosystems (& not really about the Devices)..

The Internet of Things (IoT) will have a great impact on the economy by transforming many enterprises into digital business and facilitating new business models, improving efficiency, and generating new forms of revenue.However, the ways in which enterprises can actualize any benefits will be diverse and, in some cases, painful” – Jim Tully, vice president and distinguished analyst at Gartner – 2015.

The IoT is one of the most hyped paradigms floating around at the moment. However the hype is not all unjustified. Analyst projections have about 25 billion devices connected to the internet by 2020 delivering cumulative business value of $2 trillion[1] across many industry verticals. Enterprise IT need to now begin developing capabilities to harness this information to serve their end customers. This blogpost discusses foundational IoT business elements that are common across industries.

                                                         Image Credit – ThinkStock

The Immense Market Opportunity around IoT 

The IoT has rapidly become one of the most familiar — and perhaps, most hyped — expressions across business and technology. That hype, however, is entirely justified and is backed up by the numbers as one can glean from the below graphic. The estimated business value of this still nascent market is expected to be around $10 trillion plus by 2022.

                                                         Credit – Tamara Franklin (Oracle Research)

Thinking around IoT has long been dominated by passive devices such as Industrial Sensors, RFID tags and Actuators. As pointed out in my “Gartner’s Trends for 2017” article, these devices are beginning to form a smart mesh. Field devices now have increased ‘smart’ capabilities to communicate with each other and with the internet – typically using an IP protocol -resulting in the combined intelligence of groups of such ‘things’.  The IoT now enables not just machine to machine communication but also the human to machine and human to IoT ecosystems. While the media plays up stories of IoT aware devices such as Google Nest or Amazon Echo etc – it is also shaking up vertical industries.

Virtually every industry out there has a significant amount of connected devices that have been deployed. This includes Retail, Energy & Utilities, Manufacturing, Healthcare, Transportation and Financial services etc.  Having said that, let us consider the five key industrial uses for the IoT space that will yield tremendous business value over the short to medium term – the next 2-5 years.

The Six Key Industry Applications of IoT 

Consider the above graphic (courtesy the BCG), the real business value in IoT lies in Analytics and Applications built on these analytics. In fact, BCG expects that by 2020 these higher order layers will have captured 60% of the growth from the [3]. In such a scenario, the rest of the technology elements – connected things, cloud platforms & data architectures merely enable the upper two layers in delivering business value.

Let us then consider the key industrial use cases for IoT –
  1. Retailers implementing IoT are working to ensure that their customers gain a seamless experience while browsing products in the store.For example the industry has begun adopting smart shelves that restock themselves, installed beacons in stores that communicate with shopping apps on consumers smartphones and NFC (Near Field Communications) that enable customers to make contact-less payments.  Internal operations such as Supply Chains are benefiting in a big way in their ability to gain realtime insight into the
  2. In the area of Commercial real estate, facilities management is an area where companies spend massive amounts of money on energy consumption. According to Deon Newman, at IBM Watson[3], global conglomerates like Siemens own hundreds of thousands of building which produce tens of thousands of millions of emissions. In this case, IoT analytics is being leveraged to reduce such huge carbon footprint.
  3. In the Utilities Industry – as Smart Meters have proliferated in the industry, IoT is driving use-cases ranging from Predictive Maintenance of equipment to optimizing Grid usage. For instance, in water utilities, smart sensors track everything from quality to pressure to usage patterns. Utilities are creating software platforms that provide analytics on usage patterns and forecast demand spikes in the grid .
  4. The Manufacturing industry is moving to an entirely virtual world across its lifecycle, ranging from product development, customer demand monitoring to production to inventory management. This trend is being termed as Industry 4.0 or Connected Manufacturing. As devices & systems become more interactive and intelligent, the data they send out can be used to optimize the lifecycle across the value chain thus driving higher utilization of plant capacity and improved operational efficiencies.
  5. The biggest trend in the Transportation industry is undoubtedly self driving connected cars & buses. The Connected Vehicle concept enables a car or a truck to behave as a giant smart app – sending out data and receiving inputs to optimize it’s functions. With the passing of every year, car makers are adding more and more smart features. Thus, vehicles have more automatic features builtin – ranging from navigation, requesting roadside assistance, self parking etc etc. Applications are being built which will enable these devices to be tracked on the digital mesh thus enabling easy inter vehicle communication to enable traffic management, pollution reduction and public safety.
  6. With Smart Cities governments across the globe are increasingly focused on traffic management, pollution management, public services etc – all with a view to improving quality of life for their citizens.  All of these ecosystems will be adopting IoT technology in the days and years to come.

Conclusion..

It can be seen from the above that the applications are myriad. Thus, while one cannot recommend a generic IT approach to IoT thats applicable to every industry – familiar themes do emerge that apply from a core IT capability standpoint.

The next post will consider the five key & common technology capabilities that Enterprise CIOs need to ensure that their organizations begin to develop to win in the IoT era.

References

[1] Gartner July 2015 – “The Internet of Things is a Revolution waiting to happen” – http://www.gartner.com/smarterwithgartner/the-internet-of-things-is-a-revolution-waiting-to-happen/

[2] BCG Analysis – “Winning in the IoT is about business processes” – https://www.bcgperspectives.com/content/articles/hardware-software-energy-environment-winning-in-iot-all-about-winning-processes/

[3] “Cognitive Computing and the future of smart buildings” – Deon Newman, IBM Watson IoT

https://www.ibm.com/blogs/internet-of-things/cognitive-computing-future-smart-buildings/

A POV on Bank Stress Testing – CCAR & DFAST..

The recession of 2007 to 2009 was still the most painful since the Depression. At its depths, $15 trillion in household wealth had disappeared, ravaging the pensions and college funds of Americans who had thought their money was in good hands. Nearly 9 million workers lost jobs; 9 million people slipped below the poverty line; 5 million homeowners lost homes.”
― Timothy F. Geithner, Former Secretary of the US Treasury – “Reflections on Financial crises – 2014”

A Quick Introduction to Macroeconomic Stress Testing..

The concept of stress testing in banking is not entirely new. It has been practiced for years in global banks across specific business functions that deal with risk. The goal of these internal tests has been to assess firm wide capital adequacy in periods of economic stress. However,the 2008 financial crisis clearly exposed how unprepared the Bank Holding Companies (BHCs) were to systemic risk brought on as a result of severe macroeconomic distress. Thus the current raft of regulator driven stress tests are motivated from the taxpayer funded bailouts in 2008. Back then banks were neither adequately capitalized to cope with stressed economic conditions nor were their market,credit risk losses across portfolios sustainable.

In 2007, SCAP (Supervisory Capital Access Program) was enacted as a stress testing framework in the US that only 19 leading financial institutions (Banks, Insurers etc) had to adhere to. The exercise was not only focused on the quantity of capital available but also the quality- Tier 1 common capital – with the institution. The emphasis on Tier 1 Common Capital is important as it provided an institution with a higher absorption capacity with minimizing losses to higher capital tiers.  Tier 1 Common Capital can also be managed better during economic stress by adjusting dividends, share buybacks and related activities.

Though it was a one off, the SCAP was a stringent and rigorous test. The Fed performed SCAP audits on the results of all the 19 BHC’s – some of whom failed the test.

Following this in 2010, the Dodd Frank Act was enacted by the Obama Administration.The Dodd Frank Act also introduced it’s own stress test – DFAST (Dodd-Frank Act Stress Testing). DFAST requires BHCs with assets of $10 billion & above to run annual stress tests and to make the results public. The goal of these stress tests is multifold but they are conducted primarily to assure the public, the regulators that BHCs have adequately capitalized their portfolios. BHC’s are required to present detailed capital plans to the Fed.

The SCAP’s successor, CCAR (Comprehensive Capital Adequacy Review) was also enacted around that time. Depending on the overall risk profile of the institution, the CCAR mandates several qualitative & quantitative metrics that BHCs need to report on and make public for several stressed macroeconomic scenarios.


Comprehensive Capital Analysis and Review (CCAR) is a regulatory framework introduced by the Federal Reserve in order to assess, regulate, and supervise large banks and financial institutions – collectively referred to in the framework as Bank Holding Companies (BHCs).
– (WikiPedia)

  • Every year, an increasing number of Tier 2 banks come under the CCAR mandate. CCAR basically requires specific BHCs to develop a set of internal macroeconomic scenarios or use those developed by the regulators. Regulators would then get the individual results of these scenario runs from firms across a nine quarter time horizon. Regulators also develop their own systemic stress tests to verify if a given BHC can withstand negative economic scenarios and continue to operate their lending operations. CCAR coverage primarily includes retail banking operations, auto & home lending, trading, counter party credit risk, AFS (Available For Sale)/HTM (Hold To Maturity) securities etc. The CCAR covers all major kinds of risk – market, credit, liquidity and OpsRisk.
CCAR kicked off global moves by regulators to enforce the same of banks in their respective jurisdictions. The EU requires EBA stress testing. The UK is an example of a country that requires its own stress testing – the Prudential Regulatory Authority. The same evolution of the firm wide stress testing has been followed by other regulators over the world, for example, in Europe with the EBA stress testing. Emerging markets such as India and China are also following this trend. Every year, more and more BHCs are increasingly subject to CCAR reporting mandates.

Similarities & Differences between CCAR and DFAST..

To restate – the CCAR is an annual exercise by the Federal Reserve to assess whether the largest bank holding companies operating in the United States have sufficient capital to continue operations throughout times of economic and financial stress and that they have robust, forward-looking capital-planning processes that account for their unique risks.  As part of this exercise, the Federal Reserve evaluates institutions’ capital adequacy, internal capital adequacy assessment processes, and their individual plans to make capital distributions, such as dividend payments or stock repurchases. Dodd-Frank Act stress testing (DFAST)-an exercise similar to CCAR- is a forward-looking stress test conducted by the Federal Reserve for smaller financial institutions. It is supervised by the Federal Reserve to help assess whether institutions have sufficient capital to absorb losses and support operations during adverse economic conditions.

As part of CCAR reporting guidelines, the BHC’s have to explicitly call out

  1. their sources of capital given their risk profile & breadth of operations,
  2. the internal policies & controls for measuring capital adequacy &
  3. any upcoming business decisions (share buybacks, dividends etc) that may impact their capital adequacy plans.

While both CCAR and DFAST look very similar from a high level  – they both mandate that banks  conduct stress tests – they do differ in the details. DFAST is applicable to banks that have assets between 10-50 billion $. During the planning horizon phase, CCAR allows the BHCs to use their own capital action assessments while DFAST enforces a standardized set of capital actions.The DFAST scenarios represent baseline, adverse and severely adverse scenarios. The DFAST is supervised by the Fed, the OCC (Office of the Comptroller of Currency) and the FDIC.

                                                Summary of DFAST and CCAR (Source: E&Y) 

As can be seen from the above table, while DFAST is complementary to CCAR, both efforts are distinct testing exercises that rely on similar processes, data, supervisory exercises, and requirements. The Federal Reserve coordinates these processes to reduce duplicative requirements and to minimize regulatory burden. CCAR results are reported twice on an annual basis and BHCs are required to also incorporate Basel III capital ratios in their reports with Tier 1 capital ratios calculated using existing rules. DFAST is reported up annually and it does include Basel III reporting.

In a Nutshell…

In CCAR (and DFAST), the Fed is essentially asking the BHC’s the following questions –

(1) For your defined risk profile, please define a process of understanding and mapping the key stakeholders to carry out this process.

(2) Please ensure that you use clean internal data to compute your exposures in the event of economic stress. The entire process of data sourcing, cleaning, computation, analytics & reporting needs to be auditable.

(3) What macroeconomic stress scenarios did you develop in working with key lines of business ? What are the key historical assumptions in developing these? What are the key what-if scenarios that you have developed based on the stressed scenarios? The scenarios need to be auditable as well.

(4) We are then going to run our own macroeconomic numbers & run our own scenarios using our own exposure generators on your raw data.

(5) We want to see how close both sets of numbers are.

Both CCAR and DFAST scenarios are expressed in stressed macroeconomic factors and financial indicators. The regulators typically provide these figures on a quarterly basis a few reporting periods in advance.

What are some examples of these scenarios?
  • Measures of Index Turbulence – E.g. In a certain quarter, regulators might establish that the S&P 500 would go down 30%; Decrease in key indices like home, commercial property & other asset prices.
  • Measures of  Economic Activity – E.g. US unemployment rate spikes, higher interest rates, increased inflation. What if unemployment ran to 14%? What does that do to my mortgage portfolio – the default rates increase and this is what it looks like.
  • Measures of Interest Rate Turbulence –  E.g. US treasury yields, Interest rates on US mortgages etc.

Based on this information, banks would then assess the impact of these economic scenarios as reflected in market and credit losses to their portfolios. This would help them estimate how their capital base would behave in this situation. These internal CCAR metrics are then sent over to the regulators. Every Bank has their own models based on their understanding which the Fed needs to review as well for completeness and quality.

The Fed uses the CCAR and DFAST results to evaluate capital adequacy, the quality of the capital adequacy assessment process and then evaluates the BHC’s plans to make capital distributions using dividends & share repurchases etc in the context of the results. The BHC’s boards of directors are required to approve and sign off on these plans.

What do CCAR & DFAST entail of Banks?

Well, six important things as the above illustration captures –

    1. CCAR is fundamentally very different from other umbrella risk types in that it has a strong external component in terms of reporting on internal bank data to the regulatory authorities. CCAR reporting is done by sending over internal bank Book of Record Transaction (BORT) data from their lending systems (with hundreds of manual adjustments) to the regulators for them to run their models to assess capital adequacy.  Currently , most banks do some model reporting internally which are based on canned CCAR algos in tools like SAS/Spark computed for a few macroeconomic stress scenarios.
    2. Both CCAR and DFAST stress the same business processes, data resources and governance mechanisms. They are both a significant ask on the BHCs from the standpoint of planning, execution and governance. BHCs have found them daunting with the new D-SIB’s that enter the mandate are faced with implementing these programs that need significant organizational and IT spend.
    3. Both CCAR and DFAST challenge the banks on data collection, quality, lineage and reporting. The Fed requires that data needs to be accurate, comprehensive and clean. Data Quality is the single biggest challenge to stress test compliance. Banks need to work on a range of BORT (Book of Record Transaction Systems) like Core Banking, Lending Portfolios, Position data and any other data needed to accurate reflect the business. There is also a reconciliation process that is typically used to reconcile risk data with the GL (General Ledger). For instance if a BHC’s lending portfolio is $4 billion based on the raw summary data. Once reconciliation is performed – it seems to be around $3 billion after adjustments. The regulator runs the aforesaid macroeconomic scenarios at $4 billion and the exposures are naturally off.
    4. Contrary to popular perception -the heavy lifting from is typically not in creating and running the exposure calculations for stress testing. The creation of these is relatively straightforward. Banks historically have had their own analytics groups produce these macroeconomic models. They also already have 10s of libraries in place that can be modified to create supervisory scenarios for CCAR/DFAST- the baseline, adverse & severely adverse. The critical difference with stress testing is that silo-ed models and scenarios need be unified along with the data.
    5. Model development in Banks usually follows a well defined lifecycle.Most of Liquidity Assessment and Liquidity Groups within Banks currently have a good base of quants with a clean separation of job duties. For instance, while one group produces scenarios, others work on exposures that feed into liquidity engines to calculate liquidity. The teams running these liquidity assessments are good candidates to run the CCAR/DFAST models as well. The calculators themselves will need to be rewritten for Big Data using something like SAS/ Spark.
    6. Transparency must be demonstrated down to the source data level. And banks need to be able to document all capital classification and computation rules to a sufficient degree to meet regulatory requirements during the auditing and review process.

The Technology Implications of  CCAR/DFAST..

It can clearly be seen that regulatory stress testing derives inputs from virtually every banking function. Then it should come as no surprise that  it follows that from a technology point of view there are several implications :

    • CCAR and DFAST impact a range of systems, processes and controls. The challenges that most Banks have in integrating front office trading desk data (Position data, pricing data and reporting) with back-office systems –  risk & finance are making the job of accurately reporting on stress numbers all the more difficult. These are causing most BHC’s to resort to manual data operations, analytics and complicated reconciliation process across the front, mid and back offices.
    • Not just from a computation & reporting library standardization, banks need to be able to perform common data storage for data from a range of BORT systems.
    • Banks also need to standardize on data taxonomies across all of these systems.
    • To that end, Banks need to stop creating more silos data across Risk and Finance functions; as I have often advocated in this blog – a move to a Data Lake enabled architecture is appropriate as a way of eliminating silos and the problem of unclean data which is sure to invite regulatory sanction.
    • Banks need to focus on Data Cleanliness by setting appropriate governance and audit-ability policies
    • Move to a paradigm of bringing compute to large datasets instead of the other way around
    • Move towards in memory analytics to transform, aggregate and analyze data in real time across many dimensions to obtain an understanding of the banks risk profile at any given point in time

A Reference Architecture for CCAR and DFAST..

 I recommend readers review the below post on FRTB Architecture as it contains core architectural and IT themes that are broadly applicable to CCAR and DFAST as well.

A Reference Architecture for the FRTB (Fundamental Review of the Trading Book)

Conclusion..

As can be seen from the above, both CCAR & DFAST require a holistic approach across the value chain (model development, data sourcing, reporting) across Risk, Finance and Treasury functions.  Further Regulators are increasingly demanding an automated process across risk & capital calculations under various scenarios using accurate and consistent data. The need of the hour for BHCs is to move to a common model for data storage, stress modeling and testing. Doing this can only ensure that the metrics and outputs of capital adequacy can be produced accurately and in a timely manner, thus satisfying the regulatory mandate.

References –

[1] Federal Reserve CCAR Summary Instructions 2016

https://www.federalreserve.gov/newsevents/press/bcreg/bcreg20160128a1.pdf

Why Platform as a Service (PaaS) Adoption will take off in 2017..

???????????????????????????

Since the time Steve Ballmer went ballistic professing his love for developers, it has been a virtual mantra in the technology industry that developer adoption is key to the success of a given platform. On the face of it – Platform as a Service(PaaS) is a boon to enterprise developers who are tired of the inefficiencies of old school application development environments & stacks. Further, a couple of years ago, PaaS seemed to be the flavor of the future given the focus on Cloud Computing. This blogpost focuses on the advantages of the generic PaaS approach while discussing its lagging slow rate of adoption in the cloud computing market – as compared with it’s cloud cousins – IaaS (Infrastructure as a Service) and SaaS (Software as a Service).

Platform as a Service (PaaS) as the foundation for developing Digital, Cloud Native Applications…

Call them Digital or Cloud Native or Modern. The nature of applications in the industry is slowly changing. So are the cultural underpinnings of the development process and culture themselves- from waterfall to agile to DevOps. At the same time, Cloud Computing and Big Data are enabling the creation of smart data applications. Leading business organizations are cognizant of the need to attract and retain the best possible talent – often competing with the FANGs (Facebook, Amazon, Netflix & Google).

Couple all this with the immense industry and venture capital interest around container oriented & cloud native technologies like Docker – you have a vendor arms race in the making. And the prize is to be chosen as the standard for building industry applications.

Thus, infrastructure is enabling but in the end- it is the applications that are Queen or King.

That is where PaaS comes in.

Why Digital Disruption is the Cure for the Common Data Center..

Enter Platform as a Service (PaaS)…

Platform as a Service (PaaS) is one of the three main cloud delivery models, the other two being IaaS (Infrastructure such as compute, network & storage services) and SaaS (Business applications delivered over a cloud). A collection of different cloud technologies, PaaS focuses exclusively on application development & delivery. PaaS advocates a new kind of development based on native support for concepts like agile development, unit testing, continuous integration, automatic scaling, while providing a range of middleware capabilities. Applications developed on these can be deployed out as services & managed across thousands of application instances.

In short, PaaS is the ideal platform for creating & hosting digital applications. What can PaaS provide that older application development toolchains and paradigms cannot?

While the overall design approach and features vary across every PaaS vendor – there are five generic advantages from a high level –

  1. PaaS enables a range of Application, Data & Middleware components to be delivered as API based services to developers on any given Infrastructure as a Service (IaaS).  These capabilities include-  Messaging as a service, Database as a service, Mobile capabilities as a service, Integration as a service, Workflow as a service, Analytics as a service for data driven applications etc. Some PaaS vendors also provide ability to automate & manage APIs for business applications deployment on them – API Management.
  2. PaaS provides easy & agile access to the entire suite of technologies used while creating complex business applications. These range from programming languages to application server (and lightweight) runtimes to programming languages to CI/CD toolchains to source control repositories.
  3. PaaS provides the services which enables a seamless & highly automated manner of building the complete life cycle of building and delivering web applications and services on the internet. Industry players are infusing software delivery processes with practices such as continuous delivery (CD) and continuous integration (CI). For large scale applications such as those built in web scale shops, financial services, manufacturing, telecom etc – PaaS abstracts away the complexities of building, deploying & orchestrating infrastructure thus enabling instantaneous developer productivity. This is a key point – with it’s focus on automation – PaaS can save application and system administrators precious time and resources in managing the lifecycle of elastic applications
  4. PaaS enables your application to be ‘kind of cloud’ agnostic & can enable applications to be run on any cloud platform whether public or private. This means that a PaaS application developed on Amazon AWS can easily be ported to Microsoft Azure to VMWare vSphere to Red Hat RHEV etc
  5. PaaS can help smoothen organizational Culture and Barriers – The adoption of a PaaS forces an agile culture in your organization – one that pushes cross pollination among different business, dev and ops teams. Most organizations are just now beginning to go bimodal for greenfield applications can benefit immensely from choosing a PaaS as a platform standard.

The Barriers to PaaS Adoption Will Continue to Fall In 2017..

In general, PaaS market growth rates do not seem to line up well when compared with the other broad sections of the cloud computing space, namely IaaS (Infrastructure as a Service) and SaaS (Software as a Service). 451 Research’s Market Monitor forecasts that the total market for cloud computing (including PaaS, IaaS and infrastructure software as a service (ITSM, backup, archiving) – will hit $21.9B in 2016 more than doubling to $44.2bB by 2020. Of that, some analyst estimates contend that PaaS will be a relatively small $8.1 billion.

451-research-paas_vs_saas_iaas

  (Source – 451 Research)

The advantages that PaaS confers have sadly also caused its relatively low rate of adoption as compared to IaaS and SaaS.

The reasons for this anemic rate of adoption include, in my opinion  –  

  1. Poor Conception of the Business Value of PaaS –  This is the biggest factor holding back explosive growth in this category. PaaS is a tremendously complicated technology & vendors have not helped by stressing on the complex technology underpinnings (containers, supported programming languages, developer workflow, orchestration, scheduling etc etc) as opposed to helping clients understand the tangible business drivers & value that enterprise CIOs can derive from this technology. Common drivers include increased time to market for digital capabilities, man hours saved in maintaining complex applications, ability to attract new talent etc. These factors will vary for every customer but it is up to frontline Sales teams to help deliver this message in a manner that is appropriate to the client.
  2. Yes, you can do DevOps without PaaS but PaaS helps a long way  – Many Fortune 500 organizations are drawing up DevOps strategies which do not include a PaaS & are based on a simplified CI/CD pipeline. This is to the detriment of both the customer organization & the industry as PaaS can vastly simplify a range of complex runtime & lifecycle services that would otherwise need to be cobbled together by the customer as the application moves from development to production. There is simply a lack of knowledge in the customer community about where a PaaS fits in a development & deployment toolchain.
  3. Smorgasbord of Complex Infrastructure Choices – The average leading PaaS includes a range of open source technologies ranging from containers to runtimes to datacenter orchestration to scheduling to cluster management tools. This makes it very complex from the perspective of Corporate IT – not just it terms of running POCs and initial deployments but also to manage a highly complex stack. It is incumbent on the open source projects to abstract away the complex inner workings to drive adoption  -whether by design or by technology alliances.
  4. You don’t need Cloud for PaaS but not enough Technology Leaders get that – This one is perception. The presence of an infrastructural cloud computing strategy is not a necessary condition for PaaS. 
  5. The false notion that PaaS is only fit for massively scalable, greenfield applications – Industry leading PaaS’s (like Red Hat’s OpenShift) support a range of technology approaches that can help cut technical debt. They donot limit deployment on an application server platform such as JBOSS EAP or WebSphere or WebLogic, or a lightweight framework like Spring.
  6. PaaS will help increase automation thus cutting costs – For developers of applications in Greenfield/ New Age spheres such as IoT, PaaS can enable the creation of thousands of instances in a “Serverless” fashion. PaaS based applications can be composed of microservices which are essentially self maintaining – i.e self healing and self scalable up or down; these microservices are delivered (typically) by IT as Docker containers using automated toolchains. The biggest requirement in large datacenters – human involvement – is drastically reduced if PaaS is used – while increasing agility, business responsiveness and efficiencies.

Conclusion…

My goal for this post was to share a few of my thoughts on the benefits of adopting a game changing technology. Done right, PaaS can provide a tremendous boost to building digital applications thus boosting the bottom line. Beginning 2017, we will witness PaaS satisfying critical industry use cases as leading organizations build end-to-end business solutions that covers many architectural layers.

References…

[1] http://www.forbes.com/sites/louiscolumbus/2016/03/13/roundup-of-cloud-computing-forecasts-and-market-estimates-2016/#3d75915274b0