Why Banks, Payment Providers and Insurers Should Digitize Their Risk Management..

When models turn on, brains turn off.” – Dr. Til Schuermann, Formerly Research Officer in the Banking Studies function at the Federal Reserve Bank of New York.Currently Partner at Oliver Wyman & Company.

There exist two primary reasons for Enterprises such as Banks, Insurers, Payment Providers and FinTechs to pursue best in class Risk Management Processes and Platforms. The first need is compliance driven by various regulatory reporting mandates such as the Basel Reporting Requirements, the FRTB, the Dodd‐Frank Act, Solvency II, CCAR and CAT/MiFiD II in the United States & the EU. The second reason is the need to drive top-line sales growth leveraging using Digital technology. This post advocates the implementation of Digital Technology on Risk Management across both the areas.

Image Credit – Digital Enterprise

Recapping the Goals of Regulatory Reform..

There are many kinds of Risk, ranging from the three keystone kinds – Credit, Market and Operational to the Basel-II.5/III accords, FRTB, Dodd Frank etc. The best enterprises not only manage Risk well but they also turn it into a source of competitive advantage. Leading banks have recognized this and according to McKinsey forecasts, while risk-operational processes such as credit administration today account for the majority of the some (50 percent) of the Risk function’s staff, and analytics just 15 percent, by 2025 those figures will be around 25 percent and 40 percent respectively. [1]

Whatever be the kind of Risk, certain themes are common from a regulatory intention standpoint-

  1. Limiting risks that may cause wider harm to the economy by restricting certain activities such as preventing banks with retail operations from engaging in proprietary trading activities
  2. Requiring that banks increase the amount of and quality of capital held on reserve to back their assets and by requiring higher liquidity positions
  3. Ensuring that banks put in place appropriate governance standards ensuring that boards and management interact not just internally but also with regulators and their clients
  4. Upgrading governance standards, enabling a fundamental change in bank governance and the way boards interact with both management and regulators. These ambitions were expressed in various new post‐crisis rules and approaches.
  5.  Tackle the “too big to fail” challenge for highly complex businesses spanning multiple geographies, product lines and multifaceted customer segments. Accurate risk reporting ensures adequate capital conservation buffers.

Beyond the standard models used for Risk regulatory reporting, Banks & FinTechs are pushing the uses of risk modeling to new areas such as retail lending, SME lending. Since the crisis of 2008, new entrants have begun offering alternatives to traditional financial services in areas such as payments, mortgage loans, cryptocurrency, crowdfunding, alternative lending, and Investment management. The innovative use of Risk analytics lies at the core of the FinTechs success.

Across these areas, risk models are being leveraged in diverse areas such as marketing analytics to gain customers, defend against competition etc. For instance, realtime analytic tools are also being used to improve the credit granting processes. The intention is to gain increased acceptance by pre-approving qualified customers quickly without the manual intervention that can cause weeks of delays. Again, according to McKinsey, the goals of leading Banks are to approve up to 90 percent of consumer loans in seconds, generate efficiencies of 50 percent leading to revenue increases of 5 to 10 percent. Thus, leading institutions are using Risk Analytics to rethink their business models and to expand their product portfolios. [2]

Over the last two years, this blog has extensively covered areas such as cyber security, fraud detection, anti money laundering (AML) etc from a data analytics standpoint. The industry has treated Risk as yet another defensive function but over the next 10 years, it is expected that the Risk function will be an integral part of all of these above areas thus driving business revenue growth & detecting financial fraud, crimes. There is no doubt that Risk is a true cross cutting concern across a range of business functions & not just the traditional Credit, Market, Liquidity and Operational silos. Risk strategy needs to be a priority at the highest levels of an organization.

The Challenges with Current Industry Risk Architectures..

Almost an year ago, we discussed these technology issues in the below blogpost. To recap – most industry players have a mishmash of organically developed & shrink wrapped IT systems. These platforms run critical Core Banking Applications to Trade Lifecycle to Securities Settlement to Financial Reporting etc.  Each of these systems operates in an application, workflow, data silo with it’s own view of the enterprise. These are all kept in sync largely via data replication & stove piped process integration. Further siloed risk functions ensure that different risk reporting applications are developed using duplicative technology paradigms causing massive IT spend. Further, the preponderance of complex vendor supplied systems ensures lengthy release cycles and complex data center deployment requirements.

The Five Deadly Sins of Financial Services IT..

Industry Risk Architectures Suffer From Five Limitations

 A Roadmap for Digitization of Risk Architectures..

The end state or how a Digital Risk function will look like will vary for every institution embarking on this journey. There are six foundational elements we can still point out a few guideposts based on the .

#1 Automate Back & Mid Office Processes Across Risk and Compliance  –

As discussed, Many business processes across the front, mid and back office involve risk management. These processes range from risk data aggregation, customer on boarding, loan approvals, regulatory compliance (AML,KYC, CRS & FATCA), enterprise financial reporting  & Cyber Security.It is critical to move all and any manual steps from these business functions to a highly automated model. Doing so will not only reduce operational costs in a huge way but also demonstrate substantial auditability capabilities to regulatory authorities.

#2 Design Risk Architectures to handle Real time Data Feeds –

A critical component of Digital Risk is the need to incorporate real time data feeds across Risk applications. While Risk algorithms have traditionally dealt with historical data, new regulations such as FRTB explicitly call for various time horizons. These imply that Banks  to run a full spectrum of analytics across many buckets on data seeded from real time interactions. While the focus has been on the overall quality and auditability of data, the real time requirement is critical as one moves from front office applications such as customer on boarding, loan qualifications & pre-approvals to  key areas such as  market, credit and liquidity risks. Why is this critical? We have discussed the need for real time decision making insights for business leaders. Understanding risk exposures and performing root cause analysis in real time is a huge business capability for any Digital Enterprise.

#3 Experiment with Advanced Analytics and Machine Learning 

In response to real time risk reporting, the analytics themselves will be begin to get considerably more complex. This technology complexity will only be made more difficult with multiple teams working on all of these areas. This calls out for standardization of the calculations themselves across the firm. This also implies that from an analytics standpoint, a large number of scenarios on a large volume of data.  For Risk to become truly a digital practice, the innovative uses of Data Science across areas such as customer segmentation, fraud detection, social graph analysis must all make their way into risk management. Insurance companies and Banks are already deploying self learning algorithms in applications that deal with credit underwriting, employee surveillance and fraud detection. Wealth Managers are deploying these in automated investment advisory.  Thus, machine learning will support critical risk influenced areas such as Loan Underwriting, Credit Analytics, Single view of risk etc. All of these areas will need to leverage predictive modeling leading to better business decisions across the board.

#4 Technology Led Cross Organization Collaboration –

McKinsey predicts [1] that in the coming five to ten years, different regulatory ratios such as capital, funding, leverage, total loss-absorbing capacity etc will drive  the composition of the balance sheet to support profitability. Thus the risk function will work with finance and strategy functions to help optimize the enterprise balance sheet across various economic scenarios and then provide executives with strategic choices (e.g. increase or shrink a loan portfolio, for example), and likely regulatory impacts across these scenarios. Leveraging analytical optimization tools, an improvement on return on equity (ROE) by anywhere between 50 and 400 basis points has been forecasted.

The Value Drivers in Digitization of Risk Architectures..

McKinsey contends that the automation of credit processes and the digitization of the key steps in the credit value chain can yield cost savings of up to 50 percent. The benefits of digitizing credit risk go well beyond even these improvements. Digitization can also protect bank revenue, potentially reducing leakage by 5 to 10 percent. [2]

To give an example, by putting in place real-time credit decision making in the front line, banks reduce the risk of losing creditworthy clients to competitors as a result of slow approval processes. Additionally, banks can generate credit leads by integrating into their suite of products new digital offerings from third parties and Fintech’s, such as unsecured lending platforms for business. Finally, credit risk costs can be further reduced through the integration of new data sources and the application of advanced-analytics techniques. These improvements generate richer insights for better risk decisions and ensure more effective and forward-looking credit risk monitoring. The use of machine-learning techniques, for example, can help banks improve the predictability of credit early-warning systems by up to 25 percent [2].

The Questions to Ask at the Start of Risk Transformation..

There are three questions at this phase every Enterprise needs to ask at the outset –

  • What customer focused business capabilities can be enabled across the organization by incorporating an understanding of the various kinds of Risk ?
  • What aspects of this Risk transformation can be enabled by digital technology? Where are the current organizational and technology gaps that inhibit innovation?
  • How do we measure ROI and Business success across these projects before and after the introduction of ? How do we benchmark ourselves from a granular process standpoint against the leaders?


As the above makes it clear, traditional legacy based approaches to risk data management reporting do not lend themselves well to managing your business effectively. When things are going well it has become very difficult for executives and regulators to get a good handle on how the business is functioning. In the worst of times, the risk function can fail to function well as models do not perform effectively.  It is not enough to take an incremental approach to improving current analytics approaches. The need of the hour is to incorporate the state of the art data management and analytic approaches based on Big Data, Machine Learning and Artificial Intelligence.


Apache Mesos: Cluster Manager for the Software Defined Data Center ..(3/7)

The second and previous blog in this six part series (@ http://www.vamsitalkstech.com/?p=4670)  discussed technical challenges with running large scale Digital Applications on traditional datacenter architectures. In this third blog, we will deep dive into another important ecosystem platform – Apache Mesos, a project that aims to abstract away various system resources – CPU, memory, network and disk resources to provide consuming digital applications with a giant cluster from which they can utilize capacity – a key requirement of the Software Defined Datacenter (SDDC). The next blogpost will deep dive into Linux Containers & Docker.

Introduction and the need for Apache Mesos..

This blog has from time to time discussed how Digital applications are a diverse blend of several different and broad technology paradigms – Big Data, Intelligent Middleware, Messaging, Business Process Management, Data Science et al.

To that end almost every Enterprise Datacenter supporting Digital workloads typically has clusters of multi-varied applications installed. Most traditional datacenters have used either physical or virtual machines (VMs) as the primary runtime unit to run such applications. These VMs are typically provisioned based on application asks and have applications deployed onto them. These VMs then are formed into logical clusters which are essentially a series of machines serving a given business application in an n-tier architecture.

As load increases on these servers, more VMs are provisioned into the cluster and so on. The challenge with this traditional model is that it is fairly static in nature in the sense that machines are preallocated to run certain kinds of workloads (databases, webservers, developer servers etc). The challenge with Digital and Cloud Native applications are that scaling needs to happen dynamically and applications think of the infrastructure as being infinite.  These applications present various challenges and headaches that call for the Datacenter to be software defined as we discussed in the last blog below. We will continue our look at the SDDC by considering one of the important projects in this landscape – Apache Mesos.

Why Digital Platforms Need A Software Defined Datacenter..(2/6)

Apache Mesos is a project that was developed at the University of California at Berkeley circa 2009. While it was initially created to solve the challenge of provisioning and scaling Spark clusters, the Mesos project evolved to become a centralized cluster manager. The central idea of Mesos is to pool together all the physical resources of the cluster and making it available as a single reservoir of highly available resources for different applications (or frameworks) to consume. Over time, Mesos has begun supporting complex n-tier application platforms that leverage capabilities such as Hadoop, Middleware, Jenkins, Kafka, Spark, Machine Learning etc.

As with almost all innovative Cloud & Big Data projects, the adoption of Apache Mesos has primarily been in the web scale arena. Prominent users include highly technical engineering shops such as Twitter, Netflix, Airbnb, Uber, eBay, Yelp and Apple. However, there seems to be early adopter activity with increased acceptance in the Fortune 100. For instance, Verizon signed on in 2015 to use a Mesosphere DC/OS (based on Apache Mesos) for datacenter orchestration.

The Many Definitions of Mesos..

At it’s simplest, Mesos is an Open Source Cluster Manager. What does that mean? Mesos can be described as a cluster manger because it ensures that datacenter hardware resources are managed and advantageously shared among multiple distributed technologies – Big Data, Message Oriented Middleware, Application Servers, Mobile apps etc. Mesos also enables applications to scale with a high degree of resiliency, without having to bother about details of the underlying infrastructure.

The model of resource allocation followed by Mesos allows a range of constituents sys-admins, developers & DevOps teams to request resources (CPU, RAM, Storage) from a cloud provider.

Mesos has alternatively been described as a Datacenter Kernel as it provides a single unified view of node resources to software frameworks that wish to consume them via APIs. Mesos performs the role of an Intelligent global level scheduler that can match a massive pool of hardware resources to distributed applications that want to consume these resources. Mesos aggregates all the resources into a large virtual pool using not just virtual machines and containers but primitives such as CPU, I/O and RAM. It also breaks applications into small units that can be assigned across this pool. Mesos also provides APIs in multiple languages to allow applications to be built for it. Apache Spark, the most popular data processing engine, was built originally as a Mesos framework.

It is also called a Data Center Operating System (DCOS) as it performs a similar role to the operating system. Any application that can run on Linux runs on Mesos.


To illustrate how Mesos works. Consider two clusters in a datacenter – Cluster A and Cluster B. Cluster A has 8 nodes with each node/server possessing 4 CPUs and 64 GB RAM; Cluster B has 5 nodes with each node/server having with 4 CPUs and 64 GB RAM. Mesos can essentially combine both these clusters into one virtual cluster of 52 CPUs and 832 GB RAM. The advantage of this approach is that cluster usage is greatly improved because applications share resources much more efficiently.

Mesos and Cloud Native Applications..

We discussed the differences between Cloud Native and legacy applications in the previous post @ http://www.vamsitalkstech.com/?p=4670 . Mesos has been impactful when running stateless Cloud Native applications as opposed to running traditional applications which are built on a stateful/ vertical scaling paradigm. While the defining features of Cloud Native applications are worthy of a dedicated blogpost, these applications can scale to handle massive & increasing amounts of load while tolerating any failure without impacting service. These applications are also intrinsically distributed in nature and are typically composed of loosely coupled microservices. Examples include – stateless web applications running on a Platform as a Service (PaaS), CI/CD applications working on Jenkins, NoSQL databases like HBase, Cassandra, Couchbase and MongoDB. Stateful applications that persist data using a RDBMS to disk aren’t good workloads for Mesos as yet.

When Cloud Native Digital applications are run on Mesos, several of the headaches encountered in running these on legacy datacenters are  ameliorated, namely –

  1. Clusters can be dynamically provisioned by Mesos  based on demand spikes
  2. Location independence for microservices
  3. Fault tolerance

As it matures, Mesos has also began supporting multi datacenter deployments with web scale shops like Uber running Cassandra as a framework across datacenters at scale. In the case of Uber, each datacenter has it’s own Mesos cluster with independent frameworks that exchange information periodically. The Cassandra database includes a seed node that bootstraps the gossip process for new nodes joining the cluster. A custom seed provider was created to launch Cassandra nodes which allows new nodes to be rolled out automatically into the Mesos cluster in each datacenter. (Credit – Abhishek Verma – Uber)

Mesos Architecture..

There are three main architectural primitives in Mesos – Master, Slave, Frameworks. The central orchestrator in the Mesos system is called a Master and the worker processes are called Slaves.

As depicted below, the Master process manages the overall cluster and delegates tasks to the slaves based on the resources requested by Frameworks.

The core Mesos process is installed on all nodes and their personality is given at runtime. The Slaves run application workloads that are requested by appropriate frameworks. This overall setup of Master and Slave daemons makes up a Mesos cluster.

Frameworks which are commonly called Mesos applications and are composed of three main components. First off, they have a scheduler which registers with the Master to receive resource offers and then executors which launch workloads or tasks on the slaves. The Resource offers are a simple list of a slave’s available capacity – CPU and Memory. The Master receives these offers from the slaves and then provides them to the frameworks.  A task can be anything really – a simple script or a command, or a MapReduce job or an initialization of a Jetty/Tomcat/JBOSS AS etc.

The Mesos executor is a process on the Slave that runs tasks. The executor is a program or command on the slaves which runs the tasks. No matter which isolation module is used, the executor packages all resources and runs the task on the slave node.  When the task is complete, the containers are destroyed and the Slaves resources are released back to the Master.

For Master HA, you can run multiple masters with only one Active at a given point communicating with the slave nodes. Once the Hot Master fails, Apache Zookeeper is used to manage leader election to a standby Master as depicted. Master quorum is a minimum of 3 nodes but most production deployments are recommended to have 5 Master nodes. Once a new Master is elected, all of the cluster/slave and framework information is submitted to the new Master by the frameworks so that state before failure can be reconstructed. Mesos has elaborate recovery processes for the frameworks, the schedulers and the Slave nodes.

Apache Mesos Architecture comprises of Master Nodes, Slave Nodes and Frameworks.

By some measures, Mesos is a very straightforward concept. Frameworks need to run tasks and they are traffic managed by Masters which coordinate tasks on worker machines called – Slaves.

From a production deployment standpoint, the following components are required – An odd number of Mesos Masters, Many Slave machines needed to run applications, a Zookeeper ensemble for HA configurations and an optional Docker engine running on each slave.

The Mesos Resource Allocation Process..

Mesos follows a default resource scheduling model known as two-tier scheduling. This model may seem a little convoluted but it is important to keep in mind that it was designed to satisfy the requirements & constraints of many different frameworks without having to know details of each.

The Master’s allocation module receives resource offers from slaves which then forwards them on to the framework schedulers. These offers are not just high level in terms of the resources but also how much of these resources to offer. The framework schedulers can accept or reject the Master’s offers based on their current capacity requirements. The Master’s allocation module is customizable based on specific requirements that implementing enterprises may have. The default allocation algorithm is known as Dominant Resource Fairness (DRF) and is based on fair sharing of cluster resources among requesting applications. For instance, DRF ensures that requests are equalized i.e CPU hungry applications are provided a higher share of CPU heavy resources & Memory intensive applications are provided the same fractional amount of RAM.

Mesos follows a two level resource allocation policy (Image Credit – Apache Mesos Project Documentation)

To better illustrate the resource allocation method in Mesos, let us discuss the sequence of events in the above figure from the Apache Mesos documentation[1]

  1. The Slave Node – as depicted, Agent 1 can offer reports to4 CPUs and 4 GB of memory for allocation to any framework that can use it. It reports this  available capacity to the master. The allocation policy module offers framework 1 these resources.
  2. The Master sends a resource offer describing what is available on agent 1 to framework 1.
  3. The Framework’s scheduler then provides the master withmore information on the two tasks to run on the agent, using <2 CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the second task.
  4. The master sends the tasks to the agent, which allocates appropriate resources to the framework’s executor, which in turn launches the two tasks (depicted with dotted-line borders in the figure). Because 1 CPU and 1 GB of RAM are still unallocated, the allocation module may now offer them to framework 2.

Mesos integration with other SDDC components – Linux Containers, Docker, OpenStack, Kubernetes etc

The Mesosphere stack (Credit – Alexander Rukletsov)

As with other platforms we are discussing in this series, Mesos does not stand alone in the SDDC and leverages other technologies as needed and as discussed in the last post (@ http://www.vamsitalkstech.com/?p=4670). However it needs to be stated that Mesos does have overlapping functionality at times with technologies such as Kubernetes and OpenStack.

However, let us consider the integration points between these technologies  –

  1. Linux Containers -Over the last few years, linux containers have emerged as a viable and lightweight alternative to hypervisors as way of running multiple applications on a given OS. Different containers share one underlying OS and perform with less overhead than virtual machines. Given that one of the chief goals of Mesos is to run multiple frameworks on the same set of hardware, Mesos implements what are called isolation modules and isolation mechanisms to achieve its goal of multi-tenency for different applications running on the same hardware. Mesos supports popular technologies for process isolation – cgroups, Solaris Zones, Docker containers. The first two are the default but the Mesos project has recently added Docker as an isolation mechanism.
  2. Schedulers – There is no single widely accepted definition as to what constitutes a Container Orchestration  technology. The tooling to achieve this has become one of the trickiest parts of launching containers at scale discussion with multiple projects attempting to capture this market. The requirement in the case of Mesos is straightforward –  frameworks constitute applications which need to make the the most efficient use of hardware. This means avoiding the overhead of VMs and leveraging containers –  cgroups or Docker or Rocket etc. Hence Mesos needs to be able to support container orchestration as a core feature. Mesos follows a pluggable model for container orchestration by supporting schedulers like Kubernetes or YARN or Marathon or Docker Swarm. All these tools provide service that organize containers into a clusters and running them on specified servers & overall lifecycle management and scheduling of applications running as  containers. At large webscale properties, massive container oriented environments running hundreds of microservices are all being managed with this combination of tools using Mesos.Mesos needs to be able to start and stop services in response to failure conditions etc.
  3. Private and Public Cloud Infrastructure as a Service (IaaS) Providers– Mesos works at a different layer of abstraction than a IaaS provider such as Openstack and aims to solve different problems. While OpenStack provides provisioned infrastructure across OS, Storage, Networking et, Mesos intends to achieve better cloud instance utilization. Mesos integrates well with Openstack and runs on top of resources offered up by Openstack to run frameworks on them. Mesos itself runs on a Linux instance on an existing OpenStack deployments though it also can simply run on bare metal as well. It simply requires to run a small Linux process on each of the nodes. Mesos is also significantly simpler than OpenStack and it only takes a few hrs if even to get it up and running.
    Mesos has also been deployed on public cloud technology with both Microsoft Azure and Amazon AWS. Azure’s container services are built on Mesos. Netflix leverages Mesos extensively on their EC2 cloud and have also written an advanced scheduling library called Fenzo. Fenzo ensures that a first fit kind of assignment is followed where tasks are ‘bin packed’ onto Agents by the requested use of CPU, memory and network bandwidth. Fenzo also autoscales cluster usage based on demand and also spreads tasks of a given job across EC2 availability zones for high availability. [2]With the stage set from a technology standpoint, let us look over at a few real world use cases where Mesos has been deployed in mission critical applications at various Netflix.

Mesos Deployment @ Netflix..

Netflix are one of the largest adopters and contributors to Mesos and they use it across a wide variety of business capabilities. These use cases include real time anomaly detection, data science lifecycle (training and model building batch jobs, machine learning orchestration), and other business applications. These workloads span a range of technical architectures- batch processing, stream processing and running microservices based applications.

Netflix runs their business applications as a collection of microservices deployed on Amazon EC2 and their first use of Mesos was to perform fine grained resource allocation for compute tasks to gain greater unit efficiency on EC2. The first use case for Mesos at large enterprises is typically around increasing the usage and efficiency of elastic cloud services. In Netflix’s case, they needed the cluster scheduler to increase both agent ephemerality as well as autoscale agents based on demand.

Major Application Use Cases –

  • Mantis – Netflix deals with a lot of operational data that is constantly streaming in to their environment. They have a range of use cases on streaming data such as real-time dashboarding, alerting, anomaly detection, metric generation, and ad-hoc interactive exploration of streaming data. With this Mantis is a reactive stream processing platform that is deployed as a cloud native service which focuses on operational data streams. The other goal of Mantis is to make it easy for different development teams to obtain access to real time events and then to build applications on them. The current throughput of Mantis is around 8 million events per second and Apache Mesos is running hundreds of stream-processing jobs around the clock. For certain kinds of streaming applications, this amounts to tracking millions of unique combinations of data all the time.

    Mantis Architecture is based on Apache Mesos ..
  •  As mentioned above, Netflix runs their Application services stack on Amazon EC2 and most workloads run on linux containers. Netflix created Titus to create a container management platform and to provision Docker containers on EC2.  Netflix had to do this as Amazon ECS was not upto par yet as a container orchestration solution for EC2. The use cases supported by Titus include serving batch jobs which help with algorithm training (similar titles for recommendations, A/B test cell analysis, etc.) as well as hourly ad-hoc reporting and analysis jobs. Titus recently added support for service style invocation for Netflix resources that are used to provide consistent development environments and more fine grained resource management.

  • Titus is a Container management platform that provisions Docker containers on EC2.

    Meson – One of the most important capabilities that Netflix possess is its uncanny ability to predict what movies and shows that its subscribers want to watch based on their previous watching history and similar segmentation data. Netflix excels at personalizing video recommendations and this capability is powered by machine learning algorithms. To ensure that a very large number of machine learning workflow pipelines can be efficiently created, scheduled and managed – Netflix created Meson on top of Apache Mesos. It is critical that for this system to scale and for the algorithms themselves to be fast, reliable and efficient, these pipelines are run over a large cluster of Amazon AWS instances. As depicted below, Meson manages a large number of jobs with differing CPU, Memory and Disk requirements. Once the slaves/agents are chosen, Spark jobs are run on these shared clusters. Meson uses Linux cgroups based isolation. All of the resource scheduling is handled via Fenzo (described above)

    Meson is a platform used to create high velocity Data Science pipelines that power much of Netflix’s intelligent applications.


Apache Mesos is a promising new technology which attempts to solve scaling and clustering challenges encountered in the Software Defined Datacenter (SDDC). The biggest benefits of using Mesos are more efficient use of infrastructure across complex applications with native support for multitenant applications. Mesos can ensure that multiple kinds of applications or frameworks can share a given set of nodes. This ensures not just more efficient sharing of hardware but also fault tolerance and load balancing for complex Cloud Native applications.

While, Mesos has had a good degree of adoption in the webscale properties where it was first created (Twitter, Netflix, Uber, Airbnb etc to name the most prominent), it still needs to be proven as a dependable and robust platform in the datacenter.

The next post in this series will explore another exciting technology Docker, the emerging standard in the Linux container space.


[1] Apache Mesos Documentation – http://mesos.apache.org/documentation/latest/architecture/

[2] Distributed Resource Scheduling with Apache Mesos at Netflix – Medium.com

View story at Medium.com

A Framework for Digital Transformation in the Retail Industry..(2/2)

Our environment embraces a lot of change — we have to, because the internet is changing and the technologies we use are changing… for somebody who hated change, I imagine high tech would be a pretty bad career. It would be very tough. There are much more stable industries and they should probably choose one of those more stable industries with less change. They’ll probably be happier there.” – Jeff Bezos Chairman and CEO – Amazon, May-2016

The retail carnage continued over the last month with more household names such as Macy’s, Michael Kors, JC Penney, Abercrombie & Fitch et al announcing store closures. Long term management teams are also departing at struggling Retailers – who are unable to make the digital cut. It is clear now that players that are primarily brick and mortar need to urgently reinvent themselves via Digital Innovation. This is easier said than effectively done as delivering an transformative customer experience in such a highly competitive industry requires a cultural ability embrace change and to thrive in it. From a core technology standpoint, the industry’s digital divide between leaders and laggards manifests itself across four high level dimensions – Cloud Computing, Big Data, Predictive Analytics & Business Culture. Investments in these areas are needed to improve customer value drivers – Increased Consumer Choice, Better Pricing, Frictionless Shopping & Checkout, Ease of Payments, Speedy Order Fulfillment and Operations. In this blogpost, we will discuss a transformation framework for legacy retailers across both of these dimensions – business and technology.

Digital Reinvention in Retail..

We’re taking a look at the reasons for the Storefront pullback in Retail industry. For those catching up on the business background, please read the first post below.

Here Is What Is Causing The Great Brick-And-Mortar Retail Meltdown of 2017..(1/2)

Amazon, the Gold Standard of Retail..

We have seen how an ever increasing percentage of global retail sales are gradually moving online. It then follows naturally that a business model primarily focused on Digital e-commerce capabilities is a must have in the industry.  Amazon is setting new records for online sales – selling an increased online catalog of products (including furniture), generating record revenues from other seemingly unrelated areas of it’s business (e.g AWS, Alexa, Echo). The ability of Amazon to continue generating cash is critically important as it increases it’s financial ability to compete with incumbent retailers such as Walmart. According to a research report by the ISLR [4], as of end of 2016, Amazon had a market cap twice that of it’s biggest competitor – Walmart – even though it (Amazon) only reported around $1 billion in profits over a financial reporting period. Walmart generated about $80 billion around the same five year timeframe. Amazon plows everything back into growing it’s diverse businesses and aims to grow market share at the expense of quarterly profits.

Amazon’s market cap is worth more than all major brick and mortar retailers put together. (Credit – Equitykicker)

Amazon is the envy of every retailer out there with it’s mammoth sales of almost $80 billion in 2016. Walmart and Apple are a distant #2 and #3 with sales of $13.5 billion and $12 billion respectively [1].

Why has Amazon been and will continue to be successful?

I should confess that I am an Amazon fan going back several years. Please find a backgrounder on their business strategy written over two years ago.

Amazon declares results..and stuns!

Amazon has largely been successful for a few important reasons. The biggest reason being that using technology, it has completely rewritten the rules of Retail across the key processes – Consumer Choice, Frictionless Shopping & Checkout, Ease of Payments, Fulfillment and Innovation.

Consider the following –

  1. Platforms and Ecosystems – Amazon has built platforms that serve a host of areas in retail – ranging from books to online video to groceries to video games to virtually any kind of e-commerce. Across all of these platforms, it has constantly invested in business strategies that offer a superior customer experience from choice to fulfillment. Be it in free shipping (Amazon Prime) to innovating on drone based delivery.
  2. Constant Platform Innovation -Amazon has given its customers the largest (and ever growing catalog) of products to choose from, instant 1 – click ordering, rapid delivery via Amazon Prime and online marketplaces for sellers etc. By diversifying into areas like streaming video (via Amazon Prime Video), it is also turning into a content producer.With the launch of storefronts such as Amazon Go, it is striving to provide a seamless multichannel (digital and physical) experience so consumers can move effortlessly from one channel to another. For example, many shoppers use smartphones to reserve a product online and pick it up in a store.
  3. Always Push the Envelope on Advanced Technology -Digital product innovation implies an ability to create new products and services that meet changing customer demands. Such capability implies an ability to bring products to market faster, and then to refine such approaches based on customer feedback. Amazon supplements every platform it builds with ecosystems based on advanced technology. For instance, customers can now try any of the hundreds of voice control apps being built on Amazon Alexa to order products across Amazon’s huge catalog.  In April 2017, Amazon launched Echo Look which is a Digital Assistant which can perform various functions – order a ride via Uber, order pizza from Domino’s etc. But in addition to just obeying commands and reading out news etc, it also comes with a camera that can take pictures and perform advanced image recognition. Owners can even try a few outfits, upload them to the device and the Style Check function will tell you which combination looks best. [5]
  4. Create a Data Driven Customer Experience -Using data in a way that improves the efficiencies in back end supply chains as well as creating micro opportunities to influence every customer interaction. Amazon leverages big data and advanced analytics to better understand customer behavior. For example, gaining insight into customers’ buying habits—with their consent, of course—can lead to an improved customer experience and increased sales through more effective bundling.
  5. Streamlined Operations – Finally, Amazon’s operations are the envy of the retail world.With Amazon Web Services (AWS), Amazon is the Public Cloud leader. Amazon has constantly proved it’s capabilities in automating operations and digitizing business processes using robotic process automation. This is important because it enables quicker shipping times to customers while cutting operating waste and costs. As an example, in the earnings call in 2015, their CFO touted Amazon’s use of robotics in its large warehouses to lower costs. “We’re using software and algorithms to make decisions rather than people, which we think is more efficient and scales better,”[3]

Again, according to the ILSR [4], “Today, half of all U.S. households are subscribed to the membership program Amazon Prime, half of all online shopping searches start directly on Amazon, and Amazon captures nearly one in every two dollars that Americans spend online. Amazon sells more books, toys, and by next year, apparel and consumer electronics than any retailer online or off, and is investing heavily in its grocery business.”

Retail Transformation Roadmap

Outlined below are the four prongs of a progressive strategy that Retailers can adopt to survive and thrive in today’s competitive marketplace. Needless to say, the theme across these strategies is Amazon-lite i.e leverage Digital technologies to create an immersive & convenient cross channel customer experience.

How can Retailers transform themselves to better compete in the Digital Age

Step #1 Develop a (Customer Focused) Digital Strategy..

This is a two pronged phase. Firstly, Customers across age groups are using a variety of channels such as mobile phones, apps, in-store kiosks, tablets etc to purchase products. Secondly, despite all of the attractive features around convenient ordering, frictionless payments and ease of delivery – the primary factor driving purchase in certain channels is price. Thus, it is key to identify the critical focus channels, customer segments based on loyalty & other historical data, their willingness to pay based on the channel and product mix in defining the overall digital strategy. Once defined, at the board level, key metrics which drive top line growth need to be identified.

It is important to understand that brick and mortar sales will continue to lead for a long time. Thus, investing in highly efficient store layouts, performing customer traffic analysis and in-store mapping of applications etc is highly called for. Brick and Mortar retailers have a significant ability to drive higher customer foot traffic based on their ability to deliver in-store pickup after online ordering etc. These are advantages that need to be leveraged.

There are four questions at this phase every Retailer needs to ask at the outset –

  • What customer focused business capabilities can be enabled across the organization?
  • What aspects of these  transformation can be enabled best by digital technology? Where are the current organizational and technology gaps that inhibit innovation?
  • How do we measure ROI and Business success across these projects? How do we benchmark ourselves from a granular process standpoint against the leaders?

Step #2 Accelerate Investments in New Technology..

The need of the hour for legacy Retail IT is to implement such flexible digital platforms that are built around efficient use of data , real time insights and predictive analytics. Leaders are driving platforms based on not just Big Data Analytics but are also adopting Deep Learning. Examples include Digital Assistants such as Chatbots, Mobile applications that can perform image recognition etc. What business capabilities can be driven from this? Tons.

The ability to offer quicker product tests and to modify them per customer feedback is fast becoming the norm. Depending on the segment of Retail you operate across (e.g. Apparel), the need is to make customer tryouts more convenient with the aid of both technology and humans – knowledgeable sales associates.

Further, an ability to mine customer, supplier and partner data implies the ability to offer customers relevant products, a wide ranging catalog of products, promotions & coupons and other complementary products such as store/private label credit cards.

The Three Core Competencies of Digital – Cloud, Big Data & Intelligent Middleware

Step #3 Use Data to Drive Operations

It is no longer sufficient to just perform brand studies that only understand historical customer transaction and behavior history. There is a strong degree of correlation between customer purchases and products recommended across their social networks – such as Facebook, Pinterest and Twitter etc.  Such advanced analytics are needed to drive product development, promotions and advertising.

Demystifying Digital – the importance of Customer Journey Mapping…(2/3)

The tremendous impact of AI (Artificial Intelligence) based approaches such as Deep Learning & Robotic Process Automation are beginning to be felt across early adopter industries like Banking.

Retailers playing catchup need to re-examine business and technology strategy across six critical prongs –

  1. Product Design
  2. Inventory Optimization
  3. Supply Chain Planning
  4. Transportation and Logistics
  5. IoT driven Store design
  6. Technology driven warehousing and order fulfillment

Step #4 Drive a Digital Customer Experience..

We have discussed the need to provide an immersive customer experience. Big Data Analytics drives business use cases in Digital in myriad ways – key examples include  –

  1. Obtaining a realtime Single View of an entity (typically a customer across multiple channels, product silos & geographies) – this drives customer acquisition, cross-sell, pricing and promotion. 
  2. Customer Segmentation by helping retailers understand their customers down to the individual as well as at a segment level. This has applicability in marketing promotions and campaigns.
  3. Customer sentiment analysis by combining internal organizational data, clickstream data, sentiment analysis with structured sales history to provide a clear view into consumer behavior.
  4. Product Recommendation engines which provide compelling personal product recommendations by mining realtime consumer sentiment, product affinity information with historical data.
  5. Market Basket Analysis, observing consumer purchase history and enriching this data with social media, web activity, and community sentiment regarding past purchase and future buying trends.

Demystifying Digital – Why Customer 360 is the Foundational Digital Capability – ..(1/3)


Retail as an industry continues to present interesting paradoxes in 2017. Traditional retailers continue to suffer with store closings while online entrants with their new approaches continue to thrive by taking market share away from the incumbents. The ability to adopt a Digital mindset and to offer technology platforms that enhance customer experiences will largely determine survival.


[1] “Amazon and Walmart are the top e-commerce retailers” http://wwd.com/business-news/financial/amazon-walmart-top-ecommerce-retailers-10383750/

[2] “Four Reasons Why Amazon’s stock will keep doubling every three years ”  https://www.forbes.com/sites/petercohan/2017/04/28/four-reasons-amazon-stock-will-keep-doubling-every-three-years/#45a4b68923c8

[3] “Wal-Mart, others speed up deliveries to shoppers” http://www.chicagotribune.com/business/ct-faster-holiday-deliveries-20151016-story.html

[4] Report on Amazon by the Institute of Local Self Reliance (ILSR) – https://ilsr.org/wp-content/uploads/2016/11/ILSR_AmazonReport_final.pdf

[5] “How Amazon stays more agile than most startups” –https://www.forbes.com/sites/howardhyu/2017/05/02/how-amazon-stays-more-agile-than-most-startups/#76ab2b572103

Why Digital Platforms Need A Software Defined Datacenter..(2/7)

The first blog in this seven part series (@ http://www.vamsitalkstech.com/?p=1833) introduced and discussed a reference architecture for Software Defined Data Centers (SDDC).The key runtime technology paradigm that enables Digital applications is the agility in the underlying datacenter infrastructure. Using a SDDC approach, complex underlying infrastructure (primarily Compute, Storage and Network) is abstracted away from the applications running on them.  This second blog post will discuss traditional datacenter challenges with running large scale Digital Applications.

Image Credit – Datacenter Dynamics


Every Enterprise in the middle of Digital reinvention realizes that the transformation component is critically based on technology – a mix of Big Data, Cloud, IoT, Predictive Analytics etc. It is stark from the above that the traditional IT assets & the enterprise datacenter is in need of a substantial refresh. Systems that dominate the legacy landscape such as mainframes, midrange servers, proprietary storage systems are slowly being phased out in favor of Cloud platforms running commodity x86 servers with the Linux OS, Big Data workloads, Predictive Analytics etc.

Traditional datacenters were built with application specific workloads in mind with silos of monitoring tools whereas Digital implies a move to fluid applications with changing workload requirements and more unified monitoring across the different layers.

We have dwelt on how the Digital platforms are underpinned by Cloud, Big Data and Intelligent Middleware.

The Three Core Competencies of Digital – Cloud, Big Data & Intelligent Middleware

It comes as no surprise that according to Gartner Research, by 2020, the Software-Defined Datacenter (SDDC) will become the dominant architecture in at least 75 percent of global data centers[1]. With the increasing adoption of APIs across the board and rapid increase in development of cloud-native digital applications using DevOps methodologies, the need for SDDC is only forecast to increase.

For those new to the concept of SDDC, attached is a link to the first blog in this series below where we discussed the overall technical concept along with a reference architecture.

Why Software Defined Infrastructure & why now..(1/6)

Legacy Datacenter vs SDDC..

For the last two decades, the vast majority of enterprise software applications created were based on monolithic architectures. These were typically created by dispersed teams who modeled the designs around organizational silos and the resulting inchoate patterns of communication. These applications were created by globally siloed developer teams who would then pass the deployment artifacts over to the operations team. These applications were then deployed in datacenters typically on high end servers using Vertical Scaling where multiple instances of an application are run on a few high end servers. As load increases on the application, adding more CPU, RAM etc to these server increased it’s ability scale.These applications were typically deployed, managed and updated in silos.

Thus, much of what exists in the data centers across enterprise are antiquated technology stacks. These range from proprietary hardware platforms to network devices & switches to monolithic applications running on them. Other challenges surrounding these systems include inflexible, proprietary integration & data architectures.

The vast majority of  current workloads are focused around systems such as ERP and other back office applications. They are unsuited to running cloud native applications such as Digital Platforms which support large scale users and need real time insights around customer engagement.

Quite often these legacy applications have business & process logic tightly coupled with infrastructure code. This results in complex manual processes, monolithic applications, out of compliance systems with out of date on patch levels and tightly coupled systems integration. Some of these challenges have been termed – Technical Debt.

While it is critical for Datacenters to operate in a manner that maximizes their efficiency. They also need to manage costs from an infrastructure, power and cooling standpoint while ultimately delivering the right business outcomes for the organization.

IDC forecasts that by 2018, 50% of new datacenter infrastructure investments will be for systems of engagement, insight, and action rather than maintaining existing systems of record.[2]

A great part of this transformation is also cultural. It is clear and apparent to me that the relationship lines of business (LOBs) have with their IT teams – typically central & shared – is completely broken at a majority of large organizations. Each side cannot seem to view either the perspective or the passions of the other. This dangerous dysfunction usually leads to multiple complaints from the business. Examples of which include –

  • IT is perceived to be glacially slow in providing infrastructure needed to launch new business initiatives or to amend existing ones. This leads to the phenomenon of ‘Shadow IT’ where business applications are  run on public clouds bypassing internal IT
  • Something seems to be lost in translation while conveying requirements to different teams within IT
  • IT is too focused on technological capabilities – Virtualization, Middleware, Cloud, Containers, Hadoop et al without much emphasis on business value drivers

Rapid provisioning of IT resources is a huge bottleneck which frequently leads to lines of business adopting the public cloud to run their workloads.  According to Rakesh Kumar, managing vice president at Gartner – “For over 40 years, data centers have pretty much been a staple of the IT ecosystem,Despite changes in technology for power and cooling, and changes in the design and build of these structures, their basic function and core requirements have, by and large, remained constant. These are centered on high levels of availability and redundancy, strong, well-documented processes to manage change, traditional vendor management and segmented organizational structures. This approach, however, is no longer appropriate for the digital world.” [1]

Further, Cloud-native applications are evolving into enterprise architectures built on granular microservices. Each microservice runs its own linux container. Thus, Digital architectures are evolving to highly standardized stacks that can scale “horizontally”.  Horizontal Scaling refers to increasing the overall footprint of an application’s architecture by quickly adding more servers as opposed to increasing the capacity of existing servers.

The below illustration depicts the needs of a Digital datacenter as opposed to the traditional model.

The Five Challenges of Running Massively distributed Architectures..

The SDDC with it’s focus on software controlling commodity hardware enables a range of flexibility and cost savings that were simply not possible before. In the next section, we will consider what requirements Digital Applications impose on a traditional datacenter.

What Do Digital Applications Require From Data Center Infrastructure..

As one can see from the above, traditional approaches to architecting data centers do not scale well from both a technology and from a cost standpoint as far as Digital Applications are concerned. As the diagram below captures, there are five main datacenter challenges that are encountered while architecting and deploying large or medium scale digital applications.

Running Digital Applications in legacy data centers requires surmounting five important challenges.
Running Digital Applications in legacy data centers requires surmounting five important challenges.

#1– Digital Applications Need Fast Delivery of Complex, Multivaried Application Stacks 

Digital applications are a combination of several different technology disciplines – Big Data, Intelligent Middleware, Mobile applications etc. Thus, data centers will need to run clusters of multi-varied applications at scale. Depending on the scope – a given application will consist of web servers, application servers, Big Data processing clusters, message queues, business rules and process management engines et al.

In the typical datacenter configuration, servers follow a vertical scaling model which limits their ability to host multi tenant applications. This is so as they are not inherently multi tenant in that they cannot natively separate workloads of different kinds when they’re running on the same underlying hardware. The typical traditional approach to ameliorate this has been to invest in multiple sets of hardware (servers, storage arrays) to physically separate applications which resulting in increases in running cost, a higher personnel requirement and manual processes around system patch and maintenance etc.

#2– Digital Applications Need Real Time Monitoring & Capacity Management of complex Architectures

Digital Applications also call for the highest degrees of Infrastructure and Application reliability across the stack. This implies not only a high level of monitoring but also seamless deployment of large scale applications across large clusters. Digital Applications are data intensive. Data flows into them from various sources in realtime for processing. These applications are subject to spikes in usage and as a result the underlying infrastructure hosting these can display issues with poor response times and availability.

Further, these applications are owned by combined teams of Developers and Operations. Owing to microservice architectures, the number of failure points also increase. Thus, Datacenter infrastructure is also shared between both teams with each area expected to understand the other discipline and even participate in it.

Traditional datacenters suffered from high capacity and low utilization rates. Capacity Management is critical across compute, network and storage. Sizing these resources (vCPU, vRAM, virtual Network etc) and dynamically managing their placement is a key requirement for digital application elasticity.

The other angle to this is the fact that Digital applications typically work on a chargeback model where Central IT needs to only charge the line of business for IT services consumed. This implies that IT can smartly manage capacity consumption on a real time basis using APIs. Thus, monitoring, capacity management and chargeback all need to be an integrated capability.

#3– Digital Applications Call for Dynamic Workload Scheduling

The ability to provide policy driven application & workload scheduling is a key criteria for Digital Applications. These applications work best on a self service paradigm. The capability of leveraging APIs to reconfigure & re provision infrastructure resources dynamically based on application workload needs. For instance, most Digital applications leverage linux containers which need to be dynamically scheduled and migrated across different hosts. Digital Applications thus need to be fluid in terms of how they scale across multiple hosts.

#4– Digital Applications Need Speedy Automation Across the Layers 

We discussed how one of the critical differentiators for Digital Enterprise applications is the standardization of architectural stacks. Depending on the scale, size and complexity of applications – choices of web development frameworks, libraries, application servers, databases and Big Data stacks need to be whittled down to a manageable few. This increases dependencies for applications across the infrastructure. From a horizontal scalability perspective, thousands of instances of popular applications will need to run on large scale infrastructure. What is key is ensuring that a high degree of automation from a cloud system administration standpoint. Automation spans  a variety of topics- lines of business self service, server automation, dynamic allocation of infrastructure, intelligent deployments, configuration of runtime elements using a template based approach, patching and workflow management. 

#5– Seamless Operations and Deployment Management at Scale

Traditional datacenters typically take weeks to months to deliver new applications. Digital Applications call for multiple weekly deployments and an ability to roll up or go down versions quickly. Application deployment and security patch management needs to include a range of use cases such as rolling deployments which ensure zero downtime, canary deployments to test functionality with a subset of users, sharded deployments et al. From an application maintenance standpoint, understanding where performance issues are occurring, such as delayed response times is of critical importance in ensuring customer satisfaction.

For instance, in the Retail industry, online shopping cart abandonment is as high as 70% when website response times are slow.

The lack of support for any of these operational features in Digital Applications can be fatal to user acceptance. And this can ultimately result a range of issues – increased CapEx and OpEx, high server to sysadmin ratios and unacceptably high downtimes.

In summary, the traditional datacenter is not a good fit for the new age Digital Platform.

The SDDC Technology Ecosystem

It is evident from the first post (@ http://www.vamsitalkstech.com/?p=1833) that Software Defined Datacenters have evolved into large & complex ecosystems dominated by open source technology.

It has become increasingly difficult for enterprise CXOs and IT leadership to identify which projects do what and how they all fit together.

I believe, the current SDDDC technology ecosystem could be broken down into four complementary categories –

  1. Cloud Infrastructure – Includes IaaS providers (AWS, Azure, OpenStack etc)  and Service Management Platforms such as ManageIQ
  2. Provisioning & Configuration Management – Tools like Puppet, Ansible and Chef.
  3. Serverless Infrastructure & DevOps – Includes a range of technologies but primarily PaaS providers such as OpenShift and CloudFoundry who use Linux containers (such as Docker, Rocket) as the basic runtime unit
  4. Cloud Orchestration & Monitoring- Includes a range of projects such as Apache Mesos, Kubernetes

Readers will detect a distinct tilt in my thinking towards open source but it is generally accepted that open technology communities are the ones leading most of the innovation in this space – along with meaty contributions from public cloud providers especially Amazon and Google.

The Roadmap for the rest of the blogs in this series..

In this blog series, we use these highlight specific cloud projects that are leading market adoption in the above categories.

The third and next post in this series will deep dive into Apache Mesos.

Subsequent posts in this series will cover best of breed projects – Docker & Kubernetes, ManageIQ, OpenStack, OpenShift in that order. The final post will round it all together with a sample real-world flow bringing all these projects together using a sample application provisioning flow.


Progressive enterprise IT teams have begun learning from the practices of the web-scale players and have adopting agile ways of developing applications. They have also begun slowly introducing disruptive technologies around Cloud Computing (IaaS & PaaS), Big Data, Agile developer toolsets, DevOps style development pipelines & Deployment Automation etc. Traditional datacenters are siloed in the sense that the core foundational components servers, networking and storage are deployed, managed and monitored by separate teams. This is the antithesis of Digital where all these areas converge in a highly fluid manner.

The next post in this series will discuss Apache Mesos, an exciting new technology project that strives to provide a global cluster manager for the vast diversity of applications found in Digital projects.


[1] Gartner – ” Five Reasons Why a Modern Data Center Strategy Is Needed for the Digital World” – http://www.gartner.com/newsroom/id/3029231

[2] IDC Asia/Pacific Unveils its Top Datacenter Predictions for APeJ for 2017 and Beyond –

Why the PSD2 will Spark Digital Innovation in European Banking and Payments….

Banking may be on the cusp of an industrial revolution. This is being propelled by technology on the supply side and the financial crisis on the demand side. The upshot could be the most radical reconfiguration of banking in centuries.” – Andrew Haldane, Chief Economist, Bank of England, 2013 [1]

This blog has discussed Banking and Payments industry trends quite extensively over the last year. Perhaps the most groundbreaking directive from a supranational regulatory standpoint has been the promulgation of the Payment Services Directive revision 2 (PSD2) in the European Union (EU). The PSD2 is a technology driven directive that aims to foster competition, digital innovation and security across Retail Banking & Internet Payments.

Banking and Payments Innovation in the EU..

The first Payment Services Directive (PSD1) came into vogue in the EU in 2009. With the stated goal of creating a Single Euro Payments Area (SEPA), the PSD1 provided created rules and frameworks for the creation of modern payment services as well as opened up payments to new entrants. The goal of the Single Euro Payments Area is to standardize the way euro payments are made across the EU and to make all cross border payments in euro as seamless as making domestic payments within a given member state. SEPA covers the whole of the EU and other non-EU European countries such as Iceland, Finland, Norway, Switzerland. Monaco etc.

A revised PSD (PSD2) was proposed in 2013 –PSD2 – EU Directive 2015/2366 . The PSD2 carries huge and monumental consequences in two lines of Global Banking – Retail Lines of Business (which typically include consumer savings & checking accounts, auto loans, mortgages and Small & Medium Enterprise Lending) and Payments (card payments, corporate payments, credit transfers, direct debits etc).

Many leading European Banks were propped by the EU Central Bank during the financial crisis. However, most have not innovated in any meaningful manner as their market shares have largely stayed intact with consumers still facing difficulties in cross border transactions. The EU clearly wants to use PSD2 as a vehicle to drive banking & payments innovation. Added to this is the Digital trend driven largely by global companies in the US and Asia. The intent of the PSD2 is to jumpstart the slow pace of innovation in the EU.

The PSD2 aims to foster a single market for consumer payments and eventually banking services. It intends to provide a framework for EU companies to respond to competitive changes in the payments landscape which have largely been driven by technology. The PSD2 also aims to drive further improvements in payment services across Europe by providing a number of enhancements to the PSD1 around the areas of mobile & online payments. It also harmonizes pricing and security among all member states. EU member state companies have until January 2018 to implement the PSD2.

It needs to be stated that all transactions that are ‘one leg out’ where at-least one party is located inside the EU – are within scope of the PSD2.

Open Banking, GDPR and PSD2…

The core themes of PSD2 may not be all that new for the UK banks for Her Majesty’s Treasury is putting finishing touches on the Open Bank Standard (OBS). While the topic has been covered quite exhaustively before in this blog, the themes are very similar as compared with the PSD2 – 

A Reference Architecture for The Open Banking Standard..

While the General Data Protection Regulation (GDPR) deserves it’s own blogpost, it certainly seems to impose an opposite effect on the industry as compared with PSD2. Let me explain, while the PSD2 forces banks to unlock customer data via APIs, GDPR imposes stringent requirements on them to protect customer data. It becomes effective on May 2018 (a few months after PSD2). Given the scope of both PSD2 and GDPR, banks will need to carefully assess and calibrate changes to a range of areas across the organization – security, lines of business communication, data management, partner ecosystems, outsourcing agreements etc.

So what does the PSD2 entail..

As mentioned above, the PSD2 moves the EU towards a single payment zone by creating explicit new institutional roles in the banking landscape. The goal is to clearly increase competition by changing the rules of participation in the banking & payments industry.

Under the PSD2, Banks and Payment Providers in the EU will need to unlock access to their customer data via Open APIs

First off, Banks need to begin opening up their customer data to third party providers of financial services, under the XS2A (Access to account) rule. They need to begin offering Open APIs (Application Programming Interface) to the TPPs (Third Party Providers).

This change creates three new types of roles for account and payment service providers –

  1. PISPs (Payment Initiation Service Providers) – who will be initiate online payments on the behalf of consumers which do not need to use existing payment networks and schemes. These will clearly provide new payment options to consumers in areas such as account to account payment transfers and bill pay. Example of a scenario. When a EU customer purchases a product from a retailer sometime in 2018, the retailer can initiate a payment request directly to the consumers Bank (via a secure API call) without going through any intermediaries.
  2. AISPs – (Account Information Service Providers) – who will be able to access customer core banking data and be able to provide value added personal financial management tools such as account aggregation etc.Example of a scenario – An AISP will offer a consumer with multiple banking accounts, a single aggregated view of all those accounts plus value added services such as personal financial management tools using all the transaction & historical data.
  3. ASPSPs (Account Servicing Payment Service Providers) – these are Credit Institutions (Banks that offer multiple services) and Payment Institutions( payment services providers) which are required to offer open APIs to the PSPs and the AISPs. These providers can charge a small price per transaction for the PISPs but not charge differently for payments  initiated through their own products.

The PISPs, AISPs and ASPSPs will all be registered, licensed and regulated by an EU agency – the European Banking Authority (EBA). They will also need be required to negotiate contracts with the individual banks. They will all need to use Strong Customer Authentication (SCA) mechanisms to access customer data thus reducing fraud in PSD2 transactions.

Open Banking via Open APIs..

The use of application programming interfaces (APIs) has been well documented across web scale companies such as Facebook, Amazon and Google. APIs are widely interoperable, relatively easy to create and form the front end of many Digital Platforms. APIs are leveraged to essentially access the core services provided by these platforms and can be used to create partner and customer ecosystems. Leader firms such as PayPal, Amazon & FinTechs such as Square, Mint etc have overwhelmingly used APIs as a way to not only open their platforms to millions of developers but also to offer innovative services. It is anticipated that the high margin services created as a result of PSD2, will include consumer & SME lending, financial advisory, peer to peer payments, crowdfunding, comparison shopping, chatbots etc to creating Banking ‘App Stores’ for widespread download and use. The AISPs and PISPs will definitely target high end margins such as financial advisory and lending.

APIs enable the creation of new business models that can deliver differentiated experiences (source – IBM)

It is expected that the EBA will define standards for the PSD2 Open API encompassing areas such as API definitions for standard banking operations to check account balances, perform transfers, view transaction histories, process payments. Vendors in the API space have already begun offering models for specific banking workflows. Security models for PSD2 should include support for two factor authentication, consent management etc using standards such as OpenID Connect.

Strategic Implications for Banks & Payment Providers..

With PSD2, the European Parliament has adopted the legal foundation of the creation of a EU-wide single payments area (SEPA).  While the goal of the PSD is to establish a set of modern, digital industry rules for all payment services in the European Union; it has significant ramifications for the financial services industry as it will surely current business models & foster new areas of competition. The key message from a regulatory standpoint is that consumer data can be opened up to other players in the payment value chain. This will lead to a clamor by players to own more of the customers data with a view to selling business services (e.g. accurate credit scoring, access to mortgage & other consumer loans and mutual funds etc) on that information.

The top five implications of the PSD2 for Banks will be –

  1. Increased competition for revenues in their existing customer base – It is expected that a whole range of nimble competitors such as FinTechs and other financial institutions will jockey to sell products to bank customers.
  2. Banks that are unable to respond to PSD2 in a nimbler manner will be commodified into utilities – Banks will lose their monopoly on being their customers primary front end. As FinTechs take over areas such as mortgage loans (an area where they’re much faster than banks in granting loans), Banks that cannot change their distribution and product models will be commodified. The challenges start with inflexible core banking systems that maintain customer demographics, balances, product information and other BORT (Book Of Record Transaction) Systems that store a range of loan, payment and risk data. These architectures will slowly need to transition from their current (largely) monolithic architectures to compose-able units. There are various strategies that Banks can follow to ‘modernize the core’. That may be the subject of a followup post.
  3. Lost Revenues – Over time, under PSD2, Banks and Payment providers will lose substantial revenues to the PISPs. The veritable elimination of card surcharges and Interchange Fee Regulation (IFR) for payment transactions using credit cards will not only dis-intermediate but also negatively impact card schemes such as Visa and MasterCard.
  4. A High Degree of IT Spend – To comply with the PSD2, Banks will spend tens to hundreds of millions of dollars implementing Open APIs, retrofitting these on legacy systems and complying with increased security requirements mandated by the PSD2.
  5. Implications for Regulatory Reporting and Risk Management – Clearly the Banks are a disadvantage here compared to the new entrants. The Banks still have to adhere to the Basel frameworks and AML (Anti Money Laundering) controls. The AISPs on the other hand are not subject to any of these restrictions nor do they need to hold capital in reserve.PISPs on the other hand will need to prove access to minimal capital reserves. Both AISPs and PISPs will need to explain their business plans and models clearly to regulators. They will also need to prove that their access to consumer data does not violate the intended use.

Why PSD2 is an Enormous Opportunity for Banks and Payment Providers..

At various times, we have highlighted various business & innovation issues with Banking providers in the areas of Retail Banking, Payment Providers and Capital Markets. Regimes such as PSD2 will compel staid industry players to innovate faster than they otherwise would.

After the PSD2 takes effect, banks face various choices. We can list those into three different strategic options.

  1. Minimally Compliant Banks – Here we should categorize Banks that seek to provide bare bones compliance with the Open API. While this may be the starting point for several banks, staying too long in this segment will mean gradual market share erosion as well as a loss of customer lifetime value (CLV) over time. The reason for this is that FinTechs and other startups will offer a range of services such as Instant mortgages,  personal financial management tools, paperless approval processes for a range of consumer accounts etc. It is also anticipated that such organizations will treat PSD2 as a localized effort and will allocate personnel to the project mainly around the front office and marketing.
  2. Digital Starters -Banks that have begun exploring opening up customer data but are looking to support the core Open API but also introduce their own proprietary APIs. While this approach may work in the short to medium term, it will only impose integration headaches on the banks as time goes on.
  3. Digital Innovators – The Digital Innovators will lead the way in adopting open APIs. These banks will fund dedicated teams in lines of business serving their particular customer segments either organically or using partnerships with TPPs. They will not only adhere to the PSD2 APIs but also extend the spec to create own with a focus on data monetization. Examples of such products and services will include Robo-advisors and Chatbots.

Recommendations for Banks on how to be a Digital Innovator….

In the PSD2 age, financial institutions need to embrace digital technology as a way of disarming competition and increasing their wallet share of customer business. They need to move beyond transactional banking to a customer centric model by offering value added services on the customer data that they already provide. Capabilities such as Customer Journey Mapping (CJM) and Single View of Customer (SVC) are the minimum table stakes that they need to provide.

Demystifying Digital – Why Customer 360 is the Foundational Digital Capability – ..(1/3)

So, the four strategic business goals that Innovators PSD2 compliant need to drive towards in the long run –

  1. Digitize The Customer Journey –  Bank clients who use services like Uber, Zillow, Amazon etc in their daily lives are now very vocal in demanding a seamless experience across all of their banking services using digital channels.  The vast majority of Bank applications still lag the innovation cycle, are archaic & are separately managed. The net issue with this is that the client is faced with distinct user experiences ranging from client on-boarding to servicing to transaction management. Such applications need to provide anticipatory or predictive capabilities at scale while understand the specific customers lifestyles, financial needs & behavioral preferences. 
  2. Provide Improved Access to Personal Financial Management (PFM) Tools & Improved Lending Processes  –  Provide consumers with a single aggregated picture of all their accounts without customers needing to engage a TPP (Third Party Provider). Also improve lending systems by providing more efficient access to loans by incorporating a large amount of contextual data in the process.
  3. Automate Back & Mid Office Processes Across Lending, Risk, Compliance & Fraud – PSD2 will force substantial compliance costs on the regulatory arena. The needs to forge a closer banker/client experience is not just driving demand around data silos & streams themselves but also forcing players to move away from paper based models to more of a seamless, digital & highly automated model to rework a ton of existing back & front office processes. These processes range from risk data aggregation, supranational compliance (AML,KYC, CRS & FATCA), financial reporting across a range of global regions & Cyber Security. Can the Data architectures & the IT systems  that leverage them be created in such a way that they permit agility while constantly learning & optimizing their behaviors across national regulations, InfoSec & compliance requirements? Can every piece of actionable data be aggregated, secured, transformed and reported on in such a way that it’s quality across the entire lifecycle is guaranteed? 
  4. Tune Existing Business Models Based on Client Tastes and Feedback – While the initial build out of the core architecture may seem to focus on digitizing interactions and exposing data via APIs. What follows fast is strong predictive modeling capabilities working at large scale where systems need to constantly learn and optimize their interactions, responsiveness & services based on client needs & preferences. 

Recommendations for Payment Service Providers on how to be a Digital Innovator….

Banks must revise their Payments Strategy and adopt six components to be successful as an Everyday Payments provider in the new regulatory environment:

  1. Frictionless and integrated payments –  working with interested 3rd parties in facilitating multimode payments through a variety of front ends
  2. Payments Ecosystems – Payment providers should work on creating smart ecosystems with TPPs that not only offer payment services but also leverage their knowledge of customers to offer value added tools for personal financial planning
  3. Real time Payments innovation – driving realtime cross border payments that are seamless, reliable, cost effective for both corporates and individuals
  4. Customer Data Monetization, Payment providers have been sitting on petabytes of customer data and have only now began waking up to the possibilities of monetizing this data. An area of increasing interest is to provide sophisticated analytics to merchants as a way of driving merchant rewards programs. Retailers, Airlines and other online merchants need to understand what segments their customers fall into as well as what the best avenues are to market to each of them. E.g. Webapp, desktop or tablet etc. Using all of the Payment Data available to them, Payment providers can help Merchant Retailers understand their customers better as well as improve their loyalty programs.
  5. Enhancing the Digital experience in corporate payments – Using the learnings from the more dynamic consumer payments spectrum, payment providers should offer their business clients the same experience in a range of areas such as wire transfers, cash management services using mobile devicesThe below blogpost provide more reading around the capabilities payment providers need to develop in the Digital arena.


With the PSD2, EU Banks and Payment service providers will need to accelerate the transition to a customer oriented mindset. They will being pushed to share data through open standards, become highly digitized in interacting with consumers and will need to begin leveraging their first mover advantage. They need to use the vast internal data (about customers, their transaction histories, financial preferences, operational insights etc) to create new products or services or to enhance the product experience.


[1] Andy Haldane: ‘Banking may be on the cusp of an industrial revolution – http://www.wired.co.uk/article/a-financial-forecast-from-the-bank-of-england

[2[ PSD2 EU Directive – PSD2 – EU Directive 2015/2366

A Digital Reference Architecture for the Industrial Internet Of Things (IIoT)..

A few weeks ago on the invitation of DZone Magazine, I jointly authored a Big Data Reference Architecture along with my friend & collaborator, Tim Spann (https://www.linkedin.com/in/timothyspann/). Tim & I distilled our experience working on IIoT projects to propose an industrial strength digital architecture. It brings together several technology themes – Big Data , Cyber Security, Cognitive Applications, Business Process Management and Data Science. Our goal is to discuss a best in class architecture that enables flexible deployment for new IIoT capabilities allowing enterprises to build digital applications. The abridged article was featured in the new DZone Guide to Big Data: Data Science & Advanced Analytics which can be downloaded at  https://dzone.com/guides/big-data-data-science-and-advanced-analytics

How the Internet Of Things (IoT) leads to the Digital Mesh..

The Internet of Things (IoT) has become one of the four top hyped up technology paradigms affecting the world of business. The other usual suspects being Big Data, AI/Machine Learning & Blockchain. Cisco predicts that the IOT is expected to impact about 25 billion connected things by 2020 and affect about $2 trillion of economic value globally across a diverse range of verticals. These devices are not just consumer oriented devices such as smartphones and home monitoring systems but dedicated industry objects such as sensors, actuators, engines etc.

The interesting angle to all this is the fact that autonomous devices are already beginning to communicate with one another using IP based protocols. They largely exchanging state & control information around various variables. With the growth of computational power on these devices, we are not far off from their sending over more granular and interesting streaming data – about their environment, performance and business operations – all of which will enable a higher degree of insightful analytics to be performed on the data. Gartner Research has termed this interconnected world where decision making & manufacturing optimization can occur via IoT as the “Digital Mesh“.

The evolution of technological innovation in areas such as Big Data, Predictive Analytics and Cloud Computing now enables the integration and analysis of massive amounts of device data at scale while performing a range of analytics and business process workflows on the data.

Image Credit – Sparkling Logic

According to Gartner, the Digital Mesh will thus lead to an interconnected data information deluge powered by the continuous data from these streams. These streams will encompasses classical IoT endpoints (sensors, field devices, actuators etc) sending data in a variety of formats –  text, audio, video & social data streams – along with new endpoints in areas as diverse as Industrial Automation, Remote Healthcare, Public Transportation, Connected Cars, Home Automation etc. These intelligent devices will increasingly begin communicating with their environments in a manner that will encourage collaboration in a range of business scenarios. The industrial cousin of IoT is the Industrial Internet of Things (IIIoT).

Defining the Industrial Internet Of Things (IIoT)

The Industrial Internet of Things (IIoT) can be defined as a ecosystem of capabilities that interconnects machines, personnel and processes to optimize the industrial lifecycle.  The foundational technologies that IIoT leverages are Smart Assets, Big Data, Realtime Analytics, Enterprise Automation and Cloud based services.

The primary industries impacted the most by the IIoT will include Industrial Manufacturing, the Utility industry, Energy, Automotive, Transportation, Telecom & Insurance.

According to Markets and Markets, the annual worldwide Industrial IoT market is projected to exceed $319 billion in 2020, which represents an 8% a compound annual growth rate (CAGR). The top four segments are projected to be manufacturing, energy and utilities, auto & transportation and healthcare.[1]

Architectural Challenges for Industrial IoT versus Consumer IoT..

Consumer based IoT applications generally receive the lion’s share of media attention. However the ability of industrial devices (such as sensors) to send ever more richer data about their operating environment and performance characteristics is driving a move to Digitization and Automation across a range of industrial manufacturing.

Thus, there are four distinct challenges that we need to account for in an Industrial IOT scenario as compared to Consumer IoT.

  1. The IIoT needs Robust Architectures that are able to handle millions of device telemetry messages per second. The architecture needs to take into account that all kinds of devices operating in environments ranging from the constrained to
  2. IIoT also calls for the highest degrees of Infrastructure and Application reliability across the stack. For instance, a lost message or dropped messages in a healthcare or a connected car scenario may mean life or death for a patient, or, an accident.
  3. An ability to integrate seamlessly with existing Information Systems. Lets be clear, these new age IIOT architectures need to augment existing systems such as Manufacturing Execution Systems (MES) or Traffic Management Systems. In Manufacturing, MES systems continually improve the product lifecycle and perform better resource scheduling and utilization. This integration helps these systems leverage the digital intelligence and insights across (potentially) millions of devices across complex areas of operation.
  4. An ability to incorporate richer kinds of analytics than has been possible before that provide a great degree of context. This ability to reason around context is what provides an ability to design new business models which cannot be currently imagined due to lack of agility in the data and analytics space.

What will IIoT based Digital Applications look like..

Digital Applications are being designed for specific device endpoints across industries. While the underlying mechanisms and business models differ from industry to industry, all of these use predictive analytics based on a combination of real time data processing & data science algorithms. These techniques extract insights from streaming data to provide digital services on existing toolchains, provide value added customer service, predict device performance & failures, improve operational metrics etc.

Examples abound. For instance, a great example in manufacturing is the notion of a Digital Twin which Gartner called out last year. A Digital twin is a software personification of an Intelligent device or system.  It forms a bridge between the real world and the digital world. In the manufacturing industry, digital twins can be setup to function as proxies of Things like sensors and gauges, coordinate measuring machines, vision systems, and white light scanning. This data is sent over a cloud based system where it is combined with historical data to better maintain the physical system.

The wealth of data being gathered on the shop floor will ensure that Digital twins will be used to reduce costs and increase innovation. Thus, in global manufacturing – Data science will soon make it’s way into the shop floor to enable the collection of insights from these software proxies. We covered the phenomenon of Servitization in manufacturing in a previous blogpost.

In the Retail industry, an ability to detect a customer’s location in realtime and combining that information with their historical buying patterns can drive real time promotions and an ability to dynamically price retail goods.

Solution Requirements for an IIoT Architecture..

At a high level, the IIoT reference architecture should support six broad solution areas-

  1. Device Discovery – Discovering a range of devices (and their details)  on the Digital Mesh for an organization within and outside the firewall perimeter
  2. Performing Remote Lifecycle Configuration of these devices ranging from startup to modification to monitoring to shut down
  3. Performing Deep Security level introspection to ensure the patch levels etc are adequate
  4. Creating Business workflows on the Digital Mesh. We will do this by marrying these devices to enterprise information systems (EISs)
  5. Performing Business oriented Predictive Analytics on these devices, this is critical to 
  6. On a futuristic basis, support optional integration with the Blockchain to support a distributed organizational ledger that can coordinate activity across all global areas that an enterprise operates in.

Building Blocks of the Architecture

Listed below are the foundational blocks of our reference architecture. Though the requirements will vary across industries, an organization can reasonably standardize on a number of foundational components as depicted below and then incrementally augment them as the interactions between different components increase based on business requirements.

Our reference architecture includes the following major building blocks –

  • Device Layer
  • Device Integration Layer
  • Data & Middleware Tier
  • Digital Application Layer

It also includes the following cross cutting concerns which span across the above layers –

  • Device and Data Security
  • Business Process Management
  • Service Management
  • UX Design
  • Data Governance – Provenance, Auditing, Logging

The next section provides a brief overview of the reference architecture’s components at a logical level.

A Big Data Reference Architecture for the Industrial Internet depicting multiple functional layers

Device Layer – 

The first requirement of IIIoT implementations is to support connectivity from the Things themselves or the Device layer depicted at the bottom. The Device layer includes a whole range of sensors, actuators, smartphones, gateways and industrial equipment etc. The ability to connect with devices and edge devices like routers, smart gateways using a variety of protocols is key. These network protocols include Ethernet, WiFi, and Cellular which can all directly connect to the internet. Other protocols that need a gateway device to connect include Bluetooth, RFID, NFC, Zigbee et al. Devices can connect directly with the data ingest layer shown above but it is preferred that they connect via a gateway which can perform a range of edge processing.

This is important from a business standpoint for instance, in certain verticals like healthcare and financial services, there exist stringent regulations that govern when certain identifying data elements (e.g. video feeds) can leave the premises of a hospital or bank etc. A gateway cannot just perform intelligent edge processing but also can connect thousands of device endpoints and facilitate bidirectional communication with the core IIoT architecture. 

The ideal tool for these constantly evolving devices, metadata, protocols, data formats and types is Apache NiFi.  These agents will send the data to an Apache NiFi gateway or directly into an enterprise Apache NiFi cluster in the cloud or on-premise.

Apache NiFi Eases Dataflow Management & Accelerates Time to Analytics In Banking (2/3)..

 A subproject of Apache NiFi – MiNiFi provides a complementary data collection approach that supplements the core tenets of NiFi in dataflow management. However due to its small footprint and low resource consumption, is well suited to handle dataflow from sensors and other IOT devices. It provides central management of agents while providing full chain of custody information on the flows themselves.

For remote locations, more powerful devices like the Arrow BeagleBone Black Industrial and MyPi Industrial, it is very simple to run a tiny Java or C++ MiNiFi agent for secure connectivity needs.

The data sent by the device endpoints are then modeled into an appropriate domain representation based on the actual content of the messages. The data sent over also includes metadata around the message. A canonical model can optionally be developed (based on the actual business domain) which can support a variety of applications from a business intelligence standpoint.

 Apache NiFi supports the flexibility of ingesting changing file formats, sizes, data types and schemas. The devices themselves can send a range of feeds in different formats. E.g. XML now and based on upgraded capabilities – richer JSON tomorrow. NiFi supports ingesting any file type that the devices or the gateways may send.  Once the messages are received by Apache NiFi, they are enveloped in security with every touch to each flow file controlled, secured and audited.   NiFi flows also provide full data provenance for each file, packet or chunk of data sent through the system.  NiFi can work with specific schemas if there are special requirements for file types, but it can also work with unstructured or semi structured data just as well.  From a scalability standpoint, NiFi can ingest 50,000 streams concurrently on a zero-master shared nothing cluster that horizontally scales via easy administration with Apache Ambari.

Data and Middleware Layer – 

The IIIoT Architecture recommends a Big Data platform with native message oriented middleware (MOM) capabilities to ingest device mesh data. This layer will also process device data in such a fashion – batch or real-time – as the business needs demand.

Application protocols such as AMQP, MQTT, CoAP, WebSockets etc are all deployed by many device gateways to communicate application specific messages.  The reason for recommending a Big Data/NoSQL dominated data architecture for IIOT is quite simple. These systems provide Schema on Read which is an innovative data handling technique. In this model, a format or schema is applied to data as it is accessed from a storage location as opposed to doing the same while it is ingested. From an IIOT standpoint, one must not just deal with the data itself but also metadata such as timestamps, device id, other firmware data such as software version, device manufactured data etc. The data sent from the device layer will consist of time series data and individual measurements.

The IIoT data stream can thus be visualized as a constantly running data pump which is handled by a Big Data pipeline takes the raw telemetry data from the gateways, decides which ones are of interest and discards the ones not deemed significant from a business standpoint.  Apache NiFi is your gateway and gate keeper.   It ingests the raw data, manages the flow of thousands of producers and consumers, does basic data enrichment, sentiment analysis in stream, aggregation, splitting, schema translation, format conversion and other initial steps to prepare the data. It does that all with a user-friendly web UI and easily extendible architecture.  It will then send raw or processed data to Kafka for further processing by Apache Storm, Apache Spark or other consumers.  Apache Storm is a distributed real-time computation engine that reliably processes unbounded streams of data.  Storm excels at handling complex streams of data that require windowing and other complex event processing. While Storm processes stream data at scale, Apache Kafka distributes messages at scale. Kafka is a distributed pub-sub real-time messaging system that provides strong durability and fault tolerance guarantees. NiFi, Storm and Kafka naturally complement each other, and their powerful cooperation enables real-time streaming analytics for fast-moving big data. All the stream processing is handled by NiFi-Storm-Kafka combination.  

Apache Nifi, Storm and Kafka integrate very closely to manage streaming dataflows.


Appropriate logic is built into the higher layers to support device identification, ID lookup, secure authentication and transformation of the data. This layer will process data (cleanse, transform, apply a canonical representation) to support Business Automation (BPM), BI (business intelligence) and visualization for a variety of consumers. The data ingest layer will also providing notification and alerts via Apache NiFi.

Here are some typical uses for this event processing pipeline:

a. Real-time data filtering and pattern matching

b. Enrichment based on business context

c. Real-time analytics such as KPIs, complex event processing etc

d. Predictive Analytics

e. Business workflow with decision nodes and human task nodes

Digital Application Tier – 

Once IIoT knowledge has become part of the Hadoop based Data Lake, all the rich analytics, machine learning and deep learning frameworks, tools and libraries now become available to Data Scientists and Analysts.   They can easily produce insights, dashboards, reports and real-time analytics with IIoT data joined with existing data in the lake including social media data, EDW data, log data.   All your data can be queried with familiar SQL through a variety of interfaces such as Apache Phoenix on HBase, Apache Hive LLAP and Apache Spark SQL.   Using your existing BI tools or the open sourced Apache Zeppelin, you can produce and share live reports.   You can run TensorFlow in containers on YARN for deep learning insights on your images, videos and text data; while running YARN clustered Spark ML pipelines fed by Kafka and NiFi to run streaming machine learning algorithms on trained models.

A range of predictive applications are suitable for this tier. The models themselves should seek to answer business questions around things like -Asset failure, the key performance indicators in a manufacturing process and how they’re trending, insurance policy pricing etc. 

Once the device data has been ingested into a modern data lake, key functions that need to be performed include data aggregation, transformation, enriching, filtering, sorting etc.

As one can see, this can get very complex very quick – both from a data storage and processing standpoint. A Cloud based infrastructure with its ability to provide highly scalable compute, network and storage resources is a natural fit to handle bursty IIoT applications. However, IIoT applications add their own diverse requirements of computing infrastructure, namely the ability to accommodate hundreds of kinds of devices and network gateways – which means that IT must be prepared to support a large diversity of operating systems and storage types

The tier is also responsible for the integration of the IIoT environment into the business processes of an enterprise. The IIoT solution ties into existing line-of-business applications and standard software solutions through adapters or Enterprise Application Integration (EAI) and business-to-business (B2B) gateway capabilities. End users in business-to-business or business-to-consumer scenarios will interact with the IIOT solution and the special- purpose IIoT devices through this layer. They may use the IIoT solution or line-of-business system UIs, including apps on personal mobile devices, such as smartphones and tablets.

Security Implementation

The topic of Security is perhaps the most important cross cutting concern across all layers of the IIoT architecture stack. Needless to say, each of the layers must support the strongest data encryption, authentication and authentication capabilities for devices, users and partner applications. Accordingly, capabilities must be provided to ingest and store security feeds, IDS logs for advanced behavioral analytics, server logs, device telemetry. These feeds must be constantly analyzed across three domains – the Device domain, the Business domain and the IT domain. The below blogpost delves into some of these themes and is a good read to get a deeper handle on this issue from a SOC (security operations center) standpoint.

An Enterprise Wide Framework for Digital Cybersecurity..(4/4)


It is evident from the above that IIoT will enormous opportunity for businesses globally. It will also create layers of complexity and opportunity for Enterprise IT. The creation of smart digital services on the data served up will further depend on the vertical industries. Whatever be the kind of business model – whether tracking behavior, location sensitive pricing, business process automation etc – the end goal of IT architecture should be to create enterprise business applications that are ultimately data native and analytics driven.


Why Data Silos Are Your Biggest Source of Technical Debt..

Any enterprise CEO really ought to be able to ask a question that involves connecting data across the organization, be able to run a company effectively, and especially to be able to respond to unexpected events. Most organizations are missing this ability to connect all the data together.” Tim Berners Lee -(English computer scientist, best known as the inventor of the World Wide Web)

Image Credit – Device42

We have discussed vertical industry business challenges across sectors like Banking, Insurance, Retail and Manufacturing in some level of detail over the last two years. Though enterprise business models vary depending on the industry, there is a common Digital theme raging across all industries in 2017. Every industry is witnessing an upswing in the numbers of younger and digitally aware customers. Estimates of this influential population are as high as 40% in areas such as Banking and Telecommunications. They represent a tremendous source of revenue but can also defect just as easily if the services offered aren’t compelling or easy to use – as the below illustration re the Banking industry illustrates.

These customers are Digital Natives i.e they are highly comfortable with technology and use services such as Google, Facebook, Uber, Netflix, Amazon, Google etc almost hourly in their daily lives. As a consequence, they expect a similar seamless & contextual experience while engaging with Banks, Telcos, Retailers, Insurance companies over (primarily) digital channels. Enterprises then have a dual fold challenge – to store all this data as well as harness it for real time insights in a way that is connected with internal marketing & sales.

As many studies have shown, companies that constantly harness data about their customers and perform speedy advanced analytics outshine their competition. Does that seem a bombastic statement? Not when you consider that almost half of all online dollars spent in the United States in 2016 were spent on Amazon and almost all digital advertising revenue growth in 2016 was accounted by two biggies – Google and Facebook. [1]

According to The Economist, the world’s most valuable commodity is no longer Oil, but Data. The few large companies depicted in the picture are now virtual monopolies[2] (Image Credit – David Parkins)

Let us now return to the average Enterprise. The vast majority of industrial applications (numbering around an average of 1000+ applications at large enterprises according to research firm NetSkope) generally lag the innovation cycle. This is because they’re created using archaic technology platforms by teams that conform to rigid development practices. The Fab Four (Facebook Amazon Google Netflix) and others have shown that Enterprise Architecture is a business differentiator but the Fortune 500 have not gotten that message as yet. Hence they largely predicate their software development on vendor provided technology instead of open approaches. This anti-pattern is further exacerbated by legacy organizational structures which ultimately leads to these applications holding a very parochial view of customer data. These applications can typically be classified in one of the buckets – ERP, Billing Systems, Payment Processors, Core Banking Systems, Service Management Systems, General Ledger, Accounting Systems, CRM, Corporate Email, Salesforce, Customer On-boarding etc etc. 

These enterprise applications are then typically managed by disparate IT groups scattered across the globe. They often serve different stakeholders who seem to have broad overlapping interests but have conflicting organizational priorities for various reasons. These applications then produce and data in silos – localized by geography, department, or, line of business, or, channels.

Organizational barriers only serve to impede data sharing for various reasons –  ranging from competitive dynamics around who owns the customer relationship, regulatory reasons to internal politics etc. You get the idea, it is all a giant mishmash.

Before we get any further, we need to define that dreaded word – Silo.

What Is a Silo?

A mind-set present in some companies when certain departments or sectors do not wish to share information with others in the same company. This type of mentality will reduce the efficiency of the overall operation, reduce morale, and may contribute to the demise of a productive company culture. (Source- Business Dictionary -[2])

Data is the Core Asset in Every Industry Vertical but most of it is siloed in Departments, Lines of Business across Geographies..

Let us be clear, most Industries do not suffer from a shortage of data assets. Consider a few of the major industry verticals and a smattering of the kinds of data that players in these areas commonly possess – 

Data In Banking– 

  • Customer Account data e.g. Names, Demographics, Linked Accounts etc
  • Core Banking Data going back decades
  • Transaction Data which captures the low level details of every transaction (e.g debit, credit, transfer, credit card usage etc)
  • Wire & Payment Data
  • Trade & Position Data
  • General Ledger Data e.g AP (accounts payable), AR (accounts receivable), cash management & purchasing information etc.
  • Data from other systems supporting banking reporting functions.


  • Structured Clinical data e.g. Patient ADT information
  • Free hand notes
  • Patient Insurance information
  • Device Telemetry 
  • Medication data
  • Patient Trial Data
  • Medical Images – e.g. CAT Scans, MRIs, CT images etc


  • Supply chain data
  • Demand data
  • Pricing data
  • Operational data from the shop floor 
  • Sensor & telemetry data 
  • Sales campaign data

The typical flow of data in an enterprise follows a familiar path –

  1. Data is captured in large quantities as a result of business operations (customer orders, e commerce transactions, supply chain activities, Partner integration, Clinical notes et al). These feeds are captured using a combination of techniques – mostly ESB (Enterprise Service Bus) and Message Brokers.
  2. The raw data streams then flow into respective application owned silos where over time a great amount of data movement (via copying, replication and transformation operations – the dreaded ETL) occurs using proprietary vendor developed systems. Vendors in this space have not only developed shrink wrapped products that make them tens of billions of dollars annually but also imposed massive human capital requirements of enterprises to program & maintain these data flows.
  3. Once all of the relevant data has been normalized, transformed and then processed, it  is then copied over into business reporting systems where it is used to perform a range of functions – typically for reporting for use cases such as Customer Analytics, Risk Reporting, Business Reporting, Operational improvements etc.
  4. Rinse and repeat..

Due to this old school methodology of working with customer, operational data, most organizations have no real time data processing capabilities in place & they thus live in a largely reactive world. What that means is that their view of a given customers world is typically a week to 10 days old.

Another factor to consider is – the data sources described out above are what can be described as structured data or traditional data. However, organizations are now on-boarding large volumes of unstructured data as has been captured in the below blogpost. Oftentimes, it is easier for Business Analysts, Data Scientists and Data Architects to get access to external data faster than internal data.

Getting access to internal data typically means jumping over multiple hoops from which department is paying for the feeds, the format of the feeds, regulatory issues, cyber security policy approvals, SOX/PCI compliance et al. The list is long and impedes the ability of business to get things done quickly.

Infographic: The Seven Types of Non Traditional Data that can drive Business Insights

Data and Technical Debt… 

Since Gene Kim coined the term ‘Technical Debt‘ , it has typically been used in an IT- DevOps- Containers – Data Center context. However, technology areas like DevOps, PaaS, Cloud Computing with IaaS, Application Middleware, Data centers etc in and of themselves add no direct economic value to customers unless they are able to intelligently process Data. Data is the most important technology asset compared to other IT infrastructure considerations. You do not have to take my word for that. It so happens that The Economist just published an article where they discuss the fact that the likes of Google, Facebook, Amazon et al are now virtual data monopolies and that global corporations are way way behind in the competitive race to own Data [1].

Thus, it is ironic that while the majority of traditional Fortune 500 companies are still stuck in silos, Silicon Valley companies are not just fast becoming the biggest owners of global data but are also monetizing them on the way to record profits. Alphabet (Google’s corporate parent), Amazon, Apple, Facebook and Microsoft are the five most valuable listed firms in the world. Case in point – their profits are around $25bn  in the first quarter of 2017 and together they make up more than half the value of the NASDAQ composite index. [1]

The Five Business Challenges that Data Fragmentation causes (or) Death by Silo … 

How intelligently a company harnesses it’s data assets determines it’s overall competitive position. This truth is being evidenced in sectors like Banking and Retail as we have seen in previous posts.

What is interesting, is that in some countries which are concerned about the pace of technological innovation, National regulatory authorities are creating legislation to force slow moving incumbent corporations to unlock their data assets. For example, in the European Union as a result of regulatory mandates – the PSD2 & Open Bank Standard –  a range of agile players across the value chain (e.g FinTechs ) will soon be able to obtain seamless access to a variety of retail bank customer data by accessing using standard & secure APIs.

Once obtained the data can help these companies can reimagine it in manifold ways to offer new products & services that the banks themselves cannot. A simple use case can be that they can provide personal finance planning platforms (PFMs) that help consumers make better personal financial decisions at the expense of the Banks owning the data.  Surely, FinTechs have generally been able to make more productive use of client data than have banks. They do this by providing clients with intuitive access to cross asset data, tailoring algorithms based on behavioral characteristics and by providing clients with a more engaging and unified experience.

Why cannot the slow moving established Banks do this? They suffer from a lack of data agility due to the silos that have been built up over years of operations and acquisitions. None of these are challenges for the FinTechs which can build off of a greenfield technology environment.

To recap, let us consider the five ways in which Data Fragmentation hurts enterprises – 

#1 Data Silos Cause Missed Top line Sales Growth  –

Data produced by disparate applications which use scattered silos to store them causes challenges in enabling a Single View of a customer across channels, products and lines of business. This then makes everything across the customer lifecycle a pain – ranging from smooth on-boarding, to customer service to marketing analytics. Thus, it impedes an ability to segment customers intelligently, perform cross sell & up sell. This sheer inability to understand customer journeys (across different target personas) also leads customer retention issues. When underlying data sources are fragmented, communication between business teams moves over to other internal mechanisms such as email, chat and phone calls etc. This is a recipe for delayed business decisions which are ultimately ineffective as they depend more on intuition than are backed by data. 

#2 Data Silos are the Root Cause of Poor Customer Service  –

Across industries like Banking, Insurance, Telecom & Manufacturing, the ability to get a unified view of the customer & their journey is at the heart of the the enterprises ability to understand their customers preferences & needs. This is also crucial in promoting relevant offerings and in detecting customer dissatisfaction. Currently most enterprises are woefully inadequate at putting together this comprehensive Single View of their Customers (SVC). Due to operational silos, each department possess a silo & limited view of the customer across other silos (or channels). These views are typically inconsistent in and of themselves as they lack synchronization with other departments. The net result is that the companies typically miss a high amount of potential cross-sell and up-sell opportunities.

#3 – Data Silos produce Inaccurate Analytics 

First off most Analysts need to wait long times to acquire the relevant data they need to test their hypotheses. Thus, since the data they work on is of poor quality as a result of fragmentation, so are the analytics operate on the data.

Let us take an example in Banking, Mortgage Lending, an already complex business process has been made even more so due to the data silos built around Core Banking, Loan Portfolio, Consumer Lending applications.Qualifying borrowers for Mortgages needs to be based on not just historical data that is used as part of the origination & underwriting process (credit reports, employment & income history etc) but also data that was not mined hitherto (social media data, financial purchasing patterns,). It is a well known fact there are huge segments of the population (especially the millennials) who are broadly eligible but under-banked as they do not satisfy some of the classical business rules needed to obtain approvals on mortgages.  Each of the silos store partial customer data. Thus, Banks do not possess an accurate and holistic picture of a customer’s financial status and are thus unable to qualify the customer for a mortgage in quick time with the best available custom rate.

#4 – Data Silos hinder the creation of new Business Models  

The abundance of data created over the last decade is changing the nature of business. If it follows that enterprise businesses are being increasingly built around data assets, then it must naturally follow that data as a commodity can be traded or re-imagined to create revenue streams off it. As an example, pioneering payment providers now offer retailers analytical services to help them understand which products perform best and how to improve the micro-targeting of customers. Thus, data is the critical prong of any digital initiative. This has led to efforts to monetize on data by creating platforms that either support ecosystems of capabilities. To vastly oversimplify this discussion ,the ability to monetize data needs two prongs – to centralize it in the first place and then to perform strong predictive modeling at large scale where systems need to constantly learn and optimize their interactions, responsiveness & services based on client needs & preferences. Thus, Data Silos hurt this overall effort more than the typical enterprise can imagine.

#5 – Data Silos vastly magnify Cyber, Risk and Compliance challenges – 

Enterprises have to perform a range of back-office functions such as Risk Data Aggregation & Reporting, Anti Money Laundering Compliance and Cyber Security Monitoring.

Cybersecurity – The biggest threat to the Digital Economy..(1/4)

It must naturally follow that as more and more information assets are stored across the organization, it is a manifold headache to deal with securing each and every silo from a range of bad actors – extremely well funded and sophisticated adversaries ranging from criminals to cyber thieves to hacktivists. On the business compliance front, sectors like Banking & Insurance need to maintain large AML and Risk Data Aggregation programs – silos are the bane of both. Every industry needs fraud detection capabilities as well, which need access to unified data.


My intention for this post is clearly to raise more questions than provide answers. There is no question Digital Platforms are a massive business differentiator but they need to have access to an underlying store of high quality, curated, and unified data to perform their magic. Industry leaders need to begin treating high quality Data as the most important business asset they have & to work across the organization to rid it of Silos.


[1]  The Economist – “The world’s most valuable resource is no longer oil, but data” – http://www.economist.com/news/leaders/21721656-data-economy-demands-new-approach-antitrust-rules-worlds-most-valuable-resource

[2] Definition of Silo Mentality – http://www.businessdictionary.com/definition/silo-mentality.html

Here Is What Is Causing The Great Brick-And-Mortar Retail Meltdown of 2017..(1/2)

Amazon and other pure plays are driving toward getting both predictive and prescriptive analytics. They’re analyzing and understanding information at an alarming rate. Brands have pulled products off of Amazon because they’re learning more about them than the brands themselves.” — Todd Michaud, Founder and CEO of Power Thinking Media

By April 2017,17 major retailers announced plans to close stores (Image Credit: Clark Howard)

We are witnessing a meltdown in Storefront Retail..

We are barely halfway through 2017, and the US business media is rife with stories of major retailers closing storefronts. The truth is inescapable that the Retail industry is in the midst of structural change. According to a research report from Credit Suisse, around 8,600 brick-and-mortar stores will shutter their doors in 2017. The number in 2016 was 2,056 stores and 5,077 in 2015 which points to industry malaise [1].

The Retailer’s bigger cousin – the neighborhood Mall – is not doing any better. There are around 1,200 malls in the US today and that number is forecast to decline to just about 900 in a decade.[3]

It is clear that in the coming years, Retailers (and malls) across the board will remain under pressure due to a variety of changes – technological, business model and demographic.

So what can legacy Retailers do to compete with and disarm the online upstart?

Six takeaways for Retail Industry watchers..

Six takeaways that should have industry watchers take notice from the recent headlines –

  1. The brick and mortar retail store pullback has accelerated in 2017 – an year of otherwise strong economic expansion. Typical consumer indicators that influence consumer spending on retail are generally pointing upwards. Just sample the financial data – the US has seen increasing GDP for eight straight years, the last 18 months have seen wage growth for middle & lower income Americans and gas prices are at all time lows.[3] These kinds of relatively strong consumer data trends cannot explain a slowdown in physical storefronts. Consumer spending is not shrinking to due to declining affordability/spending power.
  2. Retailers that have either declared bankruptcy or announced large scale store closings include marquee names across the different categories of retail. Ranging from Apparel to Home Appliances to Electronics to Sporting Goods. Just sample some of the names – Sports Authority, RadioShack, HHGregg, American Apparel, Bebe Stores, Aeropostale, Sears, Kmart, Macy’s, Payless Shoes, JC Penney etc. So this is clearly a trend across various sectors in retail and not confined to a given area, for instance, women’s apparel.
  3. Some of this “Storefront Retail bubble burst” can definitely be attributed to hitherto indiscriminate physical retail expansion. The first indicator is here is in the glut of residual excess retail space.  The WSJ points out that the retail expansion dates back almost 30 years ago when retailers began a “land grab” to open more stores – not unlike the housing boom a decade or so ago. [1] North America now has a glut of both retail stores and shopping malls while per capita sales has begun declining. The US especially has almost five times retail space per capita compared to the UK. American consumers are also swapping materialism for more experiences.[3]  Thus, an over-buildout of retail space is one of the causes of the ongoing crash.

    The US has way more shopping space compared to the rest of the world. (Credit – Cowan and Company)
  4. The dominant retail trend in the world is online ‘single click’ shopping. This is evidenced by declining in-store Black Friday sales in 2016 when compared with record Cyber Monday (online) sales. As online e-commerce volume increases year on year, online retailers led by Amazon are surely taking market share away from the struggling brick-and mortar Retailer who has not kept up with the pace of innovation. The uptick in online retail is unmistakeable as evidenced by the below graph (src – ZeroHedge) depicting the latest retail figures. Department-store sales rose 0.2% on the month, but were down 4.5% from a year earlier. Online retailers such as Amazon, posted a 0.6% gain from the prior month and a 11.9% increase from a year earlier.[3]

    Retail Sales – Online vs In Store Shopping (credit: ZeroHedge)
  5. Legacy retailers are trying to play catch-up with the upstarts who excel at technology. This has sometimes translated into acquisitions of online retailers (e.g. Walmart’s buy of Jet.com). However, the Global top 10 Retailers are dominated by the likes of Walmart, Costco, the Kroger, Walgreens etc. Amazon comes in only at #10 which implies that this battle is only in it’s early days. However, legacy retailers are saddled by huge fixed costs & their investors prefer dividend payouts to investments in innovations. Thus their CEOs are incentivized to focus on the next quarter, not the next decade like Amazon’s Jeff Bezos who is famously known to not evidence any signs of increasing Amazon’s profitability. Though traditional retailers have begun accelerating investments (both organic and via acquisition) in the critical areas of Cloud Computing, Big Data,Mobility and Predictive Analytics – the web scale majors such as Amazon are far far ahead of typical Retail IT shop.

  6. The fastest growing Retail industry brands are companies that use Data as a core business capability to impact the customer experience versus as just another component of an overall IT system. Retail is a game of micro customer interactions that drive sales and margin. This implies a Retailer’s ability to work with realtime customer data – whether it’s sentiment data, clickstream data and historical purchase data to drive marketing promotions, personally relevant services, order fulfillment, show-rooming, loyalty programs etc etc.On the back end, the ability to streamline operations by pulling together data from operations, supply chains are helping retailers fine-tune & automate operations especially from a delivery standpoint.

    In Retail, Technology Is King..

    So, what makes Retail somewhat of a unique industry in terms of it’s data needs? I posit that there are four important characteristics –

    • First and foremost, Retail customers especially the millennials are very open about sharing their brand preferences and experiences on social media. There is a treasure trove of untapped data and similar out there. Data needs to be collected and monetized on. We will explore this in more detail in the next post.
    • Secondly, leaders such as Amazon use customer, product data and a range of other technology capabilities to shape the customer experience versus the other way around for traditional retailers. They do this based on predictive analytic approaches such as machine learning and deep learning. Case in point is Amazon which has now morphed from an online retailer to a Cloud Computing behemoth with it’s market leading AWS (Amazon Web Services). In fact it’s best in class IT enabled it to experiment with retail business models. E.g. The Amazon Prime subscription at $99-a-year Amazon Prime subscription, which includes free two delivery, music and video streaming service that competes with Netflix. As of March 31, 2017 Amazon had 80 million Prime subscribers in the U.S , an increase of 36 percent from a year earlier, according to Consumer Intelligence Research Partners.[3]
    • Thirdly, Retail organizations need to become Data driven businesses. What does that mean or imply? They need to rely on data to drive every core business process – e.g. realtime insights about customers, supply chains, order fulfillment and inventory. This data however spans every kind from traditional structured data (sales data, store level transactions, customer purchase histories, supply chain data, advertising data etc) to non traditional data (social media feeds as there is a strong correlation between the products people rave about and what they ultimately purchase), location data, economic performance data etc). This Data variety represents a huge challenge to Retailers in terms of managing, curating and analyzing these feeds.
    • Fourth, Retail needs to begin aggressively adopting the IoT capabilities they already have in place in the area of Predictive Analytics. This implies tapping and analyzing data from in store beacons, sensors and actuators across a range of use cases from location based offers to restocking shelves.

      ..because it enables new business models..

      None of the above analysis claims that physical stores are going away. They serve a very important function in allowing consumers a way to try on products and allowing for the human experience. However, online definitely is where the growth primarily will be.

      The Next and Final Post in this series..

      It is very clear from the above that it now makes more sense to talk about a Retail Ecosystem which is composed of store, online, mobile and partner storefronts.

      In that vein, the next post in this two part series will describe the below four progressive strategies that traditional Retailers can adopt to survive and favorably compete in today’s competitive (and increasingly online) marketplace.

      These are –

    • Reinventing Legacy IT Approaches – Adopting Cloud Computing, Big Data and Intelligent Middleware to re-engineer Retail IT

    • Changing Business Models by accelerating the adoption of Automation and Predictive Analytics – Increasing Automation rates of core business processes and infusing them with Predictive intelligence thus improving customer and business responsiveness

    • Experimenting with Deep Learning Capabilities  –the use of Advanced AI such as Deep Neural Nets to impact the entire lifecycle of Retail

    • Adopting a Digital or a ‘Mode 2’ Mindset across the organization – No technology can transcend a large ‘Digital Gap’ without the right organizational culture

      Needless to say, the theme across all of the above these strategies is to leverage Digital technologies to create immersive cross channel customer experiences.


[1] WSJ – ” Three hard lessons the internet is teaching traditional stores” – https://www.wsj.com/articles/three-hard-lessons-the-internet-is-teaching-traditional-stores-1492945203

[2] The Atlantic  – “The Retail Meltdown” https://www.theatlantic.com/business/archive/2017/04/retail-meltdown-of-2017/522384/?_lrsc=2f798686-3702-4f89-a86a-a4085f390b63

[3] WSJ – ” Retail Sales fall for the second straight month” https://www.wsj.com/articles/u-s-retail-sales-fall-for-second-straight-month-1492173415

What Banks, Retailers & Payment Providers Should Do About Exploding Online Fraud in 2017..

Despite the introduction of new security measures such as EMV chip technology, 2016 saw the highest number of victims of identity fraud , according to a new report from Javelin Strategy & Research and identity-theft-protection firm LifeLock Inc[1]. 

Image Credit: Wall Street Journal


The Global Credit Card industry has industry players facing new business pressures in strategic areas. Chief among these business shifts are burgeoning online transaction volumes, increased regulatory pressures (e.g. PSD2 in the European Union) and disruptive competition from FinTechs.

As discussed in various posts in this blog in 2016 – Consumers, Banks, Law Enforcement, Payment Processors, Merchants and Private Label Card Issuers are faced with yet another critical & mounting business challenge – payment card fraud. Payment card fraud continued to expand at a massive clip in 2016 – despite the introduction of security measures such as EMV Chip cards, multi-factor authentication, secure point of sale terminals etc. As the accessibility and modes of usage of credit, debit and other payment cards burgeons and transaction volumes increase across the globe, Banks are losing tens of millions of dollars on an annual basis to fraudsters.

Regular readers of this blog will recollect that we spent a lot of time last year discussing Credit Card and Fraud in some depth. I have reproduced some of these posts below for background reading.

Big Data Counters Payment Card Fraud (1/3)…

Hadoop counters Credit Card Fraud..(2/3)

It’s time for a 2017 update on this issue.

Increasing Online Payments means rising Fraud

The growing popularity of alternative payment modes like Mobile Wallets (e.g Apple Pay, Chase and Android Pay) are driving increased payment volumes across both open loop and closed loop payments. Couple this with in-app payments (e.g Uber) as well as Banking providers Digital Wallets  only driving increased mobile payments. Retailers like Walmart, Nordstrom and Tesco have been offering more convenient in-store payments.

This relentless & secular trend towards online payments is being clearly seen in all forms of consumer and merchant payments across the globe. This trend will only continue to accelerate in 2017 as smartphone manufacturers continue to produce devices that have more onscreen real estate. This will drive more mobile commerce. With IoT technology taking center stage, the day is not long off when connected devices (e.g. wearables) make their own payments.

However, with convenience of online payments confers anonymity which increases the risk of fraud. Most existing fraud platforms were designed for a previous era – of point of sales payments – with their focus on magnetic stripes, chips and EMV technology. Online payments thus present various challenges that Banks and Merchants did not have to deal with on such a large scale.

According to the WSJ [1] more consumers (15.4 million in the US) became victims of identity fraud in 2016 than at any point in more than a decade. Despite new security protections implemented by the industry in the form of EMV – about $16 billion was lost to fraudulent purchases with online accounting for a 15% rise in cases.

Fraud is a pernicious problem which in a lot of cases leads to a much worse crime- identity theft. The U.S. Department of Justice (DOJ) terms Identity theft as “one of the most insidious forms of white collar crime”. Identity theft typically results in multiple instances of fraud, which exact a heavy toll on consumers, merchants, banks and the overall economy. Let us look at some specific recommendations for Payment providers to consider.

Sadly, the much hyped “Chip on your cards” are useless in countering online fraud..

Javelin Research noted in their study that the vast majority of identity theft fraud was linked to credit cards.[2]

Most credit card holders in the USA will remember 2016 as the year when electronic chip technology became ubiquitous and required at the majority of retail establishments. The media buzz around chips was that this would curtail fraudster activity. However, this has been accompanied by a large in online theft. Card-not-present (CNP) fraud, which is when a thief buys something online or by phone, rose 40%.[2]

So did Account takeover fraud, where thieves access ongoing customer accounts and change the contact details/security information. These increased 61% compared to 2015, and totaled around 1.4 million incidents.[2]

It is very clear that the bulk of fraud happens over online transactions. It is here that the Banks must focus now. And online is a technology game.

How should Banks, Retailers & Payment Providers Respond..

Online card fraud revolves around the unauthorized stealing of an individual’s financial data. Fraudsters are engaging in a range of complex behaviors such as counterfeiting cards, committing mail fraud to open unauthorized accounts, online Card Not Present (CNP) transactions etc. Fraud patterns are quickly copied and reproduced across diverse geographies.

Let us consider five key areas where industry players need to make investments.

#1 Augment traditional Fraud Detection Systems & Architectures  with Big Data capabilities

Traditional Fraud detection systems have been built leveraging expert systems or rules engines. These expert systems are highly mature as they take into account the domain experience, intuition of fraud analysts. Fraud patterns called business rules are created in the form of IF..THEN.. format and made available in these systems. These rules describe a range of well understood patterns as shown below.

If Consumer Credit = yes And Transaction amount ≤ 1000 And Card present = yes Then Fraud = no

Typically hundreds of such rules are applied in realtime to incoming transactions.

Expert systems have been built for the era of physical card usage and can thus only reason on a limited number of data attributes. In the online world they are focused on looking for factors such as known bad IP addresses or unusual login times based on Business Rules and Events.However, the scammers have also learnt to stay ahead of the scammed and are leveraging computing advances to come up with ever new ways of cheating the banks. Big Data can help transform the detection process by enriching the data available to the fraud process including traditional customer data, transaction data, third party fraud data, social data and location based data.

Big Data also provides capabilities to tackle the most complex types of fraud and to learn from fraud data & patterns to be able to stay ahead of criminal networks. It is recommended that fraud systems be built using a layering paradigm. E.g. Provide multiple levels of detection capabilities starting with a) configuring business rules (that describe a fraud pattern) as well as b) dynamic capabilities based on machine learning models (typically thought of as being more predictive). Fraud systems also need to adapt Big Data frameworks like Spark, Storm etc to move to a real time mode. Frameworks like Spark make it extremely intuitive to implement advanced risk scoring based on user account behavior, suspicious behavior etc.

Advanced fraud detection systems augment the Big Data approach with building models of customer behavior at the macro level. Then they would use these models to detect anomalous transactions and flag them as potentially being fraudulent.

#2 Create Dynamic Single View of Cardholders

The Single View provide comprehensive business advantages as captured here – http://www.vamsitalkstech.com/?p=2517.  The SVC can help with the ability to view a customer as a single entity (or Customer 360) across all those channels & to be able to profile those.Ability to segment those customers into populations based on their behavior patterns. This will vastly help improve anomaly detection capabilities while also helping reduce the false positive problem.

#3 Adopt Graph Data processing capabilities

Fraudsters are engaging in a range of complex behaviors such as counterfeiting cards, committing mail fraud to open unauthorized accounts, online Card Not Present (CNP) transactions etc. Fraud patterns are quickly copied and reproduced across diverse geographies as fraudsters operate in concert. Thus, fraud displays a strong social element which leads to a higher risk of repetitive fraud across geographies.

The ability to demonstrate Social Network identity links with customer profiles to establish synthetic (or fraudulent) customer profiles and to reduce false identities is a key capability to possess. As fraud detection algorithms constantly analyze thousands of data points, it is important to perform Network based analysis understand if an account or IP Address or fraud pattern is occurring across different and seemingly unrelated actors.  The ability to search for the same Telephone numbers, Email accounts, social network profiles etc – in addition to machine data such as similar IP Addresses, device signatures and addresses can be used to establish these connections. Thus, graph and network analysis lends a different dimension to detection.

#4 Personalize Fraud Detection by Adopting Machine Learning

Incorporating as many sources of data (both deep and wide) into the decisioning process helps majorly in analyzing fraud. This data includes not just the existing – customer databases, data on historical spending patterns etc but also credit reports, social media data and other datasets (e.g Government watch-lists of criminal activity).

Some of these non-traditional sources are depicted below –

  • Geolocation Data
  • Purchase Channel Data
  • Website clickstream data
  • POS Sensor, Camera, ATM data
  • Social Media Data
  • Customer Complaint Data

Payment Providers assess the risk score of transactions in realtime depending upon these 100s of such attributes. Big Data enables these reasoning on more detailed and granular attributes. Advanced statistical techniques are used to incorporate behavioral (e.g. transaction is out of normal behavior for a consumers buying patterns), temporal and spatial techniques. The models often weigh attributes differently from one another thus separating the vast majority of good transactions from the small percentage of fraudulent ones.

We discussed the fact that fraud happens at every stage of the process – account opening, customer on-boarding, account validation & cross verification, card usage & chargebacks etc. It is imperative that fraud models be created and leveraged across the entire business workflow.

#5 Automate the Fraud Monitoring, Detection Lifecycle

Business Process Management (BPM) is a more prosaic and mature field compared to Big Data and Predictive Analytics. Pockets of BPM implementations exist at every large Bank in customer facing areas such as issuance, on-boarding, reporting, compliance etc. However, the ability to design, deploy automated processes is critical across the Cards fraud lifecycle. In areas like dispute management, false positive case resolution etc depend upon robust Case Management capability – which a good BPM platform or tool can provide.

Improvements can be noticed in agent productivity, number of cases handled per Agent and improved customer satisfaction. Errors and lags due to issues in human driven manual processes come down. On the front end, providing customers with handy mobile apps to instantaneously report suspicious transactions as well as tying those with automated handling can drastically improve fraud detection thus saving tens of millions of dollars. Major improvements can also seen in compliance, dispute resolution and cross border customer service.


Online fraud keeps going up year after year, thus enterprises will remain vigilant especially banks and retailers. Online retail sales are expected to total nearly $28 trillion in 2020 [2] and it is a given that fraudsters will invent new techniques to steal customer data. Effective Fraud prevention has become an essential part of the customer experience.


[1] WSJ – Credit Card Fraud Keeps Rising Despite New Security Chips – “https://www.wsj.com/articles/credit-card-fraud-keeps-rising-despite-new-security-chipsstudy-1485954000

[2] Forbes – That Chip on Your Credit Card Isn’t Stopping Fraud After All – “http://fortune.com/2017/02/01/credit-card-chips-fraud/ “

Hadoop is Not Failing, it is the Future of Data..

Madam..What use is a new-born baby?”’ – Michael Faraday – (Apocryphal quote) when asked about the utility of electricity a new invention in the 1800s…

Source – DataEconomy

Why Hadoop Is Thriving and Will Continue to do so…

As my readers are aware I have been heavily involved in the Big Data space for the last two years. This time has been an amazing and transformative personal experience as I have been relentlessly traveling the globe advising global banking leaders across all continents.

Thus, it should come as no surprise that the recent KDNuggets article somewhat provocatively titled   – “Hadoop is Failing – Why” – managed get me disagreeing right from the get go .

The author, though well meaning from what I can tell, bases the article on several unfounded assumptions. Before we delve into those, let us consider the following background.

The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging visualization but also to personalize services customers care about – while working across multiple channels/modes of interaction. Mobile applications first began forcing the need for enterprise applications  to support multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. Healthcare is a close second where caregivers expect patient, medication & disease data at their fingertips with a few finger swipes on an iPad app.Big Data technology evolved to overcome the limitations of existing data approaches (RDBMS & EDW)  to keep up data architecture & analysis challenges inherent in the Digital application stack.

These challenges include –

  1. The challenge of data volume explosion – please read blog below for a detailed discussion.


  2. The amazing data variety enterprises are now forced to deal with, traveling at high velocity


  3. Surely Hadoop has it’s own technical constraints -the ability to support low latency BI (Business Intelligence) queries for one. However, the sheer inability of pre-Hadoop approaches to scale with exploding data ingest and management of massive data caused two business challenges for Digital Architectures. The first challenge is the ability to glean real time insights from vast streams of (structured & unstructured) data flowing into enterprise architectures. The second is it’s ability to work with advanced analytics – Predictive Analytics and Deep Learning – at fast speeds (quite often tens of thousands to tens of millions of messages per second) enables the ability to solve complex problems across domains. Hadoop only turns these challenges into business opportunities for efficient adopters.

Why the Darwinian Open Source Ecosystem ensures Hadoop is a robust and mature technology platform 

Big Data is backed by the open source community with most Hadoop ecosystem technology (25+ projects) incubated, developed and maintained in the Apache ecosystem. The Open Source community is inherently Darwinian in nature.  Its focus on code quality, industry adoption, a concrete roadmap and committers means that If a project lacks  then it is for sure headed for the graveyard.  Put another way, there can be no stragglers in this ecosystem.

Let us now consider the chief assumptions made by the author in the above article.

Assumption 1  –  Hadoop adoption is staying flat at best

The best part of my job is working with multiple customers daily on their business initiatives and figuring out how to apply technology to solving these complex challenges. I can attest that adoption in the largest enterprise is anything but stagnating. While my view is certainly anecdotal and confined to the four walls of one company, adoption is indeed skyrocketing at verticals like Banking, Telecom, Manufacturing & Insurance. The early corporate movers working with the leading vendors, have more or less figured out the kinks in the technology as applied to their business challenges. The adoption patterns are maturing and they are realizing massive business value from it. A leading vendor, Hortonworks, moved to $100 million annual revenues quicker than any other tech startup – which is testament to the potential of this space. Cloudera just went public. All this growth has been accompanied by somewhat declining revenues & stock prices at leading EDW vendors. I forecast that the first Big Data ‘startup’ to $1 billion in revenue will happen over the next five-seven years, at a somewhat faster rate compared to the revered open source pioneer Red Hat. At a minimum Hadoop projects cut tens of millions of dollars from costly and inflexible enterprise data warehouse projects. Nearly every large organization has begun deploying Hadoop as an Enterprise Landing Zone (ELZ) to augment an EDW.

Assumption 2  – The business value of projects created using Hadoop is unclear

The author has a point here but let me explain why this is an organizational challenge and not really the fault of any technology stack – Middleware or Cloud or Big Data. The challenge is that it is often a fine art to figure out the business value of Big Data projects working across complex organizational structures. IT groups can surely start POCs as science or “one-off resume builder” projects but the lines of business need to get involved from the get go sooner than any other technology category. Big Data isn’t about the infrastructural plumber’s job of storing massive volumes of data but really about creating business analytics on the data collected and curated. Whether those analytics are simply old school BI or Data Science oriented depends on the culture and innovativeness of an organization.

Organizations are using Big Data to not only solve existing business challenges (sell more products, detect fraud, run risk reports etc) but also to rapid experiment with new business models using the insights gleaned from Big Data Analytics. It falls to the office of an enlightened CDO (Chief Data Officer) to own the technology, create the appropriate internal costing models and to onboard lines of business (LOBs) projects into the data lake.

There are two questions every CDO needs to ask at the outset –

  • What business capabilities are going to be enabled across the organization?
  • What aspects of digital transformation can be enabled best by Big Data?

Assumption 3  – Big Data is only valid technical solution for massive data volumes in the Petabytes (PBs). 

The author writes ‘You don’t need Hadoop if you don’t really have a problem of huge data volumes in your enterprise, so hundreds of enterprises were hugely disappointed by their useless 2 to 10TB Hadoop clusters – Hadoop technology just doesn’t shine at this scale.’

This could not be further from the observed reality for three reasons.

Firstly, most of the projects in the terabyte (TB) range exist as tenants in larger clusters The real value of data lakes is being able to build out cross organizational data repositories that were simply too expensive or too hard to build before. Once you have all the data in one place, you can mashup it up, analyze it in ways heretofore unknown.

Secondly, as I’ve covered in the below post, many players are leveraging Big Data to gain the crucial “speed” advantage while working with TBs of data.

Thirdly, I recommend that every client start ‘small’ and use a data lake to serve as an Enterprise Landing Zone (ELZ) for data produced as a result of regular business operations. Hadoop clusters not only serve as cheap storage but also perform a range of rote but compute intensive data processing tasks (data joining, sorting, segmentation, binning etc etc) that saves the EDW from a range of taxing operations.

Assumption 4  – Hadoop skills are hard to find.

In the author’s words – “..while 57% said that the skills gap was the major reason, a number that is not going to be corrected overnight. This coincides with findings from Indeed who tracked job trends with ‘Hadoop Testing’ in the title, with the term featured in a peak of 0.061% of ads in mid 2014, which then jumped to 0.087% in late 2016, an increase of around 43% in 18 months. What this may signal is that adoption hasn’t necessarily dropped to the extent that anecdotal evidence would suggest, but companies are simply finding it difficult to extract value from Hadoop from their current teams and they require greater expertise.”

The skills gap is real and exists in three primary areas – Data Scientists, Data Engineers and Hadoop Administrators.

However this is nothing unique to Hadoop and is common with every new technology. Companies need to bridge this by augmenting the skills of internal staff, working with the Global Systems Integrators (GSI) who have all added Big Data practice areas and by engaging with academia. In fact, the prospect of working on Big Data projects can attract talent to the organization.

How Should large Organizations proceed on their Big Data journey?

So what are the best practices to avoid falling into the “Big Data does not provide value” trap?

  • Ensuring that Big Data and a discussion of it’s business and technical capabilities are conducted at the highest levels. Big Data needs to be part of an organizations DNA at the highest levels and should be discussed in the context of the other major technology forces driving industry – Cloud, Mobility, DevOps, Social, APIs etc .
  • Creating or constituting a team under the CDO (Chief Data Officer). Teams can be both physical, virtual and need to take into account organizational politics
  • Creating a COE (Center of Excellence) or such federated approach where central team works with lines of business IT on these projects
  • As part of the COE, institute a process to onboard the latest skills
  • Instituting appropriate governance and project oversight
  • Identifying key business metrics that will drive Big Data projects. This blog has covered many such areas but these include detailed analyses on expected growth acceleration, cost reduction, risk management and enabling competitive advantage.
  • Engaging the lines of business to develop these capabilities in an iterative manner. Almost all successful Big Data projects are delivered in a DevOps fashion.


The Big Data ecosystem and Hadoop technology provide mature, stable and feature rich platforms for global vertical organizations to implement complex Digital projects. However, the technology maturity is only a necessary factor. The ability of the organization in terms of an innovation oriented mindset is key in driving internal change. So is inculcating a learning mindset across business leadership, IT teams, internal domain experts and management. The universal maxim – “one gets only out of something as much as they put into it”  is more truer than ever with Big Data. While it is easy to blame a technology or a vendor or lack of skilled personnel for perceived project failures, one should guard against a status quo-ist mindset. You can rest assured that your competition are not sitting still.