Why Digital Disruption is the Cure for the Common Data Center..

The foundation of digital business is the boundary-free enterprise, which is made possible by an array of time- and location-independent computing capabilities – cloud, mobile, social and data analytics plus sensors and APIs. There are no shortcuts to the digital enterprise.”

— Mike West,Analyst,Saugatack Research 2015

At its core Digital is a fairly straightforward concept. It is essentially about offering customers more contextual and relevant experiences while creating internal teams that can turn on a dime to serve customers. It is clear that these kinds of consumer capabilities just cannot be offered using an existing technology stack. This blogpost seeks to answer what this next generation computing stack may look like.

What Digital has in Store for Enterprises…

Digital transformation is a daily fact of life at web scale shops like Google, Amazon, Apple, Facebook and Netflix. These mega shops have built not just intuitive and appealing applications but have gradually evolved them into platforms that offer discrete marketplaces that serve global audiences. They also provide robust support for mobile applications that deliver services such as content, video, e-commerce, gaming etc via such channels. In fact they have heralded the age of new media and in doing so have been transforming both internally (business models, internal teams & their offerings) as well as externally.

CXOs at established Fortune 1000 enterprises were unable to find resonance in these stories from the standpoint of their enterprise’s reinvention. This makes a lot of sense as these established companies have legacy investments and legacy stakeholders – both of which represent change inhibitors that the FANGs (Facebook Amazon Netflix and Google) did not have. Enterprise practitioners need to understand how Digital technology can impact both existing technology investments and the future landscape.

Where are most Enterprises at the moment…

Much of what exists in the datacenters across organizations are antiquated from a technology stack. These range from hardware platforms to network devices & switches to monolithic applications running on them. Connecting these applications are often proprietary or manual integraton architectures. There are inflexible, proprietary systems & data architectures, lots of manual processes, monolithic applications and tightly coupled integration. Rapid provisioning of IT resources is a huge bottleneck which frequently leads to lines of business adopting the public cloud to run their workloads.  According to Rakesh Kumar, managing vice president at Gartner – “For over 40 years, data centers have pretty much been a staple of the IT ecosystem,Despite changes in technology for power and cooling, and changes in the design and build of these structures, their basic function and core requirements have, by and large, remained constant. These are centered on high levels of availability and redundancy, strong, well-documented processes to manage change, traditional vendor management and segmented organizational structures. This approach, however, is no longer appropriate for the digital world.” [2]

On that note, the below blogpost had captured the three essential technology investments that make up Digital Transformation.

The Three Core Competencies of Digital – Cloud, Big Data & Intelligent Middleware

If Digital has to happen, IT is one of the largest stakeholders…

Digital applications present seamless expereinces across channels & devices, are tailored to individual customers needs, understand their preferences & need to be developed in an environment of constant product innovation.

So, which datacenter capabilities are required to deliver this?

Figuring out the best architectural foundation to support , leverage & monetize on digital experiences is complex.  The past few years have seen the rapid evolution of many transformational technologies—Big Data, Cognitive Computing, Cloud technology (Public clouds, OpenStack, PaaS, Containers, Software-defined networking & storage), the Blockchain – the list goes on and on. These are leading enterprises to a smarter way of developing enterprise applications and to a more modern, efficient, scalable, cloud-based architectures.

So, what capabilities do Datacenters need to innovate towards?


                                         The legacy Datacenter transitions to the Digital Datacenter

While, the illustration above is self explanatory. Enterprise IT will need to majorly embrace Cloud Computing – whatever forms the core offering may take – public, private or hybrid. The compute infrastructure ranging from a mix of open source virtualization to Linux containers. Containers essentially virtualize the operating system so that multiple workloads can run on a single host, instead of virtualizing a server to create multiple operating systems. These containers are easily ported across different servers without the need for reconfiguration and require less maintenance because there are fewer operating systems to manage. For instance, the OpenStack Cloud Project specifies Docker (a defacto standard), a Linux format for containers that’s designed to automate the deployment of applications as highly portable, self-sufficient containers.

Cloud computing will also enable the rapid scale up & scale down across the gamut of infrastructure (compute – VM/Baremetal/Containers, storage – SAN/NAS/DAS, network – switches/routers/Firewalls etc) in near real-time (NRT). Investments in SDN (Software Defined Networking) will be de riguer in order to improve software based provisioning, network, time to market and to drive network equipment costs down. The other vector that brings about datacenter innovation is around automation i.e vastly reducing manual efforts in network and application provisioning. These capabilities will be key as the vast majority of digital applications are deployed as Software as a service (SaaS).

An in depth discussion of these Software Defined capabilities can be found at the below blogpost.

Financial Services IT begins to converge towards Software Defined Datacenters..

Applications developed for a Digital infrastructure will be developed as small, nimble processes that communicate via APIs and over infrastructure like service mediation components (e.g Apache Camel). These microservices based applications will offer huge operational and development advantages over legacy applications. While one does not expect legacy but critical applications that still run on mainframes (e.g. Core Banking, Customer Order Processing etc) to move over to a microservices model anytime soon, customer facing applications that need responsive digital UIs will definitely move.

Which finally brings us to the most important capability of all – Data. The heart of any successful Digital implementation is Data. The definition of Data includes internal data (e.g. customer data, data about transactions, customer preferences data), external datasets & other relevant third party data (e.g. from retailers) etc.  While each source of data may not radically change an application’s view of its customers, the combination of all promises to do just that.

The significant increases in mobile devices and IoT (Internet of Things) capable endpoints will ensure exponential increases in data volumes will occur. Thus Digital applications will need to handle this data – not just to process it but also to be able to glean real time insights.  Some of the biggest technology investments in ensuring a unified customer journeys are in the areas of Big Data & Predictive Analytics. Enterprises should be able to leverage a common source of data that transcends silos (a data lake) to be able to drive customer decisions that drive system behavior in real time using advanced analytics such as Machine Learning techniques, Cognitive computing platforms etc which can provide accurate and personalized insights to drive the customer journey forward.

Can Datacenters incubate innovation ?

Finally, one of the key IT architectural foundation strategies companies need to invest in is modern application development. Gartner calls such a feasible approach “Bimodal IT”. According to Gartner, “infrastructure & operations leaders must ensure that their internal data centers are able to connect into a broader hybrid topology“.[2]  Let us consider Healthcare – a reasonably staid vertical as an example. In a report released by EY, “Order from Chaos – Where big data and analytics are heading, and how life sciences can prepare for the transformational tidal wave,” [1] the services firm noted that an agile environment can help organizations create opportunities to turn data into innovative insights. Typical software development life cycles that require lengthy validations and quality control testing prior to deployment can stifle innovation. Agile software development, which is adaptive and is rooted in evolutionary development and continuous improvement, can be combined with DevOps, which focuses on the the integration between the developers and the teams who deploy and run IT operations. Together, these can help life sciences organizations amp up their application development and delivery cycles. EY notes in its report that life sciences organizations can significantly accelerate project delivery, for example, “from three projects in 12 months to 12 projects in three months.”

Finally, Big Data has evolved to enable the processing of data in a batch, interactive, low latency manner depending on the business requirements – which is a massive gain for Digital projects. Big Data and DevOps will both go hand in hand to deliver new predictive capabilities.

Further, business can create digital models of client personas and integrate these with predictive analytic tiers in such a way that an API (Application Programming Interface) approach is provided to integrate these with the overall information architecture.


More and more organizations are adopting a Digital first business strategy.  The current approach as in vogue – to treat these as one-off, tactical project investments – does not simply work or scale anymore. There are various organizational models that one could employ from the standpoint of developing analytical maturity. These ranging from a shared service to a line of business led approach. An approach that I have seen work very well is to build a Digital Center of Excellence (COE) to create contextual capabilities, best practices and rollout strategies across the larger organization.

References –

[1] E&Y – “Order From Chaos” http://www.ey.com/Publication/vwLUAssets/EY-OrderFromChaos/$FILE/EY-OrderFromChaos.pdf

[2] Gartner – ” Five Reasons Why a Modern Data Center Strategy Is Needed for the Digital World” – http://www.gartner.com/newsroom/id/3029231

Why Big Data & Advanced Analytics are Strategic Corporate Assets..

The industry is all about Digital now. The explosion in data storage and processing techniques promises to create new digital business opportunities across industries. Business Analytics concerns itself from deriving insights from data that is produced as a byproduct of business operations as well as external data that reflects customer insights. Due to their critical importance in decision making, Business Analytics is now a boardroom matter and not just one confined to the IT teams. My goal in this blogpost is to quickly introduce the analytics landscape before moving on to the significant value drivers that only Predictive Analytics can provide.

The Impact of Business Analytics…

The IDC “Worldwide Big Data and Analytics Spending Guide 2016”, predicts that the big data and business analytics market will grow from $130 billion by the end of this year to $203 billion by 2020[1] at a   compound annual growth rate (CAGR) of 11.7%. This exponential growth is being experienced across industry verticals such as banking & insurance, manufacturing, telecom, retail and healthcare.

Further, during the next four years, IDC finds that large enterprises with 500+ employees will be the main driver in big data and analytics investing, accounting for about $154 billion in revenue. The US will lead the market with around $95 billion in investments during the next four years – followed by Western Europe & the APAC region [1].

The two major kinds of Business Analytics…

When we discuss the broad topic of Business Analytics, it needs to be clarified that there are two major disciplines – Descriptive and Predictive. Industry analysts from Gartner & IDC etc. will tell you that one also needs to widen the definition to include Diagnostic and Prescriptive. Having worked in the industry for a few years, I can safely say that these can be subsumed into the above two major categories.

Let’s define the major kinds of industrial analytics at a high level –

Descriptive Analytics is commonly described as being retrospective in nature i.e “tell me what has already happened”. It covers a range of areas traditionally considered as BI (Business Intelligence). BI focuses on supporting operational business processes like customer onboarding, claims processing, loan qualification etc via dashboards, process metrics, KPI’s (Key Performance Indicators). It also supports a basic level of mathematical techniques for data analysis (such as trending & aggregation etc.) to infer intelligence from the same.  Business intelligence (BI) is a traditional & well established analytical domain that essentially takes a retrospective look at business data in systems of record. The goal of the Descriptive disciplines is to primarily look for macro or aggregate business trends across different aspects or dimensions such as time, product lines, business units & operating geographies.

  • Predictive Analytics is the forward looking branch of analytics which tries to predict the future based on information about the past. It describes what “can happen based on the patterns in data”. It covers areas like machine learning, data mining, statistics, data engineering & other advanced techniques such as text analytics, natural language processing, deep learning, neural networks etc. A more detailed primer on both along with detailed use cases are found here –

The Data Science Continuum in Financial Services..(3/3)

The two main domains of Analytics are complementary yet different…

Predictive Analytics does not intend to, nor will it, replace the BI domain but only adds significant sophisticated analytical capabilities enabling businesses to be able to do more with all the data they collect. It is not uncommon to find real world business projects leveraging both these analytical approaches.

However from an implementation standpoint, the only common area of both approaches is knowledge of the business and the sources of data in an organization. Most other things about them vary.

For instance, predictive approaches both augment & build on the BI paradigm by adding a “What could happen” dimension to the data.

The Descriptive Analytics/BI workflow…

BI projects tend to follow a largely structured process which has been well defined over the last 15-20 years. As the illustration below describes it, data produced in operational systems is subject to extraction, transformation and eventually is loaded into a data warehouse for consumption by visualization tools.


                                                                       The Descriptive Analysis Workflow 

Descriptive Analytics and BI add tremendous value to well defined use cases based on a retrospective look at data.

However, key challenges with this process are –

  1. the lack of a platform to standardize and centralize data feeds leads to data silos which cause all kinds of cost & data governance headaches across the landscape
  2. complying with regulatory initiatives (such as Anti Money Laundering or Solvency II etc.) needs the warehouse to handle varying types of data which is a huge challenge for most of the EDW technologies
  3. the ability to add new & highly granular fields to the data feeds in an agile manner requires extensive relational modeling upfront to handle newer kinds of schemas etc.

Big Data platforms have overcome past shortfalls in security and governance and are being used in BI projects at most organizations. An example of the usage of Hadoop in classic BI areas like Risk Data Aggregation are discussed in depth at the below blog.


That being said, BI projects tend to follow a largely structured process which has been well defined over the last 20 years. This space serves a large existing base of customers but the industry has been looking to Big Data as a way of constructing a central data processing platform which can help with the above issues.

BI projects are predicated on using an EDW (Enterprise Data Warehouse) and/or RDBMS (Relational Database Management System) approach to store & analyze the data. Both these kinds of data storage and processing technologies are legacy in terms of both the data formats they support (Row-Column based) as well as the types of data they can store (structured data).

Finally, these systems fall short of processing data volumes generated by digital workloads which tend to be loosely structured (e.g mobile application front ends, IoT devices like sensors or ATM machines or Point of Sale terminals), & which need business decisions to be made in near real time or in micro batches (e.g detect credit card fraud, suggest the next best action for a bank customer etc.) and increasingly cloud & API based to save on costs & to provide self-service.

That is where Predictive Approaches on Big Data platforms are beginning to shine and fill critical gaps.

The Predictive Analytics workflow…

Though the field of predictive analytics has been around for years – it is rapidly witnessing a rebirth with the advent of Big Data. Hadoop ecosystem projects are enabling the easy ingestion of massive quantities of data thus helping the business gather way more attributes about their customers and their preferences.


                                                                    The Predictive Analysis Workflow

The Predictive Analytics workflow always starts with a business problem in mind. Examples of these would be “A marketing project to detect which customers are likely to buy new products or services in the next six months based on their historical & real time product usage patterns – which are denoted by x, y or z characteristics” or “Detect real-time fraud in credit card transactions.”

In use cases like these, the goal of the data science process is to be able to segment & filter customers by corralling them into categories that enable easy ranking. Once this is done, the business is involved to setup easy and intuitive visualization to present the results.

A lot of times, business groups have a hard time explaining what they would like to see – both data and the visualization. In such cases, a prototype makes things easier from a requirements gathering standpoint.  Once the problem is defined, the data scientist/modeler identifies the raw data sources (both internal and external) which comprise the execution of the business challenge.  They spend a lot of time in the process of collating the data (from Oracle/SQL Server, DB2, Mainframes, Greenplum, Excel sheets, external datasets, etc.). The cleanup process involves fixing a lot of missing values, corrupted data elements, formatting fields that indicate time and date etc.

The data wrangling phase involves writing code to be able to join various data elements so that a single client’s complete dataset is gathered in the Data Lake from a raw features standpoint.  If more data is obtained as the development cycle is underway, the Data Science team has no option but to go back & redo the whole process. The modeling phase is where algorithms come in – these can be supervised or unsupervised. Feature engineering takes in business concepts & raw data features and creates predictive features from them. The Data Scientist takes the raw & engineered features and creates a model using a mix of various algorithms. Once the model has been repeatedly tested for accuracy and performance, it is typically deployed as a service. Models as a Service (MaaS) is the Data Science counterpart to Software as a Service. The MaaS takes in business variables (often hundreds of inputs) and provides as output business decisions/intelligence, measurements, & visualizations that augment decision support systems.

 How Predictive Analytics changes the game…

Predictive analytics can bring about transformative benefits in the following six ways.

  1. Predictive approaches can be applied to a much wider & richer variety of business challenges thus enabling an organization to achieve outcomes that were not really possible with the Descriptive variety. For instance, these use cases range from Digital Transformation to fraud detection to marketing analytics to IoT (Internet of things) across industry verticals. Predictive approaches are real-time and not just batch oriented like the Descriptive approaches.
  2. When deployed strategically – they can scale to enormous volumes of data and help reason over them reducing manual costs.  It can take on problems that can’t be managed manually because of the huge amount of data that must be processed.
  3. They can predict the results of complex business scenarios by being able to probabilistically predict different outcomes across thousands of variables by perceiving minute dependencies between them. An example is social graph analysis to understand which individuals in a given geography are committing fraud and if there is a ring operating
  4. They are vastly superior at handling fine grained data of manifold types than can be handled by the traditional approach or by manual processing. The predictive approach also encourages the integration of previously “dark” data as well as newer external sources of data.
  5. They can also suggest specific business actions(e.g. based on the above outcomes) by mining data for hitherto unknown patterns. The data science approach constantly keeps learning in order to increase its accuracy of decisions
  6. Data Monetization–  they can be used to interpret the mined data to discover solutions to business challenges and new business opportunities/models


[1] IDC Worldwide Semiannual Big Data and Business Analytics Spending Guide – Oct 2016 “Double-Digit Growth Forecast for the Worldwide Big Data and Business Analytics Market Through 2020 Led by Banking and Manufacturing Investments, According to IDC



Demystifying Digital – the importance of Customer Journey Mapping…(2/3)

The first post in this three part series on Digital Foundations @ http://www.vamsitalkstech.com/?p=2517 introduced the concept of Customer 360 or Single View of Customer (SVC).  We discussed specific benefits from both a business & operational standpoint that are enabled by SVC. This second post in the series introduces the concept of a Customer Journey. The third & final post will focus on a technical design & architecture needed to achieve both these capabilities.

Introduction to Customer Journey Mapping…

The core challenge many Banks have is their ability to offer a unified customer experience for banking services across different touch points. The lack of such a unified experience negatively impacts the quality of the overall banking experience.

Thus, Customer Journey Mapping refers to the process of creating a visual depiction of a customers adoption and usage of banking products across different channels or touch points(branch,mobile,phone,chat,email etc). The journey provides dynamic & realtime insight into the total customer lifetime value (CLV) as the person has progressed in her or his life journey. The goal of the customer journey mapping is to provide the bank personnel with a way of servicing the customer better while increasing the bank’s net economic value from servicing this customer.

The result of the journey mapping process is to drive overall engagement model from the customers perspective and not solely the Banks internal processes.

Banks may be curious as to why they need a new approach to customer centricity? Quite simple, just consider the sheer complexity for signing up for new banking products such as checking or savings accounts or receiving credit for a simple checking deposit. At many banks these activities can take a couple of days. Products with higher complexity like home mortgage applications can take weeks to process even for those consumers with outstanding credit. Consumers are beginning to constantly compare these slow cycle times to the realtime service they commonly obtain using online services such as Amazon or Apple Pay or Google Wallet or Airbnb or even FinTechs. For internal innovation to flourish, customer centric mindset rather than an internal process centric mindset is what is called for at most incumbent Banks.

The Boston Consulting Group (BCG) has proposed a six part program for Banks to improve their customer centricity as a way of driving increased responsiveness and customer satisfaction[1]. This is depicted in the below illustration.


Customer Journey Mapping in Banking involves six different areas. Redrawn & readapted from BCG Analysis [1]
  1. Invest in intuitive interfaces for both customer & internal stakeholder interactions–  Millenials who use services like Uber, Facebook, Zillow, Amazon etc in their daily lives are now very vocal in demanding a seamless experience across all of their banking services using digital channels.  The first component of client oriented thinking is to provide UI applications that smoothly facilitate products that reflect individual customers lifestyles, financial needs & behavioral preferences. The user experience will offer different views to business users at various levels in the bank – client advisors, personal bankers, relationship managers, brach managers etc.  The second component is to provide a seamless experience across all channels (mobile, eBanking, tablet, phone etc) in a way that the overall journey continuous and non-siloed. The implication is that clients should be able to begin a business transaction in channel A and be able to continue them in channel B where it makes business sense.
  2. Technology Investments – The biggest technology investments in ensuring a unified customer journey are in the areas of Big Data & Predictive Analytics. Banks should be able to leverage a common source of data that transcends silos to be able to drive customer decisions that drive system behavior in real time using advanced analytics such as Machine Learning techniques, Cognitive computing platforms etc which can provide accurate and personalized insights to drive the customer journey forward. Such platforms need to be deployed in strategic areas such as the front office, call center, loan qualification etc. Further, business can create models of client personas and integrate these with the predictive analytic tier in such a way that an API (Application Programming Interface) approach is provided to integrate these with the overall information architecture.
  3. Agile Business Practices–  Customer Journey Mapping calls for cross organizational design teams consisting of business experts, UX designers, Developers & IT leaders. The goal is to create intuitive & smart client facing applications using a rapid and agile business & development lifecycle. 
  4. Risk & Compliance –  Scalable enterprise customer journey management also provides a way to integrate risk and compliance functions such as customer risk, AML compliance into the process. This can be achieved using a combination of machine learning & business rules.
  5. Process Workflow – It all starts with the business thinking outside the box and bringing in learnings from other verticals like online retailing, telecom, FinTechs etc to create journeys that reimagine existing business processes using technology. An example would be to reimagine the mortgage application process by having the bank grant a mortgage using a purely online process by detecting that this may be the best product for a given consumer. Once the initial offer is validated using a combination of publicly available MLS (Multi Listing Scheme) data & the consumer’s financial picture, the bank can team up with realtors to provide the consumer with an online home shopping experience and help take the process to a close using eSigning.
  6. Value Proposition – It is key for financial services organizations to identify appropriate usecases as well as target audiences as they begin creating critical customer journeys. First identifying & then profiling these key areas such as customer onboarding, mortgage/auto loan application, fraud claims management workflows in the retail bank, digital investment advisory in wealth management etc are key. Once identified, putting in place strong value drivers with demonstrable ROI metrics is critical in getting management buy in. According to BCG,banks that have adopted an incremental approach to customer journey innovation have increased their revenues by 25% and their productivity by 20% to 40% [1].


As financial services firms begin to embark on digital transformation, they will need to transition to a customer oriented mindset. Along with a Single View of Client, Customer Journey Mapping is a big step to realizing digitization. Banks that can make this incremental transition will surely realize immense benefits in customer lifetime value & retention as compared to their peers.Furthermore, when a Bank embarks on Data Monetization – using the vast internal data (about customers, their transaction histories, financial preferences, operational insights etc) to create new products or services or to enhance the product experience – journey mapping is a foundational capability that they need to possess.


[1] Boston Consulting Group 2016- “How digitized Customer Journeys can help Banks win hearts, minds and profits”

A POV on European Banking Regulation.. MAR, MiFiD II et al

Today’s European financial markets hardly resemble the ones from 15 years ago. The high speed of electronic trading, explosion in trading volumes, the diverse range of instruments classes & a proliferation of trading venues pose massive challenges.  With all this complexity, market abuse patterns have also become egregious. Banks are now shelling out millions of euros in fines for market abuse violations. In response to this complex world, European regulators thus have been hard at work. They have created rules for surveillance of exchanges with a view to detecting suspicious patterns of trade behavior & increase market transparency. In this blogpost,we will discuss the state of the regulatory raft as well as propose a Big Data led reengineering of techniques of data storage, records keeping & forensic analysis to help Banks meet the same.

A Short History of Market Surveillance Regulation in the European Union..

A visitor passes a sign in the lobby of the European Securities and Markets Authority's (ESMA) headquarters in Paris, France, on Thursday, June, 20, 2013. French gross domestic product will probably drop this year after stalling in 2012 as households trim spending and companies slash investment, national statistics office Insee predicted. Photographer: Balint Porneczi/Bloomberg
The lobby of the European Securities and Markets Authority’s (ESMA) headquarters in Paris, France. Photographer: Balint Porneczi/Bloomberg

As we have seen in previous posts, firms in typically most riskiest part of Banking – Capital Markets – deal in complex financial products in a dynamic industry. Over the last few years, Capital Markets have been undergoing a rapid transformation  – at a higher rate perhaps than Retail Banking or Corporate Banking. This is being fueled by technology advances that produce ever lower latencies of trading, an array of financial products, differing (and newer) market participants, heavy quant based trading strategies and multiple venues (exchanges, dark pools etc) that compete for flow based on new products & services.

The Capital Markets value chain in Europe encompasses firms on the buy side (e.g wealth managers), the sell side (e.g broker dealers) & firms that provide custodial services as well as technology providers who provide platforms for post trade analytics support. The crucial link to all of these is the execution venues themselves as well as the clearing houses.With increased globalization driving the capital markets and an increasing number of issuers, one finds an ever increasing amount of complexity across a range of financial instruments assets (stocks, bonds, derivatives, commodities etc).

In this process, over the last few years the ESMA (European Securities and Markets Authority) has slowly begin to harmonize various pieces of legislation that were originally intend to protect the investor. We will focus on two major regulations that market participants in the EU now need to conform with. These are the MiFID II (Markets in Financial Instruments Directive) and the MAR (Market Abuse Regulation). While both these regulations have different effective dates, together they supplant the 2003 passage of the original MAD (Market Abuse Directive). The global nature of capital markets ensured that the MAD was outdated to the needs to today’s financial system. A case in point is the manipulation of the LIBOR (London Interbank Offered Rate) benchmark & the FX Spot Trading scandal in the UK- both of which clearly illustrated the limitations of dated regulation passed a decade ago.  The latter is concerned with the FX (Foreign Exchange) market which is largest yet most liquid financial markets in the world. The turnover approaches around $5.3 trillion as of 2014 with the bulk of it concentrated in London. In 2014, the FCA (Financial Control Authority) fined several leading banks 1.1 billion GBP for market manipulation. All of that being said, let us quickly examine the two major areas of regulation before we study the downstream business & technology ramifications.

Though we will focus on MiFiD II and MAR in this post, the business challenges and technology architecture are broadly applicable across areas such as Dodd Frank CAT in the US & FX Remediation in the UK etc.

MiFiD,MiFiD II and MAR..

MiFiD (Markets in Financial Instruments Directive) originally started as the investment services directive in the UK in the early 90s. As EU law # (2004/39/EC), it has been applicable across the European Union since November 2007. MiFiD is a cornerstone of the EU’s regulation of financial markets seeking to improve the competitiveness of EU financial markets by creating a single market for investment services and activities and to ensure a high degree of harmonised protection for investors in financial instruments.MiFiD sets out basic rules of market participant conduct across the EU financial markets.It is intended to cover market type issues – best execution, equity & bond market supervision. It also incorporates statues for Investor Protection.

The financial crisis of 2008 (http://www.vamsitalkstech.com/?p=2758) led to a demand by G20 leaders to create more safer and resilient financial markets. This was for multiple reasons – ranging from overall confidence in the integrity of the markets to exposures of households & pension funds to these markets to ensuring the availability of capital for businesses to grow. Regulators across the globe thus began to address these changes to create safer capital markets. After extensive work, it has been concluded from a political standpoint and has evolved into two separate areas – MiFiD II & MiFiR.  MiFID II expands on the original MiFID & goes live in 2018 [1], has rules built in that deal with breaching thresholds, disorderly trading and other potential abuse[2].

The FX market is one of the largest and most liquid markets in the world with a daily average turnover of $5.3 trillion, 40% of which takes place in London. The spot FX market is a wholesale financial market and spot FX benchmarks (also known as “fixes”) are used to establish the relative value of two currencies.  Fixes are used by a wide range of financial and non-financial companies, for example to help value assets or manage currency risk.

MiFiD II transparency requirements cover a whole range of organizations in a very similar way including –

  1. A range of trading venues including Regulated Markets (RM), Multilateral trading facilities (MTF) & Organized trading facilities (OTF)
  2. Investment firms (any entity providing investment services) and the Systematic internalizers (clarified as any firm designated as a market maker or a bank that has an ability net out counterparty positions due to it’s order flow)
  3. Ultimately, MiFiD II affects the complete range of actors in the EU financial markets. This covers a range of asset managers, custodial services, wealth managers etc irrespective of where they are based (EU or no-EU)

The most significant ‘Transparency‘ portion of MiFID II expands the regime that was initially created for equity instruments in the original directive. It adds reporting requirements for both bonds and derivatives. Similar to the reporting requirements under Dodd Frank, this includes both trade reporting – public reporting of trades in realtime, and transaction reporting, – regulatory reporting no later than T+1.

Beginning early January 2018[1] when MiFID II goes into effect – both EU firms & regulators will be required to monitor a whole range of transactions as well as store more trade data across the lifecycle. Firms are also required to file Suspicious Transaction Reports (STR) as and when they detect suspicious trading patterns that may connote forms of market abuse.

The goal of the Market Abuse Regulation (MAR) is to ensure that regulatory rules stay in lockstep with the tremendous technological progress around trading platforms especially High Frequency Trading (HFT). The Market Abuse Directive (MAD) complements the MAR by ensuring that all EU member states adopt a common taxonomy of definitions for a range of market abuse. The MAR

Meanwhile, MAR defines inside information & trading with concrete examples of rogue behavior including collusion, ping orders, abusive squeeze, cross-product manipulation,  floor/ceiling price pattern, ping orders, phishing,  improper matched orders, concealing ownership, wash trades, trash and cash, quote stuffing, excessive bid/offer spread, and ‘pump and dump’ etc.

The MAR went live on July 2016. It’s goal is to ensure that rules keep pace with market developments, such as new trading platforms, as well as new technologies, such as high frequency trading (HFT) and Algorithmic trading. The MAR also requires identification requirements on the trader or algorithm that is responsible for an investment decision.

MiFID II clearly requires that firms have in place systems and controls that monitor such behaviors and are able to prevent disorderly markets.

The overarching intent of both MiFiD II & MAR is to maintain investor faith in the markets by ensuring market integrity, transparency and by catching abuse as it happens. Accordingly, the ESMA has asked for sweeping changes across how transactions on a range of financial instruments – equities, OTC traded derivatives etc – are handled. These changes have ramifications for Banks, Exchanges & Broker Dealers from a record keeping, trade reconstruction & market abuse monitoring, detection & prevention standpoint.

Furthermore, MiFID II enhances requirements for transaction reporting by including venues such as High Frequency Trading , Direct electronic access (DEA) providers &  General clearing members (GCM). The reporting granularity has also been extended to identifying the trader and the client across the order lifecycle for a given transaction.

Thus, beginning early 3rd January 2018 when MiFiD II goes into effect , both firms and regulators will be required to capture & report on detailed order lifecycle for trades.

Key Business & Technology Requirements for MiFid II and MAR Platforms..

While these regulations have broad ramifications across a variety of key functions including compliance, compensation policies, surveillance etc- one of the biggest obstacles is technology which we will examine as well as provide some guidance around.

Some of the key business requirements that can be distilled from these regulatory mandates include the below:

  • Store heterogeneous data – Both MiFiD II and MAR mandate the need to perform trade monitoring & analysis on not just real time data but also historical data spanning a few years. Among others this will include data feeds from a range of business systems – trade data, valuation & position data, reference data, rates, market data, client data, front, middle & back office, data, voice, chat & other internal communications etc. To sum up, the ability to store a range of cross asset (almost all kinds of instruments), cross format (structured & unstructured including voice), cross venue (exchange, OTC etc) trading data with a higher degree of granularity – is key.
  • Data Auditing – Such stored data needs to be fully auditable for 5 years. This implies not just being able to store it but also putting in place capabilities in place to ensure  strict governance & audit trail capabilities.
  • Manage a huge volume increase in data storage requirements (5+ years) due to extensive Record keeping requirements
  • Perform Realtime Surveillance & Monitoring of data – Once data is collected,  normalized & segmented, it will need to support realtime monitoring of data (around 5 seconds) to ensure that every trade can be tracked through it’s lifecycle. Detecting patterns that could perform surveillance for market abuse and monitor for best execution are key.
  • Business Rules  – Core logic that deals with identifying some of the above trade patterns are created using business rules. Business Rules have been covered in various areas in the blog but they primarily work based on an IF..THEN..ELSE construct.
  • Machine Learning & Predictive Analytics – A variety of supervised ad unsupervised learning approaches can be used to perform extensive Behavioral modeling & Segmentation to discover transactions behavior with a view to identifying behavioral patterns of traders & any outlier behaviors that connote potential regulatory violations.
  • A Single View of an Institutional Client- From the firm’s standpoint, it would be very useful to have a single view capability for clients that shows all of their positions across multiple desks, risk position, KYC score etc.

 The Design Ramifications of MiFiD II and MAR..

The below post captures the design of a market surveillance system to a good degree of detail. I had originally proposed it in the context of Dodd Frank CAT (Consolidated Audit Trail) Reporting in the US but we will extend these core ideas to MiFiD II and MAR as well. The link is reproduced below for review.

Design & Architecture of a Next Gen Market Surveillance System..(2/2)

Architecture of a Market Surveillance System..

The ability perform deep & multi level analysis of trade activity implies the capability of not only storing heterogeneous data for years in one place as well as the ability to perform forensic analytics (Rules & Machine Learning) in place at very low latency. Querying functionality ranging from interactive (SQL like) needs to be supported as well as an ability to perform deep forensics on the data via Data Science. Further, the ability to perform quick & effective investigation of suspicious trader behavior also requires compliance teams to access and visualize patterns of trade, drill into behavior to identify potential compliance violations. A Big Data platform is ideal for these complete range of requirements.


                       Design and Architecture of a Market Surveillance System for MiFiD II and MAR

The most important technical features for such a system are –

  1. Support end to end monitoring across a variety of financial instruments across multiple venues of trading. Support a wide variety of analytics that enable the discovery of interrelationships between customers, traders & trades as the next major advance in surveillance technology. HDFS is the ideal storage repository of this data.
  2. Provide a platform that can ingest from tens of millions to billions of market events (spanning a range of financial instruments – Equities, Bonds, Forex, Commodities and Derivatives etc) on a daily basis from thousands of institutional market participants. Data can be ingested using a range of tools – Sqoop, Kafka, Flume, API etc
  3. The ability to add new business rules (via either a business rules engine and/or a model based system that supports machine learning) is a key requirement. As we can see from the above, market manipulation is an activity that seems to constantly push the boundaries in new and unforseen ways. This can be met using open source languages like Python and R. Multifaceted projects such as Apache Spark allow users to perform exploratory data analysis (EDA), data science based analysis using language bindings with Python & R etc for a range of investigate usecases.
  4. Provide advanced visualization techniques thus helping Compliance and Surveillance officers manage the information overload.
  5. The ability to perform deep cross-market analysis i.e. to be able to look at financial instruments & securities trading on multiple geographies and exchanges 
  6. The ability to create views and correlate data that are both wide and deep. A wide view is one that helps look at related securities across multiple venues; a deep view will look for a range of illegal behaviors that threaten market integrity such as market manipulation, insider trading, watch/restricted list trading and unusual pricing.
  7. The ability to provide in-memory caches of data  for rapid pre-trade & post tradecompliance checks.
  8. Ability to create prebuilt analytical models and algorithms that pertain to trading strategy (pre- trade models –. e.g. best execution and analysis). The most popular way to link R and Hadoop is to use HDFS as the long-term store for all data, and use MapReduce jobs (potentially submitted from Hive or Pig) to encode, enrich, and sample data sets from HDFS into R.
  9. Provide Data Scientists and Quants with development interfaces using tools like SAS and R.
  10. The results of the processing and queries need to be exported in various data formats, a simple CSV/txt format or more optimized binary formats, JSON formats, or even into custom formats.  The results will be in the form of standard relational DB data types (e.g. String, Date, Numeric, Boolean).
  11. Based on back testing and simulation, analysts should be able to tweak the model and also allow subscribers (typically compliance personnel) of the platform to customize their execution models.
  12. A wide range of Analytical tools need to be integrated that allow the best dashboards and visualizations. This can be supported by platforms like Tableau, Qlikview and SAS.
  13. An intelligent surveillance system needs to store trade data, reference data, order data, and market data, as well as all of the relevant communication from a range of disparate systems, both internally and externally, and then match these things appropriately. The matching engine can be created using languages supported in Hadoop – Java, Scale, Python & R etc.
  14. Provide for multiple layers of detection capabilities starting with a) configuring business rules (that describe a trading pattern) as well as b) dynamic capabilities based on machine learning models (typically thought of as being more predictive). Such a system can also parallelize execution at scale to be able to meet demanding latency requirements for a market surveillance platform.


[1] http://europa.eu/rapid/press-release_IP-16-265_en.htm
[2] http://europa.eu/rapid/press-release_MEMO-13-774_en.htm