Anti Money Laundering (AML) – Industry Insights & Reference Architectures…

This blog has from time to time discussed issues around the defensive portion of financial services industry  (Banking, Payment Processing, and Insurance etc). Anti Money Laundering (AML) is a critical area where institutions need to protect themselves and their customers from malicious activity. This post summarizes eight key blogs on the topic of AML published at VamsiTalksTech.com. It aims to serve as a handy guide for business and technology audiences tasked with implementing complex AML projects.

Image Credit – FIBA Anti-Money Laundering Compliance Conference

Introduction

Money laundering has emerged as an umbrella crime which facilitates public corruption, drug trafficking, tax evasion, terrorism financing etc. Banks and other financial institutions are expected to conduct business in a manner that protects their countries of operations and consumers from security risks such as laundering, terrorist financing, and corruption (the ML/TF risks). Given the global reach of financial products, a variety of regulatory authorities is concerned about money laundering.  Technology has become key to meeting the regulatory expectations as well as reducing costs in these onerous programs. As the below graphic from PwC [1] demonstrates this is one of the most pressing issues facing financial services industry.

The above infographic from PwC provides a handy visual guide to the state of global AML programs.

The Six Critical Gaps in Global AML Programs…

From an industry standpoint, the highest priority issues that are being pointed out by regulators include the following –

  1. Institutions failing to develop AML frameworks that are unique to the risks run by organizations given their product and geographic mix
  2. Failure to develop real-time insights into business transactions and assigning them elevated risks based on their elements
  3. Developing AML models that draw from the widest possible sources of data – both internal and external – to understand a true picture of the business
  4. Demonstrating a consistent approach across geographies
  5. Leveraging the latest developments in analytics including Machine Learning to enable the automation of AML programs
  6. Lack of appropriate business governance & change management in setting, monitoring and managing AML compliance programs, policies and procedures

With this background in mind, the complete list of AML blogs on VamsiTalksTech is included below.

# 1 – Why Banks should Digitize their Operations and how this will help their AML programs –

Digitization implies a mix of business models predicated on agile systems, rapid & iterative development and more importantly – a Data First strategy. These have significant impacts on AML programs as well in addition to helping increase market share.

A Digital Bank is a Data Centric Bank..

# 2 – Why Data Silos are a huge challenge in many cross organization projects such as AML –

Organizational Data Silos inhibit the effectiveness of AML programs as compliance officers cannot gain a single view of a customer or single view of a suspicious transaction or view the social graph in critical areas such as trade finance. This blog discusses the Silo anti-pattern and ways to mitigate silos from proliferating.

Why Data Silos Are Your Biggest Source of Technical Debt..

# 3 – The Major Workstreams Around AML Programs

The headline is self-explanatory but we discuss the five major work streams on global AML projects – Customer Due Diligence, Entity Analysis, Downstream Analytics, Ongoing Monitoring and Investigation Lifecycle.

Deter Financial Crime by Creating an Effective Anti Money Laundering (AML) Program…(1/2)

# 4 – Predictive Analytics Across the AML workstreams –

Here we examine how Predictive Analytics can be applied across all of the five work streams.

How Big Data & Predictive Analytics transform AML Compliance in Banking & Payments..(2/2)

# 5 – The Business Need for Big Data in AML programs  –

This post discusses the most important developments in building AML systems using Big Data Technology-

Building AML Regulatory Platforms For The Big Data Era

# 6 – A Detailed Look at how Enterprises can use Big Data and Advanced Analytics to reduce AML costs –

How to leverage Big Data and Advanced Analytics to detect a range of suspicious transactions and actors.

Big Data – Banking’s New Weapon In War Against Financial Crime..(1/2)

# 7 – Reference Architecture for AML  –

We discuss a Big Data enabled Reference Architecture of an enterprise-wide AML program.

Big Data – Banking’s New Weapon In War Against Financial Crime..(2/2)

Conclusion

According to Pricewaterhouse Coopers, the estimates of global money laundering flows were between 2-5% of global GDP [1] in 2016 – however, only 1% of these transactions were caught. Certainly, the global financial industry has a long way to go before they effectively stop these nefarious actors but there should be no mistaking that technology is a huge part of the answer.

References –

  1. Pricewaterhouse Coopers 2016 AML Survey  http://www.pwc.com/gx/en/services/advisory/forensics/economic-crime-survey/anti-money-laundering.html

How Big Data & Predictive Analytics transform AML Compliance in Banking & Payments..(2/2)

The first blog in this two part series (Deter Financial Crime by Creating an effective AML Program) described how Money Laundering (ML) activities employed by nefarious actors (e.g drug cartels, corrupt public figures & terrorist organizations) have gotten more sophisticated over the years. Global and Regional Banks are falling short of their compliance goals despite huge technology and process investments. Banks that fail to maintain effective compliance are typically fined hundreds of millions of dollars. In this second & final post, we will examine why Big Data Analytics as a second generation effort can become critical to efforts to shut down the flow of illicit funds across the globe thus ensuring financial organizations are compliant with efforts to reduce money laundering.

Where current enterprisewide AML programs fall short..

As discussed in various posts and in the first blog in the series (below), the Money Laundering (ML) rings of today are highly sophisticated in their understanding of the business specifics across the domains of Banking  – Capital Markets, Retail & Commercial banking. They are also very well versed in the complex rules that govern global trade finance.

Deter Financial Crime by Creating an Effective Anti Money Laundering (AML) Program…(1/2)

Further, the more complex and geographically diverse a financial institution is, the higher it’s risk of AML (Anti Money Laundering) compliance violations. Other factors such as an enormous volume of transactions across multiple distribution channels, across geographies between thousands of counter-parties always increases money laundering risk.

Most current AML programs fall short in five specific areas –

  1. Manual Data Collection & Risk Scoring – Bank’s response to AML statutes has been to bring in more staff typically in hundreds at large banks. These staff perform rote but key processes in AML such as Customer Due Diligence (CDD) and Know Your Customer (KYC).  These staff extensively scour external sources like Lexis Nexis, Thomson Reuters, D&B etc to manually scoring of risky client entities often pairing these with internal bank data. They also use AML watch-lists to perform this process of verifying individuals and business customers so that AML Case Managers can review it before filing Suspicious Activity Reports (SAR). On an average, about 50% of the cost of AML programs is incurred in terms of the large headcount requirements. At large Global Banks where the number of accounts are more 100 million customers the data volumes can get real big real quick causing all kinds of headaches for AML programs from a data aggregation, storage, processing and accuracy standpoint. There is a crying need to automate AML programs end to end to not only perform accurate risk scoring but also to keep costs down.
  2. Social Graph Analysis in areas such as Trade finance helps model the complex transactions occurring between thousands of entities. Each of these entities may have a complex holding structure with accounts that have been created using forged documents. Most fraud also happens in networks of fraud. An inability to dynamically understand the topology of the financial relationships among thousands of entities implies that AML programs need to develop graph based analysis capabilities .
  3. AML programs extensively deploy rule based systems or Transaction Monitoring Systems (TMS) which allow an expert system based approach to setup new rules. These rules span areas like monetary thresholds, specific patterns that connote money laundering & also business scenarios that may violate these patterns. However, fraudster rings now learn (or know) these rules quickly & change their fraudulent methods constantly to avoid detection. Thus there is a significant need to reduce a high degree of dependence on traditional TMS – which are slow to adapt to the dynamic nature of money laundering.
  4. The need to perform extensive Behavioral modeling & Customer Segmentation to discover transactions behavior with a view to identifying behavioral patterns of entities & outlier behaviors that connote potential laundering.
  5. Real time transaction monitoring in areas like Payment Cards presents unique challenges where money laundering is hidden within mountains of transaction data. Every piece of data produced as a result of bank operations needs to be commingled with historical data sets (for customers under suspicion) spanning years in making a judgment call about filing a SAR (Suspicious Activity Report).

How Big Data & Predictive Analytics can help across all these areas..

aml_predictiveanalytics

  1. The first area where Big Data & Predictive Analytics have a massive impact is around Due Diligence data of KYC (Know Your Customer) data. All of the above discussed data scraping from various sources can be easily automated by using tools in a Big Data stack to ingest information automatically. This is done by sending requests to data providers (the exact same ones that Banking institutions are currently using) via an API. Once this data is obtained, they can use real time processing tools (such as Apache Storm and Apache Spark) to apply sophisticated algorithms to that collected data to transform that data to calculate a Risk Score or Rating. In Trade Finance, Text Analytics can be used to process a range of documents like invoices, bills of lading, certificates of shipping etc to enable Banks to inspect a complex process across hundreds of entities operating across countries.  This approach enables Banks to process massive amounts of diverse data in quick time (even seconds) to synthesize it to accurate risk scores. Implementing Big Data in this very important workstream can help increase efficiency and reduce costs.
  2. The second area where Big Data shines at is in the space of helping create a Single View of a Customer as depicted below. This is made possible by doing advanced entity matching with the establishment and adoption of a lightweight entity ID service. This service will consist of entity assignment and batch reconciliation. The goal here is to get each business system to propagate the Entity ID back into their Core Banking, loan and payment systems, then transaction data will flow into the lake with this ID attached providing a way to do Customer 360.single-view-of-the-customer
  3. To be clear, we are advocating for a mix of both business rules and Data Science. Machine Learning is recommended as enables a range of business analytics across AML programs overcoming the limitations of a TMS. The first usecase is around Data Science for  – which is – Give me all transactions in one place, give me all the Case Mgmt files in one place, give me all of the customer data in one place and give me all External data (TBD) in one place. And the reason I want all of this is to perform Exploratory, hypothesis Data Science with the goal being to uncover areas of risk that one possibly missed out on before, find out areas that were not as risky as they thought were before so the risk score can be lowered and really constantly finding out the real Risk profile that your institution bears. E.g. Downgrading investment in your Trade financing as you are find a lot of Scrap Metal based fraudulent transactions.
  4. The other important value driver in deploying Data Science is to perform Advanced Transaction Monitoring Intelligence.  The core idea is to get years worth of Banking data in one location (the datalake) & then applying  unsupervised learning to glean patterns in those transactions. The goal is then to identify profiles of actors with the intent of feeding it into downstream surveillance & TM systems. This knowledge can then be used to –
  • Constantly learn transaction behavior for similar customers is very important in detecting laundering in areas like payment cards. It is very common to have retail businesses setup with the sole purpose of laundering money.
  • Discover transaction activity of trade finance customers with similar traits (types of businesses, nature of transfers, areas of operations etc.)
  • Segment customers by similar trasnaction behaviors
  • Understand common money laundering typologies and identify specific risks from a temporal and spatial/geographic standpoint
  • Improve and lear correlations between alert accuracy and suspicious activity reports (SAR) filings
  • Keep the noise level down by weeding out false positives

Benefits of a forward looking approach..  

We believe that we have a fresh approach that can help Banks with the following value drivers & metrics –

  • Detect AML violations on a proactive basis thus reducing the probability of massive fines
  • Save on staffing expenses for Customer Due Diligence (CDD)
  • Increase accurate production of suspicious activity reports (SAR)
  • Decrease the percent of corporate customers with AML-related account closures in the past year by customer risk level and reason – thus reducing loss of revenue
  • Decrease the overall KYC profile update backlog across geographies
  • Help create Customer 360 views that can help accelerate CLV (Customer Lifetime Value) as well as Customer Segmentation from a cross-sell/up-sell perspective

Big Data shines in all the above areas..

Conclusion…

The AML landscape will rapidly change over the next few years to accommodate the business requirements highlighted above. Regulatory authorities should also lead the way in adopting a Hadoop/ ML/Predictive Analytics based approach over the next few years. There is no other way to do tackle large & medium AML programs in a lower cost and highly automated manner.

Deter Financial Crime by Creating an Effective Anti Money Laundering (AML) Program…(1/2)

THE AML CHALLENGE CONTINUES UNABATED…

As this blog has repeatedly catalogued over the last year here[1], here[2] and here[3], Money Laundering is a massive global headache and one of the biggest crimes against humanity. Not a month goes by when we do not hear of billions of dollars in ill gotten funds being stolen from developing economies via corruption as well as from proceeds of nefarious whether it is the Panama papers or banks unwittingly helping drug cartels launder money.

I have seen annual estimates of global money laundering flows ranging anywhere from $ 1 trillion to 2 trillion – almost 5% of global GDP.  Almost all of this is laundered via Retail & Merchant Banks,  Payment Networks, Securities & Futures firms, Casino Services & Clubs etc – which explains why annual AML related fines on Banking organizations run into the billions and are increasing every year. However, the number of SARs (Suspicious Activity Reports) filed by banking institutions are much higher as a category as compared to the numbers filed by these other businesses.

The definition of Financial Crimes is fairly broad & encompasses a large area of definition – the traditional money laundering activity, financial fraud like identity theft/check fraud/wire fraud, terrorist activity, tax evasion, securities market manipulation, insider trading and other kinds of securities fraud. Financial institutions across the spectrum of the market now need to comply with the regulatory mandate at both the global as well as the local market level.

What makes AML such a hard subject for Global Banks which should be innovating quite easily?

The issues which bedevil smooth AML programs include –

  • the complex nature of banking across retail, commercial, wealth management & capital markets; global banks now derive around 40% of revenue from non traditional markets (North America & Western Europe)
  • the scale of customer activity ranging from 5 to 50 million at the large global banks
  • patchwork of local regulations, risk and compliance reporting requirements. E.g. Stringent compliance requirements in the US & UK but softer requirements elsewhere
  • tens of distribution channels
  • growing volumes of transactions causing requirements for complex analytics
  • the need to constantly integrate 3rd party information of lists of politically exposed persons of interest (PEPs) using manual means
  • technology while ensuring the availability of banking services to millions of underserved populations – also makes it easy for the launderers to conduct & mask their activities

The challenges are hard but the costs of non-compliance are severe. Banks have been fined billions of dollars, compliance officers face potential liability & institutional reputation takes a massive hit. Supra national authorities like the United Nations (UN) and the European Union (EU) can also impose sanctions when they perceive that AML violations threaten human rights & the rule of law.

TECHNOLOGY IS THE ANSWER…

Many Banks have already put in rules, policies & procedures to detect AML violations and have also invested in substantial teams staffed by money laundering risk officers (MLRO) & headed by compliance officers. These rules to detect money laundering work based on thresholds and patterns that breached such criteria. The issue with this is that the money launderers themselves are in the class of statisticians and they constantly devise new rules to hide their tracks.

The various elements that make up the risk to banks and financial institutions and the technology they use to detect these can be broken down into five main areas & work streams as shown below.

AML_Workstreams

                                Illustration: The Five Workstreams of AML programs

  1. Customer Due Diligence  – this involves gathering information from the client as well as on-boarding data from external sources to verify these details and to establish a proper KYC (Know Your Customer) program.
  2. Entity Analysis – identifying relationships between institutional clients as well as retail clients to understand the true social graph. Bank compliance officers now have gone beyond KYC (Know Your Customer) to know their customer’s customer, or KYCC.[4]
  3. Downstream Analytics – detecting advanced patterns of behavior among clients & the inter-web of transactions with a view to detecting hidden patterns of money laundering. This also involves assessing client risk during specific points in the banking lifecycle, such as account opening, transactions above a certain monetary value. These data points could signal potentially illegitimate activity based on any number of features associated with such transactions. Any transaction could also lead to the filing of a suspicious activity report (SAR)
  4. Ongoing Monitoring  – Help aggregate such customer transactions across multiple geographies for pattern detection and reporting purposes. This involves creating a corporate taxonomy of rules that capture a natural language description of the conditions, patterns denoting various types of financial crimes – terrorist financing, mafia laundering, drug trafficking, identity theft etc.
  5. SAR Investigation Lifecycle – These rules trigger downstream workflows to allow human investigation on such transactions

QUANTIFIABLE BENEFITS FROM DOING IT WELL…

Financial institutions that leverage new Age technology (Big Data, Predictive Analytics, Workflow) in these five areas will be able to effectively analyze financial data and deter potential money launderers before they are able to proceed, providing the institution with protection in the form of full compliance with the regulations.

The business benefits include –

  • Detect AML violations on a proactive basis thus reducing the probability of massive fines 
  • Save on staffing expenses for Customer Due Diligence (CDD)
  • Increase accurate production of suspicious activity reports (SAR)
  • Decrease the percent of corporate customers with AML-related account closures in the past year by customer risk level and reason – thus reducing loss of revenue
  • Decrease the overall KYC profile backlog across geographies
  • Help create Customer 360 views that can help accelerate CLV (Customer Lifetime Value) as well as Customer Segmentation from a cross-sell/up-sell perspective

CONCLUSION…

Virtually every leading banking institution, securities firm, payment provider understands that they need to enhance their AML capabilities by a few notches and also need to constantly evolve them as fraud itself morphs.

The question is can they form a true picture of their clients (both retail and institutional) on a real time basis, monitor every banking interaction while understanding it’s true context when merged with historical data, detect unusual behavior. Further creating systems that learn  from these patterns truly helps minimize money laundering.

The next and final post in this two part series will examine how Big Data & Analytics help with each of the work streams discussed above.

REFERENCES…

[1] Building AML Regulatory Platforms for the Big Data Era – http://www.vamsitalkstech.com/?p=5

[2]Big Data – Banking’s New Weapon Against Financial Crime – http://www.vamsitalkstech.com/?p=806

[3] Reference Architecture for AML
– http://www.vamsitalkstech.com/?p=833

[4] WSJ – Know Your Customer’s Customer is the New Norm – http://blogs.wsj.com/riskandcompliance/2014/10/02/the-morning-risk-report-know-your-customers-customer-is-new-norm/

Global Retail Banking Needs a Digital Makeover

If you don’t like change, you will like irrelevance even less.” -General Eric Shinseki, Former  US Secretary of Veterans Affairs

This blog has spent time documenting the ongoing digital disruption across the industry especially financial services. Is there proof that creative destruction is taking a hold in Banking? The answer is a clear & unequivocal “Yes”. Clearly, Retail Banking is undergoing a massive makeover. This is being driven by many factors – changing consumer preferences, the advent of technology, automation of business processes & finally competition from not just the traditional players but also the Fintechs. The first casualty of this change is the good old Bank Branch. This post looks at the business background of Retail Banking across the world & will try to explain my view on what is causing this shift in how Banks and consumers perceive financial services.

This blog post will be one of a series of five standalone posts on Retail Bank transformation. The intention for the first post is to discuss industry dynamics, the current state of competition and will briefly introduce the forces causing a change in the status quo. The second post will categorize FinTechs across the banking landscape with key examples of how they disinter-mediate established players. The remaining posts will examine each of the other forces (Customer  in more detail along with specific and granular advice to retail banks on how to incorporate innovation into their existing technology, processes and organizational culture.

Introduction – 

Retail Banking is perhaps one of the most familiar and regular services that everyday citizens use in the course of their lives. Money is a commodity we touch every day in our lives when we bank, shop, pay bills, borrow etc. Retail lines of banking typically include personal accounts, credit cards, mortgages and auto loans. 

For large financial conglomerates that have operations spanning Commercial Banking, Capital Markets, Wealth & Asset Management etc, retail operations have always represented an invaluable source of both stability as well as balance sheet strength. The sheer size & economic exposure of retail operations ensures that it is not only staid yet stable but also somewhat insulated from economic shocks. This is borne out by the policies of respective national central banks & treasury departments. Indeed one of main the reasons regulators have bailed out banks in the past is due to the perception that Main Street & the common citizen’s banking assets becoming a casualty of increased risk taking  by traders in the capital markets divisions. This scenario famously played out during the Great Depression in the late 1920s and was a major factor in causing widespread economic contagion. A stock market crash quickly cascaded into a nation-wide economic depression. 

Thus, retail banking is crucial to not just to the owning corporation but also to diverse stakeholders in the world economy – deposit holders, the regulators led by the US Federal Reserve (in the US) & a host of other actors.  

The State of Global Retail Banking – 

In the financial crisis of 2008, retail banks not only held their own but also assumed a bigger share of revenues as the recovery got underway in the following years. According to a survey by Boston Consulting Group (BCG), retail banking activities accounted for 55 percent of the revenues generated across a global cohort of 140 banks, up from 45 percent in 2006.[1] 

However, the report also contends that retail revenues since 2008 have been slowly falling as investors have begin shifting their savings to deposits as a reaction to high profile financial scandals thus putting pressure on margins. Higher savings rates have helped offset this somewhat & retail banks ended up maintaining better cost to income (CIR) ratios than did other areas of banking.Retail banks also performed better on a key metric return on assets (ROA). The below graphic from the BCG captures this metric. In the Americas region, the average ROA was 162 percent higher than the average group ROA in 2008. From 2001 through 2006, it was 51 percent higher. Global banking revenues stood at $ 1.59 trillion in 2015 – a figure that is expected to hold relatively steady across the globe [2]

It is also important to note that global performance of retail banks across the five major regions: the Americas, Europe, the Middle East, Asia, and Australia has generally varied based on a multitude of factors. And even within regions, banking performance has varied widely.[2]

Retail Banking - BCG

                                      Illustration 1 – Retail Banking is profitable and stable 

As stable as this sector seems, it is also be roiled by four main forces that are causing every major player to rethink their business strategy. Left unaddressed, these changes will cause huge and negative impacts on competitive viability, profitability & also impact all important growth over the next five years. 

What is the proof that retail banking is beginning to change? The below graphic from CNN [1] says it all –

BofA_Branches_CNN

Bank of America has 23% fewer branches and 37% fewer employees than in 2009.  That downward trend across both metrics is expected to continue as online transactions from (deposits to checks to online loans) grown by a staggering 94%. The bank is expected to cut more positions in reflection of a shrinking headcount and branch footprint[1].

Pressure from the FinTechs:

The Financial Services and the Insurance industry are facing an unprecedented amount of change driven by factors like changing client preferences and the emergence of new technology—the Internet, mobility, social media, etc. These changes are immensely profound, especially with the arrival of “FinTech”—technology-driven applications that are upending long-standing business models across all sectors from retail banking to wealth management & capital markets. Further, members of a major new segment, Millennials, increasingly use mobile devices, demand more contextual services and expect a seamless unified banking experience—something akin to what they  experience on web properties like Facebook, Amazon, Uber, Google or Yahoo, etc. They do so by expanding their wallet share of client revenues by offering contextual products tailored to individual client profiles. Their savvy use of segmentation data and predictive analytics enables the delivery of bundles of tailored products across multiple delivery channels (web, mobile, call center banking, point of sale, ATM/kiosk etc.).

Retail Banking must trend Digital to respond – 

The definition of Digital is somewhat nebulous, I would like to define the key areas where it’s impact and capabilities will need to be felt for this gradual transformation to occur.

A true Digital Bank needs to –

  • Offer a seamless customer experience much like the one provided by the likes of Facebook & Amazon i.e highly interactive & intelligent applications that can detect a single customer’s journey across multiple channels
  • offer data driven interactive services and products that can detect customer preferences on the fly, match them with existing history and provide value added services. Services that not only provide a better experience but also foster a longer term customer relationship
  • to be able to help the business prototype, test, refine and rapidly develop new business capabilities
  • Above all, treat Digital as a Constant Capability and not as an ‘off the shelf’ product or a one off way of doing things

The five areas that established banks need to change across are depicted below..

RetailBank_Value_Drivers

  1. Convert branches to be advisory & relationship focused instead of centers for transactions – As the number of millennials keeps growing, the actual traffic to branches will only continue to decline.  Branches still have an area of strength in being intimate customer touch points. The branch of the future can be redesigned to have more self service features along with relationship focused advisory personnel instead of purely being staffed by tellers and managers. They need to be reimagined as Digital Centers, not unlike an Apple store, with highly interactive touch screens and personnel focused on building business through high margin products.
  2. Adopt a FinTech like mindset – FinTechs (or new Age financial industry startups) offer enhanced customer experiences built on product innovation and agile business models. They do so by expanding their wallet share of client revenues by offering contextual products tailored to individual client profiles. Their savvy use of segmentation data and predictive analytics enables the delivery of bundles of tailored products across multiple delivery channels (web, mobile, Point Of Sale, Internet, etc.). Like banks, these technologies support multiple modes of payments at scale, but they aren’t bound by the same regulatory and compliance regulations as are banks, who operate under a mandate that they must demonstrate that they understand their risk profiles. The best retail banks will not only seek to learn from, but sometimes partner with, emerging fintech players to integrate new digital solutions and deliver exceptional customer experience. To cooperate and take advantage of fintechs, banks will require new partnering capabilities. To heighten their understanding of customers’ needs and to deliver products and services that customers truly value, banks will need new capabilities in data management and analytics.
  3. Understand your customer – Banks need to move to a predominantly online model, providing consumers with highly interactive, engaging and contextual experiences that span multiple channels—branch banking, eBanking, POS, ATM, etc. Further goals are increased profitability per customer for both micro and macro customer populations with the ultimate goal of increasing customer lifetime value (CLV).
  4. Business Process improvement – Drive Automation across lines of business  – Financial services are fertile ground for business process automation, since most banks across their various lines of business are simply a collection of core and differentiated processes. Examples of these processes are consumer banking (with processes including on boarding customers, collecting deposits, conducting business via multiple channels, and compliance with regulatory mandates such as KYC and AML); investment banking (including straight-through-processing, trading platforms, prime brokerage, and compliance with regulation); payment services; and wealth management (including modeling model portfolio positions and providing complete transparency across the end-to-end life cycle). The key takeaway is that driving automation can result not just in better business visibility and accountability on behalf of various actors. It can also drive revenue and contribute significantly to the bottom line. Automation enables enterprise business and IT users to document, simulate, manage, automate and monitor business processes and policies. It is designed to empower business and IT users to collaborate more effectively, so business applications can be changed more easily and quickly.
  5. Agile Culture – All of the above are only possible if the entire organization operates on an agile basis in order to collaborate across the value chain. Cross functional teams across new product development, customer acquisition & retention, IT Ops, legal & compliance must collaborate in short work cycles to close the traditional business & IT innovation gap.  One of DevOps’s chief goals is to close the long-standing gap between the engineers who develop and test IT capability and the organizations that are responsible for deploying and maintaining IT operations. Using traditional app dev methodologies, it can take months to design, test and deploy software. No business today has that much time—especially in the age of IT consumerization and end users accustomed to smart phone apps that are updated daily. The focus now is on rapidly developing business applications to stay ahead of competitors that can better harness Big Data’s amazing business capabilities.

How can all of this be quantified? –

The results of BCG’s sixth annual Global Retail-Banking Excellence benchmarking illustrate the value drivers. Forward looking banks are working on some of the above aspects are able to reduce cycle times for core processes thus improving productivity. The leaders in the survey are also reallocating resources from the mid and office to customer facing roles.[3]

Again, according to the BCG, digital reinvention comes with huge benefits to both the top and bottom-lines. Their annual survey across the global retail banking sector estimates an average reduction in operating expenses from 15% to 25%, increases in pretax profit by 20% to 30% and an average increase in margins before tax from 5% to 10%. [3] These numbers are highly impressive at the scale that large banks operate.

The question thus is, can the vast majority of Banks change before it’s too late? Can they find the right model of execution in the Digital Age before their roles are either diminished or dis-intermediated by competition?

We will dive deep into the FinTech’s in the next post in the series.

References

[1] CNN Money – Bank of America has 23% fewer branches than 2009

[2]BCG Research- Winning Strategies Revisited for Retail Banking

[3] BCG Research- Global Capital Markets 2016: The Value Migration

Open Enterprise Hadoop – as secure as Fort Knox

Previous posts in this blog have discussed customers leveraging Open Source, Big Data and Hadoop related technologies for a range of use cases across industry verticals. We have seen how a Hadoop-powered “Data Lake” can not only provide a solid foundation for a new generation of applications that provide analytics and insight, but can also increase the number of access points to an organization’s data. As diverse types of both external and internal enterprise data are ingested into a central repository, the inherent security risks must be understood and addressed by a host of actors in the architecture. Security is thus highly essential for organizations that store and process sensitive data in the Hadoop ecosystem. Many organizations must adhere to strict corporate security polices as well as rigorous industry guidelines. So how does open source Hadoop stack upto demanding standards such as PCI-DSS? 

We have from time to time, noted the ongoing digital transformation across industry verticals. For instance, banking organizations are building digital platforms that aim to engage customers, partners and employees. Retailers & Banks now recognize that the key to win the customer of the day is to offer a seamless experience across multiple channels of engagement. Healthcare providers want to offer their stakeholders – patients, doctors,nurses, suppliers etc with multiple avenues to access contextual data and services; the IoT (Internet of Things) domain is abuzz with the possibilities of Connected Car technology.

The aim of this blogpost is to disabuse those notions which float around from time to time where a Hadoop led 100% open source ecosystem is cast as being somehow insecure or unable to fit well into a corporate security model. It is only to dispel such notions about open source, the Open Source Alliance has noted well that – “Open source enables anyone to examine software for security flaws. The continuous and broad peer-review enabled by publicly available source code improves security through the identification and elimination of defects that might otherwise be missed. Gartner for example, recommends the open source Apache Web server as a more secure alternative to closed source Internet Information servers. The availability of source code also facilitates in-depth security reviews and audits by government customers.” [2]

It is a well understood fact that data is the most important asset a business possess and one that nefarious actors are usually after. Let us consider the retail industry- cardholder data such as card numbers or PAN (Primary Account Numbers) & other authentication data is much sought after by the criminal population.

The consequences of a data breach are myriad & severe and can include –

  • Revenue losses
  • Reputational losses
  • Regulatory sanction and fines etc

Previous blogposts have chronicled cybersecurity in some depth. Please refer to this post as a starting point for a somewhat exhaustive view of cybersecurity. This awareness has led to an increased adoption in risk based security frameworks. E.g ISO 27001, the US National Institute of Standards and Technology (NIST) Cybersecurity Framework and SANS Critical Controls. These frameworks offer a common vocabulary, a set of guidelines that enable enterprises to  identify and prioritize threats, quickly detect and mitigate risks and understand security gaps.

In the realm of payment card data – regulators,payment networks & issuer banks themselves recognize this and have enacted compliance standard – the PCI DSS (Personal Cardholder Information – Data Security Standards). PCI is currently in its third generation incarnation or v3.0 which was introduced over the course of 2014. It is the most important standard for a host of actors –  merchants, processors, payment service providers or really any entity that stores or uses payment card data. It is also important to note that the core process of compliance all applications and systems in a merchant or a payment service provider.

The  PCI standards council recommends the following 12 components for PCI-DSS as depicted in the below table.

PCI_DSS_12_requirements_grande

Illustration: PCI Data Security Standard – high level overview (source: shopify.com)

While PCI covers a whole range of areas that touch payment data such as POS terminals, payment card readers, in store networks etc – data security is front & center.

It is to be noted though that according to the Data Security Standards body who oversee the creation & guidance around the PCI , a technology vendor or product cannot be declared as being cannot “PCI Compliant.”

Thus, the standard has wide implications on two different dimensions –

1. The technology itself as it is incorporated at a merchant as well as

2. The organizational culture around information security policies.

My experience in working at both Hortonworks & Red Hat has shown me that open source is certified at hundreds of enterprise customers running demanding workloads in verticals such as financial services, retail, insurance, telecommunications & healthcare. The other important point to note is that these customers are all PCI, HIPPA and SOX compliant across the board.

It is a total misconception that off the shelf and proprietary point solutions are needed to provide broad coverage across the above pillars. Open enterprise Hadoop offers comprehensive and well rounded implementations across all of the five areas and what more it is 100% open source. 

Let us examine how security in Hadoop works.

The Security Model for Open Enterprise Hadoop – 

The Hadoop community has thus adopted both a top down as well as bottom up approach when looking at security as well as examining at all potential access patterns and across all components of the platform.

Hadoop and Big Data security needs to be considered across the below two prongs – 

  1. What do the individual projects themselves need to support to guarantee that business architectures built using them are highly robust from a security standpoint? 
  2. What are the essential pillars of security that the platform which makes up every enterprise cluster needs to support? 

Let us consider the first. The Apache Hadoop project contains 25+ technologies in the realm of data ingestion, processing & consumption. While anything beyond a cursory look is out of scope here, an exhaustive list of the security hooks provided into each of the major projects are covered here [1].

For instance, Apache Ranger manages fine-grained access control through a rich user interface that ensures consistent policy administration across Hadoop data access components. Security administrators have the flexibility to define security policies for a database, table and column, or a file, and can administer permissions for specific LDAP-based groups or individual users. Rules based on dynamic conditions such as time or geolocation, can also be added to an existing policy rule. The Ranger authorization model is highly pluggable and can be easily extended to any data source using a service-based definition.[1]

Administrators can use Ranger to define a centralized security policy for the following Hadoop components and the list is constantly enhanced:

  • HDFS
  • YARN
  • Hive
  • HBase
  • Storm
  • Knox
  • Solr
  • Kafka

Ranger works with standard authorization APIs in each Hadoop component, and is able to enforce centrally administered policies for any method used to access the data lake.[1]

 Now the second & more important question from an overall platform perspective. 

There are five essential pillars from a security standpoint that address critical needs that security administrators place on data residing in a data lake. If any of these pillars is vulnerable from an implementation standpoint, it ends up creating risk built into organization’s Big Data environment. Any Big Data security strategy must address all five pillars, with a consistent implementation approach to ensure their effectiveness.

Security_Pillars

                             Illustration: The Essential Components of Data Security

  1. Authentication – does the user possess appropriate credentials? This is implemented via the Kerberos authentication protocol & allied concepts such as Principals, Realms & KDC’s (Key Distribution Centers).
  2. Authorization – what resources is the user allowed to access based on business need & credentials?  Implemented in each Hadoop project & integrated with an organizations LDAP/AD/.
  3. Perimeter Security – prevents unauthorized outside access to the cluster. Implemented via Apache Knox Gateway which extends the reach of Hadoop services to users outside of a Hadoop cluster. Knox also simplifies Hadoop security for users who access the cluster data and execute jobs.
  4. Centralized Auditing  – implemented via Apache Atlas and it’s integration with Apache Ranger.
  5. Security Administration – deals with the central setup & control all security information using a central console.  uses Apache Ranger to provide centralized security administration and management. The Ranger Administration Portal is the central interface for security administration. Users can create and update policies, which are then stored in a policy database.

ranger_centralized_admin

                                           Illustration: Centralized Security Administration

It is also to be noted that as Hadoop adoption grows at an incremental pace, workloads that harness data for complex business analytics and decision-making may need more robust data-centric protection (namely data masking, encryption, tokenization). Thus, in addition to the above Hadoop projects as Apache Ranger, enterprises can essentially take an augmentative approach.  Partner solutions that offer data centric protection for Hadoop data such as Dataguise DgSecure for Hadoop which clearly complement an enterprise ready Hadoop distribution (such as those from the open source leader Hortonworks) are definitely worth a close look.

Summary

While implementing Big Data architectures in support of business needs, security administrators should look to address coverage for components across each of the above areas as they design the infrastructure. A rigorous & bottom-up approach to data security makes it possible to enforce and manage security across the stack through a central point of administration, which will likely prevent any potential security gaps and inconsistencies. This approach is especially important for newer technology like Hadoop where exciting new projects & data processing engines are always being incubated at a rapid clip. After all, the data lake is all about building a robust & highly secure platform on which data engines – Storm,Spark etc and processing frameworks like Mapreduce function to create business magic. 

 References – 

[1] Hortonworks Data Security Guide
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/

[2] Open Source Alliance of America
http://opensourceforamerica.org/learn-more/benefits-of-open-source-software/

The six megatrends helping enterprises derive massive value from Big Data..

The world is one big data problem.” – Andrew McAfee, associate director of the Center for Digital Business at MIT Sloan

Though Data as a topic has been close to my heart, it was often a subject I would not deal much with given my preoccupation with applications, middleware, cloud computing & DevOps. However I grabbed the chance to teach a Hadoop course in 2012 and it changed the way I looked at data – not merely an enabler but as the true oil of business. Fast forward to 2016, I have almost completed an amazing and enriching year at Hortonworks.  It is a good time for retrospection about how Big Data is transforming businesses across the Fortune 500 landscape. Thus, I present what is not merely the ‘Art of the Possible’ but ‘Business Reality’ –  distilled insights from an year of working with real world customers. Companies pioneering Big Data  into commercial applications to drive shareholder value & customer loyalty.

Six Megatrends

  Illustration – The Megatrends helping enterprises derive massive value from Big Data 

Please find presented the six megatrends that will continue to drive Big Data into enterprise business & IT architectures for the foreseeable future.

  1. The Internet of Anything (IoAT) – The rise of the machines has been well documented but enterprises have just begun waking up to the possibilities in 2016. The paradigm of  harnessing IoT data by leveraging Big Data techniques has begun to gain industry wide adoption & cachet. For example in the manufacturing industry, data is being gathered from a wide variety of sensors that are distributed geographically along factory locations running 24×7. Predictive maintenance strategies that pull together sensor data, prognostics are critical to efficiency & also to optimize the business. In other verticals like healthcare & insurance, massive data volumes are now being reliably generated from diverse sources of telemetry such as patient monitoring devices as well as human manned endpoints at hospitals. In transportation, these devices include cars in the consumer space, trucks & other field vehicles, geolocation devices. Others include field machinery in oil exploration & server logs across IT infrastructure. In the personal consumer space, personal fitness devices like FitBit, Home & Office energy management sensors etc. All of this constitutes the trend that Gartner terms the Digital Mesh. The Mesh really is built from coupling machine data these with the ever growing social media feeds, web clicks, server logs etc.The Digital Mesh leads to an interconnected information deluge which encompasses classical IoT endpoints along with audio, video & social data streams. This leads to huge security challenges and opportunity from a business perspective  for forward looking enterprises (including Governments). Applications that are leveraging Big Data to ingest, connect & combine these disparate feeds into one holistic picture of an entity – whether individual or institution – are clearly beginning to differentiate themselves. IoAT is starting to be a huge part of digital transformation initiatives with more usecases emerging in 2017 across industry verticals.
  2. The Emergence of Unified Architectures – The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging visualization but also to personalize services clients care about across multiple modes of interaction. Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. Healthcare is a close second where caregivers expect patient, medication & disease data at their fingertips with a few finger swipes on an iPad app. What Big Data brings to the equation beyond it’s strength in data ingest & processing is a unified architecture. For instance, MapReduce is the original framework for writing applications that process large amounts of structured and unstructured data stored in the Hadoop Distributed File System (HDFS). Apache Hadoop YARN opened Hadoop to other data processing engines (e.g. Apache Spark/Storm) that can now run alongside existing MapReduce jobs to process data in many different ways at the same time. The result is that ANY kind of application processing can be run inside a Hadoop runtime – batch, realtime, interactive or streaming.
  3. Consumer 360 Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. The healthcare industry stores patient data across multiple silos – ADT (Admit Discharge Transfer) systems, medication systems, CRM systems etc. Data Lakes provide an ability to visualize all of the patients data in one place thus improving outcomes. The Digital Mesh (covered above) only exacerbates this semantic gap in user experiences as information consumers navigate applications as they consume services across the mesh. A mesh that is both multi-channel as well as one that needs a 360 degree customer view across all these engagement points. Applications developed in 2016 and beyond must take a 360 degree based approach to ensuring a continuous client experience across the spectrum of endpoints and the platforms that span them from a Data Visualization standpoint. Every serious business needs to provide a unified view of a customer across tens of product lines and geographies.
  4. Machine Learning, Data Science & Predictive Analytics – Most business problems are data challenges and an approach centered around data analysis helps extract meaningful insights from data thus helping the business It is a common capability now for many enterprises to possess the capability to acquire, store and process large volumes of data using a low cost approach leveraging Big Data and Cloud Computing.  At the same time the rapid maturation of scalable processing techniques allows us to extract richer insights from data. What we commonly refer to as Machine Learning – a combination of  of econometrics, machine learning, statistics, visualization, and computer science – extract valuable business insights hiding in data and builds operational systems to deliver that value.Data Science has evolved to a new branch called “Deep Neural Nets” (DNN). DNN Are what makes possible the ability of smart machines and agents to learn from data flows and to make products that use them even more automated & powerful. Deep Machine Learning involves the art of discovering data insights in a human-like pattern. The web scale world (led by Google and Facebook) have been vocal about their use of Advanced Data Science techniques and the move of Data Science into Advanced Machine Learning. Data Science is an umbrella concept that refers to the process of extracting business patterns from large volumes of both structured, semi structured and unstructured data. It is emerging the key ingredient in enabling a predictive approach to the business. Data Science & it’s applications across a range of industries are covered in the blogpost http://www.vamsitalkstech.com/?p=1846
  5. Visualization  – Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. The average enterprise user is also familiar with BYOD in the age of self service. The Digital Mesh only exacerbates this gap in user experiences as information consumers navigate applications as they consume services across a mesh that is both multi-channel as well as provides Customer 360 across all these engagement points.While information management technology has grown at a blistering pace, the human ability to process and comprehend numerical data has not. Applications being developed in 2016 are beginning to adopt intelligent visualization approaches that are easy to use,highly interactive and enable the user to manipulate corporate & business data using their fingertips – much like an iPad app. Tools such as intelligent dashboards, scorecards, mashups etc are helping change a visualization paradigms that were based on histograms, pie charts and tons of numbers. Big Data improvements in data lineage, quality are greatly helping the visualization space.
  6. DevOps – Big Data powered by Hadoop has now evolved into a true application architecture ecosystem as mentioned above. The 30+ components included in an enterprise grade platform like the Hortonworks Data Platform (HDP) include APIs (Application Programming Interfaces) to satisfy every kind of data need that an application could have – streaming, realtime, interactive or batch. Couple that with improvements in predictive analytics. In 2016, enterprise developers leveraging Big Data have been building scalable applications with data as a first class citizen. Organizations using DevOps are already reaping the rewards as they are able to streamline, improve and create business processes to reflect customer demand and positively affect customer satisfaction. Examples abound in the Webscale world (Netflix, Pinterest, and Etsy) but we now have existing Fortune 1000 companies in verticals like financial services, healthcare, retail and manufacturing who are benefiting from Big Data & DevOps.Thus, 2016 will be the year when Big Data techniques are no longer be the preserve of classical Information Management teams but move to the umbrella application development area which encompasses the DevOps and Continuous Integration & Delivery (CI-CD) spheres.
    One of DevOps chief goal’s is to close the long-standing gap between the engineers who develop and test IT capability and the organizations that are responsible for deploying and maintaining IT operations. Using traditional app dev methodologies, it can take months to design, test and deploy software. No business today has that much time—especially in the age of IT consumerization and end users accustomed to smart phone apps that are updated daily. The focus now is on rapidly developing business applications to stay ahead of competitors that can better harness the Big Data business capabilities. The micro services architecture approach advocated by DevOps combines the notion of autonomous, cooperative yet loosely coupled applications built as a conglomeration of business focused services is a natural fit for the Digital Mesh.  The most important additive and consideration to micro services based architectures in 2016 are  Analytics Everywhere.

The Final Word – 

We have all heard about the growth of data volumes & variety. 2016 is perhaps the first year where forward looking business & technology executives have begin capturing commercial value from the data deluge by balancing analytics with creative user experience. 

Thus, modern data applications are making Big Data ubiquitous. Rather than existing as back-shelf tools for the monthly ETL run or for reporting, these modern applications can help industry firms incorporate data into every decision they make.  Applications in 2016 and beyond are beginning  to recognize that Analytics are pervasive, relentless, realtime and thus embedded into our daily lives.  

Global Banking faces it’s Uber Moment..

The neighborhood bank branch is on the way out and is being slowly phased out as the primary mode of customer interaction for Banks. Banks across the globe have increased their technology investments in strategic areas such as Analytics, Data & Mobile. The Bank of the future increasingly resembles a technology company.

I have no doubt that the financial industry will face a series of Uber moments,” – Antony Jenkins (then CEO) of Barclays Bank, 2015

The Washington Post proclaimed in an article [1] this week that bank branch on the corner of Main Street may not be there much longer.

Technology is transforming Banking thus leading to dramatic changes in the landscape of customer interactions. We live in the age of the Digital Consumer – Banking in the age of the hyper-connected consumer. As millenials join the labor force, they are expecting to be able to Bank from anywhere, be it a mobile device or use internet banking from their personal computer.

As former Barclays CEO Antony Jenkins described it in a speech given last fall, the global banking industry, which is under severe pressure from customer demands for increased automation and contextual services, will slash employment and branches by 20 percent to 50 percent over the next decade.[2]

“I have no doubt that the financial industry will face a series of Uber moments,” he said in the late-November speech in London, referring to the way that Uber and other ride-hailing companies have rapidly unsettled the taxi industry.[2]

Banking must trend Digital to respond to changing client needs – 

The Financial Services and the Insurance industry are facing an unprecedented amount of change driven by factors like changing client preferences and the emergence of new technology—the Internet, mobility, social media, etc. These changes are immensely profound, especially with the arrival of “FinTech”—technology-driven applications that are upending long-standing business models across all sectors from retail banking to wealth management & capital markets. Further, members of a major new segment, Millennials, increasingly use mobile devices, demand more contextual services and expect a seamless unified banking experience—something akin to what they  experience on web properties like Facebook, Amazon, Uber, Google or Yahoo, etc.

The definition of Digital is somewhat nebulous, I would like to define the key areas where it’s impact and capabilities will need to be felt for this gradual transformation to occur.

A true Digital Bank needs to –

  • Offer a seamless customer experience much like the one provided by the likes of Facebook & Amazon i.e highly interactive & intelligent applications that can detect a single customer’s journey across multiple channels
  • offer data driven interactive services and products that can detect customer preferences on the fly, match them with existing history and provide value added services. Services that not only provide a better experience but also foster a longer term customer relationship
  • to be able to help the business prototype, test, refine and rapidly develop new business capabilities
  • Above all, treat Digital as a Constant Capability and not as an ‘off the shelf’ product or a one off way of doing things

Though some of the above facts & figures may seem startling, it’s how individual banks put both data and technology to work across their internal value chain that will define their standing in the rapidly growing data economy.

Enter the FinTechs

FinTechs (or new Age financial industry startups) offer enhanced customer experiences built on product innovation and agile business models. They do so by expanding their wallet share of client revenues by offering contextual products tailored to individual client profiles. Their savvy use of segmentation data and predictive analytics enables the delivery of bundles of tailored products across multiple delivery channels (web, mobile, Point Of Sale, Internet, etc.). Like banks, these technologies support multiple modes of payments at scale, but they aren’t bound by the same regulatory and compliance regulations as are banks, who operate under a mandate that they must demonstrate that they understand their risk profiles. Compliance is an even stronger requirement for banks in areas around KYC (Know Your Customer) and AML (Anti Money Laundering) where there is a need to profile customers—both individual & corporate—to decipher if any of their transaction patterns indicate money laundering, etc.

Banking produces the most data of any industry—rich troves of data that pertain to customer transactions, payments, wire transfers and demographic information. However, it is not enough for financial service IT departments to just possess the data. They must be able to drive change through legacy thinking and infrastructures as the industry changes—both from a data product as well as from a risk & compliance standpoint.

The business areas shown in the below illustration are a mix of both legacy capabilities (Risk, Fraud and Compliance) to the new value added areas (Mobile Banking, Payments, Omni-channel Wealth Management etc).

DataDriven1

   Illustration – Predictive Analytics and Big Data are upending business models in Banking across multiple vectors of disruption 

Business Challenges facing banks today

Banks and other players across the financial spectrum face challenges across three distinct areas. First and foremost they need to play defense with a myriad of regulatory and compliance legislation across defensive areas of the business such as risk data aggregation and measurement and financial compliance and fraud detection. On the other hand, there is a distinct need to vastly improve customer satisfaction and stickiness by implementing predictive analytics capabilities and generating better insights across the customer journey thus driving a truly immersive digital experience. Finally, banks need to leverage their mountains of data assets to create new business models and go-to-market strategies. They need to do this by monetizing multiple data sources—both data-in-motion and data-at-rest—for actionable intelligence.

Data is the single most important driver of bank transformation, impacting financial product selection, promotion targeting, next best action and ultimately, the entire consumer experience. Today, the volume of this data is growing exponentially as consumers increasingly share opinions and interact with an array of smart phones, connected devices, sensors and beacons emitting signals during their customer journey.

Data Challenges – 

Business and technology leaders are struggling to keep pace with a massive glut of data from digitization, the internet of things, machine learning, and cybersecurity for starters. A data lake—which combines data assets, technology and analytics to create enterprise value at a massive scale—can help businesses gain control over their data.

Fortunately, Big Data driven predictive analytics is here to help.  The Hadoop platform and ecosystem of technologies have matured considerably and have evolved to supporting business critical banking applications. The emergence of cloud platforms is helping in this regard.

Positively impacting the banking experience requires data

Whether at the retail bank or at corporate headquarters, there are a number of ways to leverage technology in order to enable a successful consumer experience across all banking sectors:

Retail & Consumer Banking

Banks need to move to a predominantly online model, providing consumers with highly interactive, engaging and contextual experiences that span multiple channels—branch banking, eBanking, POS, ATM, etc. Further goals are increased profitability per customer for both micro and macro customer populations with the ultimate goal of increasing customer lifetime value (CLV).

Capital Markets

Capital markets firms must create new business models and offer superior client relationships based on their data assets. Those that leverage and monetize their data assets will enjoy superior returns and raise the bar for the rest of the industry. It is critical for capital market firms to better understand their clients (be they institutional or otherwise) from a 360-degree perspective so they can be marketed to as a single entity across different channels—a key to optimizing profits with cross selling in an increasingly competitive landscape.

Wealth Managers

The wealth management segment (e.g., private banking, tax planning, estate planning for high net worth individuals) is a potential high growth business for any financial institution. It is the highest touch segment of banking, fostered on long-term and extremely lucrative advisory relationships. It is also the segment most ripe for disruption due to a clear shift in client preferences and expectations for their financial future. Actionable intelligence gathered from real-time transactions and historical data becomes a critical component for product tailoring, personalization and satisfaction.

Corporate Banking

The ability to market complex financial products across a global corporate banking client base is critical to generating profits in this segment. It’s also important to engage in risk-based portfolio optimization to predict which clients are at risk for adverse events like defaults. In addition to being able to track revenue per client and better understand the entities they bank with, it is also critical that corporate banks track AML compliance.

The future of data for Financial Services

Understand the Customer Journey

Across retail banking, wealth management and capital markets, a unified view of the customer journey is at the heart of the bank’s ability to promote the right financial product, recommend a properly aligned portfolio products, keep up with evolving preferences as the customer relationship matures and accurately predict future revenue from a customer. But currently most retail, investment banks and corporate banks lack a comprehensive single view of their customers. Due to operational silos, each department has a limited view of the customer across multiple channels. These views are typically inconsistent, vary quite a bit and result in limited internal collaboration when servicing customer needs. Leveraging the ingestion and predictive capabilities of a Big Data platform, banks can provide a user experience that rivals Facebook, Twitter or Google and provide a full picture of customer across all touch points.

Create Modern data applications

Banks, wealth managers, stock exchanges and investment banks are companies run on data—data on deposits, payments, balances, investments, interactions and third-party data quantifying risk of theft or fraud. Modern data applications for banking data scientists may be built internally or purchased “off the shelf” from third parties. These new applications are powerful and fast enough to detect previously invisible patterns in massive volumes of real-time data. They also enable banks to proactively identify risks with models based on petabytes of historical data. These data science apps comb through the “haystacks” of data to identify subtle “needles” of fraud or risk not easy to find with manual inspection.

These modern data applications make Big Data and data science ubiquitous. Rather than back-shelf tools for the occasional suspicious transaction or period of market volatility, these applications can help financial firms incorporate data into every decision they make. They can automate data mining and predictive modeling for daily use, weaving advanced statistical analysis, machine learning, and artificial intelligence into the bank’s day-to-day operations.

Conclusion – Banks need to drive Product Creation using the Latest Technology –  

A strategic approach to industrializing analytics in a Banking organization can add massive value and competitive differentiation in five distinct categories –

  1. Exponentially improve existing business processes. e.. Risk data aggregation and measurement, financial compliance, fraud detection
  2. Help create new business models and go to market strategies – by monetizing multiple data sources – both internal and external
  3. Vastly improve customer satisfaction by generating better insights across the customer journey
  4. Increase security while expanding access to relevant data throughout the enterprise to knowledge workers
  5. Help drive end to end digitization

If you really think about it –  all that banks do is manipulate and deal in data. If that is not primed for a Über type of revolution I do not know what is.

References
[1] https://www.washingtonpost.com/news/wonk/wp/2016/04/19/say-goodbye-to-your-neighborhood-bank-branch/

[2] http://www.theguardian.com/business/2015/nov/25/banking-facing-uber-moment-says-former-barclays-boss

A Digital Bank is a Data Centric Bank..

“There’s no better way to help a customer than to be there for them in the moments that matter.” — Lucinda Barlow, Google

The Banking industry produces the most data of any vertical out there with well defined & long standing business processes that have stood the test of time. Banks possess rich troves of data that pertain to customer transactions & demographic information. However, it is not enough for Bank IT to just possess the data. They must be able to drive change through legacy thinking and infrastructures as things change around the entire industry not just from a risk & compliance standpoint.

For instance a major new segment are the millennial customers – who increasingly use mobile devices and demand more contextual services as well as a seamless unified banking experience – akin to what they commonly experience via the internet – at web properties like Facebook, Amazon, Uber, Google or Yahoo etc.

The Data Centric Bank

Banks, wealth managers, stock exchanges and investment banks are companies run on data—data on deposits, payments, balances, investments, interactions and third-party data quantifying risk of theft or fraud. Modern data applications for banking data scientists may be built internally or purchased “off the shelf” from third parties. These new applications are powerful and fast enough to detect previously invisible patterns in massive volumes of real-time data. They also enable banks to proactively identify risks with models based on petabytes of historical data. These modern data science applications comb through the “haystacks” of data to identify subtle “needles” of fraud or risk not easy to find with manual inspection.

The Bank of the future looks somewhat like the below –

Digital_Journey_Banking

                                            Illustration – The Data Driven Bank

How do Banks stay relevant in this race and how is the Digital Journey to be accomplished?

I posit that there are five essential steps –

  1. Build for the organization of the future by inculcating innovation into the cultural DNA. A good chunk of the FinTech’s success is owed to a contrarian mindset in terms of creating business platforms using technology & generating a huge competitive advantage. The secret is following a strategy of continuous improvements. This is done by generating new ideas, being unafraid to cannibalize older (and even profitable) ideas and constant experimenting across new businesses. For instance Facebook is famous for not having a review board that designers and engineers go present to with PowerPoint slides. Here prototypes and pilot projects are directly presented to executives – even to CEO Mark Zuckerberg. Facebook pivoted in a couple of years from weak mobile offerings to becoming the #1 mobile app company (more users access their FB pages using mobile devices running iOS & Android compared to using laptops).
  2. Leverage Predictive Analytics across all data sets – A large part of the answer is to take an industrial approach to predictive analytics.  The current approach as in vogue – to treat these as one-off, tactical project investments does not simply work or scale anymore.  There are various organizational models that one could employ from the standpoint of developing analytical maturity. These ranging from a shared service to a line of business led approach. An approach that I have seen work very well is to build a Center of Excellence (COE) to create contextual capabilities, best practices and rollout strategies across the larger organization. These modern data applications make Big Data and data science ubiquitous. Rather than back-shelf tools for the occasional suspicious transaction or period of market volatility, these applications can help financial firms incorporate data into every decision they make. They can automate data mining and predictive modeling for daily use, weaving advanced statistical analysis, machine learning, and artificial intelligence into the bank’s day-to-day operations.
  3. Drive Automation across lines of business  – Financial services are fertile ground for business process automation, since most banks across their various lines of business are simply a collection of core and differentiated processes. Examples are consumer banking (with processes including on boarding customers, collecting deposits, conducting business via multiple channels, and compliance with regulatory mandates such as KYC and AML); investment banking (including straight-through-processing, trading platforms, prime brokerage, and compliance with regulation); payment services; and wealth management (including modeling model portfolio positions and providing complete transparency across the end-to-end life cycle). The key takeaway is that driving automation can result not just in better business visibility and accountability on behalf of various actors. It can also drive revenue and contribute significantly to the bottom line.It enables enterprise business and IT users to document, simulate, manage, automate and monitor business processes and policies. It is designed to empower business and IT users to collaborate more effectively, so business applications can be changed more easily and quickly.
  4. Adopt Open Source  -Open Source while being some what of an unknown challenge  to the mass middle market enterprise represents also a tremendous opportunity at most Banks & FinTechs across the spectrum of Financial Services. As one examines business imperatives & use-cases across the seven key segments (Retail & Consumer banking, Wealth management, Capital Markets,Insurance, Credit Cards & Payment processing, Stock Exchanges and Consumer Lending) it is clear that SMAC (Social, Mobile, Analytics, Cloud and Data) stacks can not just satisfy existing use-cases in terms of cost & satisfying business requirements across a spectrum but also help adopters build out Blue Oceans (i.e new markets). Segments of open source include the Linux OS, Open Source Middleware, Databases and Big Data ecosystem. Technologies like these have disrupted proprietary closed source products ranging from popular UNIX variants, Application Platforms & EDWs, RDBMS’s etc.
  5. Understand the Customer – Across Retail Banking, Wealth Management, Capital Markets, a unified view of the customer journey is at the heart of the bank’s ability to promote the right financial product, recommend a properly aligned portfolio products, keep up with evolving preferences as the customer relationship matures and accurately predict future revenue from a customer. But currently most retail, investment banks and corporate banks lack a comprehensive single view of their customers. Due to operational silos, each department has a limited view of the customer across multiple channels. These views are typically inconsistent, vary quite a bit and result in limited internal collaboration when servicing customer needs. Leveraging the ingestion and predictive capabilities of a Big Data based platform, banks can provide a user experience that rivals Facebook, Twitter or Google and provide a full picture of customer across all touch points.

Recommendations – 

Developing a strategic mindset to digital transformation  should be a board level concern. This entails

  • To begin with – ensuring buy in & commitment in the form of funding at a Senior Management level. This support needs to extend across usecases in the entire value chain
  • Extensive but realistic ROI (Return On Investment) models built during due diligence with periodic updates for executive stakeholders
  • On a similar note, ensuring buy in using a strategy of co-opting & alignment with Quants and different high potential areas of the business (as covered in the usecases in the last blog)
  • Identifying leaders within the organization who can not only lead digital projects but also create compelling content to evangelize the use of predictive analytics
  • Begin to tactically bake in or embed data science capabilities across different lines of business and horizontal IT
  • Slowly moving adoption to the Risk, Fraud, Cybersecurity and Compliance teams as part of the second wave of digital. This is critical in ensuring that analysts across these areas move from a spreadsheet intensive model to adopting advanced statistical techniques
  • Creating a Digital Analytics COE (Center of Excellence) that enable cross pollination of ideas across the fields of statistical modeling, data mining, text analytics, and Big Data technology
  • Ensuring that issues related to data privacy,audit & compliance have been given a great deal of forethought
  • Identifying  & developing human skills in toolsets (across open source and closed source) that facilitate adapting to data lake based architectures. A large part of this is to organically grow the talent pool by instituting a college recruitment process

Summary – 

I have found myself spending the vast majority of my career working with a range of marquee financial services, healthcare, business services & Telco clients. More often than not, a vast percentage of these strategic discussions have centered around business transformation, enterprise architecture and overall strategy around Open Source initiatives & technology.

Global Banking is at an inflexion point, there is now an emerging sense of urgency in mainstream Financial Services organizations to create and expand on their Digital Transformation strategy.In the last few years, more and more of the technology oriented discussions have been focused around Cloud Computing, DevOps, Mobility & Big Data.

The prongs of digital range from Middleware to BPM to Cloud Computing (IaaS/PaaS/SaaS) to the culture & DevOps practices.

The rise of Open Standards and Open APIs have been the catalyst in the digital disruption. Neglect them at your peril.

Data Driven Decisions in Financial Services..(2/3)

“Data! Data! Data!” he cried impatiently. “I can’t make bricks without clay!” Sherlock Holmes – Conan Doyle’s The Adventure of the Copper Beaches”

The first post in this three part series described the key ways in which innovative applications of data science are changing a somewhat insular and clubby banking & financial services industry. This disruption rages across the spectrum from both a business model as well as an organizational cultural standpoint. This second post examines key & concrete usecases enabled by a ‘Data Driven’ approach in the  Industry. The next & final post will examine foundational Data Science tasks & techniques commonly employed to get value from data.

Big Data platforms, powered by Open Source Hadoop, can not only economically store large volumes of structured, unstructured or semi-structured data & but also help process it at scale. The result is a steady supply of continuous, predictive and actionable intelligence. With the advent of Hadoop and Big Data ecosystem technologies, Bank IT (across a spectrum of business services) is now able to ingest, onboard & analyze massive quantities of data at a much lower price point.

One can thus can not only generate insights using a traditional ad-hoc querying(or descriptive intelligence) model but also build advanced statistical models on the data. These advanced techniques leverage data mining tasks (like classification, clustering, regression analysis, neural networks etc) to perform highly robust predictive modeling. Owing to Hadoop’s natual ability to work with any kind of data, this can encompass the streaming and realtime paradigms in addition to the traditional historical (or batch) mode.

Further, Big Data also helps Banks capture and commingle diverse datasets that can improve their analytics in combination with improved visualization tools that aid in the exploration & monetization of data.

Now, lets break the above summary down into specifics.

Data In Banking

Corporate IT organizations in the financial industry have been tackling data challenges due to strict silo based approaches that inhibit data agility for many years now.

Consider some of the traditional (or INTERNAL) sources of data in banking –

  • Customer Account data e.g. Names, Demographics, Linked Accounts etc
  • Core Banking Data
  • Transaction Data which captures the low level details of every transaction (e.g debit, credit, transfer, credit card usage etc)
  • Wire & Payment Data
  • Trade & Position Data
  • General Ledger Data e.g AP (accounts payable), AR (accounts receivable), cash management & purchasing information etc.
  • Data from other systems supporting banking reporting functions.

To provide the reader with a wider perspective, a vast majority of the above traditional data is almost all human generated. However, with the advent of smart sensors, enhancements in telemetry based devices like ATMs, POS terminals etc –  machines are beginning to generate even more data. Thus, every time a banking customer clicks a button on their financial provider’s website or makes a purchase using a credit card or calls her bank using the phone – a digital trail is created. Mobile apps drive a ever growing number of interactions due to the sheer nature of interconnected services – banking, retail, airlines, hotels etc. The result is lots of data and metadata that is MACHINE & App generated.

In addition to the above internal & external sources, commercially available 3rd party datasets ranging from crop yields to car purchases to customer preference data  (segmented by age or affluence categories), social media feedback re- financial & retail product usage are now widely available for purchase. As financial services firms sign up partnerships in Retail, Government and Manufacturing, these data volumes will only begin to explode in size & velocity.The key point is that an ever growing number of customer facing interfaces are now available for firms to collect data in a manner that they had never been able to do so before.

Where can Predictive Analytics help – 

Let us now begin some of the main use cases  out there as depicted in the below picture-

DS_Banking_UseCases

                              Illustration – Data Science led disruption in Banking

Defensive Use Cases Across the Banking Spectrum (RFC) – Risk, Fraud & Security

Internal Risk & Compliance departments are increasingly turning to Data Science techniques to create & run models on aggregated risk data. Multiple types of models and algorithms are used to find patterns of fraud and anomalies in the data to predict customer behavior. Examples include Bayesian filters, Clustering, Regression Analysis, Neural Networks etc. Data Scientists & Business Analysts have a choice of MapReduce, Spark (via Java,Python,R), Storm etc and SAS to name a few – to create these models. Fraud model development, testing and deployment on fresh & historical data become very straightforward to implement on Hadoop.

  • Risk Data Aggregation and Measurement – Measure and project different kinds of banking risks (Market Risk, Credit Risk, Loan Default and Operational Risk) . The applications for Data Science range from predicting different risk metrics across market, credit risk in Capital Markets. In Consumer Banking sectors like mortgage banking, credit cards & other financial products, data science is heavily leveraged to classify products & customers into different risk categories. Then to predicting risk scores and risk portfolio trends across thousands of variables.
  • Fraud Detection – Detect and predict institutional fraud for a range of usecases – Anti Money Laundering Compliance (AML), Know Your Customer (KYC), watchlist screening, tax evasion, Linked Entity Analysis etc. In the area of individual level fraud – credit card fraud & mortgage fraud – predictive models are developed which constantly analyze customer spending patterns, location & travel details, employment details and social networks to detect in real time if customer accounts are being compromised.
  • Cyber SecurityAnalyze clickstreams, network packet capture data, weblogs, image data, telemetry data to predict security compromises & to provide advanced security analytics.

Capital Markets, Consumer Banking, Payment Systems & Wealth Management

A) Capital Markets

  • Algorithmic Trading– Data Science augments trading infrastructures in several ways. It helps re-tool existing trading infrastructures so that they are more integrated yet loosely coupled and efficient by helping plug in algorithm based complex trading strategies that are quantitative in nature across a range of asset classes like equities, forex,ETFs and commodities etc. It also helps with trade execution after Hadoop incorporates newer & faster sources of data (social media, sensor data, clickstream date) and not just the conventional sources (market data, position data, M&A data, transaction data etc). E.g Retrofitting existing trade systems to be able to accommodate a range of mobile clients who have a vested interest in deriving analytics. e.g marry tick data with market structure information to understand why certain securities dip or spike at certain points and the reasons for the same (e.g. institutional selling or equity linked trades with derivatives).
  • Trade Analytics – Trade Strategy development is now a complex process where heterogeneous data – ranging from market data, existing positions, corporate actions, social & sentiment data are all blended together to obtain insights into possible market movements, trader yield & profitability across multiple trading desks.
  • Market & Trade Surveillance – An intelligent surveillance system needs to store trade data, reference data, order data, and market data, as well as all of the relevant communications from all the disparate systems, both internally and externally, and then match these things appropriately. The system needs to account for multiple levels of detection capabilities starting with a) configuring business rules (that describe a fraud pattern) as well as b) dynamic capabilities based on machine learning models (typically thought of as being more predictive) to detect complex patterns that pertain to insider trading and other market integrity compromises. Such a system also needs to be able to parallelize model execution at scale to be able to meet demanding latency requirements.

B) Consumer Banking & Wealth Management

Data Science has been proven in several applications in consumer banking ranging from a single view of customer to mapping customer journey across multiple financial products & channels. Techniques like pattern analysis (detecting new patterns within and across datasets), marketing analysis (across channels), recommendation analysis (across groups of products) are becoming fairly common. One can see a clear trend in early adopter consumer banking & private banking institutions in moving to an “Analytics first” approach to creating new business applications.

  • Customer 360 & Segmentation –
    Currently most Retail and Consumer Banks lack a  comprehensive view of their customers. Each department has a limited view of customer due to which the offers and interactions with customers across multiple channels are typically inconsistent and vary a lot.  This also results in limited collaboration within the bank when servicing customer needs. Leveraging the ingestion and predictive capabilities of a Hadoop based platform, Banks can provide a user experience that rivals Facebook, Twitter or Google that provide a full picture of customer across all touch points
  • Some of the more granular business usecases that span the spectrum in Consumer Banking include –
    • Improve profitability per retail or cards customer across the lifecycle by targeting at both micro and macro levels (customer populations) .This is done by combining the rich diverse datasets – existing transaction data, interaction data, social media feeds, online visits, cross channel data etc as well as understand customer preferences across similar segments
    • Detect customer dissatisfaction by analyzing transaction, call center data
    • Cross sell and upsell opportunities across different products
    • Help improve the product creation & pricing process

B) Payment Networks 

The real time data processing capabilities of Hadoop allow it to process data in a continual or bursty or streaming or micro batching fashion. Once payment data is ingested, such it must be processed in a very small time period (hundreds of milliseconds) which is typically termed near real time (NRT). When combined with predictive capabilities via behavioral modeling & transaction profiling Data Science can provide significant operational, time & cost savings across the below areas.

  • Obtaining a single view of customer across multiple modes of payments
  • Detecting payment fraud by using behavior modeling
  • Understand which payment modes are used more by which customers
  • Realtime analytics support
  • Tracking, modeling & understanding customer loyalty
  • Social network and entity link analysis

The road ahead – 

How can leaders in the Banking industry leverage a predictive analytics based approach across each of the industry ?

I posit that this will take place in four ways –

  • Using data to create digital platforms that better engage customers, partners and employees
  • Capturing & analyzing any and all data streams from both conventional and newer sources to compile a 360 degree view of the retail customer, institutional client or payment or fraud etc. This is critical to be able to market to the customer as one entity and to assess risk across that one entity as well as populations of entities
  • Creating data products by breaking down data silos and other internal organizational barriers
  • Using data driven insights to support a culture of continuous innovation and experimentation

The next & final post will examine specific Data Science techniques covering key algorithms, and other computational approaches.. We will also cover business & strategy recommendations to industry CXO’s embarking on Data Science projects.

Big Data & Advanced Analytics drive profits in Financial Services..(1/3)

“Silicon Valley is coming. There are hundreds of start-ups with a lot of brains and money working on various alternatives to traditional banking….the ones you read about most are in the lending business, whereby the firms can lend to individuals and small businesses very quickly and — these entities believe — effectively by using Big Data to enhance credit underwriting. They are very good at reducing the ‘pain points’ in that they can make loans in minutes, which might take banks weeks. Jamie Dimon –  CEO JP Morgan Chase in Annual Letter to Shareholders Feb 2016[1].

If Jamie Dimon’s opinion is anything to go by, the Financial Services industry is undergoing a major transformation and it is very evident that Banking as we know it will change dramatically over the next few years. This blog has spent some time over the last year defining the Big Data landscape in Banking. However the rules of the game are changing from mere data harnessing to leveraging data to drive profits. With that background, let us begin examining the popular applications of Data Science in the financial industry. This blog covers the motivation for and need of data mining in Banking. The next blog will introduce key usecases and we will round off the discussion in the third & final post by covering key algorithms, and other computational approaches.

The Banking industry produces the most data of any vertical out there with well defined & long standing business processes that have stood the test of time. Banks possess rich troves of data that pertain to customer transactions & demographic information. However, it is not enough for Bank IT to just possess the data. They must be able to drive change through legacy thinking and infrastructures as things change around the entire industry not just from a risk & compliance standpoint. For instance a major new segment are the millennial customers – who increasingly use mobile devices and demand more contextual services as well as a seamless unified banking experience – akin to what they commonly experience via the internet – at web properties like Facebook, Amazon,Uber, Google or Yahoo etc.

How do Banks stay relevant in this race? A large part of the answer is to make Big Data a strategic & boardroom level discussion and to take an industrial approach to predictive analytics.  The current approach as in vogue – to treat these as one-off, tactical project investments does not simply work or scale anymore.  There are various organizational models that one could employ, ranging from a shared service to a line of business led approach. An approach that I have seen work very well is to build a Center of Excellence (COE) to create contextual capabilities, best practices and rollout strategies across the larger organization.

Banks need to lead with Business Strategy 

A strategic approach to industrializing analytics in a Banking organization can add massive value and competitive differentiation in five distinct categories –

  1. Exponentially improve existing business processes. e.. Risk data aggregation and measurement, financial compliance, fraud detection
  2. Help create new business models and go to market strategies – by monetizing multiple data sources – both internal and external
  3. Vastly improve customer satisfaction by generating better insights across the customer journey
  4. Increase security while expanding access to relevant data throughout the enterprise to knowledge workers
  5. Help drive end to end digitization

Financial Services gradually evolves from Big Data 1.0 to 2.0

Predictive analytics & data mining have only been growing in popularity in recent years. However, when coupled with Big Data, they are on their way to attaining a higher degree of business capability & visibility.

Lets take a quick walk down memory lane..

In Big Data 1.0 – (2009-2015), a large technical area of focus was to ingest huge volumes of data to process them in a batch oriented fashion to perform a limited number of business usecases. In the era of 2.0, the focus is on enabling applications to perform high, medium or low latency based complex processing.

In the age of 1.0, Banking organizations across the spectrum, ranging from the mega banks to smaller regional banks to asset managers, have used the capability to acquire, store and process large volumes of data using commodity hardware at a much lower price point. This has resulted in huge reduction in CapEx & OpEx spend on data management projects  (Big Data augments while helping augment legacy investments in MPP systems, Data Warehouses, RDBMS’s etc).

The age of Big Data 1.0 in financial services is almost over and the dawn of Big Data 2.0 is now upon the industry. One may ask, “what is the difference?”, I would contend that while Big Data 1.0 largely dealt with the identification, on-boarding and broad governance of the data; 2.0 will begin the redefinition of business based on the ability do deploy advanced processing techniques across a plethora of new & existing sources of data. 2.0 will thus be about extracting richer insights from the onboarded data to serve customers better, stay compliant with regulation & to create new businesses. The new role of  ‘Data scientist’ who is an interdisciplinary expert (part business strategist, part programmer, part statistician, data miner & part business analyst) –  has come to represent one of the highly coveted job skills today.

Much before the time “Data Science” entered the technology lexicon, the Capital Markets employed advanced quantitative techniques. The emergence of Big Data has only created up new avenues in machine learning, data mining and artificial intelligence.

BigData_1_2

                                                    Illustration: Data drives Banking

Why is that ?

Hadoop, which is now really a platform ecosystem of 30+ projects – as opposed to a standalone technology, has been reimagined twice and now forms the backbone of any financial services data initiative. Thus, Hadoop is has now evolved into a dual persona – first an Application platform in addition to being a platform for data storage & processing.

Why are Big Data and Hadoop the ideal platform for Predictive Analytics?

Big Data is dramatically changing that approach with advanced analytic solutions that are powerful and fast enough to detect fraud in real time but also build models based on historical data (and deep learning) to proactively identify risks.

The reasons why Hadoop is emerging as the best choice for predictive analytics are

  1. Access to the advances in advanced infrastructures & computing capabilities at a very low cost
  2. Monumental advances in the algorithmic techniques themselves now..e.g. mathematical abilities, feature sets, performance etc
  3. Low cost & efficient access to tremendous amounts for data & the ability to store it at scale

Technologies in the Hadoop ecosystem such as ingestion frameworks (Flume,Kafka,Sqoop etc) and processing frameworks (MapReduce,Storm, Spark et al) have enabled the collection, organization and analysis of Big Data at scale. Hadoop supports multiple ways of running models and algorithms that are used to find patterns of customer behavior, business risks, cyber security violations, fraud and compliance anomalies in the mountains of data. Examples of these models include Bayesian filters, Clustering, Regression Analysis, Neural Networks etc. Data Scientists & Business Analysts have a choice of MapReduce, Spark (via Java,Python,R), Storm etc and SAS to name a few – to create these models. Fraud model development, testing and deployment on fresh & historical data become very straightforward to implement on Hadoop

However the story around Big Data adoption in your average Bank is typically not all that revolutionary – it typically follows a more evolutionary cycle where a rigorous engineering approach is applied to gain small business wins before scaling up to more transformative projects.Leveraging an open enterprise Hadoop approach, Big Data centric business initiatives in financial services have begun realizing value in a range of areas as diverse as –  the defensive (Risk, Fraud and Compliance  – RFC ) to achieving Competitive Parity (e.g Single View of Customer) to the Offensive (Digital Transformation across their Retail Banking business, unified Trade Data repositories in Capital Markets).

With the stage thus set, the next post will describe real world compelling usecases for Predictive Analytics across the spectrum of 21st century banking.

References

1.http://www.businessinsider.com/jamie-dimon-shareholder-letter-and-silicon-valley-2015-4