Why Banks, Payment Providers and Insurers Should Digitize Their Risk Management..

When models turn on, brains turn off.” – Dr. Til Schuermann, Formerly Research Officer in the Banking Studies function at the Federal Reserve Bank of New York.Currently Partner at Oliver Wyman & Company.

There exist two primary reasons for Enterprises such as Banks, Insurers, Payment Providers and FinTechs to pursue best in class Risk Management Processes and Platforms. The first need is compliance driven by various regulatory reporting mandates such as the Basel Reporting Requirements, the FRTB, the Dodd‐Frank Act, Solvency II, CCAR and CAT/MiFiD II in the United States & the EU. The second reason is the need to drive top-line sales growth leveraging using Digital technology. This post advocates the implementation of Digital Technology on Risk Management across both the areas.

Image Credit – Digital Enterprise

Recapping the Goals of Regulatory Reform..

There are many kinds of Risk, ranging from the three keystone kinds – Credit, Market and Operational to the Basel-II.5/III accords, FRTB, Dodd Frank etc. The best enterprises not only manage Risk well but they also turn it into a source of competitive advantage. Leading banks have recognized this and according to McKinsey forecasts, while risk-operational processes such as credit administration today account for the majority of the some (50 percent) of the Risk function’s staff, and analytics just 15 percent, by 2025 those figures will be around 25 percent and 40 percent respectively. [1]

Whatever be the kind of Risk, certain themes are common from a regulatory intention standpoint-

  1. Limiting risks that may cause wider harm to the economy by restricting certain activities such as preventing banks with retail operations from engaging in proprietary trading activities
  2. Requiring that banks increase the amount of and quality of capital held on reserve to back their assets and by requiring higher liquidity positions
  3. Ensuring that banks put in place appropriate governance standards ensuring that boards and management interact not just internally but also with regulators and their clients
  4. Upgrading governance standards, enabling a fundamental change in bank governance and the way boards interact with both management and regulators. These ambitions were expressed in various new post‐crisis rules and approaches.
  5.  Tackle the “too big to fail” challenge for highly complex businesses spanning multiple geographies, product lines and multifaceted customer segments. Accurate risk reporting ensures adequate capital conservation buffers.

Beyond the standard models used for Risk regulatory reporting, Banks & FinTechs are pushing the uses of risk modeling to new areas such as retail lending, SME lending. Since the crisis of 2008, new entrants have begun offering alternatives to traditional financial services in areas such as payments, mortgage loans, cryptocurrency, crowdfunding, alternative lending, and Investment management. The innovative use of Risk analytics lies at the core of the FinTechs success.

Across these areas, risk models are being leveraged in diverse areas such as marketing analytics to gain customers, defend against competition etc. For instance, realtime analytic tools are also being used to improve the credit granting processes. The intention is to gain increased acceptance by pre-approving qualified customers quickly without the manual intervention that can cause weeks of delays. Again, according to McKinsey, the goals of leading Banks are to approve up to 90 percent of consumer loans in seconds, generate efficiencies of 50 percent leading to revenue increases of 5 to 10 percent. Thus, leading institutions are using Risk Analytics to rethink their business models and to expand their product portfolios. [2]

Over the last two years, this blog has extensively covered areas such as cyber security, fraud detection, anti money laundering (AML) etc from a data analytics standpoint. The industry has treated Risk as yet another defensive function but over the next 10 years, it is expected that the Risk function will be an integral part of all of these above areas thus driving business revenue growth & detecting financial fraud, crimes. There is no doubt that Risk is a true cross cutting concern across a range of business functions & not just the traditional Credit, Market, Liquidity and Operational silos. Risk strategy needs to be a priority at the highest levels of an organization.

The Challenges with Current Industry Risk Architectures..

Almost an year ago, we discussed these technology issues in the below blogpost. To recap – most industry players have a mishmash of organically developed & shrink wrapped IT systems. These platforms run critical Core Banking Applications to Trade Lifecycle to Securities Settlement to Financial Reporting etc.  Each of these systems operates in an application, workflow, data silo with it’s own view of the enterprise. These are all kept in sync largely via data replication & stove piped process integration. Further siloed risk functions ensure that different risk reporting applications are developed using duplicative technology paradigms causing massive IT spend. Further, the preponderance of complex vendor supplied systems ensures lengthy release cycles and complex data center deployment requirements.

The Five Deadly Sins of Financial Services IT..

Industry Risk Architectures Suffer From Five Limitations

 A Roadmap for Digitization of Risk Architectures..

The end state or how a Digital Risk function will look like will vary for every institution embarking on this journey. There are six foundational elements we can still point out a few guideposts based on the .

#1 Automate Back & Mid Office Processes Across Risk and Compliance  –

As discussed, Many business processes across the front, mid and back office involve risk management. These processes range from risk data aggregation, customer on boarding, loan approvals, regulatory compliance (AML,KYC, CRS & FATCA), enterprise financial reporting  & Cyber Security.It is critical to move all and any manual steps from these business functions to a highly automated model. Doing so will not only reduce operational costs in a huge way but also demonstrate substantial auditability capabilities to regulatory authorities.

#2 Design Risk Architectures to handle Real time Data Feeds –

A critical component of Digital Risk is the need to incorporate real time data feeds across Risk applications. While Risk algorithms have traditionally dealt with historical data, new regulations such as FRTB explicitly call for various time horizons. These imply that Banks  to run a full spectrum of analytics across many buckets on data seeded from real time interactions. While the focus has been on the overall quality and auditability of data, the real time requirement is critical as one moves from front office applications such as customer on boarding, loan qualifications & pre-approvals to  key areas such as  market, credit and liquidity risks. Why is this critical? We have discussed the need for real time decision making insights for business leaders. Understanding risk exposures and performing root cause analysis in real time is a huge business capability for any Digital Enterprise.

#3 Experiment with Advanced Analytics and Machine Learning 

In response to real time risk reporting, the analytics themselves will be begin to get considerably more complex. This technology complexity will only be made more difficult with multiple teams working on all of these areas. This calls out for standardization of the calculations themselves across the firm. This also implies that from an analytics standpoint, a large number of scenarios on a large volume of data.  For Risk to become truly a digital practice, the innovative uses of Data Science across areas such as customer segmentation, fraud detection, social graph analysis must all make their way into risk management. Insurance companies and Banks are already deploying self learning algorithms in applications that deal with credit underwriting, employee surveillance and fraud detection. Wealth Managers are deploying these in automated investment advisory.  Thus, machine learning will support critical risk influenced areas such as Loan Underwriting, Credit Analytics, Single view of risk etc. All of these areas will need to leverage predictive modeling leading to better business decisions across the board.

#4 Technology Led Cross Organization Collaboration –

McKinsey predicts [1] that in the coming five to ten years, different regulatory ratios such as capital, funding, leverage, total loss-absorbing capacity etc will drive  the composition of the balance sheet to support profitability. Thus the risk function will work with finance and strategy functions to help optimize the enterprise balance sheet across various economic scenarios and then provide executives with strategic choices (e.g. increase or shrink a loan portfolio, for example), and likely regulatory impacts across these scenarios. Leveraging analytical optimization tools, an improvement on return on equity (ROE) by anywhere between 50 and 400 basis points has been forecasted.

The Value Drivers in Digitization of Risk Architectures..

McKinsey contends that the automation of credit processes and the digitization of the key steps in the credit value chain can yield cost savings of up to 50 percent. The benefits of digitizing credit risk go well beyond even these improvements. Digitization can also protect bank revenue, potentially reducing leakage by 5 to 10 percent. [2]

To give an example, by putting in place real-time credit decision making in the front line, banks reduce the risk of losing creditworthy clients to competitors as a result of slow approval processes. Additionally, banks can generate credit leads by integrating into their suite of products new digital offerings from third parties and Fintech’s, such as unsecured lending platforms for business. Finally, credit risk costs can be further reduced through the integration of new data sources and the application of advanced-analytics techniques. These improvements generate richer insights for better risk decisions and ensure more effective and forward-looking credit risk monitoring. The use of machine-learning techniques, for example, can help banks improve the predictability of credit early-warning systems by up to 25 percent [2].

The Questions to Ask at the Start of Risk Transformation..

There are three questions at this phase every Enterprise needs to ask at the outset –

  • What customer focused business capabilities can be enabled across the organization by incorporating an understanding of the various kinds of Risk ?
  • What aspects of this Risk transformation can be enabled by digital technology? Where are the current organizational and technology gaps that inhibit innovation?
  • How do we measure ROI and Business success across these projects before and after the introduction of ? How do we benchmark ourselves from a granular process standpoint against the leaders?

Conclusion..

As the above makes it clear, traditional legacy based approaches to risk data management reporting do not lend themselves well to managing your business effectively. When things are going well it has become very difficult for executives and regulators to get a good handle on how the business is functioning. In the worst of times, the risk function can fail to function well as models do not perform effectively.  It is not enough to take an incremental approach to improving current analytics approaches. The need of the hour is to incorporate the state of the art data management and analytic approaches based on Big Data, Machine Learning and Artificial Intelligence.

References

What Banks, Retailers & Payment Providers Should Do About Exploding Online Fraud in 2017..

Despite the introduction of new security measures such as EMV chip technology, 2016 saw the highest number of victims of identity fraud , according to a new report from Javelin Strategy & Research and identity-theft-protection firm LifeLock Inc[1]. 

Image Credit: Wall Street Journal

Background

The Global Credit Card industry has industry players facing new business pressures in strategic areas. Chief among these business shifts are burgeoning online transaction volumes, increased regulatory pressures (e.g. PSD2 in the European Union) and disruptive competition from FinTechs.

As discussed in various posts in this blog in 2016 – Consumers, Banks, Law Enforcement, Payment Processors, Merchants and Private Label Card Issuers are faced with yet another critical & mounting business challenge – payment card fraud. Payment card fraud continued to expand at a massive clip in 2016 – despite the introduction of security measures such as EMV Chip cards, multi-factor authentication, secure point of sale terminals etc. As the accessibility and modes of usage of credit, debit and other payment cards burgeons and transaction volumes increase across the globe, Banks are losing tens of millions of dollars on an annual basis to fraudsters.

Regular readers of this blog will recollect that we spent a lot of time last year discussing Credit Card and Fraud in some depth. I have reproduced some of these posts below for background reading.

Big Data Counters Payment Card Fraud (1/3)…

Hadoop counters Credit Card Fraud..(2/3)

It’s time for a 2017 update on this issue.

Increasing Online Payments means rising Fraud

The growing popularity of alternative payment modes like Mobile Wallets (e.g Apple Pay, Chase and Android Pay) are driving increased payment volumes across both open loop and closed loop payments. Couple this with in-app payments (e.g Uber) as well as Banking providers Digital Wallets  only driving increased mobile payments. Retailers like Walmart, Nordstrom and Tesco have been offering more convenient in-store payments.

This relentless & secular trend towards online payments is being clearly seen in all forms of consumer and merchant payments across the globe. This trend will only continue to accelerate in 2017 as smartphone manufacturers continue to produce devices that have more onscreen real estate. This will drive more mobile commerce. With IoT technology taking center stage, the day is not long off when connected devices (e.g. wearables) make their own payments.

However, with convenience of online payments confers anonymity which increases the risk of fraud. Most existing fraud platforms were designed for a previous era – of point of sales payments – with their focus on magnetic stripes, chips and EMV technology. Online payments thus present various challenges that Banks and Merchants did not have to deal with on such a large scale.

According to the WSJ [1] more consumers (15.4 million in the US) became victims of identity fraud in 2016 than at any point in more than a decade. Despite new security protections implemented by the industry in the form of EMV – about $16 billion was lost to fraudulent purchases with online accounting for a 15% rise in cases.

Fraud is a pernicious problem which in a lot of cases leads to a much worse crime- identity theft. The U.S. Department of Justice (DOJ) terms Identity theft as “one of the most insidious forms of white collar crime”. Identity theft typically results in multiple instances of fraud, which exact a heavy toll on consumers, merchants, banks and the overall economy. Let us look at some specific recommendations for Payment providers to consider.


Sadly, the much hyped “Chip on your cards” are useless in countering online fraud..

Javelin Research noted in their study that the vast majority of identity theft fraud was linked to credit cards.[2]

Most credit card holders in the USA will remember 2016 as the year when electronic chip technology became ubiquitous and required at the majority of retail establishments. The media buzz around chips was that this would curtail fraudster activity. However, this has been accompanied by a large in online theft. Card-not-present (CNP) fraud, which is when a thief buys something online or by phone, rose 40%.[2]

So did Account takeover fraud, where thieves access ongoing customer accounts and change the contact details/security information. These increased 61% compared to 2015, and totaled around 1.4 million incidents.[2]

It is very clear that the bulk of fraud happens over online transactions. It is here that the Banks must focus now. And online is a technology game.

How should Banks, Retailers & Payment Providers Respond..

Online card fraud revolves around the unauthorized stealing of an individual’s financial data. Fraudsters are engaging in a range of complex behaviors such as counterfeiting cards, committing mail fraud to open unauthorized accounts, online Card Not Present (CNP) transactions etc. Fraud patterns are quickly copied and reproduced across diverse geographies.

Let us consider five key areas where industry players need to make investments.


#1 Augment traditional Fraud Detection Systems & Architectures  with Big Data capabilities

Traditional Fraud detection systems have been built leveraging expert systems or rules engines. These expert systems are highly mature as they take into account the domain experience, intuition of fraud analysts. Fraud patterns called business rules are created in the form of IF..THEN.. format and made available in these systems. These rules describe a range of well understood patterns as shown below.

If Consumer Credit = yes And Transaction amount ≤ 1000 And Card present = yes Then Fraud = no

Typically hundreds of such rules are applied in realtime to incoming transactions.

Expert systems have been built for the era of physical card usage and can thus only reason on a limited number of data attributes. In the online world they are focused on looking for factors such as known bad IP addresses or unusual login times based on Business Rules and Events.However, the scammers have also learnt to stay ahead of the scammed and are leveraging computing advances to come up with ever new ways of cheating the banks. Big Data can help transform the detection process by enriching the data available to the fraud process including traditional customer data, transaction data, third party fraud data, social data and location based data.

Big Data also provides capabilities to tackle the most complex types of fraud and to learn from fraud data & patterns to be able to stay ahead of criminal networks. It is recommended that fraud systems be built using a layering paradigm. E.g. Provide multiple levels of detection capabilities starting with a) configuring business rules (that describe a fraud pattern) as well as b) dynamic capabilities based on machine learning models (typically thought of as being more predictive). Fraud systems also need to adapt Big Data frameworks like Spark, Storm etc to move to a real time mode. Frameworks like Spark make it extremely intuitive to implement advanced risk scoring based on user account behavior, suspicious behavior etc.

Advanced fraud detection systems augment the Big Data approach with building models of customer behavior at the macro level. Then they would use these models to detect anomalous transactions and flag them as potentially being fraudulent.


#2 Create Dynamic Single View of Cardholders

The Single View provide comprehensive business advantages as captured here – http://www.vamsitalkstech.com/?p=2517.  The SVC can help with the ability to view a customer as a single entity (or Customer 360) across all those channels & to be able to profile those.Ability to segment those customers into populations based on their behavior patterns. This will vastly help improve anomaly detection capabilities while also helping reduce the false positive problem.

#3 Adopt Graph Data processing capabilities

Fraudsters are engaging in a range of complex behaviors such as counterfeiting cards, committing mail fraud to open unauthorized accounts, online Card Not Present (CNP) transactions etc. Fraud patterns are quickly copied and reproduced across diverse geographies as fraudsters operate in concert. Thus, fraud displays a strong social element which leads to a higher risk of repetitive fraud across geographies.

The ability to demonstrate Social Network identity links with customer profiles to establish synthetic (or fraudulent) customer profiles and to reduce false identities is a key capability to possess. As fraud detection algorithms constantly analyze thousands of data points, it is important to perform Network based analysis understand if an account or IP Address or fraud pattern is occurring across different and seemingly unrelated actors.  The ability to search for the same Telephone numbers, Email accounts, social network profiles etc – in addition to machine data such as similar IP Addresses, device signatures and addresses can be used to establish these connections. Thus, graph and network analysis lends a different dimension to detection.


#4 Personalize Fraud Detection by Adopting Machine Learning

Incorporating as many sources of data (both deep and wide) into the decisioning process helps majorly in analyzing fraud. This data includes not just the existing – customer databases, data on historical spending patterns etc but also credit reports, social media data and other datasets (e.g Government watch-lists of criminal activity).

Some of these non-traditional sources are depicted below –

  • Geolocation Data
  • Purchase Channel Data
  • Website clickstream data
  • POS Sensor, Camera, ATM data
  • Social Media Data
  • Customer Complaint Data

Payment Providers assess the risk score of transactions in realtime depending upon these 100s of such attributes. Big Data enables these reasoning on more detailed and granular attributes. Advanced statistical techniques are used to incorporate behavioral (e.g. transaction is out of normal behavior for a consumers buying patterns), temporal and spatial techniques. The models often weigh attributes differently from one another thus separating the vast majority of good transactions from the small percentage of fraudulent ones.

We discussed the fact that fraud happens at every stage of the process – account opening, customer on-boarding, account validation & cross verification, card usage & chargebacks etc. It is imperative that fraud models be created and leveraged across the entire business workflow.


#5 Automate the Fraud Monitoring, Detection Lifecycle

Business Process Management (BPM) is a more prosaic and mature field compared to Big Data and Predictive Analytics. Pockets of BPM implementations exist at every large Bank in customer facing areas such as issuance, on-boarding, reporting, compliance etc. However, the ability to design, deploy automated processes is critical across the Cards fraud lifecycle. In areas like dispute management, false positive case resolution etc depend upon robust Case Management capability – which a good BPM platform or tool can provide.

Improvements can be noticed in agent productivity, number of cases handled per Agent and improved customer satisfaction. Errors and lags due to issues in human driven manual processes come down. On the front end, providing customers with handy mobile apps to instantaneously report suspicious transactions as well as tying those with automated handling can drastically improve fraud detection thus saving tens of millions of dollars. Major improvements can also seen in compliance, dispute resolution and cross border customer service.

Conclusion  

Online fraud keeps going up year after year, thus enterprises will remain vigilant especially banks and retailers. Online retail sales are expected to total nearly $28 trillion in 2020 [2] and it is a given that fraudsters will invent new techniques to steal customer data. Effective Fraud prevention has become an essential part of the customer experience.

References

[1] WSJ – Credit Card Fraud Keeps Rising Despite New Security Chips – “https://www.wsj.com/articles/credit-card-fraud-keeps-rising-despite-new-security-chipsstudy-1485954000

[2] Forbes – That Chip on Your Credit Card Isn’t Stopping Fraud After All – “http://fortune.com/2017/02/01/credit-card-chips-fraud/ “

Demystifying Digital – Reference Architecture for Single View of Customer / Customer 360..(3/3)

The first post in this three part series on Digital Foundations @ http://www.vamsitalkstech.com/?p=2517 introduced the concept of Customer 360 or Single View of Customer (SVC).  This second post in the series discussed the concept of Customer Journey Mapping (CJM) – http://www.vamsitalkstech.com/?p=3099 . We discussed specific benefits from both a business & operational standpoint that are enabled by SVC & CJM. The third & final post will focus on a technical design & architecture needed to achieve both these capabilities.

Business Requirements for Single View of Customer & Customer Journey Mapping…

The following key business requirements need to be supported for three key personas- Customer, Marketing & Customer Service – from a SVC and CJM standpoint.

  1. Provide an Integrated Experience: A fully integrated omnichannel experience for both the customer and internal stakeholder (marketing, customer service, regulatory, managerial etc) roles. This means a few important elements – consistent information across all touchpoints, the right information to the right user at the right time, an ability to view the CJM graph with realtime metrics on Customer Lifetime Value (CLV) etc.
  2. Continuously Learning Customer Facing System: An ability for the customer facing portion of the architecture to learn constantly to fine-tune it’s understanding of the customers real time picture. This includes an ability to understand the customer’s journey.
  3. Contextual yet Seamless Movement across Channels: The ability for customers to transition seamlessly from one channel to the other while conducting business transactions.
  4. Ability to introduce Marketing Programs for existing Customers: An ability to introduce marketing and customer retention and other loyalty programs in a dynamic manner. These include and ability to combine historical data with real time data about customer interactions and other responses like clickstreams – to provide product recommendations and real time offers.
  5. Customer Acquisition: An ability to perform low cost customer acquisition and to be able to run customized offers for segments of customers from a back-office standpoint.

Key Gaps in existing Single View (SVC) Architectures ..

It needs to be kept in mind that every organization is different from an IT legacy investment and operational standpoint. As such, a “one-size-fits-all” architecture is impossible to create. However, highlighted below are some common key data and application architecture gaps that I have observed from a data standpoint while driving to a SVC (Single View of Customer) with multiple leading enterprises.

  1. The lack of a single, unique & global customer identifier – The need to create a single universal customer identifier (based on various departmental or line of business identifiers) and to use it as a primary key in the customer master list
  2. Once the identifier is created in either the source system or in the datalake, organizations need to figure out a way to cascade that identifier into the Book of Record systems (CRM systems, webapps and ERP systems) so that the architecture can begin knitting together a single view of the customer. This may also involve periodically go out across the BOR systems, link all the customers data and pull the data into the lake;
  3. Many companies deal with multiple customer on-boarding systems. At some point, these  on-boarding processes need  to be centralized. For instance in Banking esp In Capital markets, customer on-boarding done in six or seven different areas; all of these ideally need to be consolidated into one.
  4. Graph Data Semantics – Once created, the Master Customer identifier should be mapped to all the other identifiers lines of business use to uniquely identify their customer; the ability to use simple or more complex matching techniques (Rule based matching, machine learning based matching & search based matching) is highly called for.
  5. MDM (Master Data Management) systems have traditionally automated some of this process by creating & owning that unique customer identifier. However Big Data capabilities help by linking that unique customer identifier to all the other ways the customer may be mapped across the organization. To this end,  data may be exported into an MDM system backed by a traditional RDBMS; or; the computation of the unique identifier can be done in a data lake and then exported into an MDM system.

Let us discuss the generic design of the architecture (depicted above) with a focus on the following subsystems –

A Reference Architecture for Single View of Customer/ Customer 360
  1. At the very top, different channels depict with different touch points In today’s connected world, the customer experience spans multiple different touch points throughout the customer lifecycle. A customer should be able to move through multiple different touch points during the buying process. Customers should be able to start, pause transactions (e.g. An Auto Loan application) from one channel and restart/complete them from another.
  2. A Big Data enabled application architecture is chosen. This needs to account for two different data processing paradigms. The first is a realtime component. The architecture must be capable of handling events within a few milliseconds. The second is an ability to handle massive scale data analysis in a retrospective manner. Both these components are provided by a Hadoop stack. The real time component leverages – Apache NiFi, Apache HBase, HDFS, Kafka, Storm and Spark. The batch component leverages  HBase, Apache Titan, Apache Hive, Spark and MapReduce.
  3. The range of Book of Record and external systems send data into the central datalake. Both realtime and batch components highlighted above send the data into the lake. The design of the lake itself will be covered in more detail in the below section.
  4. Starting from the upper-left side, we have the Book of Record Systems sending across transactions. These are ingested into the lake using any of the different ingestion frameworks provided in Hadoop. E.g. Flume, Kafka, Sqoop, HDFS API for batch transfers etc.  The ingestion layer depicted is based on Apache NiFi and is used to load data into the data lake.  Functionally, it is made up of real time data loaders and end of day data loaders. The real time loaders load the data as it is created in the feeder systems, the EOD data loaders will adjust the data end of the day based on the P&L sign off and the end of day close processes.  The main data feeds for the system will be from the book of record transaction systems (BORTS) but there may also be multiple data feeds from transaction data providers and customer information systems.
  5. The UI Framework is standardized across all kinds of clients. For instance this could be an HTML 5 GUI Framework that contains reusable widgets that can be used for mobile and browser based applications.  The framework also need to deal with common mobile issues such as bandwidth and be able to automatically throttle the data back where bandwidth is limited.It also needs to facilitate the construction of large user defined pivot tables for ad hoc reporting. It utilizes UI framework components for its GUI construction and communicates with the application server via the web services layer.
  6. API access is also provided by Web Services for partner applications to leverage: This is the application layer that that provides a set of RESTful web services that control the GUI behavior and that control access to the persistent data and the data that is cached on the data fabric.
  7. The transactions are taken through the pipeline of enrichment and the profiles of customers are stored in HBase. .
  8. The core data processing platform is then based on a datalake pattern which has been covered in this blog before. It includes the following pattern of processing.
    1. Data is ingested real time into a HBase database (which uses HDFS as the underlying storage layer). Tables are designed in HBase to store the profile of a trade and it’s lifecycle.
    2. Producers are authenticated at the point of ingest.
    3. Once the data has been ingested into HDFS, it is taken through a pipeline of processing (L0 to L3) as depicted in the below blogpost.

      http://www.vamsitalkstech.com/?p=667

  9. Speed Layer: The computational grid that makes up the Speed layer can be a distributed in memory data fabric like Infinispan or GemFire, or a computation process can be overlaid directly onto a stateful data fabric technology like Spark or GemFire. The choice is dependent of the language choices that have been made in building the other key analytic libraries. If multiple language bindings are required (e.g. C# & Java) then the data fabric will typically be a different product than the Grid.

Data Science for Customer 360

 Consider the following usecases that are all covered under Customer 360 –

  1. The ability to segment customers into categories based on granular data attributes
  2. Improve customer targeting for new promotions & increasing acquisition rate
  3. Increasing cross sell and upsell rates
  4. Understanding influencers among customer segments & helping these net promoters recommend products to other customers
  5. Performing market basket analysis of what products/services are typically purchased together
  6. Understanding customer risk profiles
  7. Creating realtime views of customer lifetime value (CLV)
  8. Reducing customer attrition

The obvious capability that underlies all of these is Data Science. Thus, Predictive Analytics is the key compelling paradigm that enables the buildout of the dynamic Customer 360.

The Predictive Analytics workflow always starts with a business problem in mind. Examples of these would be “A marketing project to detect which customers are likely to buy new products or services in the next six months based on their historical & real time product usage patterns – which are denoted by x,y or z characteristics” or “Detect realtime fraud in credit card transactions.” or “Perform certain algorithms based on the predictions”. In usecases like these, the goal of the data science process is to be able to segment & filter customers by corralling them into categories that enable easy ranking. Once this is done, the business is involved to setup easy and intuitive visualization to present the results. In the machine learning process, an entire spectrum of algorithms can be tried to solve such business problems.

A lot of times, business groups working on Customer 360 projects have a hard time explaining what they would like to see – both data and the visualization. In such cases, a prototype makes things way more easy from a requirements gathering standpoint.  Once the problem is defined, the data scientist/modeler identifies the raw data sources (both internal and external) which comprise the execution of the business challenge.  They spend a lot of time in the process of collating the data (from Oracle, DB2, Mainframe, Greenplum, Excel sheets, External datasets etc). The cleanup process involves fixing a lot of missing values, corrupted data elements, formatting fields that indicate time and date etc.

The Data Scientist working with the business needs to determine how much of this raw data is useful and how much of it needs to be massaged to create a Customer 360 view. Some of this data needs to be extrapolated to form the features using formulas – so that a model can be created. The models created often involve using languages such as R and Python.

Feature engineering takes in business features in the form of feature vectors and creates predictive features from them. The Data Scientist takes the raw features and creates a model using a mix of various algorithms. Once the model has been repeatedly tested for accuracy and performance, it is typically deployed as a service.

The transformation phase involves writing code to be able to to join up like elements so that a single client’s complete dataset is gathered in the Data Lake from a raw features standpoint.  If more data is obtained as the development cycle is underway,  the Data Science team has no option but to go back & redo the whole process.

Models as a Service (MaaS) is the Data Science counterpart to Software as a Service.The MaaS takes in business variables (often hundreds of them as inputs) and provides as output business decisions/intelligence, measurements, visualizations that augment decision support systems.

Once these models are deployed and updated nightly based on their performance – the serving layer takes advantage of them to drive real time 360 decisioning.

To Sum Up…

In this short series we have discussed that customers and data about their history, preferences, patterns of behavior, aspirations etc are the most important corporate asset. Big Data technology and advances made in data storage, processing and analytics can help architect a dynamic Single View that can help maximize competitive advantage across every industry vertical.

How the Internet of Things (IIoT) Digitizes Industrial Manufacturing..

In 2017, the chief strategic concerns for Global Product Manufacturers are manifold. These range from their ability drive growth in new markets by creating products that younger customers need, cut costs by efficient high volume manufacturing spanning global supply chains  & effective distribution and service. While the traditional lifecycle has always been a huge management challenge the question now is how digital technology can help create new markets and drive higher margins in established areas. In this blogpost, we will consider how IIoT (Internet Of Things) technology can do all of the above and foster new business models -by driving customer value on top of the core product.

Global Manufacturing is evolving from an Asset based industry to an Information based Digital industry. (Image Credit – GE)

A Diverse Industry Caught in Digital Dilemmas..

The last decade has seen tectonic changes in leading manufacturing economies. Along with a severe recession, employment in the industry has moved along the technology curve to a more skilled workforce. The services component of the industry is also steadily increasing i.e manufacturing now consumes business services and also is presented as such in certain sectors. The point is well made that this industry is not monolithic and there are distinct sectors with their own specific drivers for business success[1].

           The diverse sectors within Global Manufacturing (McKinsey [1])

Global manufacturing operations have evolved differently across industry segments. McKinsey identifies five diverse segments across the industry

  1. Global innovators for local markets – Industries such as Chemicals, Auto, Heavy Machinery etc.
  2. Regional processingRubber and Plastics products, Tobacco, Fabricated Metal and
  3. Energy intensive commodities – Industries supplying wood products, Petroleum and coke refining and Mineral based products
  4. Global technologies and innovators – Industries supplying Semiconductors, Computers and Office machinery
  5. Labor intensive tradables – These include textiles, apparel, leather, furniture, toys etc.
    Each of the above five sectors has different geographical locations where production takes place, they have diverse supply chains, support models, efficiency requirements and technological focus areas. These industries all have varying competitive forces operating across each.

However the trend that is broadly applicable to all of them is the “Industrial Internet”.

Defining the Industrial Internet Of Things (IIoT)

The Industrial Internet of Things (IIoT) can be defined as a ecosystem of capabilities that interconnects machines, personnel and processes to optimize the industrial lifecycle.  The foundational technologies that IIoT leverages are Smart Assets, Big Data, Realtime Analytics, Enterprise Automation and Cloud based services.

The primary industries impacted the most by the IIoT will include Industrial Manufacturing, the Utility industry, Energy, Automotive, Transportation, Telecom & Insurance.

Globally integrated manufacturers must constantly assess and fine-tune their strategy across these above eight stages. A key aspect is to be able to collect data throughout the process to derive real-time insights from the lifecycle, suppliers and customers. IoT technologies allied with Big Data techniques provide ways to store this data and to derive real-time & historical analytic insights. Thus the Manufacturing industry is moving to an entirely virtual world across its lifecycle, ranging from product development, customer demand monitoring to production to inventory management. This trend is being termed as Industry 4.0 or Connected Manufacturing. As devices & systems become more interactive and intelligent, the data they send out can be used to optimize the lifecycle across the value chain thus driving higher utilization of plant capacity and improved operational efficiencies.

Let us consider the impact of the IIoT across the lifecycle of Industrial Manufacturing.

IIOT moves the Manufacturing Industries from Asset Centric to Data Centric

The Industrial Internet of Things (IIoT) is a key enabler in digitizing the legacy manufacturing lifecycle. IIoT, Big Data and Predictive Analytics enable Manufacturers to reinvent their business models.

The Generic Product Manufacturing Lifecycle Overview as depicted in the above illustration covers the the most important activities that take place in the manufacturing process. Please note that this is a high level overview and in future posts we will expand upon each stage accordingly.

The overall lifecycle can be broken down into the following eight steps:

  1. Globally Integrated Product Design
  2. Prototyping and Pre-Production
  3. Mass production
  4. Sales and Marketing
  5. Product Distribution
  6. Activation and Support
  7. Value Added Services
  8. Resale and Retirement

Industry 4.0/ IIoT impacts Product Design and Innovation

IIoT technology can have a profound impact on the above traditional lifecycle in the following ways –

  1. The ability to connect the different aspects of the value chain that hitherto have been disconnected. This will fundamentally transform the asset lifecycle leading to higher manufacturing efficiencies, reduced wastage and more customer centric manufacturing (thus reducing recall rates)
  1. The ability to manage and integrate diverse data from sensors, machine data from operational systems, supplier channels & social media feedback drives real time insights
  2. The Connected asset lifecycle also leads to better inventory management and also drive optimal resupply decisions
  3. Create new business models that leverage data across the lifecycle to enable better product usage, pay for performance or outcome based services or even a subscription based usage model
  4. The ability track real time insights across the customer base thus leading to a more optimized asset lifecycle
  5. Reducing costs by allowing more operations ranging from product maintenance to product demos, customer experience sessions to occur remotely

Manufacturers have been connecting the value chain together for many years now. The M2M (mobile to mobile) implementations have already led to rounds of improvements in the so called ‘illities’ metrics– productivity, quality, reliability etc. The real opportunity with IIoT is being able to create new business models that result from the convergence of Operational Technology (OT) with Information Technology (IT). This journey primarily consists of taking a brick and mortar industry and slowly turning it into a data driven industry.

The benefits of adopting the IIOT range from improved quality owing to better aligned, efficient and data driven processes, higher operational efficiency overall, products better aligned with changing customer requirements, tighter communication across interconnected products and supplier networks.

Deloitte has an excellent take on the disruption ongoing in manufacturing ecosystems and holds all of the below terms as synonymous – [2]

  • Industrial Internet

  • Connected Enterprise

  • SMART Manufacturing

  • Smart Factory

  • Manufacturing 4.0

  • Internet of Everything

  • Internet of Things for Manufacturing

Digital Applications are already being designed for specific device endpoints across thought leaders across manufacturing industries such as the Automakers. While the underlying mechanisms and business models differ across the above five manufacturing segments, all of the new age Digital applications leverage Big Data, Cloud Computing, Predictive analytics at a minimum. Predictive Analytics are largely based on a combination of real time data processing & data science algorithms. These techniques extract insights from streaming data to provide digital services on existing toolchains, provide value added customer service, predict device performance & failures, improve operational metrics etc.

Examples abound. For instance, an excellent example in manufacturing is the notion of a Digital Twin which Gartner called out last year in their disruptive trends for 2017. A Digital twin is a software personification of an Intelligent device or system.  It forms a bridge between the real world and the digital world. In the manufacturing industry, digital twins can be setup to function as proxies of Things like sensors and gauges, coordinate measuring machines, vision systems, and white light scanning. This data is sent over a cloud based system where it is combined with historical data to better maintain the physical system.

The wealth of data being gathered on the shop floor will ensure that Digital twins will be used to reduce costs and increase innovation. Thus, in global manufacturing – Data science will soon make it’s way into the shop floor to enable the collection of insights from these software proxies.

What About the Technical Architecture..

For those readers inclined to follow the technology arc of this emerging trend, the below blogpost discusses an IIoT Reference Architecture to a great degree of technical depth –

A Digital Reference Architecture for the Industrial Internet Of Things (IIoT)..

References

  1. McKinsey & Company  – Global Manufacturing Outlook 2017 – http://www.mckinsey.com/business-functions/operations/our-insights/the-future-of-manufacturing
  2. Deloitte Press on Manufacturing Ecosystems – https://dupress.deloitte.com/dup-us-en/focus/industry-4-0/manufacturing-ecosystems-exploring-world-connected-enterprises.html

How Big Data & Advanced Analytics can help Real Estate Investment Trusts (REITS)

                                                         Image Credit – Kiplinger’s

Introduction…

Real Estate Investment Trust’s (REITS) are financial companies that own various forms of commercial and residential real estate. These assets include office buildings, retail shopping centers, hospitals, warehouses, timberland and hotels etc. Real estate is growing quite nicely as a component of the global financial business. Given their focus on real estate investments, REITS have always occupied a specialized position in global finance.

Fundamentally, there are three types of REITS –

  1. Equity REITS which exclusively deal in acquiring, improving and selling properties with the aim of higher returns for their investors
  2. Mortgage REITS only buy and sell mortgages
  3. Hybrid REITS which do both #1 and #2 above

REITS have a reasonably straightforward business model – you take the yields from the properties you own and reinvest the funds to be able to pay your investors (a mandated 95% of dividends). Most of the traditional REIT business processes are well handled by conventional types of technology. However more and more REITs are being challenged to develop a compelling Big Data strategy that leverages their tremendous data assets. 

The Five Key Big Data Applications for REITS… 

Let us consider at the five key areas where advanced analytics built on a Big Data foundation can immensely help REITS.

#1 Property Acquisition Modeling 

REITS owners can leverage the rich datasets available around renters demographics, preferences, seasonality, economic conditions in specific markets to better guide capital decisions on acquiring property. This modeling needs to take into account land costs, development costs, fixture costs & any other sales and marketing costs to appeal to tenants. I’d like to call this macro business perspective. Also from a micro business perspective, being able to better study individual properties using a variety of widely available data – MLS listings for similar properties, foreclosures, closeness to retail establishments, work sites, building profiles, parking spaces, energy footprint etc can help them match tenants to their property holdings. All this is critical to getting their investment mix right to meet profitability targets.

                                  Click on the Image for a blogpost discussing Predictive Analytics in Capital Markets

#2 Portfolio Modeling 

REITS can leverage Big Data to perform more granular modeling of their MBS portfolios. As an example, they can feed in a lot more data into their existing models as discussed above. E.g.  Demographic data, macroeconomic factors et al.  

A simple scenario would be if Interest Rates go up by X basis points – what does that mean for my portfolio exposure, Default Rate, Cost Picture, Optimal times to buy certain MBS’s etc ?  REITS can then use that info to enter hedges etc to protect against any downside. Big Data can also help with a range of predictive modeling across all of the above areas as discussed below.  An example is to build a 360 degree view of a given investment portfolio.

                                                         Click on Image for a Customer 360 discussion 

#3 Risk Data Aggregation & Calculations 

The instruments underlying the portfolios themselves carry large amounts of credit & interest rate risk. Big Data is a fantastic platform for aggregating and calculating many kinds of risk exposures as the below link discuss in detail. 

  

                                            Click on Image for a discussion of Risk Data Aggregation and Measurement 

 

#4 Detect and Prevent Money Laundering (AML)

Due to the global nature of investment funds flowing into real estate, REITS are highly exposed to money laundering and sanctions risks. Whether or not REITS operate in high risk geographies (India,China, South America, Russia etc) or have complex holding structures – they need to file SAR (Suspicious Activity Reports) with the FinCEN.  There has always been a strong case to be made that shady foreign entities and individuals were laundering ill gotten proceeds to buy US real estate. In early 2016, the FinCEN began implementing Geographic Targeting Orders (GTOs). Title companies based in the United States are now required to clearly identify the real owners of either limited liability companies (LLCs) or any other partnerships, and other legal entities being used to purchase high end residential real estate using cash.

AML as a topic is covered exhaustively in the below series of blogposts (please click on image to open the first one).

                                                         Click on Image for a Deepdive on AML

#5 Smart Cities, Net New Investments and Property Management

In the future, REITS would want to invest in Smart Cities which are positioned to be leading urban centers offering mobility, green technology, personalized medicine, safe services, clean water, traffic management and other forward looking urban amenities. These Smart Cities target a new kind of client- upwardly mobile, technologically savvy, environment conscious millenials. According to RBC Capital Markets, Smart Cities presents a massive investment opportunity for REITS. Such investments could provide REITS offering income yields of around 10-20%. (Source – Ben Forster @ Schroeders).

Smart Cities will be created using a number of high end technologies such as IoT, AI, Virtual Reality, Device Meshes etc. By 2020, it is estimated that these buildings will be generating an enormous amount of data that needs to be stored and analyzed by landlords.

As the below graphic from Cisco attests, the ability to work with IoT data to analyze a range of these micro investment opportunities is a Big Data challenge.

The ongoing maintenance and continuous refurbishment of rental properties is a large portion of the business operation of a REIT. The availability of smart sensors and such IoT devices that can track air quality, home appliance malfunction etc can help greatly with preventive maintenance.

Conclusion..

As can be seen from some of the above business areas, most REITS data needs require a holistic approach across the value chain (capital sourcing, investment decisions, portfolio management & operations). This approach spans various horizontal functions like Customer Segmentation, Property Acquisition, Risk, Finance and Business Operations.
The need of the hour for larger REITS is to move to a common model for data storage, model building and testing.  It is becoming increasingly obvious that Big Data can provide massive business opportunities for REITS.

The Three Habits of Highly Effective Real Time Enterprises…

All I do is sit at home and watch Netflix. ” – Kylie Irving

The Power of Real Time

Anyone who has used Netflix to watch a movie or used Uber to hail a ride knows how simple, time efficient, inexpensive and seamless it is to do either. Chances are that most users of Netflix and Uber would never again use a physical video store or a taxi service unless they did not have a choice. Thus it should not come as a surprise that within a short span of a few years, these companies have acquired millions of delighted customers using their products (which are just apps) while developing market capitalizations of tens of billions of dollars.

As of early 2016, Netflix had about 60 million subscribers[1] and is finding significant success in producing its own content thus continuing to grab market share from the established players like NBC, Fox and CBS. Most Netflix customers opt to ditch Cable and are choosing to stream content in real time across a variety of devices.

Uber is nothing short of a game changer in the ride sharing business. Not just in busy cities but also in underserved suburban areas, Uber services save plenty of time and money in enabling riders to hail cabs. In congested metro areas, Uber also provides near instantaneous rides for a premium which motivates more drivers to service riders. As someone, who has used Uber in almost every continent in the world, it is no surprise that as of 2016, Uber dominates in terms of market coverage, operating in 400 cities in 70+ countries.[2]

What is the common theme in ordering a cab using Uber or a viewing a movie on Netflix ?

Answer – Both services are available at the click of a button, they’re lightning quick and constantly build on their understanding of your tastes, ratings and preferences. In short, they are Real Time products.

Why is Real Time such a powerful business capability?

In the Digital Economy, the ability to interact intelligently with consumers in real time is what makes possible the ability to create new business models and to drive growth in existing lines of business.

So, what do Real Time Enterprises do differently

What underpins a real time enterprise are three critical factors or foundational capabilities as shown in the below illustration. For any enterprise to be considered real time, the presence of these three components is what decides the pace of consumer adoption. Real time capabilities are part business innovation and part technology.

Let us examine these…

#1 Real Time Businesses possess a superior Technology Strategy

First and foremost, business groups must be able to define a vision for where they would like their products and services to be able to do to acquire younger and more dynamic consumers.

As companies adopt new business models, the technologies that support them must also change along with the teams that deliver them. IT departments have to move to more of a service model while delivering agile platforms and technology architectures for business lines to develop products around.

Why Digital Disruption is the Cure for the Common Data Center..

It needs to be kept in mind that these new approaches should be incubated slowly and gradually. They must almost always be business or usecase driven at first.

#2 Real Time Enterprises are extremely smart about how they leverage data

The second capability is an ability to break down data silos in an organization. Most organizations have no idea of what to do with all the data they generate. Sure, they use a fraction of it to perform business operations but beyond that most of this data is simply let go. As a consequence they fail to view their customer as a dynamic yet unified entity. Thus, they have no idea as to how to market more products or to estimate the risk being run on their behalf etc. The ability to add  is a growing emphasis on the importance of the role of the infrastructure within service orientation. As the common factor that is present throughout an organization, the networking infrastructure is potentially the ideal tool for breaking down the barriers that exist between the infrastructure, the applications and the business. Consequently, adding greater intelligence into the network is one way of achieving the levels of virtualization and automation that are necessary in a real-time operation.

Across Industries, Big Data Is Now the Engine of Digital Innovation..

#3 Real Time Enterprises use Predictive Analytics and they automate the hell out of every aspect of their business

Real time enterprises get the fact that using only Business Intelligence (BI) dashboards is largely passe. BI implementations base their insights on data that is typically stale, (even by days). BI operates in a highly siloed manner based on long cycles of data extraction, transformation, indexing etc.

However, if products are to be delivered over mobile and other non traditional channels, then BI is ineffective at providing just in time analytics that can drive an understanding of a dynamic consumers wants and needs. The Real Time enterprise demands that workers at many levels ranging from line of business managers to executives have fresh, high quality and actionable information on which they can base complex yet high quality business decisions. These insights are only enabled by Data Science and Business Automation. When deployed strategically – these techniques can scale to enormous volumes of data and help reason over them reducing manual costs.  They can take on business problems that can’t be managed manually because of the huge amount of data that must be processed.

Why Big Data & Advanced Analytics are Strategic Corporate Assets..

Conclusion..

Real time Enterprises do a lot of things right. They constantly experiment with creating new and existing business capabilities with a view to making them appealing to a rapidly changing clientele. They refine these using constant feedback loops and create cutting edge technology stacks that dominate the competitive landscape. Enterprises need to make the move to becoming Real time.

Neither Netflix nor Uber are sitting on their laurels. Netflix (which discontinued mail in DVDs and moved to an online only model a few years ago) continues to expand globally betting that the convenience of the internet will eventually turn it into a major content producer. Uber is prototyping self driving cars in Pittsburgh and intends to rollout its own fleet of self driving vehicles thus replacing it’s current 1.5 million drivers and also beginning a food delivery business around urban centers eventually[4].

Sure, the ordinary organization is no Netflix or Uber and when a journey such as the one to real time capabilities is embarked on, things can and will go wrong in this process. However, the cost of continuing with business as usual can be incalculable over the next few years.  There is always a startup or a competitor that wants to deliver what you do at much lower cost and at a lightning fast clip. Just ask Blockbuster and the local taxi cab company.

References

[1] Netflix Statistics 2016 – Statistica.com

[2] Fool.com “Just how dominant is Uber” – http://www.fool.com/investing/general/2015/05/24/lyft-vs-uber-just-how-dominant-is-uber-ridesharing.aspx

[3] Expanded Ramblings – “Uber Statistics as of Oct 2016” http://expandedramblings.com/index.php/uber-statistics/

[4] Uber Self driving cars debut in Pittsburgh – “http://www.wsj.com/articles/inside-ubers-new-self-driving-cars-in-pittsburgh-1473847202”

Why Big Data & Advanced Analytics are Strategic Corporate Assets..

The industry is all about Digital now. The explosion in data storage and processing techniques promises to create new digital business opportunities across industries. Business Analytics concerns itself from deriving insights from data that is produced as a byproduct of business operations as well as external data that reflects customer insights. Due to their critical importance in decision making, Business Analytics is now a boardroom matter and not just one confined to the IT teams. My goal in this blogpost is to quickly introduce the analytics landscape before moving on to the significant value drivers that only Predictive Analytics can provide.

The Impact of Business Analytics…

The IDC “Worldwide Big Data and Analytics Spending Guide 2016”, predicts that the big data and business analytics market will grow from $130 billion by the end of this year to $203 billion by 2020[1] at a   compound annual growth rate (CAGR) of 11.7%. This exponential growth is being experienced across industry verticals such as banking & insurance, manufacturing, telecom, retail and healthcare.

Further, during the next four years, IDC finds that large enterprises with 500+ employees will be the main driver in big data and analytics investing, accounting for about $154 billion in revenue. The US will lead the market with around $95 billion in investments during the next four years – followed by Western Europe & the APAC region [1].

The two major kinds of Business Analytics…

When we discuss the broad topic of Business Analytics, it needs to be clarified that there are two major disciplines – Descriptive and Predictive. Industry analysts from Gartner & IDC etc. will tell you that one also needs to widen the definition to include Diagnostic and Prescriptive. Having worked in the industry for a few years, I can safely say that these can be subsumed into the above two major categories.

Let’s define the major kinds of industrial analytics at a high level –

Descriptive Analytics is commonly described as being retrospective in nature i.e “tell me what has already happened”. It covers a range of areas traditionally considered as BI (Business Intelligence). BI focuses on supporting operational business processes like customer onboarding, claims processing, loan qualification etc via dashboards, process metrics, KPI’s (Key Performance Indicators). It also supports a basic level of mathematical techniques for data analysis (such as trending & aggregation etc.) to infer intelligence from the same.  Business intelligence (BI) is a traditional & well established analytical domain that essentially takes a retrospective look at business data in systems of record. The goal of the Descriptive disciplines is to primarily look for macro or aggregate business trends across different aspects or dimensions such as time, product lines, business units & operating geographies.

  • Predictive Analytics is the forward looking branch of analytics which tries to predict the future based on information about the past. It describes what “can happen based on the patterns in data”. It covers areas like machine learning, data mining, statistics, data engineering & other advanced techniques such as text analytics, natural language processing, deep learning, neural networks etc. A more detailed primer on both along with detailed use cases are found here –

The Data Science Continuum in Financial Services..(3/3)

The two main domains of Analytics are complementary yet different…

Predictive Analytics does not intend to, nor will it, replace the BI domain but only adds significant sophisticated analytical capabilities enabling businesses to be able to do more with all the data they collect. It is not uncommon to find real world business projects leveraging both these analytical approaches.

However from an implementation standpoint, the only common area of both approaches is knowledge of the business and the sources of data in an organization. Most other things about them vary.

For instance, predictive approaches both augment & build on the BI paradigm by adding a “What could happen” dimension to the data.

The Descriptive Analytics/BI workflow…

BI projects tend to follow a largely structured process which has been well defined over the last 15-20 years. As the illustration below describes it, data produced in operational systems is subject to extraction, transformation and eventually is loaded into a data warehouse for consumption by visualization tools.

descriptive_analytics

                                                                       The Descriptive Analysis Workflow 

Descriptive Analytics and BI add tremendous value to well defined use cases based on a retrospective look at data.

However, key challenges with this process are –

  1. the lack of a platform to standardize and centralize data feeds leads to data silos which cause all kinds of cost & data governance headaches across the landscape
  2. complying with regulatory initiatives (such as Anti Money Laundering or Solvency II etc.) needs the warehouse to handle varying types of data which is a huge challenge for most of the EDW technologies
  3. the ability to add new & highly granular fields to the data feeds in an agile manner requires extensive relational modeling upfront to handle newer kinds of schemas etc.

Big Data platforms have overcome past shortfalls in security and governance and are being used in BI projects at most organizations. An example of the usage of Hadoop in classic BI areas like Risk Data Aggregation are discussed in depth at the below blog.

http://www.vamsitalkstech.com/?p=2697

That being said, BI projects tend to follow a largely structured process which has been well defined over the last 20 years. This space serves a large existing base of customers but the industry has been looking to Big Data as a way of constructing a central data processing platform which can help with the above issues.

BI projects are predicated on using an EDW (Enterprise Data Warehouse) and/or RDBMS (Relational Database Management System) approach to store & analyze the data. Both these kinds of data storage and processing technologies are legacy in terms of both the data formats they support (Row-Column based) as well as the types of data they can store (structured data).

Finally, these systems fall short of processing data volumes generated by digital workloads which tend to be loosely structured (e.g mobile application front ends, IoT devices like sensors or ATM machines or Point of Sale terminals), & which need business decisions to be made in near real time or in micro batches (e.g detect credit card fraud, suggest the next best action for a bank customer etc.) and increasingly cloud & API based to save on costs & to provide self-service.

That is where Predictive Approaches on Big Data platforms are beginning to shine and fill critical gaps.

The Predictive Analytics workflow…

Though the field of predictive analytics has been around for years – it is rapidly witnessing a rebirth with the advent of Big Data. Hadoop ecosystem projects are enabling the easy ingestion of massive quantities of data thus helping the business gather way more attributes about their customers and their preferences.

data_science_process

                                                                    The Predictive Analysis Workflow

The Predictive Analytics workflow always starts with a business problem in mind. Examples of these would be “A marketing project to detect which customers are likely to buy new products or services in the next six months based on their historical & real time product usage patterns – which are denoted by x, y or z characteristics” or “Detect real-time fraud in credit card transactions.”

In use cases like these, the goal of the data science process is to be able to segment & filter customers by corralling them into categories that enable easy ranking. Once this is done, the business is involved to setup easy and intuitive visualization to present the results.

A lot of times, business groups have a hard time explaining what they would like to see – both data and the visualization. In such cases, a prototype makes things easier from a requirements gathering standpoint.  Once the problem is defined, the data scientist/modeler identifies the raw data sources (both internal and external) which comprise the execution of the business challenge.  They spend a lot of time in the process of collating the data (from Oracle/SQL Server, DB2, Mainframes, Greenplum, Excel sheets, external datasets, etc.). The cleanup process involves fixing a lot of missing values, corrupted data elements, formatting fields that indicate time and date etc.

The data wrangling phase involves writing code to be able to join various data elements so that a single client’s complete dataset is gathered in the Data Lake from a raw features standpoint.  If more data is obtained as the development cycle is underway, the Data Science team has no option but to go back & redo the whole process. The modeling phase is where algorithms come in – these can be supervised or unsupervised. Feature engineering takes in business concepts & raw data features and creates predictive features from them. The Data Scientist takes the raw & engineered features and creates a model using a mix of various algorithms. Once the model has been repeatedly tested for accuracy and performance, it is typically deployed as a service. Models as a Service (MaaS) is the Data Science counterpart to Software as a Service. The MaaS takes in business variables (often hundreds of inputs) and provides as output business decisions/intelligence, measurements, & visualizations that augment decision support systems.

 How Predictive Analytics changes the game…

Predictive analytics can bring about transformative benefits in the following six ways.

  1. Predictive approaches can be applied to a much wider & richer variety of business challenges thus enabling an organization to achieve outcomes that were not really possible with the Descriptive variety. For instance, these use cases range from Digital Transformation to fraud detection to marketing analytics to IoT (Internet of things) across industry verticals. Predictive approaches are real-time and not just batch oriented like the Descriptive approaches.
  2. When deployed strategically – they can scale to enormous volumes of data and help reason over them reducing manual costs.  It can take on problems that can’t be managed manually because of the huge amount of data that must be processed.
  3. They can predict the results of complex business scenarios by being able to probabilistically predict different outcomes across thousands of variables by perceiving minute dependencies between them. An example is social graph analysis to understand which individuals in a given geography are committing fraud and if there is a ring operating
  4. They are vastly superior at handling fine grained data of manifold types than can be handled by the traditional approach or by manual processing. The predictive approach also encourages the integration of previously “dark” data as well as newer external sources of data.
  5. They can also suggest specific business actions(e.g. based on the above outcomes) by mining data for hitherto unknown patterns. The data science approach constantly keeps learning in order to increase its accuracy of decisions
  6. Data Monetization–  they can be used to interpret the mined data to discover solutions to business challenges and new business opportunities/models

References

[1] IDC Worldwide Semiannual Big Data and Business Analytics Spending Guide – Oct 2016 “Double-Digit Growth Forecast for the Worldwide Big Data and Business Analytics Market Through 2020 Led by Banking and Manufacturing Investments, According to IDC

http://www.idc.com/getdoc.jsp?containerId=prUS41826116

 

Embedding A Culture of Business Analytics into the Enterprise DNA..

IT driven business transformation is always bound to fail” – Amber Storey, Sr Manager, Ernst & Young

The value of Big Data driven Analytics is no longer in question both from a customer as well as an enterprise standpoint. Lack of investment in an analytic strategy has the potential to impact shareholder value negatively.  Business Boards and CXOs are now concerned about their overall levels and maturity of investments in terms of business value – i.e increasing sales, driving down business & IT costs & helping create new business models. It is thus an increasingly accurate argument that smart applications & ecosystems built around them will increasingly dictate enterprise success.

Such examples among forward looking organizations abound across industries. These range from realtime analytics in manufacturing using IoT data streams across the supply chain, the use of natural language processing to drive patient care decisions in healthcare, more accurate insurance fraud detection & driving Digital interactions in Retail Banking etc to quote a few. 

However , most global organizations currently adopt a fairly tactical approach to ensuring the delivery of of traditional business intelligence (BI) and predictive analytics to their application platforms.  This departmental is quite suboptimal in ways as scaleable data driven decisions & culture not only empower decision-makers with up to date and realtime information but also help them develop long term insights into how globally diversified business operations are performing.  Scale is the key word here due to rapidly changing customer trends, partner, supply chain realities & regulatory mandates.

Scale implies speed of learning,  business agility across the organization in terms of having globally diversified operations turn on a dime thus ensuring that the business feels empowered.

A quick introduction to Business (Descriptive & Predictive) Analytics –

Business intelligence (BI) is a traditional & well established analytical domain that essentially takes a retrospective look at business data in systems of record. The goal for BI is to primarily look for macro or aggregate business trends across different aspects or dimensions such as time, product lines, business unites & operating geographies.

BI is primarily concerned with “What happened and what trends exist in the business based on historical data?“. The typical use cases for BI include budgeting, business forecasts, reporting & key performance indicators (KPI).

On the other hand, Predictive Analytics (a subset of Data Science) augments & builds on the BI paradigm by adding a “What could happen” dimension to the data in terms of –

  • being able to probabilistically predict different business scenarios across thousands of variables
  • suggesting specific business actions based on the above outcomes

Predictive Analytics does not intend to nor will it replace the BI domain but only adds significant business capabilities that lead to overall business success. It is not uncommon to find real world business projects leveraging both these analytical approaches.

Creating an industrial approach to analytics – 

Strategic business projects typically begin imbibing a BI/Predictive Analytics based approach as an afterthought to the other aspects of system architecture and buildout. This dated approach then ensures that analytics becomes external to and eventually operating in a reactive mode in the operation of business system.

Having said that, one does need to recognize that an industrial approach to analytics is a complex endeavor that depends on how an organization tackles the convergence of the below approaches –

  1. Organizational Structure
  2. New Age Technology 
  3. A Platforms Mindset
  4. Culture

Creating_An_Analytic_Culture

        Illustration – Embedding A Culture of Business Analytics into the Enterprise DNA..

Lets discuss them briefly – 

Organizational Structure – The historical approach has been to primarily staff analytics teams as a standalone division often reporting to a CIO. This team has responsibility for both the business intelligence as well as some silo of a data strategy. Such a piecemeal approach to predictive analytics ensures that business & application teams adopt a “throw it over the wall” mentality over time.

So what needs to be done? 

In the Digital Age, enterprises should look to centralize both data management as well as the governance of analytics as core business capabilities. I suggest a hybrid organizational structure where a Center of Excellence (COE) is created which reports to the office of the Chief Data Officer (CDO) as well as individual business analytic leaders within the lines of business themselves.

 This should be done to ensure that three specific areas are adequately tackled using a centralized approach- 

  • Investing in creating a data & analytics roadmap by creating a center of excellence (COE)
  • Setting appropriate business milestones with “lines of business” value drivers built into a robust ROI model
  • Managing Risk across the enterprise with detailed scenario planning

New Age Technology –

The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging visualization but also to personalize services clients care about across multiple modes of interaction. Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. We have seen how how exploding data generation across the global economy has become a clear & present business & IT phenomenon. Data volumes are rapidly expanding across industries. However, while the production of data itself that has increased but it is also driving the need for organizations to derive business value from it. This calls for the collection & curation of data from dynamic,  and highly distributed sources such as consumer transactions, B2B interactions, machines such as ATM’s & geo location devices, click streams, social media feeds, server & application log files and multimedia content such as videos etc – using Big Data.

Cloud Computing is the ideal platform to provide the business with self service as well as rapid provisioning of business analytics. Every new application designed needs to be cloud native from the get go.

The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging Visualization but also to personalize services clients care about across multiple modes of interaction. Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc.

A Platforms Mindset – 

As opposed to building standalone or one-off business applications, a Platform Mindset is a more holistic approach capable of producing higher revenues. Platforms abound in the webscale world at shops like Apple, Facebook & Google etc. Applications are constructed like lego blocks  and they reuse customer & interaction data to drive cross sell and up sell among different product lines. The key components here are to ensure that one starts off with products with high customer attachment & retention. While increasing brand value, it is key to ensure that customers & partners can also collaborate in the improvements in the various applications hosted on top of the platform.

Culture – Business value fueled by analytics is only possible if the entire organization operates on an agile basis in order to collaborate across the value chain. Cross functional teams across new product development, customer acquisition & retention, IT Ops, legal & compliance must collaborate in short work cycles to close the traditional business & IT innovation gap. Methodologies like DevOps who’s chief goal is to close the long-standing gap between the engineers who develop and test IT capability and the organizations that are responsible for deploying and maintaining IT operations – must be adopted. Using traditional app dev methodologies, it can take months to design, test and deploy software. No business today has that much time—especially in the age of IT consumerization and end users accustomed to smart phone apps that are updated daily. The focus now is on rapidly developing business applications to stay ahead of competitors that can better harness Big Data’s amazing business capabilities.

Summary- 

Enterprise wide business analytic approaches designed around the four key prongs  (Structure, Culture, Technology & Platforms)   will create immense operational efficiency, better business models, increased relevance and ultimately drive revenues. These will separate the visionaries, leaders from the laggards in the years to come.

Capital Markets Pivots to Big Data in 2016

Previous posts in this blog have discussed how Capital markets firms must create new business models and offer superior client relationships based on their vast data assets. Firms that can infuse a data driven culture in both existing & new areas of operation will enjoy superior returns and raise the bar for the rest of the industry in 2016 & beyond. 

Capital Markets are the face of the financial industry to the general public and generate a large percent of the GDP for the world economy. Despite all the negative press they have garnered since the financial crisis of 2008, capital markets perform an important social function in that they contribute heavily to economic growth and are the primary vehicle for household savings. Firms in this space allow corporations to raise capital using the underwriting process. However, it is not just corporations that benefit from such money raising activity – municipal, local and national governments do the same as well. Just that the overall mechanism differs – while business enterprises issue both equity and bonds, governments typically issue bonds. According to the Boston Consulting Group (BCG), the industry will grow to annual revenues of $661 billion in 2016 from $593 billion in 2015 – a healthy 12% increase. On the buy side, the asset base (AuM – Assets under Management) is expected to reach around $100 trillion by 2020 up from $74 trillion in 2014.[1]

Within large banks, the Capital Markets group and the Investment Banking Group perform very different functions.  Capital Markets (CM) is the face of the bank to the street from a trading perspective.  The CM group engineers custom derivative trades that hedge exposure for their clients (typically Hedge Funds, Mutual Funds, Corporations, Governments and high net worth individuals and Trusts) as well as for their own treasury group.  They may also do proprietary trading on the banks behalf for a profit – although it is this type of trading that Volcker Rule is seeking to eliminate.

If a Bank uses dark liquidity pools (DLP) they funnel their Brokerage trades through the CM group to avoid the fees associated with executing an exchange trade on the street.  Such activities can also be used to hide exchange based trading activity from the Street.  In the past, Banks used to make their substantial revenues by profiting from their proprietary trading or by collecting fees for executing trades on behalf of their treasury group or other clients.

Banking and within it, capital markets continues to generate insane amounts of data. These producers range from news providers to electronic trading participants to stock exchanges which are increasingly looking to monetize data. And it is not just the banks, regulatory authorities like the FINRA in the US are processing peak volumes of 40-75 billion market events a day http://www.vamsitalkstech.com/?p=1157 [2]. In addition to data volumes, Capital Markets has always  possessed a variety challenge as well. They have tons of structured data around traditional banking data, market data, reference data & other economic data. You can then factor in semi-structured data around corporate filings,news,retailer data & other gauges of economic activity. An additional challenge now is the creation of data from social media, multimedia etc – firms are presented with significant technology challenges and business opportunities.

Within larger financial supermarkets, the capital markets group typically leads the way in  being forward looking in terms of adopting cutting edge technology and high tech spends.  Most of the compute intensive problems are generated out of either this group or the enterprise risk group. These groups own the exchange facing order management systems, the trade booking systems, the pricing libraries for the products the bank trades as well as the tactical systems that are used to manage their market and credit risks, customer profitability, compliance and collateral systems.  They typically hold about one quarter of a Banks total IT budget. Capital Markets thus has the largest number of use cases for risk and compliance.

Players across value chain on the buy side, the sell side, the intermediaries (stock exchanges & the custodians) & technology firms such as market data providers are all increasingly looking at leveraging these new data sets that can help unlock the value of data for business purposes beyond operational efficiency.

So what are the  different categories of applications that are clearly leveraging Big Data in production deployments.

CapMkts_UseCases

                      Illustration – How are Capital Markets leveraging Big Data In 2016

I have catalogued the major ones below based on my work with the majors in the spectrum over the last year.

  1. Client Profitability Analysis or Customer 360 view:  With the passing of the Volcker Rule, the large firms are now moving over to a model based on flow based trading rather than relying on prop trading. Thus it is critical for capital market firms to better understand their clients (be they institutional or otherwise) from a 360-degree perspective so they can be marketed to as a single entity across different channels—a key to optimizing profits with cross selling in an increasingly competitive landscape. The 360 view encompasses defensive areas like Risk & Compliance but also the ability to get a single view of profitability by customer across all of their trading desks, the Investment Bank and Commercial Lending.
  2. Regulatory Reporting –  Dodd Frank/Volcker Rule Reporting: Banks have begun to leverage data lakes to capture every trade intraday and end of day across it’s lifecycle. They are then validating that no proprietary trading is occurring on on the banks behalf.  
  3. CCAR & DFast Reporting: Big Data can substantially improve the quality of  raw data collected across multiple silos. This improves the understanding of a Bank’s stress test numbers.
  4. Timely and accurate risk management: Running Historical, stat VaR (Value at Risk) or both to run the business and to compare with the enterprise risk VaR numbers.
  5. Timely and accurate liquidity management:  Look at the tiered collateral and their liquidity profiles on an intraday basis to manage the unit’s liquidity.  They also need to look at credit and market stress scenarios and be able to look at the liquidity impact of those scenarios.
  6. Timely and accurate intraday Credit Risk Management:  Understanding when  & if  deal breaches a tenor bucketed limit before they book it.  For FX trading this means that you have about 9 milliseconds  to determine if you can do the trade.  This is a great place to use in memory technology like Spark/Storm and a Hadoop based platform. These usecases are key in increasing the capital that can be invested in the business.  To do this they need to convince upper management that they are managing their risks very tightly.
  7. Timely and accurate intraday Market Risk Management:  Leveraging Big Data to market risk computations ensures that Banks have a real time idea of any market limit breaches for any of the tenor bucketed market limits.
  8. Reducing Market Data costs: Market Data providers like Bloomberg, Thomson Reuters and other smaller agencies typically charge a fee each time data is accessed.  With a large firm, both the front office and Risk access this data on an ad-hoc fairly uncontrolled basis. A popular way to save on cost is to  negotiate the rights to access the data once and read it many times.  The key is that you need a place to put it & that is the Data Lake.
  9. Trade Strategy Development & Backtesting: Big Data is being leveraged to constantly backtest trading strategies and algorithms on large volumes of historical and real time data. The ability to scale up computations as well as to incorporate real time streams is key to
  10. Sentiment Based Trading: Today, large scale trading groups and desks within them have begun monitoring economic, political news and social media data to identify arbitrage opportunities. For instance, looking for correlations between news in the middle east and using that to gauge the price of crude oil in the futures space.  Another example is using weather patterns to gauge demand for electricity in specific regional & local markets with a view to commodities trading. The realtime nature of these sources is information gold. Big Data provides the ability to bring all these sources into one central location and use the gleaned intelligence to drive various downstream activities in trading & private banking.
  11. Market & Trade Surveillance:Surveillance is an umbrella term that usually refers to a wide array of trading practices that serve to distort securities prices thus enabling market manipulators to illicitly profit at the expense of other participants, by creating information asymmetry. Market surveillance is generally out by Exchanges and Self Regulating Organizations (SRO) in the US – all of which have dedicated surveillance departments set up for this purpose. However, capital markets players on the buy and sell side also need to conduct extensive trade surveillance to report up internally. Pursuant to this goal, the exchanges & the SRO’s monitor transaction data including orders and executed trades & perform deep analysis to look for any kind of abuse and fraud.
  12. Buy Side (e.g. Wealth Management) – A huge list of usecases I have catalogued here – https://dzone.com/articles/the-state-of-global-wealth-management-part-2-big-d 
  13. AML Compliance –  Covered in various blogs and webinars.
    http://www.vamsitalkstech.com/?s=AML
    https://www.boozallen.com/insights/2016/04/webinar-anti-money-laudering – 

The Final Word

A few tactical recommendations to industry CIOs:

  • Firstly, capital markets players should look to create centralized trade repositories for Operations, Traders and Risk Management.  This would allow consolidation of systems and a reduction in costs by providing a single platform to replace operations systems, compliance systems and desk centric risk systems.  This would eliminate numerous redundant data & application silos, simplify operations, reduce redundant quant work, improve and understanding of risk.
  • Secondly, it is important to put in place a model to create sources of funding for discretionary projects that can leverage Big Data.
  • Third, Capital Markets groups typically have to fund their portion of AML, Dodd Frank, Volcker Rule, Trade Compliance, Enterprise Market Risk and Traded Credit Risk projects.  These are all mandatory spends.  After this they typically get to tackle discretionary business projects. Eg- fund their liquidity risk, trade booking and tactical risk initiatives.  These defensive efforts always get the short end of the stick and are not to be neglected while planning out new initiatives.
  • Finally, an area in which a lot of current players are lacking is the ability to associate clients using a Lightweight Entity Identifier (LEI). Using a Big Data platform to assign logical and physical entity ID’s to every human and business the bank interacts can have salubrious benefits. Big Data can ensure that firms can do this without having to redo all of their customer on-boarding systems. This is key to achieving customer 360 views, AML and FATCA compliance as well as accurate credit risk reporting.

It is no longer enough for CIOs in this space to think of tactical Big Data projects, they must be thinking around creating platforms and ecosystems around those platforms to be able to do a variety of pathbreaking activities that generate a much higher rate of return.

References

[1] “The State of Capital Markets in 2016” – BCG Perspectives