Why Data Garbage-In means Analytics Garbage-Out..

This is the third in a series of blogs on Data Science that I am jointly authoring with Maleeha Qazi, (https://www.linkedin.com/in/maleehaqazi/). We have previously covered some of the inefficiencies that result from a siloed data science process @ http://www.vamsitalkstech.com/?p=5046 & the ideal way Data Scientists would like their models deployed for the maximal benefit and use – as a Service @ http://www.vamsitalkstech.com/?p=5321. As the name of this third blog post suggests, the success of a data science initiative depends on data. If the data going into the process is “bad” then the results cannot be relied upon. Our goal is to also suggest some practical steps that enterprises can take from a data quality & governance process standpoint. 

However, under the strong influence of the current AI hype, people try to plug in data that’s dirty & full of gaps, that spans years while changing in format and meaning, that’s not understood yet, that’s structured in ways that don’t make sense, and expect those tools to magically handle it. ” – Monica Rogati (Data Science Advisor and ex-VP  Jawbone – 2017) 

Image Credit – The Daily Omnivore

Introduction

Different posts in this blog have discussed Data Science and other Analytical approaches to some degree of depth. What is apparent is that whatever the kind of analytics – descriptive, predictive, or prescriptive – the availability of a wide range of quality data sources is key. However, along with volume and variety of data, the veracity, or the truth, in the data is as important. This blog post discusses the main factors that determine the quality of data from a Data Scientist’s perspective.  

The Top Issues of Data Quality

As highlighted in the above illustration, the top quality issues that data assets typically face are the following:

  1. Incomplete Data: The data provided for analysis should span the entire cross-section of known data about how the organization views its customers and products. This would include data generated from various applications that belong to the business, and external data bought from various vendors to enriched the knowledge base. The completeness criteria measures if all of the information about entities under consideration is available and useable.
  2. Inconsistent & Inaccurate Data: Consistency measures what data values give conflicting information and must be fixed. It also measures if all the data elements conform to specific and uniform formats and are stored in a consistent manner. Inaccurate data either has duplicate, missing or erroneous values. It also does not reflect an accurate picture of the state of the business at the point in time it was pulled.
  3. Lack of Data Lineage & Auditability: The data framework needs to support audit-ability, i.e provide an audit trail of how the data values were derived from source to analysis point; the various transformations performed on it to arrive at the data set being considered for analysis.
  4. Lack of Contextuality: Data needs to be accompanied by meaningful metadata – data that describes the concepts within the dataset.
  5. Temporally Inconsistent: This measures if the data was temporally consistent and meaningful given the time it was recorded.

What Business Challenges does Poor Data Quality Cause…

Image Credit – DataMartist

Data Quality causes the following business challenges in enterprises:

  1. Customer dissatisfaction: Across industries like Banking, Insurance, Telecom & Manufacturing, the ability to get a unified view of the customer & their journey is at the heart of the enterprise’s ability to promote relevant offerings & detect customer dissatisfaction. Currently, most industry players are woeful at putting together this comprehensive Single View of their Customers (SVC). Due to operational silos, each department possesses its own siloed & limited view of the customer across multiple channels. These views are typically inconsistent, lack synchronization with other departments, & miss a high amount of potential cross-sell and upsell opportunities. This is a data quality challenge at its core.
  2. Lost revenue: The Customer Journey problem has been an age-old issue which has gotten exponentially more complicated over the last five years as the staggering rise of mobile technology and the Internet of Things (IoT) have vastly increased the number of enterprise touch points that customers are exposed to in terms of being able to discover and purchase new products/services. In an OmniChannel world, an increasing number of transactions are being conducted online. In verticals like the Retail industry and Banking & Insurance industries, the number of online transactions conducted approaches an average of 40%. Adding to the problem, more and more consumers are posting product reviews and feedback online. Companies thus need to react in real-time to piece together the source of consumer dissatisfaction.
  3. Time and cost in data reconciliation: Every large enterprise nowadays runs expensive data re-engineering projects due to their data quality challenges. These are an inevitable first step in other digital projects which cause huge cost and time overheads.
  4. Increased time to market for key projects: Poor data quality causes poor data agility, which increases the time to market for key projects.
  5. Poor data means suboptimal analytics: Poor data quality causes the analytics done using it to be suboptimal – algorithms will end up giving wrong conclusions because the input provided to them is incorrect at best & inconsistent at worst.

Why is Data Quality a Challenge in Enterprises

Image Credit – DataMartist

The top reasons why data quality has been a huge challenge in the industry are:

  1. Prioritization conflicts: For most enterprises, the focus of their business is the product(s)/service(s) being provided, book-keeping is a mandatory but secondary concern. And since keeping the business running is the most important priority, keeping the books accurate for financial matters is the only aspect that gets most of the technical attention it deserves. Other data aspects are usually ignored.
  2. Organic growth of systems: Most enterprises have gone through a series of book-keeping methods and applications, most of which have no compatibility with one another. Warehousing data from various systems as they are deprecated, merging in data streams from new systems, and fixing data issues as these processes happen is not prioritized till something on the business end fundamentally breaks. Band-aids are usually cheaper and easier to apply than to try and think ahead to what the business will need in the future, build it, and back-fill it with all the previous systems’ data in an organized fashion.
  3. Lack of time/energy/resources: Nobody has infinite time, energy, or resources. Doing the work of making all the systems an enterprise chooses to use at any point in time talk to one another, share information between applications, and keep a single consistent view of the business is a near-impossible task. Many well-trained resources, time & energy is required to make sure this can be setup and successfully orchestrated on a daily basis. But how much is a business willing to pay for this? Most do not see short-term ROI and hence lose sight of the long-term problems that could be caused by ignoring the quality of data collected.
  4. What do you want to optimize?: There are only so many balls an enterprise can have up in the air to focus on without dropping one, and prioritizing those can be a challenge. Do you want to optimize the performance of the applications that need to use, gather and update the data, OR do you want to make sure data accuracy/consistency (one consistent view of the data for all applications in near real-time) is maintained regardless? One will have to suffer for the other.

How to Tackle Data Quality

Image Credit – DataMartist

                                                   

With the advent of Big Data and the need to derive value from ever increasing volumes and a variety of data, data quality becomes an important strategic capability. While every enterprise is different, certain common themes emerge as we consider the quality of data:

  1. The sheer number of transaction systems found in a large enterprise causes multiple challenges across the data quality dimensions. Organizations need to have valid frameworks and governance models to ensure the data’s quality.
  2. Data quality has typically been thought of as just data cleansing and fixing missing fields. However, it is very important to address the originating business processes that cause this data to take multiple dimensions of truth. For example, centralize customer onboarding in one system across channels rather than having every system do its own onboarding.
  3. It is clear from the above that data quality and its management is not a one time or siloed application exercise. As part of a structured governance process, it is very important to adopt data profiling and other capabilities to ensure high-quality data.

Conclusion

Enterprises need to define both quantitative and qualitative metrics to ensure that data quality goals are captured across the organization. Once this is done, an iterative process needs to be followed to ensure that a set of capabilities dealing with data governance, auditing, profiling, and cleansing is applied to continuously ensure that data is brought up to, and kept at, a high standard. Doing so can have salubrious effects on customer satisfaction, product growth, and regulatory compliance.

Data Science in the Cloud A.k.a. Models as a Service (MaaS)..

This is second in a series of blogs on Data Science that I am jointly authoring with Maleeha Qazi, (https://www.linkedin.com/in/maleehaqazi/). We have previously covered some of the inefficiencies that result from a siloed data science process @ http://www.vamsitalkstech.com/?p=5046. All of the actors in the data science space can agree that becoming responsive to business demands is the overarching goal of the process. In this second blog post, we will discuss Model as a Service (MaaS), an approach to ensuring that models and their insights can be leveraged throughout a large organization.

Image Credit – Logistics Industry Blog

Introduction

Hardware as a Service (HaaS), Software as a Service (SaaS), Database as a Service (DBaaS), Infrastructure as a Service (IaaS), Platform as a service (PaaS), Network as a Service (NaaS), Backend as a service (BaaS), Storage as a Service (STaaS). While every IT delivery model is going the way of the cloud, does Data Science lag behind in this movement?  In such an environment, what do Data Scientists dream of to ensure that their models are constantly being trained on high quality and high volume production grade data?… Models as a Service (MaaS).

The Predictive Analytics workflow…

The Predictive Analytics workflow always starts with a business problem in mind. For example: “A marketing project to detect which customers are likely to buy new products or services in the next six months based on their historical & real time product usage patterns” or “Detect real-time fraud in credit card transactions.”

Illustration – The Predictive Analysis Workflow in a financial services setting

In use cases like these, the goal of the data science process is to be able to segment & filter customers by corralling them into categories that enable easy ranking. Once this is done, the business can setup easy and intuitive visualizations to present the results.

A lot of times, business groups have a hard time explaining what they would like to see – both in terms of input data and output format. In such cases, a prototype makes things easier from a requirement gathering standpoint.  Once the problem is defined, the data scientist/modeler identifies the raw data sources (both internal and external) which are pertinent to the business challenge.  They spend a lot of time in the process of collating the data (from a variety of sources like Oracle/SQL Server, DB2, Mainframes, Greenplum, Excel sheets, external datasets, etc.). The cleanup process involves dealing with missing values, corrupted data elements, formatting fields to be homogenous in terms of format, etc.

This data wrangling phase involves writing code to join various data elements so that a complete dataset is gathered in the Data Lake from a raw features standpoint, at the correct granularity for the problem at hand.  If more data is obtained as the development cycle is underway, the Data Science team has to go back & redo the process to incorporate the new data feeds. The modeling phase is where sophisticated algorithms come into play. Feature engineering takes in business concepts & raw data features and creates predictive features from them. The Data Scientist takes the raw & engineered features and creates a model by applying various algorithms & testing to find the best one. Once the model has been refined, & tested for accuracy and performance, it is ideally deployed as a service.

Challenges with the existing approach

The challenges with the above approach are:

  1. Business Scalability – Predictive analytics as highlighted above resembles a typical line of business project or initiative. The benefits of the learning from localized application initiatives are largely lost to the larger organization if you don’t allow multiple applications and business initiatives to access the models built.
  2. Lack of Data Richness – The models created by individual teams are not always enriched by cross organizational data constantly being generated by different business applications. In addition to that, the vast majority of industrial applications do not leverage all possible kinds of unstructured data & 3rd party data in their business applications. Enabling the models to be exposed to a range of data (both internal and external) can only enrich the insights generated.
  3. Cross Application Applicability – This challenge deals with how business intelligence insights from disparate applications (which leverage different models), to enhance business areas they weren’t originally created for. This could allow for customer centered insights in real-time. For example, consider a customer sales application and a call center application. Can cross application insights be used to understand that customers are calling into the call center because it has been hard to use the website to order products?
  4. Data Monetization  – What is critical in the ability to create new commercial business models is agile analytics around existing and new data sources. If it follows that enterprise businesses are being increasingly built around data assets, then it must naturally follow that data as a commodity can be traded or re-imagined to create revenue streams off of it. As an example, pioneering payment providers now offer retailers analytical services to help them understand which products perform best and how to improve the micro-targeting of customers. Thus, data is the critical prong of any digital initiative. This has led to efforts to monetize on data by creating platforms that support ecosystems of capabilities. To vastly oversimplify this discussion, the ability to monetize data needs two prongs – to centralize it in the first place and then to perform strong predictive modeling at large scale where systems need to constantly learn and optimize their interactions, responsiveness & services based on client needs & preferences. Thus, centralizing models offer more benefits than the typical enterprise can imagine.

    Enter Model As A Service…

    The MaaS takes in business variables (often hundreds or thousands of inputs) and provides as output model results upon which business decisions can be made, & visualizations that augment decision support systems. As depicted in the above illustration, once different predictive models are built, tested and validated, they are ready to be used in real world production deployments. MaaS is essentially a way of deploying these advanced models as a part of software applications where they are offered as a software subscription.

    MaaS also enables cleaner separation of the Application development process and the Data Science workflow.

    Business Benefits from a MaaS approach

    1. Exposing models to different lines of business thus increasing their usefulness and opening them up to feedback to help increase their accuracy.
    2. MaaS opens the models to any application that wants to take advantage of them. This forces Data scientists to work with business teams that are much broader than they otherwise would have access to work with normally.
    3. The provision of dashboards and business intelligence across the organization becomes much easier than with a siloed approach.
    4. MaaS as an approach fundamentally encourages an agile approach to managing data assets and also to rationalizing them. For any MaaS initiative to succeed, timely access needs to be provided to potentially hundreds of data sources in an organization. MaaS encourages a move to viewing data as a reusable asset across the organization.

    Technical advantages of the MaaS approach

    • Separation of concerns : software & data feeds maintained by IT, models maintained by Data Scientists.
    • Versioning of models can be separated from versioning of system(s) using models.
    • Same models can be utilized by multiple software packages for consistency.
    • Consistent handling of data sources: e.g. which “master” source provides what types of data for all the models so that a customer looks the same regardless of the model acting on the data for insights.
    • Single point for putting a “watch” on the performance of a model.
    • Controlled usage of model.
    • MaaS ensures that the analytic process can be automated from a deployment standpoint.  

    Conclusion

    MaaS can enable organizations to move their analytic practices and capabilities to the next level. It enables the best of both worlds – the ability to centralize the data science capabilities across an organization while keeping customer data securely inside the organization. Done right, it can enable the democratization of data science insights across a large enterprise.

What Your Data Science Team Needs From IT..

Data matures like wine, applications like fish.” – James Governor, Principal Analyst & Founder of RedMonk, circa 2007

I would like to begin a series of posts on Data Science jointly authored with my friend, ex-colleague, & collaborator, Maleeha Qazi – Data Scientist (https://www.linkedin.com/in/maleehaqazi/). In these posts, we are intending to bring to light several technology themes around industrial use of Data Science and Deep Learning around Industrial Applications, Big Data , Cyber Security, Cognitive Applications, Business Process Management, and Cloud Computing. Our goal for this first post is to discuss typical issues that bedevil every Data Science initiative at the beginning. Namely, the top technical and cultural concerns to communicate to the IT Department every time a new project is begun.

Introduction

With Data Science emerging as a key enabler in Digital Customer focused Applications, renewed focus is  being placed on how the lifecycle of these new fangled applications happens alongside traditional IT development. This blogpost aims to highlight some of the key concerns involved when Data Science groups work with IT departments. Currently there is no “one size fits all model” in terms of how advanced models are developed and deployed so that they can be accessed and used at scale by customers. It is our wager that almost every large enterprise working on these projects encounters these issues. We wanted to share our experience with the enterprise community over a series of blog posts.

It is clear that Data Science teams, product teams and IT need to collaborate to create business applications that learn from customer needs.

So what are the top asks that Data Science has for their IT groups? There are at least nine important focus areas:

#1 Understanding of the business challenge and agreeing on a common vocabulary 

It is a generally accepted fact that most IT/Data Science interactions are focused on the technology portion which include some of the following elements : the data sources within the organization, acquisition and access to external data sources, the availability of tools & infrastructure to begin supporting the data science development process, cloud or on-prem, data ingestion engines (e.g. Kafka, Flume, Sqoop) to ingest and process the data, etc. While this is certainly part of the process, there has begun to be a distinct anti pattern in how this interaction is working when solely driven by technology alone. The Data Science team is involved in creating models that typically reflect customer needs that drive business value for an organization’s customers, partners, regulators & employees. In that rather important context, technology at it’s core is just an engine and does not exist in a vacuum. The most vibrant enterprises understand this ground reality and always ensure that business needs drive both Data Scientists & IT and not the other way around. It is thus highly important for both the Data Science team and IT team to agree on the business challenge at hand to ensure that their interactions (long and short term) are being driven with business & competitive outcomes in mind. Examples of such goals are a common organization wide business language (so that definitions agree semantically) across products, customers, logistics, supply chains & business domains. The shared emphasis on both teams should be on overall goals such as increased customer profitability, enhanced customer segmentation, customer service productivity, etc. Setting this tone upfront will not only ensure that outcomes for both teams are aligned but will also ensure that critical gaps in knowledge and capabilities are filled. One of the approaches that is working well is increased cross pollination across both teams, collapsing artificial organizational barriers by adopting DevOps & ensuring that Data Science teams have a “slim IT” presence (e.g. an embedded data engineer and datacenter person) to rapidly be able to fill in gaps in IT’s business knowledge or capability.

#2 IT needs to help Data Scientists acquire a deep understanding of the overall Data Architecture

Once business requirements have been identified, Data Scientists get right to work in understanding the different data sources that will comprise inputs to their models. In large enterprises, it is not inconceivable to find out that there are many varied data sources from which data needs to be sourced. For instance, in Banking there are a range of Book of Record Transaction (BORT) systems from which data needs to be extracted. It is also key to supplement this data with external data sets. Models are only as good as the data they are given to work with. Garbage In, Garbage Out (GIGO) is the moniker given to bad data that ensures that models perform poorly. A lot of times, business groups have a hard time explaining what they would like to see – both in terms of data and visualization. In such cases, a prototype makes things easier from a requirements gathering standpoint.  Once the problem is defined, the data scientist/modeler identifies the raw data sources (both internal and external) which are needed for the execution of the business challenge.  They spend a lot of time in the process of collating the data (from Oracle/SQL Server, DB2, Mainframes, Greenplum, Excel sheets, external datasets, etc.). The cleanup/data-wrangling process includes fixing and standardizing missing value representations, identifying potentially corrupted data elements, formatting fields that indicate time and date in a consistent manner, etc.

#3 Infrastructure & IT Self Service Across Environments, Platforms and Tools 

This one is huge. The traditional IT model of hardware acquisition and vetting is typically drawn out as a process. Even with public cloud, onerous security controls are sometime added to infrastructure which delay the Data Science team’s ability to develop their models in an agile manner. The dreaded term Shadow IT (where business & data science teams go around the IT team to procure compute and storage on the public cloud) is not just an issue with infrastructure software but is slowly creeping up to business intelligence and advanced analytics apps. The delays associated with provisioning legacy data silos combined with using tools that are neither intuitive nor able to scale to deal with the increasing data deluge are making timely business analysis almost impossible to perform.  Insights delivered too late are not very valuable. Data Scientists dearly desire that the environments that they need for development and testing are made available as soon as possible and ideally via a self service user interface. This calls for IT investments in Cloud computing platforms that enable agility and speedy provisioning of dev/test environments across compute, network and storage.

#4 – Collaboration with IT around the DS development lifecycle

Organizations typically have well established development methodologies and processes. Currently most data science development and traditional application development  happen in two distinct tracks. Software development typically follows a Agile/DevOps process (a combination of Scrum/XP). The development lifecycle is divided into several stages with each producing a working deliverable at the end. The deliverables are incrementally updated to arrive at an acceptable product at the end which is then deployed for customer use. In this model, team members typically follow a defined role.

The Data Science development cycle is different. Data scientists/modelers are given a certain business problem to solve. They proceed to find the appropriate data they need, pull it into Hadoop or a Data Warehouse, wrangle it, try various algorithms to create the best possible models, test the models, and ensure that they perform well for the problem at hand. If they  get more data during the process, they will go back and retest the whole process. The issue is that IT needs to partner with and collaborate with the Data Science team to first strategize and then help provision different environments (dev, test, prod) to enable data scientists to do iterative model development. They then need to help the Data Science team deploy these models in the appropriate deployment architecture.

#5 – Help Improve the Data Science User Experience

Using traditional app dev methodologies, it can take months to design, test and deploy software – which is simply unsustainable. One of chief goals of the DevOps model is to close the long-standing gap between the engineers who develop and test IT capability and business requirements for such capabilities.  Accordingly the data science teams need best practice recommendations on using IDEs that support iterative model development & debugging. It is important that these development tools support programming languages such as R and Python – the most common go-to languages for data science – to rapidly develop code. It is critical that the IT group partner with the Data Scientists to enable these capabilities both from a development and a deployment standpoint.

#6 – Model Deployment

The data wrangling phase involves writing code to be able to join various data sets so that a single complete dataset can be created from a raw features standpoint.  If more data is obtained as the development cycle is underway, the Data Science team has no option but to go back and redo the whole process. Once the raw features are gathered, feature engineering can begin to create predictive features from the raw data, taking into account business concepts. The modeling phase is where the choice of algorithms comes into play. A Data Scientist takes the raw & engineered features and creates models using the most appropriate algorithms for the task. After the models have been repeatedly tested for accuracy and performance, the best one is typically deployed for use. Once the models have been developed it is critical to ensure that these can be deployed rapidly, run automatically, and changed as per business requirements and performance. How and where these models will get deployed depends on the business case, ideally they should be deployed as a service. Models as a Service (MaaS) is the Data Science counterpart to Software as a Service. The MaaS takes in business variables (often hundreds of inputs) and provides as output business decisions/intelligence, measurements, and visualizations that augment decision support systems. IT help is needed to ensure that the models can scale as customer usage of these Digital Platforms increases.

#7 Model Governance and Management

There needs to be appropriate checks put into place to allow for the monitoring and maintenance of the models once in production. Model versioning must be handled so that customers aren’t affected during a maintenance cycle – old models must still function while the new ones are being put into place. And by keeping a check on the performance of models in production, the IT team can tell when a model stops performing optimally & to call on the Data Science team to check on why.

#8 Security and Compliance  

How are security constraints around different environments managed? Though IT maintains control over the vast domain of tools and environments in any organization, the Data Science team must maintain control of the models. Any random person updating the models could lead to performance degradations. This separation of concerns is akin to DB security over schemas/tables/columns – only certain individuals should be granted access to perform certain operations for the most optimal results.

#9 – Delivering Results to Business Users –

Once the model has been deployed the results need to be made available to business users. Depending on the application, model results might need to be served up in near-real-time, every day/week/month/year, ad-hoc on demand, or any other time frame in-between. Organizations need to deal with providing appropriate tools (e.g. apps, sandboxes, etc.) to enable end users to explore the results of the analysis, and to perform intelligent visualization of the data.  Visualizations include trend analysis over time, KPIs, list of interesting customers/accounts, etc.

Conclusion

Digital applications will continue to incorporate Data Science at an increasing scale. However, traditional IT Departments need to collaborate in the above specific areas to ensure that the algorithms developed for specific business issues are effective, forward looking and scalable.

A Digital Reference Architecture for the Industrial Internet Of Things (IIoT)..

A few weeks ago on the invitation of DZone Magazine, I jointly authored a Big Data Reference Architecture along with my friend & collaborator, Tim Spann (https://www.linkedin.com/in/timothyspann/). Tim & I distilled our experience working on IIoT projects to propose an industrial strength digital architecture. It brings together several technology themes – Big Data , Cyber Security, Cognitive Applications, Business Process Management and Data Science. Our goal is to discuss a best in class architecture that enables flexible deployment for new IIoT capabilities allowing enterprises to build digital applications. The abridged article was featured in the new DZone Guide to Big Data: Data Science & Advanced Analytics which can be downloaded at  https://dzone.com/guides/big-data-data-science-and-advanced-analytics

How the Internet Of Things (IoT) leads to the Digital Mesh..

The Internet of Things (IoT) has become one of the four top hyped up technology paradigms affecting the world of business. The other usual suspects being Big Data, AI/Machine Learning & Blockchain. Cisco predicts that the IOT is expected to impact about 25 billion connected things by 2020 and affect about $2 trillion of economic value globally across a diverse range of verticals. These devices are not just consumer oriented devices such as smartphones and home monitoring systems but dedicated industry objects such as sensors, actuators, engines etc.

The interesting angle to all this is the fact that autonomous devices are already beginning to communicate with one another using IP based protocols. They largely exchanging state & control information around various variables. With the growth of computational power on these devices, we are not far off from their sending over more granular and interesting streaming data – about their environment, performance and business operations – all of which will enable a higher degree of insightful analytics to be performed on the data. Gartner Research has termed this interconnected world where decision making & manufacturing optimization can occur via IoT as the “Digital Mesh“.

The evolution of technological innovation in areas such as Big Data, Predictive Analytics and Cloud Computing now enables the integration and analysis of massive amounts of device data at scale while performing a range of analytics and business process workflows on the data.

Image Credit – Sparkling Logic

According to Gartner, the Digital Mesh will thus lead to an interconnected data information deluge powered by the continuous data from these streams. These streams will encompasses classical IoT endpoints (sensors, field devices, actuators etc) sending data in a variety of formats –  text, audio, video & social data streams – along with new endpoints in areas as diverse as Industrial Automation, Remote Healthcare, Public Transportation, Connected Cars, Home Automation etc. These intelligent devices will increasingly begin communicating with their environments in a manner that will encourage collaboration in a range of business scenarios. The industrial cousin of IoT is the Industrial Internet of Things (IIIoT).

Defining the Industrial Internet Of Things (IIoT)

The Industrial Internet of Things (IIoT) can be defined as a ecosystem of capabilities that interconnects machines, personnel and processes to optimize the industrial lifecycle.  The foundational technologies that IIoT leverages are Smart Assets, Big Data, Realtime Analytics, Enterprise Automation and Cloud based services.

The primary industries impacted the most by the IIoT will include Industrial Manufacturing, the Utility industry, Energy, Automotive, Transportation, Telecom & Insurance.

According to Markets and Markets, the annual worldwide Industrial IoT market is projected to exceed $319 billion in 2020, which represents an 8% a compound annual growth rate (CAGR). The top four segments are projected to be manufacturing, energy and utilities, auto & transportation and healthcare.[1]

Architectural Challenges for Industrial IoT versus Consumer IoT..

Consumer based IoT applications generally receive the lion’s share of media attention. However the ability of industrial devices (such as sensors) to send ever more richer data about their operating environment and performance characteristics is driving a move to Digitization and Automation across a range of industrial manufacturing.

Thus, there are four distinct challenges that we need to account for in an Industrial IOT scenario as compared to Consumer IoT.

  1. The IIoT needs Robust Architectures that are able to handle millions of device telemetry messages per second. The architecture needs to take into account that all kinds of devices operating in environments ranging from the constrained to
  2. IIoT also calls for the highest degrees of Infrastructure and Application reliability across the stack. For instance, a lost message or dropped messages in a healthcare or a connected car scenario may mean life or death for a patient, or, an accident.
  3. An ability to integrate seamlessly with existing Information Systems. Lets be clear, these new age IIOT architectures need to augment existing systems such as Manufacturing Execution Systems (MES) or Traffic Management Systems. In Manufacturing, MES systems continually improve the product lifecycle and perform better resource scheduling and utilization. This integration helps these systems leverage the digital intelligence and insights across (potentially) millions of devices across complex areas of operation.
  4. An ability to incorporate richer kinds of analytics than has been possible before that provide a great degree of context. This ability to reason around context is what provides an ability to design new business models which cannot be currently imagined due to lack of agility in the data and analytics space.

What will IIoT based Digital Applications look like..

Digital Applications are being designed for specific device endpoints across industries. While the underlying mechanisms and business models differ from industry to industry, all of these use predictive analytics based on a combination of real time data processing & data science algorithms. These techniques extract insights from streaming data to provide digital services on existing toolchains, provide value added customer service, predict device performance & failures, improve operational metrics etc.

Examples abound. For instance, a great example in manufacturing is the notion of a Digital Twin which Gartner called out last year. A Digital twin is a software personification of an Intelligent device or system.  It forms a bridge between the real world and the digital world. In the manufacturing industry, digital twins can be setup to function as proxies of Things like sensors and gauges, coordinate measuring machines, vision systems, and white light scanning. This data is sent over a cloud based system where it is combined with historical data to better maintain the physical system.

The wealth of data being gathered on the shop floor will ensure that Digital twins will be used to reduce costs and increase innovation. Thus, in global manufacturing – Data science will soon make it’s way into the shop floor to enable the collection of insights from these software proxies. We covered the phenomenon of Servitization in manufacturing in a previous blogpost.

In the Retail industry, an ability to detect a customer’s location in realtime and combining that information with their historical buying patterns can drive real time promotions and an ability to dynamically price retail goods.

Solution Requirements for an IIoT Architecture..

At a high level, the IIoT reference architecture should support six broad solution areas-

  1. Device Discovery – Discovering a range of devices (and their details)  on the Digital Mesh for an organization within and outside the firewall perimeter
  2. Performing Remote Lifecycle Configuration of these devices ranging from startup to modification to monitoring to shut down
  3. Performing Deep Security level introspection to ensure the patch levels etc are adequate
  4. Creating Business workflows on the Digital Mesh. We will do this by marrying these devices to enterprise information systems (EISs)
  5. Performing Business oriented Predictive Analytics on these devices, this is critical to 
  6. On a futuristic basis, support optional integration with the Blockchain to support a distributed organizational ledger that can coordinate activity across all global areas that an enterprise operates in.

Building Blocks of the Architecture

Listed below are the foundational blocks of our reference architecture. Though the requirements will vary across industries, an organization can reasonably standardize on a number of foundational components as depicted below and then incrementally augment them as the interactions between different components increase based on business requirements.

Our reference architecture includes the following major building blocks –

  • Device Layer
  • Device Integration Layer
  • Data & Middleware Tier
  • Digital Application Layer

It also includes the following cross cutting concerns which span across the above layers –

  • Device and Data Security
  • Business Process Management
  • Service Management
  • UX Design
  • Data Governance – Provenance, Auditing, Logging

The next section provides a brief overview of the reference architecture’s components at a logical level.

A Big Data Reference Architecture for the Industrial Internet depicting multiple functional layers

Device Layer – 

The first requirement of IIIoT implementations is to support connectivity from the Things themselves or the Device layer depicted at the bottom. The Device layer includes a whole range of sensors, actuators, smartphones, gateways and industrial equipment etc. The ability to connect with devices and edge devices like routers, smart gateways using a variety of protocols is key. These network protocols include Ethernet, WiFi, and Cellular which can all directly connect to the internet. Other protocols that need a gateway device to connect include Bluetooth, RFID, NFC, Zigbee et al. Devices can connect directly with the data ingest layer shown above but it is preferred that they connect via a gateway which can perform a range of edge processing.

This is important from a business standpoint for instance, in certain verticals like healthcare and financial services, there exist stringent regulations that govern when certain identifying data elements (e.g. video feeds) can leave the premises of a hospital or bank etc. A gateway cannot just perform intelligent edge processing but also can connect thousands of device endpoints and facilitate bidirectional communication with the core IIoT architecture. 

The ideal tool for these constantly evolving devices, metadata, protocols, data formats and types is Apache NiFi.  These agents will send the data to an Apache NiFi gateway or directly into an enterprise Apache NiFi cluster in the cloud or on-premise.

Apache NiFi Eases Dataflow Management & Accelerates Time to Analytics In Banking (2/3)..

 A subproject of Apache NiFi – MiNiFi provides a complementary data collection approach that supplements the core tenets of NiFi in dataflow management. However due to its small footprint and low resource consumption, is well suited to handle dataflow from sensors and other IOT devices. It provides central management of agents while providing full chain of custody information on the flows themselves.

For remote locations, more powerful devices like the Arrow BeagleBone Black Industrial and MyPi Industrial, it is very simple to run a tiny Java or C++ MiNiFi agent for secure connectivity needs.

The data sent by the device endpoints are then modeled into an appropriate domain representation based on the actual content of the messages. The data sent over also includes metadata around the message. A canonical model can optionally be developed (based on the actual business domain) which can support a variety of applications from a business intelligence standpoint.

 Apache NiFi supports the flexibility of ingesting changing file formats, sizes, data types and schemas. The devices themselves can send a range of feeds in different formats. E.g. XML now and based on upgraded capabilities – richer JSON tomorrow. NiFi supports ingesting any file type that the devices or the gateways may send.  Once the messages are received by Apache NiFi, they are enveloped in security with every touch to each flow file controlled, secured and audited.   NiFi flows also provide full data provenance for each file, packet or chunk of data sent through the system.  NiFi can work with specific schemas if there are special requirements for file types, but it can also work with unstructured or semi structured data just as well.  From a scalability standpoint, NiFi can ingest 50,000 streams concurrently on a zero-master shared nothing cluster that horizontally scales via easy administration with Apache Ambari.

Data and Middleware Layer – 

The IIIoT Architecture recommends a Big Data platform with native message oriented middleware (MOM) capabilities to ingest device mesh data. This layer will also process device data in such a fashion – batch or real-time – as the business needs demand.

Application protocols such as AMQP, MQTT, CoAP, WebSockets etc are all deployed by many device gateways to communicate application specific messages.  The reason for recommending a Big Data/NoSQL dominated data architecture for IIOT is quite simple. These systems provide Schema on Read which is an innovative data handling technique. In this model, a format or schema is applied to data as it is accessed from a storage location as opposed to doing the same while it is ingested. From an IIOT standpoint, one must not just deal with the data itself but also metadata such as timestamps, device id, other firmware data such as software version, device manufactured data etc. The data sent from the device layer will consist of time series data and individual measurements.

The IIoT data stream can thus be visualized as a constantly running data pump which is handled by a Big Data pipeline takes the raw telemetry data from the gateways, decides which ones are of interest and discards the ones not deemed significant from a business standpoint.  Apache NiFi is your gateway and gate keeper.   It ingests the raw data, manages the flow of thousands of producers and consumers, does basic data enrichment, sentiment analysis in stream, aggregation, splitting, schema translation, format conversion and other initial steps to prepare the data. It does that all with a user-friendly web UI and easily extendible architecture.  It will then send raw or processed data to Kafka for further processing by Apache Storm, Apache Spark or other consumers.  Apache Storm is a distributed real-time computation engine that reliably processes unbounded streams of data.  Storm excels at handling complex streams of data that require windowing and other complex event processing. While Storm processes stream data at scale, Apache Kafka distributes messages at scale. Kafka is a distributed pub-sub real-time messaging system that provides strong durability and fault tolerance guarantees. NiFi, Storm and Kafka naturally complement each other, and their powerful cooperation enables real-time streaming analytics for fast-moving big data. All the stream processing is handled by NiFi-Storm-Kafka combination.  

Apache Nifi, Storm and Kafka integrate very closely to manage streaming dataflows.

 

Appropriate logic is built into the higher layers to support device identification, ID lookup, secure authentication and transformation of the data. This layer will process data (cleanse, transform, apply a canonical representation) to support Business Automation (BPM), BI (business intelligence) and visualization for a variety of consumers. The data ingest layer will also providing notification and alerts via Apache NiFi.

Here are some typical uses for this event processing pipeline:

a. Real-time data filtering and pattern matching

b. Enrichment based on business context

c. Real-time analytics such as KPIs, complex event processing etc

d. Predictive Analytics

e. Business workflow with decision nodes and human task nodes

Digital Application Tier – 

Once IIoT knowledge has become part of the Hadoop based Data Lake, all the rich analytics, machine learning and deep learning frameworks, tools and libraries now become available to Data Scientists and Analysts.   They can easily produce insights, dashboards, reports and real-time analytics with IIoT data joined with existing data in the lake including social media data, EDW data, log data.   All your data can be queried with familiar SQL through a variety of interfaces such as Apache Phoenix on HBase, Apache Hive LLAP and Apache Spark SQL.   Using your existing BI tools or the open sourced Apache Zeppelin, you can produce and share live reports.   You can run TensorFlow in containers on YARN for deep learning insights on your images, videos and text data; while running YARN clustered Spark ML pipelines fed by Kafka and NiFi to run streaming machine learning algorithms on trained models.

A range of predictive applications are suitable for this tier. The models themselves should seek to answer business questions around things like -Asset failure, the key performance indicators in a manufacturing process and how they’re trending, insurance policy pricing etc. 

Once the device data has been ingested into a modern data lake, key functions that need to be performed include data aggregation, transformation, enriching, filtering, sorting etc.

As one can see, this can get very complex very quick – both from a data storage and processing standpoint. A Cloud based infrastructure with its ability to provide highly scalable compute, network and storage resources is a natural fit to handle bursty IIoT applications. However, IIoT applications add their own diverse requirements of computing infrastructure, namely the ability to accommodate hundreds of kinds of devices and network gateways – which means that IT must be prepared to support a large diversity of operating systems and storage types

The tier is also responsible for the integration of the IIoT environment into the business processes of an enterprise. The IIoT solution ties into existing line-of-business applications and standard software solutions through adapters or Enterprise Application Integration (EAI) and business-to-business (B2B) gateway capabilities. End users in business-to-business or business-to-consumer scenarios will interact with the IIOT solution and the special- purpose IIoT devices through this layer. They may use the IIoT solution or line-of-business system UIs, including apps on personal mobile devices, such as smartphones and tablets.

Security Implementation

The topic of Security is perhaps the most important cross cutting concern across all layers of the IIoT architecture stack. Needless to say, each of the layers must support the strongest data encryption, authentication and authentication capabilities for devices, users and partner applications. Accordingly, capabilities must be provided to ingest and store security feeds, IDS logs for advanced behavioral analytics, server logs, device telemetry. These feeds must be constantly analyzed across three domains – the Device domain, the Business domain and the IT domain. The below blogpost delves into some of these themes and is a good read to get a deeper handle on this issue from a SOC (security operations center) standpoint.

An Enterprise Wide Framework for Digital Cybersecurity..(4/4)

Conclusion

It is evident from the above that IIoT will enormous opportunity for businesses globally. It will also create layers of complexity and opportunity for Enterprise IT. The creation of smart digital services on the data served up will further depend on the vertical industries. Whatever be the kind of business model – whether tracking behavior, location sensitive pricing, business process automation etc – the end goal of IT architecture should be to create enterprise business applications that are ultimately data native and analytics driven.

DZone-GuideToBigData-Apr17

Here Is What Is Causing The Great Brick-And-Mortar Retail Meltdown of 2017..(1/2)

Amazon and other pure plays are driving toward getting both predictive and prescriptive analytics. They’re analyzing and understanding information at an alarming rate. Brands have pulled products off of Amazon because they’re learning more about them than the brands themselves.” — Todd Michaud, Founder and CEO of Power Thinking Media

By April 2017,17 major retailers announced plans to close stores (Image Credit: Clark Howard)

We are witnessing a meltdown in Storefront Retail..

We are barely halfway through 2017, and the US business media is rife with stories of major retailers closing storefronts. The truth is inescapable that the Retail industry is in the midst of structural change. According to a research report from Credit Suisse, around 8,600 brick-and-mortar stores will shutter their doors in 2017. The number in 2016 was 2,056 stores and 5,077 in 2015 which points to industry malaise [1].

The Retailer’s bigger cousin – the neighborhood Mall – is not doing any better. There are around 1,200 malls in the US today and that number is forecast to decline to just about 900 in a decade.[3]

It is clear that in the coming years, Retailers (and malls) across the board will remain under pressure due to a variety of changes – technological, business model and demographic.

So what can legacy Retailers do to compete with and disarm the online upstart?

Six takeaways for Retail Industry watchers..

Six takeaways that should have industry watchers take notice from the recent headlines –

  1. The brick and mortar retail store pullback has accelerated in 2017 – an year of otherwise strong economic expansion. Typical consumer indicators that influence consumer spending on retail are generally pointing upwards. Just sample the financial data – the US has seen increasing GDP for eight straight years, the last 18 months have seen wage growth for middle & lower income Americans and gas prices are at all time lows.[3] These kinds of relatively strong consumer data trends cannot explain a slowdown in physical storefronts. Consumer spending is not shrinking to due to declining affordability/spending power.
  2. Retailers that have either declared bankruptcy or announced large scale store closings include marquee names across the different categories of retail. Ranging from Apparel to Home Appliances to Electronics to Sporting Goods. Just sample some of the names – Sports Authority, RadioShack, HHGregg, American Apparel, Bebe Stores, Aeropostale, Sears, Kmart, Macy’s, Payless Shoes, JC Penney etc. So this is clearly a trend across various sectors in retail and not confined to a given area, for instance, women’s apparel.
  3. Some of this “Storefront Retail bubble burst” can definitely be attributed to hitherto indiscriminate physical retail expansion. The first indicator is here is in the glut of residual excess retail space.  The WSJ points out that the retail expansion dates back almost 30 years ago when retailers began a “land grab” to open more stores – not unlike the housing boom a decade or so ago. [1] North America now has a glut of both retail stores and shopping malls while per capita sales has begun declining. The US especially has almost five times retail space per capita compared to the UK. American consumers are also swapping materialism for more experiences.[3]  Thus, an over-buildout of retail space is one of the causes of the ongoing crash.

    The US has way more shopping space compared to the rest of the world. (Credit – Cowan and Company)
  4. The dominant retail trend in the world is online ‘single click’ shopping. This is evidenced by declining in-store Black Friday sales in 2016 when compared with record Cyber Monday (online) sales. As online e-commerce volume increases year on year, online retailers led by Amazon are surely taking market share away from the struggling brick-and mortar Retailer who has not kept up with the pace of innovation. The uptick in online retail is unmistakeable as evidenced by the below graph (src – ZeroHedge) depicting the latest retail figures. Department-store sales rose 0.2% on the month, but were down 4.5% from a year earlier. Online retailers such as Amazon, posted a 0.6% gain from the prior month and a 11.9% increase from a year earlier.[3]

    Retail Sales – Online vs In Store Shopping (credit: ZeroHedge)
  5. Legacy retailers are trying to play catch-up with the upstarts who excel at technology. This has sometimes translated into acquisitions of online retailers (e.g. Walmart’s buy of Jet.com). However, the Global top 10 Retailers are dominated by the likes of Walmart, Costco, the Kroger, Walgreens etc. Amazon comes in only at #10 which implies that this battle is only in it’s early days. However, legacy retailers are saddled by huge fixed costs & their investors prefer dividend payouts to investments in innovations. Thus their CEOs are incentivized to focus on the next quarter, not the next decade like Amazon’s Jeff Bezos who is famously known to not evidence any signs of increasing Amazon’s profitability. Though traditional retailers have begun accelerating investments (both organic and via acquisition) in the critical areas of Cloud Computing, Big Data,Mobility and Predictive Analytics – the web scale majors such as Amazon are far far ahead of typical Retail IT shop.

  6. The fastest growing Retail industry brands are companies that use Data as a core business capability to impact the customer experience versus as just another component of an overall IT system. Retail is a game of micro customer interactions that drive sales and margin. This implies a Retailer’s ability to work with realtime customer data – whether it’s sentiment data, clickstream data and historical purchase data to drive marketing promotions, personally relevant services, order fulfillment, show-rooming, loyalty programs etc etc.On the back end, the ability to streamline operations by pulling together data from operations, supply chains are helping retailers fine-tune & automate operations especially from a delivery standpoint.

    In Retail, Technology Is King..

    So, what makes Retail somewhat of a unique industry in terms of it’s data needs? I posit that there are four important characteristics –

    • First and foremost, Retail customers especially the millennials are very open about sharing their brand preferences and experiences on social media. There is a treasure trove of untapped data and similar out there. Data needs to be collected and monetized on. We will explore this in more detail in the next post.
    • Secondly, leaders such as Amazon use customer, product data and a range of other technology capabilities to shape the customer experience versus the other way around for traditional retailers. They do this based on predictive analytic approaches such as machine learning and deep learning. Case in point is Amazon which has now morphed from an online retailer to a Cloud Computing behemoth with it’s market leading AWS (Amazon Web Services). In fact it’s best in class IT enabled it to experiment with retail business models. E.g. The Amazon Prime subscription at $99-a-year Amazon Prime subscription, which includes free two delivery, music and video streaming service that competes with Netflix. As of March 31, 2017 Amazon had 80 million Prime subscribers in the U.S , an increase of 36 percent from a year earlier, according to Consumer Intelligence Research Partners.[3]
    • Thirdly, Retail organizations need to become Data driven businesses. What does that mean or imply? They need to rely on data to drive every core business process – e.g. realtime insights about customers, supply chains, order fulfillment and inventory. This data however spans every kind from traditional structured data (sales data, store level transactions, customer purchase histories, supply chain data, advertising data etc) to non traditional data (social media feeds as there is a strong correlation between the products people rave about and what they ultimately purchase), location data, economic performance data etc). This Data variety represents a huge challenge to Retailers in terms of managing, curating and analyzing these feeds.
    • Fourth, Retail needs to begin aggressively adopting the IoT capabilities they already have in place in the area of Predictive Analytics. This implies tapping and analyzing data from in store beacons, sensors and actuators across a range of use cases from location based offers to restocking shelves.

      ..because it enables new business models..

      None of the above analysis claims that physical stores are going away. They serve a very important function in allowing consumers a way to try on products and allowing for the human experience. However, online definitely is where the growth primarily will be.

      The Next and Final Post in this series..

      It is very clear from the above that it now makes more sense to talk about a Retail Ecosystem which is composed of store, online, mobile and partner storefronts.

      In that vein, the next post in this two part series will describe the below four progressive strategies that traditional Retailers can adopt to survive and favorably compete in today’s competitive (and increasingly online) marketplace.

      These are –

    • Reinventing Legacy IT Approaches – Adopting Cloud Computing, Big Data and Intelligent Middleware to re-engineer Retail IT

    • Changing Business Models by accelerating the adoption of Automation and Predictive Analytics – Increasing Automation rates of core business processes and infusing them with Predictive intelligence thus improving customer and business responsiveness

    • Experimenting with Deep Learning Capabilities  –the use of Advanced AI such as Deep Neural Nets to impact the entire lifecycle of Retail

    • Adopting a Digital or a ‘Mode 2’ Mindset across the organization – No technology can transcend a large ‘Digital Gap’ without the right organizational culture

      Needless to say, the theme across all of the above these strategies is to leverage Digital technologies to create immersive cross channel customer experiences.

References..

[1] WSJ – ” Three hard lessons the internet is teaching traditional stores” – https://www.wsj.com/articles/three-hard-lessons-the-internet-is-teaching-traditional-stores-1492945203

[2] The Atlantic  – “The Retail Meltdown” https://www.theatlantic.com/business/archive/2017/04/retail-meltdown-of-2017/522384/?_lrsc=2f798686-3702-4f89-a86a-a4085f390b63

[3] WSJ – ” Retail Sales fall for the second straight month” https://www.wsj.com/articles/u-s-retail-sales-fall-for-second-straight-month-1492173415

Why Big Data Analytics is the Future of CRM..

A  question that I get a lot from customers is around how Big Data can help augment CRM systems. The answer isn’t just about the ability to aggregate loads of information to produce much richer views of the data but also about feeding this data to produce richer digital analytics. 

Why Combine CRM with Big Data

Customer Relationship Management (CRM) systems primarily resolve around customer information and captures a customers interactions with a company.  The strength of CRM systems is their ability to work with structured data such as customer demographic information (Name, Identifiers, Address, product history etc)

Industry customers will want to use their core CRM customer profiles as a foundational capability and then augment it with additional data as shown in the below diagram –

  1. Core CRM Records as shown at the bottom layers storing structured customer contact data
  2. Extended attribute information from MDM systems,
  3. Customer Experience Data such as Social (sentiment, propensity to buy), Web clickstreams, 3rd party data, etc. (i.e. behavioral, demographics, lifestyle, interests, etc).
  4. Any Linked accounts for customers
  5. The ability to move to a true Customer 360 or Single View
How Big Data Can Augment CRM systems (credit – Mike Ger)

All of these non traditional data streams shown above and depicted below can be stored on commodity hardware clusters. This can be done at a fraction of the cost of traditional SAN storage. The combined data can then be analyzed effectively in near real time thus providing support for advanced business capabilities.

The Seven Kinds of Non Traditional Data that has become prevalent over the last five years

Seven Common Business Capabilities

Once all of this data has been ingested into a datalake from CRM systems, Book Of Record Transaction Systems (BORT), unstructured data sources etc the following kinds of analysis are performed on it. Big Data based on Hadoop can help join CRM data (customer demographics, sales information, advertising campaign info etc) with additional data. This rich view of a complete dataset can provide the below business capabilities.

  • Customer Segmentation– For a given set of data, predict for each individual in a population, a discrete set of classes that this individual belongs to. An example classification is – “For all retail banking clients in a given population, who are most likely to respond to an offer to move to a higher segment”.
  • Pattern recognition and analysis – discover new combinations of business patterns within large datasets. E.g. combine a customers structured data with clickstream data analysis. A major bank in NYC is using this data to settle mortgage loans.
  • Customer Sentiment analysis is an technique used to find degrees of customer satisfaction and how to improve them with a view of increasing customer net promoter scores (NPS).
  • Market basket analysis is commonly used to find out associations between products that are purchased together with a view to improving marketing products. E.g Recommendation engines which to understand what banking products to recommend to customers.
  • Regression algorithms aim to characterize the normal or typical behavior of an individual or group within a larger population. It is frequently used in anomaly detection systems such as those that detect AML (Anti Money Laundering) and Credit Card fraud.
  • Profiling algorithms divide data into groups, or clusters, of items that have similar properties.
  • Causal Modeling algorithms attempt to find out what business events influence others.

Four business benefits in combining Big Data with CRM Systems –

  1. Hadoop can make CRM systems more efficient and cost effective – Most CRM technology is based on an underlying relational database or enterprise data warehouse. These legacy data storage technologies suffer from data collection delays and processing challenges. Hadoop with it’s focus on Schema On Read (SOR) and parallelism can enable low cost storage combined with efficient processing
  2. This integration can focus on improving customer experience – Combining past interactions with historical data across both systems can provide a realtime single view of a customer thus helping agents work better with their customer.
  3. Combine Data in innovative ways to create new products – Once companies have deep insights into customer behavior and purchasing patterns, they can combine the data to create or modify existing service and products.
  4. Gain Realtime insights –  Online transactions are increasing in number year on year. The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging visualization but also to personalize services clients care about across multiple channels of interaction. The only way to attain digital success is to understand your customers at a micro level while constantly making strategic decisions on your offerings to the market. This essentially means operating in a real time world – which leads to Big Data.

To Sum Up…

Combining CRM with Big Data can help maximize competitive advantage across every industry vertical. These advantages not only stem from cheaply storing and analyzing vastly richer data. These business insights are deployed in areas such as marketing, customer service and new product ideation.

How Modern Taxation can benefit from Big Data Analytics..

We have repeatedly discussed how Predictive Analytics built on Big Data capabilities can create a tremendous amount of differentiation and value for Banking and Insurance companies. However, one of the most important areas of public finance – taxation- is just waking up to the possibilities of using this game changing technology – with the goal of  increasing public revenues via the collection of indirect taxes. In this blogpost, we will examine both perspectives – the taxation authority as well as the Corporate under the burden of tax reporting.

Tax

National Tax Authorities begin information sharing..

The last few years have seen increased digitalization of national tax accounting systems. More and more citizens opt to file their returns electronically. Reporting rules for businesses (such as Banks) that have wealthy account holders who have substantial financial assets spanning continents have also gone up. Different legislations have been promulgated in an effort to ensure that the exchange of information on tax matters is as seamless as it can possibly be – with the goal of curbing tax evasion.

The US Government’s tax arm – the IRS (Internal Revenue Service)- introduced the FATCA (Foreign Account Tax Compliance Act) in 2010 with the intention of detecting and tracking US nationals that reside abroad and owe the US taxes. The FATCA intends to prevent US taxpayers who hold non-US financial assets from avoiding taxation by ensuring that foreign banking institutions report on the US account holders or face a withholding of 30% on all US income.

The Organization for Economic Cooperation and Development (OECD) in conjunction with the US authorities has adopted the Common Reporting Standard (CRS), as an information standard for the automatic exchange of taxpayer information. As part of CRS, more than 90 countries now share information on residents assets and incomes in conformance with the reporting standard. CRS is far wider than FATCA and imposes a significant burden on the compliance team in a bank.

As an example, both the US and the UK governments enforce compliance with CRS. This enables bilateral information sharing between both national taxation authorities. [1]

To that end, national and regional tax authorities have begun using Big Data techniques refined in the private sector to increase the collection of taxes from citizens while reducing fraud and waste in the system.

Across a range of national compliance regimes, Big Data can help improve both the tax collection and improve risk scoring process. It can do this  by way of advanced analytics that operates on a much richer and wider set of data than was possible before. On the business side it can help with ensuring compliance reporting and not overpaying.

Catalogued below are the important areas in which Big Data Analytics can present a huge impact in the field of public taxation.

# 1: Help both Taxation Authorities and Finance Departments Efficiently Store and Process Enormous Amounts of Taxation Data 

The point is well made that the entire indirect taxation process is a complex process in terms of both the breadth and the depth of documentation. This is for both corporate tax departments and of the authorities. Businesses need to look at a complex range of multifaceted tax data across all of the thousands of national, state and local jurisdictions they operate across. Not being able to store, process the data in one place induces all the challenges caused by data silos. The Data Lake is a natural fit to store historical tax documents, Accounts Payables/Receivables, Expense information, business receipts, emails, call transcripts etc. Data from various source systems that drive the tax process should be from Master Data, Reference Data (containing fresh & accurate tax rate/jurisdiction data), data from various Book of Record Systems, ERP, finance and legacy financial systems. The key point here is to automate this data movement and cut down manual ingestion processes thus improving both the speed and quality of the process at the first & most important step itself.

                                                    Click on Image to view a blogpost on Data Lakes 

# 2: Perform Accurate Customer Due Diligence (CDD) by Creating a Single View of Compliance

Given the complexity of both compliance regimes, Big Data can help automate Customer Due Diligence (CDD)/ know your customer (KYC). This is important in helping improve both the tax collection and improve risk scoring process;

Big Data Analytics can also make the CDD highly automated as opposed to a manual laborious process especially around tax avoidance watch lists or suspicious accounts. One of the first steps here is to help create a 360 degree view of entity whether a high risk individual or a corporate entity for taxation. Doing so enables better account servicing as well as provides a holistic activity of fraud. Adding a machine learning process on this can help detect micro patterns of fraud across 10s of accounts across geographies. KYC programs are becoming increasingly daunting undertakings due to issues such as difficulty in identifying customers across multiple lines of business, and lack of a consistent view of customer bank product use and transaction activity. Further complicating these challenges is the advent of new risks such as digital currencies, new and unique payment methods, and continued variation in global data privacy regulation—all of which are resulting in enhanced regulatory scrutiny of banks readiness in this area. 

# 3: Predicting Tax Yields (or Liabilities) More Accurately

Leveraging the ingestion and predictive capabilities of a Big Data based platform, tax authorities can create a full picture of an individual or entity across all accounts or geographies. Internal Bank compliance teams can do the same with their client accounts. This can be used to predict better tax yield and compliance numbers.

# 4: Vastly Improve Tax Fraud & Evasion by improving Risk Scoring & Detection

The Banking industry has already begun leveraging sophisticated fraud detection strategies using socio-demographic data and taxpayers behavior. Big Data can be used to look at 10s of attributes per individual that were previously missed out before even for internal structured data. Adding a machine learning process on this can help detect micro patterns of fraud across 10s of accounts across geographies. 

Big Data based analytics also operate at scale thus eliminating manual and cumbersome spreadsheet based analysis – which bedevils the ability to quickly and visually detect tax fraud and evasion.

# 5: Improve the Auditability & Accuracy of Regulatory Reporting

Most businesses currently use ERP systems and engines within those to help with their taxation process. An example is to calculate VAT (Value Added Tax) obligations using the same.  These traditional tax engines and tools suffer from all the issues that plagued traditional tax data storage – operating on limited data sets which may or may not be accurate, manual processes and reconciliations lead to audits more regularly.  Big Data with its focus on data quality and overall governance can help remedy these issues. It can help improve the quality, timeliness and overall confidence in the reporting thus leading to lower number of audits.

Conclusion..

Opening the door to the latest data storage and processing techniques can help taxation authorities introduce a higher degree of automation into their core business functions. This will allow them to reduce manual data operations, avoid costly reconciliation & reporting discrepancies – thus reducing costly audits.  This will enable them to focus on tasks such as strategic forecasting and better tax planning.

References..

[1] Indirect Tax Compliance in an Era of Big Data – Gillis, McStocker and Percival, KPMG –   http://www.bna.com/indirect-tax-compliance-n17179927659/

How Big Data & Advanced Analytics can help Real Estate Investment Trusts (REITS)

                                                         Image Credit – Kiplinger’s

Introduction…

Real Estate Investment Trust’s (REITS) are financial companies that own various forms of commercial and residential real estate. These assets include office buildings, retail shopping centers, hospitals, warehouses, timberland and hotels etc. Real estate is growing quite nicely as a component of the global financial business. Given their focus on real estate investments, REITS have always occupied a specialized position in global finance.

Fundamentally, there are three types of REITS –

  1. Equity REITS which exclusively deal in acquiring, improving and selling properties with the aim of higher returns for their investors
  2. Mortgage REITS only buy and sell mortgages
  3. Hybrid REITS which do both #1 and #2 above

REITS have a reasonably straightforward business model – you take the yields from the properties you own and reinvest the funds to be able to pay your investors (a mandated 95% of dividends). Most of the traditional REIT business processes are well handled by conventional types of technology. However more and more REITs are being challenged to develop a compelling Big Data strategy that leverages their tremendous data assets. 

The Five Key Big Data Applications for REITS… 

Let us consider at the five key areas where advanced analytics built on a Big Data foundation can immensely help REITS.

#1 Property Acquisition Modeling 

REITS owners can leverage the rich datasets available around renters demographics, preferences, seasonality, economic conditions in specific markets to better guide capital decisions on acquiring property. This modeling needs to take into account land costs, development costs, fixture costs & any other sales and marketing costs to appeal to tenants. I’d like to call this macro business perspective. Also from a micro business perspective, being able to better study individual properties using a variety of widely available data – MLS listings for similar properties, foreclosures, closeness to retail establishments, work sites, building profiles, parking spaces, energy footprint etc can help them match tenants to their property holdings. All this is critical to getting their investment mix right to meet profitability targets.

                                  Click on the Image for a blogpost discussing Predictive Analytics in Capital Markets

#2 Portfolio Modeling 

REITS can leverage Big Data to perform more granular modeling of their MBS portfolios. As an example, they can feed in a lot more data into their existing models as discussed above. E.g.  Demographic data, macroeconomic factors et al.  

A simple scenario would be if Interest Rates go up by X basis points – what does that mean for my portfolio exposure, Default Rate, Cost Picture, Optimal times to buy certain MBS’s etc ?  REITS can then use that info to enter hedges etc to protect against any downside. Big Data can also help with a range of predictive modeling across all of the above areas as discussed below.  An example is to build a 360 degree view of a given investment portfolio.

                                                         Click on Image for a Customer 360 discussion 

#3 Risk Data Aggregation & Calculations 

The instruments underlying the portfolios themselves carry large amounts of credit & interest rate risk. Big Data is a fantastic platform for aggregating and calculating many kinds of risk exposures as the below link discuss in detail. 

  

                                            Click on Image for a discussion of Risk Data Aggregation and Measurement 

 

#4 Detect and Prevent Money Laundering (AML)

Due to the global nature of investment funds flowing into real estate, REITS are highly exposed to money laundering and sanctions risks. Whether or not REITS operate in high risk geographies (India,China, South America, Russia etc) or have complex holding structures – they need to file SAR (Suspicious Activity Reports) with the FinCEN.  There has always been a strong case to be made that shady foreign entities and individuals were laundering ill gotten proceeds to buy US real estate. In early 2016, the FinCEN began implementing Geographic Targeting Orders (GTOs). Title companies based in the United States are now required to clearly identify the real owners of either limited liability companies (LLCs) or any other partnerships, and other legal entities being used to purchase high end residential real estate using cash.

AML as a topic is covered exhaustively in the below series of blogposts (please click on image to open the first one).

                                                         Click on Image for a Deepdive on AML

#5 Smart Cities, Net New Investments and Property Management

In the future, REITS would want to invest in Smart Cities which are positioned to be leading urban centers offering mobility, green technology, personalized medicine, safe services, clean water, traffic management and other forward looking urban amenities. These Smart Cities target a new kind of client- upwardly mobile, technologically savvy, environment conscious millenials. According to RBC Capital Markets, Smart Cities presents a massive investment opportunity for REITS. Such investments could provide REITS offering income yields of around 10-20%. (Source – Ben Forster @ Schroeders).

Smart Cities will be created using a number of high end technologies such as IoT, AI, Virtual Reality, Device Meshes etc. By 2020, it is estimated that these buildings will be generating an enormous amount of data that needs to be stored and analyzed by landlords.

As the below graphic from Cisco attests, the ability to work with IoT data to analyze a range of these micro investment opportunities is a Big Data challenge.

The ongoing maintenance and continuous refurbishment of rental properties is a large portion of the business operation of a REIT. The availability of smart sensors and such IoT devices that can track air quality, home appliance malfunction etc can help greatly with preventive maintenance.

Conclusion..

As can be seen from some of the above business areas, most REITS data needs require a holistic approach across the value chain (capital sourcing, investment decisions, portfolio management & operations). This approach spans various horizontal functions like Customer Segmentation, Property Acquisition, Risk, Finance and Business Operations.
The need of the hour for larger REITS is to move to a common model for data storage, model building and testing.  It is becoming increasingly obvious that Big Data can provide massive business opportunities for REITS.

Why Big Data & Advanced Analytics are Strategic Corporate Assets..

The industry is all about Digital now. The explosion in data storage and processing techniques promises to create new digital business opportunities across industries. Business Analytics concerns itself from deriving insights from data that is produced as a byproduct of business operations as well as external data that reflects customer insights. Due to their critical importance in decision making, Business Analytics is now a boardroom matter and not just one confined to the IT teams. My goal in this blogpost is to quickly introduce the analytics landscape before moving on to the significant value drivers that only Predictive Analytics can provide.

The Impact of Business Analytics…

The IDC “Worldwide Big Data and Analytics Spending Guide 2016”, predicts that the big data and business analytics market will grow from $130 billion by the end of this year to $203 billion by 2020[1] at a   compound annual growth rate (CAGR) of 11.7%. This exponential growth is being experienced across industry verticals such as banking & insurance, manufacturing, telecom, retail and healthcare.

Further, during the next four years, IDC finds that large enterprises with 500+ employees will be the main driver in big data and analytics investing, accounting for about $154 billion in revenue. The US will lead the market with around $95 billion in investments during the next four years – followed by Western Europe & the APAC region [1].

The two major kinds of Business Analytics…

When we discuss the broad topic of Business Analytics, it needs to be clarified that there are two major disciplines – Descriptive and Predictive. Industry analysts from Gartner & IDC etc. will tell you that one also needs to widen the definition to include Diagnostic and Prescriptive. Having worked in the industry for a few years, I can safely say that these can be subsumed into the above two major categories.

Let’s define the major kinds of industrial analytics at a high level –

Descriptive Analytics is commonly described as being retrospective in nature i.e “tell me what has already happened”. It covers a range of areas traditionally considered as BI (Business Intelligence). BI focuses on supporting operational business processes like customer onboarding, claims processing, loan qualification etc via dashboards, process metrics, KPI’s (Key Performance Indicators). It also supports a basic level of mathematical techniques for data analysis (such as trending & aggregation etc.) to infer intelligence from the same.  Business intelligence (BI) is a traditional & well established analytical domain that essentially takes a retrospective look at business data in systems of record. The goal of the Descriptive disciplines is to primarily look for macro or aggregate business trends across different aspects or dimensions such as time, product lines, business units & operating geographies.

  • Predictive Analytics is the forward looking branch of analytics which tries to predict the future based on information about the past. It describes what “can happen based on the patterns in data”. It covers areas like machine learning, data mining, statistics, data engineering & other advanced techniques such as text analytics, natural language processing, deep learning, neural networks etc. A more detailed primer on both along with detailed use cases are found here –

The Data Science Continuum in Financial Services..(3/3)

The two main domains of Analytics are complementary yet different…

Predictive Analytics does not intend to, nor will it, replace the BI domain but only adds significant sophisticated analytical capabilities enabling businesses to be able to do more with all the data they collect. It is not uncommon to find real world business projects leveraging both these analytical approaches.

However from an implementation standpoint, the only common area of both approaches is knowledge of the business and the sources of data in an organization. Most other things about them vary.

For instance, predictive approaches both augment & build on the BI paradigm by adding a “What could happen” dimension to the data.

The Descriptive Analytics/BI workflow…

BI projects tend to follow a largely structured process which has been well defined over the last 15-20 years. As the illustration below describes it, data produced in operational systems is subject to extraction, transformation and eventually is loaded into a data warehouse for consumption by visualization tools.

descriptive_analytics

                                                                       The Descriptive Analysis Workflow 

Descriptive Analytics and BI add tremendous value to well defined use cases based on a retrospective look at data.

However, key challenges with this process are –

  1. the lack of a platform to standardize and centralize data feeds leads to data silos which cause all kinds of cost & data governance headaches across the landscape
  2. complying with regulatory initiatives (such as Anti Money Laundering or Solvency II etc.) needs the warehouse to handle varying types of data which is a huge challenge for most of the EDW technologies
  3. the ability to add new & highly granular fields to the data feeds in an agile manner requires extensive relational modeling upfront to handle newer kinds of schemas etc.

Big Data platforms have overcome past shortfalls in security and governance and are being used in BI projects at most organizations. An example of the usage of Hadoop in classic BI areas like Risk Data Aggregation are discussed in depth at the below blog.

http://www.vamsitalkstech.com/?p=2697

That being said, BI projects tend to follow a largely structured process which has been well defined over the last 20 years. This space serves a large existing base of customers but the industry has been looking to Big Data as a way of constructing a central data processing platform which can help with the above issues.

BI projects are predicated on using an EDW (Enterprise Data Warehouse) and/or RDBMS (Relational Database Management System) approach to store & analyze the data. Both these kinds of data storage and processing technologies are legacy in terms of both the data formats they support (Row-Column based) as well as the types of data they can store (structured data).

Finally, these systems fall short of processing data volumes generated by digital workloads which tend to be loosely structured (e.g mobile application front ends, IoT devices like sensors or ATM machines or Point of Sale terminals), & which need business decisions to be made in near real time or in micro batches (e.g detect credit card fraud, suggest the next best action for a bank customer etc.) and increasingly cloud & API based to save on costs & to provide self-service.

That is where Predictive Approaches on Big Data platforms are beginning to shine and fill critical gaps.

The Predictive Analytics workflow…

Though the field of predictive analytics has been around for years – it is rapidly witnessing a rebirth with the advent of Big Data. Hadoop ecosystem projects are enabling the easy ingestion of massive quantities of data thus helping the business gather way more attributes about their customers and their preferences.

data_science_process

                                                                    The Predictive Analysis Workflow

The Predictive Analytics workflow always starts with a business problem in mind. Examples of these would be “A marketing project to detect which customers are likely to buy new products or services in the next six months based on their historical & real time product usage patterns – which are denoted by x, y or z characteristics” or “Detect real-time fraud in credit card transactions.”

In use cases like these, the goal of the data science process is to be able to segment & filter customers by corralling them into categories that enable easy ranking. Once this is done, the business is involved to setup easy and intuitive visualization to present the results.

A lot of times, business groups have a hard time explaining what they would like to see – both data and the visualization. In such cases, a prototype makes things easier from a requirements gathering standpoint.  Once the problem is defined, the data scientist/modeler identifies the raw data sources (both internal and external) which comprise the execution of the business challenge.  They spend a lot of time in the process of collating the data (from Oracle/SQL Server, DB2, Mainframes, Greenplum, Excel sheets, external datasets, etc.). The cleanup process involves fixing a lot of missing values, corrupted data elements, formatting fields that indicate time and date etc.

The data wrangling phase involves writing code to be able to join various data elements so that a single client’s complete dataset is gathered in the Data Lake from a raw features standpoint.  If more data is obtained as the development cycle is underway, the Data Science team has no option but to go back & redo the whole process. The modeling phase is where algorithms come in – these can be supervised or unsupervised. Feature engineering takes in business concepts & raw data features and creates predictive features from them. The Data Scientist takes the raw & engineered features and creates a model using a mix of various algorithms. Once the model has been repeatedly tested for accuracy and performance, it is typically deployed as a service. Models as a Service (MaaS) is the Data Science counterpart to Software as a Service. The MaaS takes in business variables (often hundreds of inputs) and provides as output business decisions/intelligence, measurements, & visualizations that augment decision support systems.

 How Predictive Analytics changes the game…

Predictive analytics can bring about transformative benefits in the following six ways.

  1. Predictive approaches can be applied to a much wider & richer variety of business challenges thus enabling an organization to achieve outcomes that were not really possible with the Descriptive variety. For instance, these use cases range from Digital Transformation to fraud detection to marketing analytics to IoT (Internet of things) across industry verticals. Predictive approaches are real-time and not just batch oriented like the Descriptive approaches.
  2. When deployed strategically – they can scale to enormous volumes of data and help reason over them reducing manual costs.  It can take on problems that can’t be managed manually because of the huge amount of data that must be processed.
  3. They can predict the results of complex business scenarios by being able to probabilistically predict different outcomes across thousands of variables by perceiving minute dependencies between them. An example is social graph analysis to understand which individuals in a given geography are committing fraud and if there is a ring operating
  4. They are vastly superior at handling fine grained data of manifold types than can be handled by the traditional approach or by manual processing. The predictive approach also encourages the integration of previously “dark” data as well as newer external sources of data.
  5. They can also suggest specific business actions(e.g. based on the above outcomes) by mining data for hitherto unknown patterns. The data science approach constantly keeps learning in order to increase its accuracy of decisions
  6. Data Monetization–  they can be used to interpret the mined data to discover solutions to business challenges and new business opportunities/models

References

[1] IDC Worldwide Semiannual Big Data and Business Analytics Spending Guide – Oct 2016 “Double-Digit Growth Forecast for the Worldwide Big Data and Business Analytics Market Through 2020 Led by Banking and Manufacturing Investments, According to IDC

http://www.idc.com/getdoc.jsp?containerId=prUS41826116

 

Demystifying Digital – Why Customer 360 is the Foundational Digital Capability – ..(1/3)

The first post in this three part series on Digital Foundations introduces the concept of Customer 360 or Single View of Customer (SVC). We will discuss the need for & the definition of the SVC as part of the first step in any Digital Transformation endeavor. We will also discuss specific benefits from both a business & operational state that are enabled by SVC. The second post in the series introduces the concept of a Customer Journey. The third & final post will focus on a technical design & architecture needed to achieve both these capabilities.
 
In an era of exploding organizational touch points, how many companies can truly claim that they know & understand their customers, their needs & evolving preferences deeply and from a realtime perspective?  
How many companies can claim to keep up as a customers product & service usage matures and keep them engaged by cross selling new offerings. How many can accurately predict future revenue from a customer based on their current understanding of their profile?
The answer is not at all encouraging.
Across industries like Banking, Insurance, Telecom & Manufacturing, the ability to get a unified view of the customer & their journey is at the heart of the the enterprise ability to promote relevant offerings & detect customer dissatisfaction. 
  • Currently most industry players are woeful at putting together this comprehensive Single View of their Customers (SVC). Due to operational silos, each department possess a siloed & limited view of the customer across multiple channels. These views are typically inconsistent, lack synchronization with other departments & miss a high amount of potential cross-sell and up-sell opportunities.
  • The Customer Journey problem has been an age old issue which has gotten exponentially more complicated over the last five years as the staggering rise of mobile technology and the Internet of Things (IoT) have vastly increased the number of enterprise touch points that customers are exposed to in terms of being able to discover & purchase new products/services. In an OmniChannel world, an increasing number of transactions are being conducted online. In verticals like Retail and Banking, the number of online transactions approaches an average of 40%. Adding to the problem, more and more consumers are posting product reviews and feedback online. Companies thus need to react in realtime to piece together the source of consumer dissatisfaction.
Another large component of customer outreach are Marketing analytics & the ability to run effective campaigns to recruit customers.

The most common questions that a lot of enterprises fail to answer accurately are –

  1. Is the Customer happy with their overall relationship experience?
  2. What mode of contact do they prefer? And at what time? Can Customers be better targeted at these channels at those preferred times?
  3. What is the overall Customer Lifetime Value (CLV) or how much profit we are able to generate from this customer over their total lifetime?
  4. By understanding CLV across populations, can we leverage that to increase spend on marketing & sales for products that are resulting in higher customer value?
  5. How do we increase cross sell and up-sell of products & services?
  6. Does this customer fall into a certain natural segment and if so, how can we acquire most customers like them?
  7. Can different channels (Online, Mobile, IVR & POS) be synchronized ? Can Customers begin a transaction in one channel and complete it in any of the others without having to resubmit their data?

The first element in Digital is the Customer Centricity & it must naturally follow that a 360 degree view is a huge aspect of that.

single-view-of-the-customer

                                       Illustration – Customer 360 view & its benefits

So what information is specifically contained in a Customer 360 –

The 360 degree view is a snapshot of the below types of data –

  • Customer’s Demographic information – Name, Address, Age etc
  • Length of the Customer-Enterprise relationship
  • Products and Services purchased overall
  • Preferred Channel & time of Contact
  • Marketing Campaigns the customer has responded to
  • Major Milestones in the Customers relationship
  • Ongoing activity – Open Orders, Deposits, Shipments, Customer Cases etc
  • Ongoing Customer Lifetime Value (CLV) Metrics and the Category of customer (Gold, Silver, Bronze etc)
  • Any Risk factors – Likelihood of Churn, Customer Mood Alert, Ongoing issues etc
  • Next Best Action for Customer

How Big Data technology can help..

Leveraging the ingestion and predictive capabilities of a Big Data based platform, banks can provide a user experience that rivals Facebook, Twitter or Google and provide a full picture of customer across all touch points.

Big Data enhances the Customer 360 capability in the following ways  –  

  1. Obtaining a realtime Single View of the Customer (typically a customer across multiple channels, product silos & geographies) across years of account history 
  2. Customer Segmentation by helping businesses understand customer segments down to the individual level as well as at a segment level
  3. Performing Customer sentiment analysis by combining internal organizational data, clickstream data, sentiment analysis with structured sales history to provide a clear view into consumer behavior.
  4. Product Recommendation engines which provide compelling personal product recommendations by mining realtime consumer sentiment, product affinity information with historical data.
  5. Market Basket Analysis, observing consumer purchase history and enriching this data with social media, web activity, and community sentiment regarding past purchase and future buying trends.

Customer 360 can help improve the following operational metrics of a Retailer or a Bank or a Telecom immensely.

  1. Cost to Income ratio; Customers Acquired per FTE; Sales and service FTE’s (as percentage of total FTE’s), New Accounts Per Sales FTE etc
  2.  Sales conversion rates across channels, Decreased customer attrition rates etc.
  3. Improved Net promotor scores (NPS), referral based sales etc

Customer 360 is thus basic digital capability every organization needs to offer their customers, partners & internal stakeholders. This implies a re-architecture of both data management and business processes automation.

The next post will discuss the second critical component of Digital Transformation – the Customer Journey.