Why Platform as a Service (PaaS) Adoption will take off in 2017..

???????????????????????????

Since the time Steve Ballmer went ballistic professing his love for developers, it has been a virtual mantra in the technology industry that developer adoption is key to the success of a given platform. On the face of it – Platform as a Service(PaaS) is a boon to enterprise developers who are tired of the inefficiencies of old school application development environments & stacks. Further, a couple of years ago, PaaS seemed to be the flavor of the future given the focus on Cloud Computing. This blogpost focuses on the advantages of the generic PaaS approach while discussing its lagging slow rate of adoption in the cloud computing market – as compared with it’s cloud cousins – IaaS (Infrastructure as a Service) and SaaS (Software as a Service).

Platform as a Service (PaaS) as the foundation for developing Digital, Cloud Native Applications…

Call them Digital or Cloud Native or Modern. The nature of applications in the industry is slowly changing. So are the cultural underpinnings of the development process and culture themselves- from waterfall to agile to DevOps. At the same time, Cloud Computing and Big Data are enabling the creation of smart data applications. Leading business organizations are cognizant of the need to attract and retain the best possible talent – often competing with the FANGs (Facebook, Amazon, Netflix & Google).

Couple all this with the immense industry and venture capital interest around container oriented & cloud native technologies like Docker – you have a vendor arms race in the making. And the prize is to be chosen as the standard for building industry applications.

Thus, infrastructure is enabling but in the end- it is the applications that are Queen or King.

That is where PaaS comes in.

Why Digital Disruption is the Cure for the Common Data Center..

Enter Platform as a Service (PaaS)…

Platform as a Service (PaaS) is one of the three main cloud delivery models, the other two being IaaS (Infrastructure such as compute, network & storage services) and SaaS (Business applications delivered over a cloud). A collection of different cloud technologies, PaaS focuses exclusively on application development & delivery. PaaS advocates a new kind of development based on native support for concepts like agile development, unit testing, continuous integration, automatic scaling, while providing a range of middleware capabilities. Applications developed on these can be deployed out as services & managed across thousands of application instances.

In short, PaaS is the ideal platform for creating & hosting digital applications. What can PaaS provide that older application development toolchains and paradigms cannot?

While the overall design approach and features vary across every PaaS vendor – there are five generic advantages from a high level –

  1. PaaS enables a range of Application, Data & Middleware components to be delivered as API based services to developers on any given Infrastructure as a Service (IaaS).  These capabilities include-  Messaging as a service, Database as a service, Mobile capabilities as a service, Integration as a service, Workflow as a service, Analytics as a service for data driven applications etc. Some PaaS vendors also provide ability to automate & manage APIs for business applications deployment on them – API Management.
  2. PaaS provides easy & agile access to the entire suite of technologies used while creating complex business applications. These range from programming languages to application server (and lightweight) runtimes to programming languages to CI/CD toolchains to source control repositories.
  3. PaaS provides the services which enables a seamless & highly automated manner of building the complete life cycle of building and delivering web applications and services on the internet. Industry players are infusing software delivery processes with practices such as continuous delivery (CD) and continuous integration (CI). For large scale applications such as those built in web scale shops, financial services, manufacturing, telecom etc – PaaS abstracts away the complexities of building, deploying & orchestrating infrastructure thus enabling instantaneous developer productivity. This is a key point – with it’s focus on automation – PaaS can save application and system administrators precious time and resources in managing the lifecycle of elastic applications
  4. PaaS enables your application to be ‘kind of cloud’ agnostic & can enable applications to be run on any cloud platform whether public or private. This means that a PaaS application developed on Amazon AWS can easily be ported to Microsoft Azure to VMWare vSphere to Red Hat RHEV etc
  5. PaaS can help smoothen organizational Culture and Barriers – The adoption of a PaaS forces an agile culture in your organization – one that pushes cross pollination among different business, dev and ops teams. Most organizations are just now beginning to go bimodal for greenfield applications can benefit immensely from choosing a PaaS as a platform standard.

The Barriers to PaaS Adoption Will Continue to Fall In 2017..

In general, PaaS market growth rates do not seem to line up well when compared with the other broad sections of the cloud computing space, namely IaaS (Infrastructure as a Service) and SaaS (Software as a Service). 451 Research’s Market Monitor forecasts that the total market for cloud computing (including PaaS, IaaS and infrastructure software as a service (ITSM, backup, archiving) – will hit $21.9B in 2016 more than doubling to $44.2bB by 2020. Of that, some analyst estimates contend that PaaS will be a relatively small $8.1 billion.

451-research-paas_vs_saas_iaas

  (Source – 451 Research)

The advantages that PaaS confers have sadly also caused its relatively low rate of adoption as compared to IaaS and SaaS.

The reasons for this anemic rate of adoption include, in my opinion  –  

  1. Poor Conception of the Business Value of PaaS –  This is the biggest factor holding back explosive growth in this category. PaaS is a tremendously complicated technology & vendors have not helped by stressing on the complex technology underpinnings (containers, supported programming languages, developer workflow, orchestration, scheduling etc etc) as opposed to helping clients understand the tangible business drivers & value that enterprise CIOs can derive from this technology. Common drivers include increased time to market for digital capabilities, man hours saved in maintaining complex applications, ability to attract new talent etc. These factors will vary for every customer but it is up to frontline Sales teams to help deliver this message in a manner that is appropriate to the client.
  2. Yes, you can do DevOps without PaaS but PaaS helps a long way  – Many Fortune 500 organizations are drawing up DevOps strategies which do not include a PaaS & are based on a simplified CI/CD pipeline. This is to the detriment of both the customer organization & the industry as PaaS can vastly simplify a range of complex runtime & lifecycle services that would otherwise need to be cobbled together by the customer as the application moves from development to production. There is simply a lack of knowledge in the customer community about where a PaaS fits in a development & deployment toolchain.
  3. Smorgasbord of Complex Infrastructure Choices – The average leading PaaS includes a range of open source technologies ranging from containers to runtimes to datacenter orchestration to scheduling to cluster management tools. This makes it very complex from the perspective of Corporate IT – not just it terms of running POCs and initial deployments but also to manage a highly complex stack. It is incumbent on the open source projects to abstract away the complex inner workings to drive adoption  -whether by design or by technology alliances.
  4. You don’t need Cloud for PaaS but not enough Technology Leaders get that – This one is perception. The presence of an infrastructural cloud computing strategy is not a necessary condition for PaaS. 
  5. The false notion that PaaS is only fit for massively scalable, greenfield applications – Industry leading PaaS’s (like Red Hat’s OpenShift) support a range of technology approaches that can help cut technical debt. They donot limit deployment on an application server platform such as JBOSS EAP or WebSphere or WebLogic, or a lightweight framework like Spring.
  6. PaaS will help increase automation thus cutting costs – For developers of applications in Greenfield/ New Age spheres such as IoT, PaaS can enable the creation of thousands of instances in a “Serverless” fashion. PaaS based applications can be composed of microservices which are essentially self maintaining – i.e self healing and self scalable up or down; these microservices are delivered (typically) by IT as Docker containers using automated toolchains. The biggest requirement in large datacenters – human involvement – is drastically reduced if PaaS is used – while increasing agility, business responsiveness and efficiencies.

Conclusion…

My goal for this post was to share a few of my thoughts on the benefits of adopting a game changing technology. Done right, PaaS can provide a tremendous boost to building digital applications thus boosting the bottom line. Beginning 2017, we will witness PaaS satisfying critical industry use cases as leading organizations build end-to-end business solutions that covers many architectural layers.

References…

[1] http://www.forbes.com/sites/louiscolumbus/2016/03/13/roundup-of-cloud-computing-forecasts-and-market-estimates-2016/#3d75915274b0

The Three Core Competencies of Digital – Cloud, Big Data & Intelligent Middleware

Ultimately, the cloud is the latest example of Schumpeterian creative destruction: creating wealth for those who exploit it; and leading to the demise of those that don’t.” – Joe Weiman author of Cloudonomics: The Business Value of Cloud Computing

trifacta_digital

The  Cloud As a Venue for Digital Workloads…

As 2016 draws to a close, it can safely be said that no industry leader questions the existence of the new Digital Economy and the fact that every firm out there needs to create a digital strategy. Myriad organizations are taking serious business steps to making their platforms highly customer-centric via a renewed operational metrics focus. They are also working on creating new business models using their Analytics investments. Examples of these verticals include Banking, Insurance, Telecom, Healthcare, Energy etc.

As a general trend, the Digital Economy brings immense opportunities while exposing firms to risks as well. Customers now demanding highly contextual products, services and experiences – all accessible via an easy API (Application Programming Interfaces).

Big Data Analytics (BDA) software revenues will grow from nearly $122B in 2015 to more than $187B in 2019 – according to Forbes [1].  At the same time, it is clear that exploding data generation across the global economy has become a clear & present business phenomenon. Data volumes are rapidly expanding across industries. However, while the production of data itself that has increased but it is also driving the need for organizations to derive business value from it. As IT leaders know well, digital capabilities need low cost yet massively scalable & agile information delivery platforms – which only Cloud Computing can provide.

For a more detailed technical overview- please visit below link.

http://www.vamsitalkstech.com/?p=1833

Big Data & Big Data Analytics drive consumer interactions.. 

The onset of Digital Architectures in enterprise businesses implies the ability to drive continuous online interactions with global consumers/customers/clients or patients. The goal is not just provide engaging visualization but also to personalize services clients care about across multiple channels of interaction. The only way to attain digital success is to understand your customers at a micro level while constantly making strategic decisions on your offerings to the market. Big Data has become the catalyst in this massive disruption as it can help business in any vertical solve their need to understand their customers better & perceive trends before the competition does. Big Data thus provides the foundational  platform for successful business platforms.

The three key areas where Big Data & Cloud Computing intersect are – 

  • Data Science and Exploration
  • ETL, Data Backups and Data Preparation
  • Analytics and Reporting

Big Data drives business usecases in Digital in myriad ways – key examples include  –  

  1. Obtaining a realtime Single View of an entity (typically a customer across multiple channels, product silos & geographies)
  2. Customer Segmentation by helping businesses understand their customers down to the individual micro level as well as at a segment level
  3. Customer sentiment analysis by combining internal organizational data, clickstream data, sentiment analysis with structured sales history to provide a clear view into consumer behavior.
  4. Product Recommendation engines which provide compelling personal product recommendations by mining realtime consumer sentiment, product affinity information with historical data.
  5. Market Basket Analysis, observing consumer purchase history and enriching this data with social media, web activity, and community sentiment regarding past purchase and future buying trends.

Further, Digital implies the need for sophisticated, multifactor business analytics that need to be performed in near real time on gigantic data volumes. The only deployment paradigm capable of handling such needs is Cloud Computing – whether public or private. Cloud was initially touted as a platform to rapidly provision compute resources. Now with the advent of Digital technologies, the Cloud & Big Data will combine to process & store all this information.  According to the IDC , by 2020 spending on Cloud based Big Data Analytics will outpace on-premise by a factor of 4.5. [2]

Intelligent Middleware provides Digital Agility.. 

Digital Applications are applications modular, flexible and responsive to a variety of access methods – mobile & non mobile. These applications are also highly process driven and support the highest degree of automation. The need of the hour is to provide enterprise architecture capabilities around designing flexible digital platforms that are built around efficient use of data, speed, agility and a service oriented architecture. The choice of open source is key as it allows for a modular and flexible architecture that can be modified and adopted in a phased manner – as you will shortly see.

The intention in adopting a SOA (or even a microservices) architecture for Digital capabilities is to allow lines of business an ability to incrementally plug in lightweight business services like customer on-boarding, electronic patient records, performance measurement, trade surveillance, risk analytics, claims management etc.

Intelligent Middleware adds significant value in six specific areas –

  1. Supports a high degree of Process Automation & Orchestration thus enabling the rapid conversion of paper based business processes to a true digital form in a manner that lends itself to continuous improvement & optimization
  2. Business Rules help by adding a high degree of business flexibility & responsiveness
  3. Native Mobile Applications  enables platforms to support a range of devices & consumer behavior across those front ends
  4. Platforms As a Service engines which enable rapid application & business capability development across a range of runtimes and container paradigms
  5. Business Process Integration engines which enable rapid application & business capability development
  6. Middleware brings the notion of DevOps into the equation. Digital projects bring several technology & culture challenges which can be solved by a greater degree of collaboration, continuous development cycles & new toolchains without giving up proven integration with existing (or legacy)systems.

Intelligent Middleware not only enables Automation & Orchestration but also provides an assembly environment to string different (micro)services together. Finally, it also enables less technical analysts to drive application lifecycle as much as possible.

Further, Digital business projects call out for mobile native applications – which a forward looking middleware stack will support.Middleware is a key component for driving innovation and improving operational efficiency.

Five Key Business Drivers for combining Big Data, Intelligent Middleware & the Cloud…

The key benefits of combining the above paradigms to create new Digital Applications are –

  • Enable Elastic Scalability Across the Digital Stack
    Cloud computing can handle the storage and processing of any amount of data & any kind of data.This calls for the collection & curation of data from dynamic and highly distributed sources such as consumer transactions, B2B interactions, machines such as ATM’s & geo location devices, click streams, social media feeds, server & application log files and multimedia content such as videos etc. It needs to be noted that data volumes here consist of multi-varied formats, differing schemas, transport protocols and velocities. Cloud computing provides the underlying elastic foundation to analyze these datasets.
  • Support Polyglot Development, Data Science & Visualization
    Cloud technologies are polyglot in nature. Developers can choose from a range of programming languages (Java, Python, R, Scala and C# etc) and development frameworks (such as Spark and Storm). Cloud offerings also enable data visualization using a range of tools from Excel to BI Platforms.
  • Reduce Time to Market for Digital Business Capabilities
    Enterprises can avoid time consuming installation, setup & other upfront procedures. consuming  can deploy Hadoop in the cloud without buying new hardware or incurring other up-front costs. On the same vein, even big data analytics should be able to support self service across the lifecycle – from data acquisition, preparation, analysis & visualization.
  • Support a multitude of Deployment Options – Private/Public/Hybrid Cloud 
    A range of scenarios for product development, testing, deployment, backup or cloudbursting are efficiently supported in pursuit of cost & flexibility goals.
  • Fill the Talent Gap
    Open Source technology is the common thread across Cloud, Big Data and Middleware. The hope is that the ubiquity of open source will serve as a critical level in enabling the filling up of the IT-Business skills scarcity gap.

As opposed to building standalone or one-off business applications, a ‘Digital Platform Mindset’ is a more holistic approach capable of producing higher rates of adoption & thus revenues. Platforms abound in the web-scale world at shops like Apple, Facebook & Google etc. Digital Applications are constructed like lego blocks  and they reuse customer & interaction data to drive cross sell and up sell among different product lines. The key components here are to ensure that one starts off with products with high customer attachment & retention. While increasing brand value, it is key to ensure that customers & partners can also collaborate in the improvements in the various applications hosted on top of the platform.

References

[1] Forbes Roundup of Big Data Analytics (BDA) Report

http://www.forbes.com/sites/louiscolumbus/2016/08/20/roundup-of-analytics-big-data-bi-forecasts-and-market-estimates-2016/#b49033b49c5f

[2] IDC FutureScape: Worldwide Big Data and Analytics 2016 Predictions

Why Software Defined Infrastructure & why now..(1/2)

The ongoing digital transformation in key verticals like financial services, manufacturing, healthcare and telco has incumbent enterprises fending off a host of new market entrants. Enterprise IT’s best answer is to increase the pace of innovation as a way of driving increased differentiation in business processes. Though data analytics & automation remain the lynchpin of this approach – software defined infrastructure (SDI) built on the notions of cloud computing has emerged as the main infrastructure differentiator & that for a host of reasons which we will discuss in this two part blog.

Software Defined Infrastructure (SDI) is essentially an idea that brings together  advances in a host of complementary areas spanning both infrastructure software, data as well as development environments. It supports a new way of building business applications. The core idea in SDI is that massively scalable applications (in support of diverse customer needs) describe their behavior characteristics (via configuration & APIs) to underlying datacenter infrastructure which simply obeys those commands in an automated fashion while abstracting away the underlying complexities.

SDI as an architectural pattern was originally made popular by the web scale giants – the so-called FANG companies of tech — Facebook , Amazon , Netflix and Alphabet (the erstwhile Google) but has begun making it’s way into the enterprise world gradually.

Common Business IT Challenges prior to SDI – 
  1. Cost of hardware infrastructure is typically growing at a high percentage every year as compared to  growth in the total  IT budget. Cost pressures are driving an overall re look at the different tiers across the IT landscape.
  2. Infrastructure is not completely under the control of the IT-Application development teams as yet.  Business realities that dictate rapid app development to meet changing business requirements
  3. Even for small, departmental level applications, still needed to deploy expensive proprietary stacks which are not only cost and deployment footprint prohibitive but also take weeks to spin up in terms of provisioning cycles.
  4. Big box proprietary solutions leading to a hard look at Open Source technologies which are lean and easy to use with lightweight deployment footprint.Apps need to dictate footprint; not vendor provided containers.
  5. Concerns with acquiring developers who are tooled on cutting edge development frameworks & methodologies. You have zero developer mindshare with Big Box technologies.

Key characteristics of an SDI

  1. Applications built on a SDI can detect business events in realtime and respond dynamically by allocating additional resources in three key areas – compute, storage & network – based on the type of workloads being run.
  2. Using an SDI, application developers can seamlessly deploy apps while accessing higher level programming abstractions that allow for the rapid creation of business services (web, application, messaging, SOA/ Microservices tiers), user interfaces and a whole host of application elements.
  3. From a management standpoint, business application workloads are dynamically and automatically assigned to the available infrastructure (spanning public & private cloud resources) on the basis of the application requirements, required SLA in a way that provides continuous optimization across the life cycle of technology.
  4. The SDI itself optimizes the entire application deployment by both externally provisioned APIs & internal interfaces between the five essential pieces – Application, Compute, Storage, Network & Management.

The SDI automates the technology lifecycle –

Consider the typical tasks needed to create and deploy enterprise applications. This list includes but is not limited to –

  • onboarding hardware infrastructure,
  • setting up complicated network connectivity to firewalls, routers, switches etc,
  • making the hardware stack available for consumption by applications,
  • figure out storage requirements and provision those
  • guarantee multi-tenancy
  • application development
  • deployment,
  • monitoring
  • updates, failover & rollbacks
  • patching
  • security
  • compliance checking etc.
The promise of SDI is to automate all of this from a business, technology, developer & IT administrator standpoint.
 SDI Reference Architecture – 
 The SDI encompasses SDC (Software Defined Compute) , SDS (Software Defined Storage), SDN (Software Defined Networking), Software Defined Applications and Cloud Management Platforms (CMP) into one logical construct as can be seen from the below picture.
FS_SDDC

                      Illustration: The different tiers of Software Defined Infrastructure

The core of the software defined approach are APIs.  APIs control the lifecycle of resources (request, approval, provisioning,orchestration & billing) as well as the applications deployed on them. The SDI implies commodity hardware (x86) & a cloud based approach to architecting the datacenter.

The ten fundamental technology tenets of the SDI –

1. Highly elastic – scale up or scale down the gamut of infrastructure (compute – VM/Baremetal/Containers, storage – SAN/NAS/DAS, network – switches/routers/Firewalls etc) in near real time

2. Highly Automated – Given the scale & multi-tenancy requirements, automation at all levels of the stack (development, deployment, monitoring and maintenance)

3. Low Cost – Oddly enough, the SDI operates at a lower CapEx and OpEx compared to the traditional datacenter due to reliance on open source technology & high degree of automation. Further workload consolidation only helps increase hardware utilization.

4. Standardization –  The SDI enforces standardization and homogenization of deployment runtimes, application stacks and development methodologies based on lines of business requirements. This solves a significant IT challenge that has hobbled innovation at large financial institutions.

5. Microservice based applications –  Applications developed for a SDI enabled infrastructure are developed as small, nimble processes that communicate via APIs and over infrastructure like messaging & service mediation components (e.g Apache Kafka & Camel). This offers huge operational and development advantages over legacy applications. While one does not expect Core Banking applications to move over to a microservice model anytime soon, customer facing applications that need responsive digital UIs will need definitely consider such approaches.

6. ‘Kind-of-Cloud’ Agnostic –  The SDI does not enforce the concept of private cloud, or rather it encompasses a range of deployment options – public, private and hybrid.

7. DevOps friendly –  The SDI enforces not just standardization and homogenization of deployment runtimes, application stacks and development methodologies but also enables a culture of continuous collaboration among developers, operations teams and business stakeholders i.e cross departmental innovation. The SDI is a natural container for workloads that are experimental in nature and can be updated/rolled-back/rolled forward incrementally based on changing business requirements. The SDI enables rapid deployment capabilities across the stack leading to faster time to market of business capabilities.

8. Data, Data & Data –  The heart of any successful technology implementation is Data. This includes customer data, transaction data, reference data, risk data, compliance data etc etc. The SDI provides a variety of tools that enable applications to process data in a batch, interactive, low latency manner depending on what the business requirements are.

9. Security –  The SDI shall provide robust perimeter defense as well as application level security with a strong focus on a Defense In Depth strategy.

10. Governance –  The SDI enforces strong governance requirements for capabilities ranging from ITSM requirements – workload orchestration, business policy enabled deployment, autosizing of workloads to change management, provisioning, billing, chargeback & application deployments.

The next & final blog in this series will look at current & specific technology choices – as of 2016 – in building out an SDI.

My take on Gartner’s Top 10 Strategic Technology Trends for 2016

Gartner_top_2016

Dream no small dreams for they have no power to move the hearts of men.” — Goethe

It is that time of the year again when the mavens at Gartner make their annual predictions regarding the top Strategic trends for the upcoming year. The definition of ‘strategic’ as in an emerging technology trend that will impact Iong term business thus influencing plans & budgets. As before, I will be offering up my own take on these while solidifying the discussion in terms of the Social, Mobile, Big Data Analytics & Cloud (SMAC) stack that is driving ongoing industry revolution.
  1. The Digital Mesh
    The rise of the machines has been well documented but enterprises are waking up to the possibilities only recently.  Massive data volumes are now being reliably generated from diverse sources of telemetry as well as endpoints at corporate offices (as a consequence of BYOD). The former devices include sensors used in manufacturing, personal fitness devices like FitBit, Home and Office energy management sensors, Smart cars, Geo-location devices etc. Couple these with the ever growing social media feeds, web clicks, server logs and more – one sees a clear trend forming which Gartner terms the Digital Mesh.  The Digital Mesh leads to an interconnected information deluge which encompasses classical IoT endpoints along with audio, video & social data streams. This leads to huge security challenges and opportunity from a business perspective  for forward looking enterprises (including Governments). Applications will need to combine these into one holistic picture of an entity – whether individual or institution. 
  2. Information of Everything
    The IoT era brings an explosion of data that flows across organizational, system and application boundaries. Look for advances in technology especially in Big Data and Visualization to help consumers harness this information in the right form enriched with the right contextual information.In the Information of Everything era, massive amounts of efforts will thus be expended on data ingestion, quality and governance challenges.
  3. Ambient User Experiences
    Mobile applications first begun forcing the need for enterprise to begin supporting multiple channels of interaction with their consumers. For example Banking now requires an ability to engage consumers in a seamless experience across an average of four to five channels – Mobile, eBanking, Call Center, Kiosk etc. The average enterprise user is familiar with BYOD in the age of self service. The Digital Mesh only exacerbates this gap in user experiences as information consumers navigate applications as they consume services across a mesh that is both multi-channel as well as provides Customer 360 across all these engagement points.Applications developed in 2016 and beyond must take an approach to ensuring a smooth experience across the spectrum of endpoints and the platforms that span them from a Data Visualization standpoint.
  4. Autonomous Agents and Things

    Smart machines like robots,personal assistants like Apple Siri,automated home equipment will rapidly evolve & become even more smarter as their algorithms get more capable and understanding of their own environments. In addition, Big Data & Cloud computing will continue to mature and offer day to day capabilities around systems that employ machine learning to make predictions & decisions. We will see increased application of Smart Agents in diverse fields like financial services,healthcare, telecom and media.

  5. Advanced Machine Learning
    Most business problems are data challenges and an approach centered around data analysis helps extract meaningful insights from data thus helping the business It is a common capability now for many enterprises to possess the capability to acquire, store and process large volumes of data using a low cost approach leveraging Big Data and Cloud Computing.  At the same time the rapid maturation of scalable processing techniques allows us to extract richer insights from data. What we commonly refer to as Machine Learning – a combination of  of econometrics, machine learning, statistics, visualization, and computer science – extract valuable business insights hiding in data and builds operational systems to deliver that value. Data Science has evolved to a new branch called “Deep Neural Nets” (DNN). DNN Are what makes possible the ability of smart machines and agents to learn from data flows and to make products that use them even more automated & powerful. Deep Machine Learning involves the art of discovering data insights in a human-like pattern. The web scale world (led by Google and Facebook) have been vocal about their use of Advanced Data Science techniques and the move of Data Science into Advanced Machine Learning.
  6. 3D Printing Materials

    3D printing continues to evolve and advance across a wide variety of industries.2015 saw a wider range of materials including carbon fiber, glass, nickel alloys, electronics & other materials used in the 3D printing process . More and more industries continue to incorporate the print and assembly of composite parts constructed using such materials – prominent examples including Tesla and SpaceX. We are at the beginning of a 20 year revolution which will lead to sea changes in industrial automation.

  7. Adaptive Security
    A cursory study of the top data breaches in 2015 reads like a “Who’s Who”of actors in society across Governments, Banks, Retail establishments etc. The enterprise world now understands that an comprehensive & strategic approach to Cybersecurity has  now far progressed from being an IT challenge a few years ago to a business imperative. As Digital and IoT ecosystems evolve to loose federations of API accessible and cloud native applications, more and more assets are at danger of being targeted by extremely well funded and sophisticated adversaries. For instance – it is an obvious truth that data from millions of IoT endpoints requires data ingest & processing at scale. The challenge from a security perspective is multilayered and arises not just from malicious actors but also from a lack of a holistic approach that combines security with data governance, audit trails and quality attributes. Traditional solutions cannot handle this challenge which is exacerbated by the expectation that in an IoT & DM world, data flows will be multidirectional across a grid of application endpoints. Expect to find applications in 2016 and beyond incorporating Deep Learning and Real Time Analytics into their core security design with a view to analyzing large scale data at a very low latency.
  8. Advanced System Architecture
    The advent of the digital mesh and ecosystem technologies like autonomous agents (powered by Deep Neural Nets) will make increasing demands on computing architectures from a power consumption, system intelligence as well as a form factor perspective. The key is to provide increased performance while mimicking neuro biological architectures. The name given this style of building electronic circuits is neuromorphic computing. Systems designers will have increased choice in terms of using field programmable gate arrays (FPGAs) or graphics processing units (GPUs). While both FGPAs and GPUs have their pros and cons, devices & computing architectures using these as a foundation are both suited to deep learning and other pattern matching algorithms leveraged by advanced machine learning. Look for more reductions in form factors at less power consumption while allowing advanced intelligence in the IoT endpoint ecosystem.
  9. Mesh App and Service Architecture
    The micro services architecture approach which combines the notion of autonomous, cooperative yet loosely coupled applications built as a conglomeration of business focused services is a natural fit for the Digital Mesh.  The most important additive and consideration to micro services based architectures in the age of the Digital Mesh is what I’d like to term –  Analytics Everywhere. Applications in 2016 and beyond will need to recognize that Analytics are pervasive, relentless, realtime and thus embedded into our daily lives. Every interaction a user has with a micro services based application will need a predictive capability built into the application architecture itself. Thus, 2016 will be the year when Big Data techniques are no longer be the preserve of classical Information Management teams but move to the umbrella Application development area which encompasses the DevOps and Continuous Integration & Delivery (CI-CD) spheres.

  10. IoT Architecture and Platforms
    There is no doubt in anyone’s mind that IoT (Internet Of Things) is a technology megatrend that will reshape enterprises, government and citizens for years to come. IoT platforms will complement Mesh Apps and Service Architectures with a common set of platform capabilities built around open communication, security, scalability & performance requirements. These will form the basic components of IoT infrastructure including but not limited to machine to machine interfaces,location based technology, micro controllers , sensors, actuators and the communication protocols (based on an all IP standard).


The Final Word
– 

One feels strongly that  Open Source will drive the various layers that make up the Digital Mesh stack (Big Data, Operating Systems, Middleware, Advanced Machine Learning & BPM). IoT will be a key part of Digital Transformation initiatives.

However, the challenge for developing Vertical capabilities on these IoT platforms is three fold.  Specifically in areas of augmenting micro services based Digital Mesh applications- which are largely lacking at the time of writing:

  • Data Ingest in batch or near realtime (NRT) or realtime from dynamically changing, disparate and physically distributed sensors, machines, geo location devices, clickstreams, files, and social feeds via highly secure lightweight agents
  • Provide secure data transfer using point-to-point and bidirectional data flows in real time
  • Curate these flows with Simple Event Processing (SEP) capabilities via tracing, parsing, filtering, joining, transforming, forking or cloning of data flows while adding business context to these flows. As mobile clients, IoT applications, social media feeds etc are being brought onboard into existing applications from an analytics perspective, traditional IT operations face pressures from both business and development teams to provide new and innovative services.

The creation of these smart services will further depend on the vertical industries that these products serve as well as requirements for the platforms that host them. E.g industrial automation, remote healthcare, public transportation, connected cars, home automation etc.

Finally, 2016 also throws up some interesting questions around Cyber Security, namely –

a. Can an efficient Cybersecurity be a lasting source of competitive advantage;
b. Given that most breaches are long running in nature where systems are slowly compromised over months. How does one leverage Big Data and Predictive Modeling to rewire and re-architect creaky defenses?
c. Most importantly, how can applications implement security in a manner that they constantly adapt and learn;

If there were just a couple of sentences to sum up Gartner’s forecast for 2016 in a succinct manner, it would be “The emergence of the Digital Mesh & the rapid maturation of IoT will serve to accelerate business transformation across industry verticals. The winning enterprises will begin to make smart technology investments in Big Data, DevOps & Cloud practices  to harness these changes “.

Design & Architecture of a Next Gen Market Surveillance System..(2/2)

This article is the final installment in a two part series that covers one of the most critical issues facing the financial industry – Investor & Market Integrity Protection via Global Market Surveillance. While the first (and previous) post discussed the global scope of the problem across multiple global jurisdictions –  this post will discuss a candidate Big Data & Cloud Computing Architecture that can help market participants (especially the front line regulators – the Stock Exchanges themselves) & SROs (Self Regulatory Authorities) implement these capabilities in their applications & platforms.

Business Background –

The first article in this two part series laid out the five business trends that are causing a need to rethink existing Global & Cross Asset Surveillance based systems.

To recap them below –

  1. The rise of trade lifecycle automation across the Capital Markets value chain and the increasing use of technology across the lifecycle contributes to an environment where speeds and feeds are contributing to a huge number of securities changing hands (in huge quantities) in milliseconds across 25+ global venues of trading; automation leads to increase in trading volumes which adds substantially to the increased risk of fraud
  2. The presence of multiple avenues of trading (ATF – alternative trading facilities and MTF – multilateral trading facilities) creates opportunities for information and price arbitrage that were never a huge problem before in terms of multiple markets and multiple products across multiple geographies with different regulatory requirements.This has been covered in a previous post in this blog at –
    http://www.vamsitalkstech.com/?p=412
  3. As a natural consequence of all of the above – (the globalization of trading where market participants are spread across multiple geographies) it makes it all the more difficult to provide a consolidated audit trail (CAT) to view all activity under a single source of truth ;as well as traceability of orders across those venues; this is extremely key as fraud is becoming increasingly sophisticated e.g the rise of insider trading rings
  4. Existing application (e.g ticker plants, surveillance systems, DevOps) architectures are becoming brittle and underperforming as data and transaction volumes continue to go up & data storage requirements keep rising every year. This leads to massive gaps in compliance data. Another significant gap is found while performing a range of post trade analytics – many of which are beyond the simple business rules being leveraged right now and now increasingly need to move into the machine learning & predictive domain. Surveillance now needs to include non traditional sources of data e.g trader email/chat/link analysis etc that can point to under the radar rogue trading activity before that causes the financial system huge losses. E.g. the London Whale, the LIBOR fixing scandal etc 
  5. Again as a consequence of increased automation, backtesting of data has become a challenge – as well as being able to replay data across historical intervals. This is key in mining for patterns of suspicious activity like bursty spikes in trading as well as certain patterns that could indicate illegal insider selling

The key issue becomes – how do antiquated surveillance systems move into the era of Cloud & Big Data enabled innovation as a way of overcoming these business challenges?

Technology Requirements –

An intelligent surveillance system needs to store trade data, reference data, order data, and market data, as well as all of the relevant communications from all the disparate systems, both internally and externally, and then match these things appropriately. The system needs to account for multiple levels of detection capabilities starting with a) configuring business rules (that describe a fraud pattern) as well as b) dynamic capabilities based on machine learning models (typically thought of as being more predictive). Such a system also needs to parallelize execution at scale to be able to meet demanding latency requirements for a market surveillance platform.

The most important technical essentials for such a system are –

  1. Support end to end monitoring across a variety of financial instruments across multiple venues of trading. Support a wide variety of analytics that enable the discovery of interrelationships between customers, traders & trades as the next major advance in surveillance technology.
  2. Provide a platform that can ingest from tens of millions to billions of market events (spanning a range of financial instruments – Equities, Bonds, Forex, Commodities and Derivatives etc) on a daily basis from thousands of institutional market participants
  3. The ability to add new business rules (via either a business rules engine and/or a model based system that supports machine learning) is a key requirement. As we can see from the first post, market manipulation is an activity that seems to constantly push the boundaries in new and unforseen ways
  4. Provide advanced visualization techniques thus helping Compliance and Surveillance officers manage the information overload.
  5. The ability to perform deep cross-market analysis i.e. to be able to look at financial instruments & securities trading on multiple geographies and exchanges e.g.
  6. The ability to create views and correlate data that are both wide and deep. A wide view will look at related securities across multiple venues; a deep view will look for a range of illegal behaviors that threaten market integrity such as market manipulation, insider trading, watch/restricted list trading and unusual pricing.
  7. The ability to provide in-memory caches of data  for rapid pre-trade compliance checks.
  8. Ability to create prebuilt analytical models and algorithms that pertain to trading strategy (pre- trade models –. e.g. best execution and analysis). The most popular way to link R and Hadoop is to use HDFS as the long-term store for all data, and use MapReduce jobs (potentially submitted from Hive or Pig) to encode, enrich, and sample data sets from HDFS into R.
  9. Provide Data Scientists and Quants with development interfaces using tools like SAS and R.
  10. The results of the processing and queries need to be exported in various data formats, a simple CSV/txt format or more optimized binary formats, JSON formats, or even into custom formats.  The results will be in the form of standard relational DB data types (e.g. String, Date, Numeric, Boolean).
  11. Based on back testing and simulation, analysts should be able to tweak the model and also allow subscribers (typically compliance personnel) of the platform to customize their execution models.
  12. A wide range of Analytical tools need to be integrated that allow the best dashboards and visualizations.

Application & Data Architecture –

The dramatic technology advances in Big Data & Cloud Computing enable the realization of the above requirements.  Big Data is dramatically changing that approach with advanced analytic solutions that are powerful and fast enough to detect fraud in real time but also build models based on historical data (and deep learning) to proactively identify risks.

To enumerate the various advantages of using Big Data  –

a) Real time insights –  Generate insights at a latency of a few milliseconds
b) A Single View of Customer/Trade/Transaction 
c) Loosely coupled yet Cloud Ready Architecture
d) Highly Scalable yet Cost effective

The technology reasons why Hadoop is emerging as the best choice for fraud detection: From a component perspective Hadoop supports multiple ways of running models and algorithms that are used to find patterns of fraud and anomalies in the data to predict customer behavior. Examples include Bayesian filters, Clustering, Regression Analysis, Neural Networks etc. Data Scientists & Business Analysts have a choice of MapReduce, Spark (via Java,Python,R), Storm etc and SAS to name a few – to create these models. Fraud model development, testing and deployment on fresh & historical data become very straightforward to implement on Hadoop. The last few releases of enterprise Hadoop distributions (e.g. Hortonworks Data Platform) have seen huge advances from a Governance, Security and Monitoring perspective.

A shared data repository called a Data Lake is created, that can capture every order creation, modification, cancelation and ultimate execution across all exchanges. This lake provides more visibility into all data related to intra-day trading activities. The trading risk group accesses this shared data lake to processes more position, execution and balance data. This analysis can be performed on fresh data from the current workday or on historical data, and it is available for at least five years—much longer than before. Moreover, Hadoop enables ingest of data from recent acquisitions despite disparate data definitions and infrastructures. All the data that pertains to trade decisions and trade lifecycle needs to be made resident in a general enterprise storage pool that is run on the HDFS (Hadoop Distributed Filesystem) or similar Cloud based filesystem. This repository is augmented by incremental feeds with intra-day trading activity data that will be streamed in using technologies like Sqoop, Kafka and Storm.

The above business requirements can be accomplished leveraging the many different technology paradigms in the Hadoop Data Platform. These include technologies such as enterprise grade message broker – Kafka, in-memory data processing via Spark & Storm etc.

Market_Surveillance

                  Illustration :  Candidate Architecture  for a Market Surveillance Platform 

The overall logical flow in the system –

  • Information sources are depicted at the left. These encompass a variety of institutional, system and human actors potentially sending thousands of real time messages per second or sending over batch feeds.
  • A highly scalable messaging system to help bring these feeds into the architecture as well as normalize them and send them in for further processing. Apache Kafka is chosen for this tier.Realtime data is published by Payment Processing systems over Kafka queues. Each of the transactions has 100s of attributes that can be analyzed in real time to  detect patterns of usage.  We leverage Kafka integration with Apache Storm to read one value at a time and perform some kind of storage like persist the data into a HBase cluster.In a modern data architecture built on Apache Hadoop, Kafka ( a fast, scalable and durable message broker) works in combination with Storm, HBase (and Spark) for real-time analysis and rendering of streaming data. 
  • Trade data is thus streamed into the platform (on a T+1 basis), which thus ingests, collects, transforms and analyzes core information in real time. The analysis can be both simple and complex event processing & based on pre-existing rules that can be defined in a rules engine, which is invoked with Storm. A Complex Event Processing (CEP) tier can process these feeds at scale to understand relationships among them; where the relationships among these events are defined by business owners in a non technical or by developers in a technical language. Apache Storm integrates with Kafka to process incoming data. Storm architecture is covered briefly in the below section.
  • HBase provides near real-time, random read and write access to tables (or ‘maps’) storing billions of rows and millions of columns. In this case once we store this rapidly and continuously growing dataset from the information producers, we are able  to do perform super fast lookup for analytics irrespective of the data size.
  • Data that has analytic relevance and needs to be kept for offline or batch processing can be handled using the storage platform based on Hadoop Distributed Filesystem (HDFS) or Amazon S3. The idea to deploy Hadoop oriented workloads (MapReduce, or, Machine Learning) to understand trading patterns as they occur over a period of time.Historical data can be fed into Machine Learning models created above and commingled with streaming data as discussed in step 1.
  • Horizontal scale-out (read Cloud based IaaS) is preferred as a deployment approach as this helps the architecture scale linearly as the loads placed on the system increase over time. This approach enables the Market Surveillance engine to distribute the load dynamically across a cluster of cloud based servers based on trade data volumes.
  • To take an incremental approach to building the system, once all data resides in a general enterprise storage pool and makes the data accessible to many analytical workloads including Trade Surveillance, Risk, Compliance, etc. A shared data repository across multiple lines of business provides more visibility into all intra-day trading activities. Data can be also fed into downstream systems in a seamless manner using technologies like SQOOP, Kafka and Storm. The results of the processing and queries can be exported in various data formats, a simple CSV/txt format or more optimized binary formats, json formats, or you can plug in custom SERDE for custom formats. Additionally, with HIVE or HBASE, data within HDFS can be queried via standard SQL using JDBC or ODBC. The results will be in the form of standard relational DB data types (e.g. String, Date, Numeric, Boolean). Finally, REST APIs in HDP natively support both JSON and XML output by default.
  • Operational data across a bunch of asset classes, risk types and geographies is thus available to risk analysts during the entire trading window when markets are still open, enabling them to reduce risk of that day’s trading activities. The specific advantages to this approach are two-fold: Existing architectures typically are only able to hold a limited set of asset classes within a given system. This means that the data is only assembled for risk processing at the end of the day. In addition, historical data is often not available in sufficient detail. HDP accelerates a firm’s speed-to-analytics and also extends its data retention timeline
  • Apache Atlas is used to provide governance capabilities in the platform that use both prescriptive and forensic models, which are enriched by a given businesses data taxonomy and metadata.  This allows for tagging of trade data  between the different businesses data views, which is a key requirement for good data governance and reporting. Atlas also provides audit trail management as data is processed in a pipeline in the lake
  • Another important capability that Hadoop can provide is the establishment and adoption of a lightweight entity ID service – which aids dramatically in the holistic viewing & audit tracking of trades. The service will consist of entity assignment for both institutional and individual traders. The goal here is to get each target institution to propagate the Entity ID back into their trade booking and execution systems, then transaction data will flow into the lake with this ID attached providing a way to do Customer & Trade 360.
  • Output data elements can be written out to HDFS, and managed by HBase. From here, reports and visualizations can easily be constructed.One can optionally layer in search and/or workflow engines to present the right data to the right business user at the right time.  

The Final Word [1] –

We have discussed FINRA as an example of a forward looking organization that has been quite vocal about their usage of Big Data. So how successful has this approach been for them?

The benefits Finra has seen from big data and cloud technologies prompted the independent regulator to use those technologies as the basis for its proposal to build the Consolidated Audit Trail, the massive database project intended to enable the SEC to monitor markets in a high-frequency world. Over the summer, the number of bids to build the CAT was narrowed down to six in a second round of cuts. (The first round of cuts brought the number to 10 from more than 30.) The proposal that Finra has submitted together with the Depository Trust and Clearing Corporation (DTCC) is still in contention. Most of the bids to build and run the CAT for five years are in the range of $250 million, and Finra’s use of AWS and Hadoop makes its proposal the most cost-effective, Randich says.

References –

[1] http://www.fiercefinanceit.com/story/finra-leverages-cloud-and-hadoop-its-consolidated-audit-trail-proposal/2014-10-16

Financial Services IT begins to converge towards Software Defined Datacenters..

Previous posts in this blog have commented on the financial services industry as increasingly undergoing a gradual makeover if not outright transformation – both from a business and IT perspective.  This is being witnessed across the spectrum that makes up this crucial vertical –  Retail & Consumer Banking, Stock Exchanges, Wealth Management/ Private Banking & Cards etc.

The regulatory deluge (Basel III, Dodd Frank, CAT Reporting, AML & KYC etc) and the increasing sophistication of cybersecurity threats have completely changed the landscape that IT finds itself in – compared to even five years ago.

Brett King writes in his inimitable style about the age of the hyper-connected consumer i.e younger segments of the population who expect to be able to bank from anywhere, be it from a mobile device or via the Internet from their personal computers instead of just walking into a physical branch.

Further multiple Fintechs (like WealthFront, Kabbage, Square, LendingClub, Mint.com, Cyptocurrency based startups etc)  are leading the way in pioneering a better customer experience.  For an established institution that has huge early mover advantage, the ability to compete with innovative players by using fresh technology approaches is critical to engage customers.

All of these imperatives place a lot of pressure on Enterprise FS IT to move from an antiquated command and control model to being able to deliver on demand services with the speed of an Amazon Web Services.

These new services are composed of Applications that encompass paradigms ranging from Smart Middleware, Big Data, Realtime Analytics, Data Science, DevOps and Mobility. The common business thread to deploying all of these applications is to be able to react quickly and expeditiously to customer expectations and requirements.

Enter the Software Defined Datacenter (SDDC). Various definitions exist for this term but I wager that it means – “a highly automated & self-healing datacenter infrastructure that can quickly deliver on demand services to millions of end users, internal developers without  imposing significant headcount requirements on the enterprise“.

Let’s parse this below.

The SDDC encompasses SDC (Software Defined Compute) , SDS (Software Defined Storage), SDN (Software Defined Networking), Software Defined Applications and Cloud Management Platforms (CMP) into one logical construct as can be seen from the below picture.

FS_SDDC

The core of the software defined approach are APIs.  APIs control the lifecycle of resources (request, approval, provisioning,orchestration & billing) as well as the applications deployed on them. The SDDC implies commodity hardware (x86) & a cloud based approach to architecting the datacenter.

The ten fundamental technology differentiators of the SDDC –

1. Highly elastic – scale up or scale down the gamut of infrastructure (compute – VM/Baremetal/Containers, storage – SAN/NAS/DAS, network – switches/routers/Firewalls etc) in near real time

2. Highly Automated – Given the scale & multi-tenancy requirements, automation at all levels of the stack (development, deployment, monitoring and maintenance)

3. Low Cost – Oddly enough, the SDDC operates at a lower CapEx and OpEx compared to the traditional datacenter due to reliance on open source technology & high degree of automation. Further workload consolidation only helps increase hardware utilization.

4. Standardization –  The SDDC enforces standardization and homogenization of deployment runtimes, application stacks and development methodologies based on lines of business requirements. This solves a significant IT challenge that has hobbled innovation at large financial institutions.

5. Microservice based applications –  Applications developed for a SDDC enabled infrastructure are developed as small, nimble processes that communicate via APIs and over infrastructure like service mediation components (e.g Apache Camel). This offers huge operational and development advantages over legacy applications. While one does not expect Core Banking applications to move over to a microservice model anytime soon, customer facing applications that need responsive digital UIs will need definitely consider such approaches.

6. ‘Kind-of-Cloud’ Agnostic –  The SDDC does not enforce the concept of private cloud, or rather it encompasses a range of deployment options – public, private and hybrid.

7. DevOps friendly –  The SDDC enforces not just standardization and homogenization of deployment runtimes, application stacks and development methodologies but also enables a culture of continuous collaboration among developers, operations teams and business stakeholders i.e cross departmental innovation. The SDDC is a natural container for workloads that are experimental in nature and can be updated/rolled-back/rolled forward incrementally based on changing business requirements. The SDDC enables rapid deployment capabilities across the stack leading to faster time to market of business capabilities.

8. Data, Data & Data –  The heart of any successful technology implementation is Data. This includes customer data, transaction data, reference data, risk data, compliance data etc etc. The SDDC provides a variety of tools that enable applications to process data in a batch, interactive, low latency manner depending on what the business requirements are.

9. Security –  The SDDC shall provide robust perimeter defense as well as application level security with a strong focus on a Defense In Depth strategy. Further data at rest and in motion shall be

10. Governance –  The SDDC enforces strong governance requirements for capabilities ranging from ITSM requirements – workload orchestration, business policy enabled deployment, autosizing of workloads to change management, provisioning, billing, chargeback & application deployments.

So how is doing SDDC at the moment? Most major banks have initiatives in place to gradually evolve their infrastructures to an SDI paradigm. Bank of America (for one) have been vocal about their approach in using two stacks, one Open Source & OpenStack based and the other a proprietary stack[1].

To sum up the core benefit of the SDDC approach, it brings a large enterprise closer to web scale architectures and practices.

The business dividends of the latter include –

1. Digital Transformation – Every large Bank is under growing pressure to transform lines of business or their entire enterprise into a digital operation. I define digital in this context as being able to – “adapt high levels of automation while enabling the business to support multiple channels by which products and services can be delivered to customers. ”

Further the culture of digital encourages constant innovation and agility resulting high levels of customer & employee satisfaction.”

2. Smart Data & Analytics –  Techniques that ensure that the right data is in the hands of the right employee at the right time so that contextual services can be offered in real time to customers. This has the effect of optimizing existing workflows while also enabling the creation of new business models.

3. Cost Savings – Oddly enough, the move to web-scale only reduces business and IT costs. You not only end up doing more with less employees due to higher levels of automation but also are able to constantly cut costs due to adopting technologies like Cloud Computing which enable one to cut CapEx and OpEx. Almost all webscale IT is dominated by open source technologies & APIs, which are much more cost effective than proprietaty platforms.

4. A Culture of Collaboration – The most vibrant enterprises that have implemented web-scale practices not only offer “IT/Business As A Service” but also have instituted strong cultures of symbiotic relationships between customers (both current & prospective), employees , partners and developers etc.

5. Building for the Future – The core idea behind implementing web-scale architecture and data management practices is “Be disruptive in your business or be disrupted by competition”. Web-scale practices enable the building of business platforms around which ecosystems can be created and then sustained based on increasing revenue.

To quote wikipedia, a widespread transition to the SDDC will take years:

Enterprise IT will have to become truly business focused, automatically placing application workloads where they can be best processed. We anticipate that it will take about a decade until the SDDC becomes a reality. However, each step of the journey will lead to efficiency gains and make the IT organization more and more service oriented.

The virtuous loop encouraged by constant customer data & feedback enables business applications (and platforms) to behave like agile & growing organisms –  SDDC based architectures offer them the agility to get there.

References

1.http://blogs.wsj.com/cio/2015/06/26/bank-of-america-adding-workloads-to-software-defined-infrastructure/