NSA PRISM – The Mother of all Big Data Projects
As a data engineer and scientist, I have been following the NSA PRISM raw intelligence mining program with great interest. The engineering complexity, breadth and scale is simply amazing compared to say credit card analytics (Fair Issac) or marketing analytics firms like Acxiom.
Some background… PRISM – “Planning Tool for Resource Integration, Synchronization, and Management” – is a top-secret data-mining “connect-the-dots” program aimed at terrorism detection and other pattern extraction authorized by federal judges working under the Foreign Intelligence Surveillance Act (FISA). PRISM allows the U.S. intelligence community to look for patterns across multiple gateways across a wide range of digital data sources.
PRISM is unstructured big data aggregation framework — audio and video chats, phone call records, photographs, e-mails, documents, financial transactions and transfers, internet searches, Facebook Posts, smartphone logs and connection logs – and relevant analytics that enable analysts to extract patterns. Save and analyze all of the digital breadcrumbs people don’t even know they are creating.
The whole NSA program raises an interesting debate about “Sed quis custodiet ipsos custodes.” (“But who will watch the watchers.”) Read more 
Next Best Offer Design: Solution Architecture
Next best offer, next best action, interaction optimization, and experience optimization typically have similar architecture. Machine learning and multivariate statistical analysis are at the heart of these cutting edge Behavioral Analytics strategies. Typically firms use statistical tools for segmentation models, behavioral propensity modeling, and market basket analysis.
The bleeding edge in next best offer is increasingly around:
- Applying machine learning to find connections between product tastes and different affinity statements
- Developing low-latency algorithms that help show the right product at the right time to a customer
- Developing rich customer affinity profiles through a variety of feedback loops as well as third-party data source (e.g. Facebook user demos and taste graph)
Targeted Offer Solutions
Making Money on Predictive Analytics – Tools, Consulting and Content
Here are just a few examples of analytics at work
- Target predicts customer pregnancy from shopping behavior, thus identifying prospects to contact with offers related to the needs of a newborn’s parents.
- Tesco (UK) annually issues 100 million personalized coupons at grocery cash registers across 13 countries. Predictive analytics increased redemption rates by a factor of 3.6.
- Netflix predicts which movies you will like based on what you watched.
- Life insurance companies can predicts the likelihood an elderly insurance policy holder will die within 18 months in order to trigger end-of-life counseling.
- Con Edison predicts energy distribution cable failure, updating risk levels that are displayed on operators’ screens three times an hour in New York City.
Now you are interested. So what about your organization. Do you have the right toolset, dataset, skillset and mindset for analytics? Do you want to enable end users to get access to their data without having to go through intermediaries?
The challenge facing managers in every industry is not trivial… how do you effectively derive insights from the deluge of data? How do you structure and execute analytics programs (Infrastructure + Applications + Business Insights) with limited budgets?
Proctor & Gamble – Business Sphere and Decision Cockpits
Data-driven DNA is about having the right toolset, mindset, skillset and dataset to evolve a major brand and seize today’s omni-channel opportunities. Whether it’s retooling and retraining for the multiscreen attention economy, or introducing digital innovations that transform both retail and healthcare, P&G is bringing data into every part of its core strategies to fight for the customer.
—————————
Striving for market leadership in consumer products is a non-stop managerial quest. In the struggle for survival, the fittest win out at the expense of their rivals because they succeed in adapting themselves best to their environment.
CMOs and CIOs everywhere agree that analytics is essential to sales & marketing and that its primary purpose is to gain access to customer insight and intelligence along the market funnel – awareness, consideration, preference, purchase and loyalty.
In this posting we illustrate a best-in-class “run-the-business” with Data/Analytics Case Study at P&G. The case study demonstrates four key characteristics of data market leaders:
- A shared belief that data is a core asset that can be used to enhance operations, customer service, marketing and strategy
- More effective leverage of more data – corporate, product, channel, and customer – for faster results
-
Technology is only a tool, it is not the answer..!
- Support for analytics by senior managers who embrace new ideas and are willing to shift power and resources to those who make data-driven decisions
This case study of a novel construct called Business Cockpit (also called LaunchTower in the Biotech and Pharmaceutical Industry) illustrates the way Business Analytics is becoming more central in retail and CPG decision making.
Here is a quick summary of P&G Analytics program:
- Primary focus on improving management decisions at scale – did the analysis to identify time gap between information and application to decision making
- “Information and Decision Solutions” (IT) embeds over 300 analysts in leadership teams
- Over 50 “Business Suites” for executive information viewing and decision-making
- “Decision cockpits” on 50,000 desktops
- 35% of marketing budget on digital
- Real-time social media sentiment analysis for “Consumer Pulse”
- Focused on how to best apply and visualize information instead of discussion/debate about validity of data
P&G Overview
Data Scientist Infographic & Managed Analytics
The exploding demand for analytics professionals has exceeded all expectations, and is driven by the Big Data tidal wave. Big data is a term commonly applied to large data sets where volume, variety, velocity, or multi-structured data complexity are beyond the ability of commonly used software tools to efficiently capture, manage, and process.
To get value from big data, ‘quants’ or data scientists are becoming analytic innovators who create tremendous business value within an organization, quickly exploring and uncovering game-changing insights from vast volumes of data, as opposed to merely accessing transactional data for operational reporting.
This EMC infographic summarizing their Data Scientist study supports my hypothesis – Data is becoming new oil and we need a new category of professionals to handle the downstream and upstream aspects of drilling, refining and distribution. Data is one of the most valuable assets within an organization. With business process automation, the amount of data being generated, stored and analyzed by organizations is exploding.
Following up on our previous blog post – Are you one of these — Data Scientist, Analytics Guru, Math Geek or Quant Jock? — I am convinced that future jobs are going to be centered around “Raw Data -> Aggregate Data -> Intelligence ->Insight -> Decisions” data chain. We are simply industrializing the chain as machines/automation takes over the lower end of the spectrum. Also Web 2.0 and Social Media are creating an interesting data feedback loop – users contribute to the products they use via likes, comments, etc.
CIOs are faced with the daunting task of unlocking the value of their data efficiently in the time-frame required to make accurate decisions. To support the CIOs, companies like IBM are attempting to become a one-stop shop by a rapid-fire $14 Bln plus acquisition strategy: Cognos, Netezza, SPSS, ILog, Solid, CoreMetrics, Algorithmics, Unica, Datacap, OpenPages, Clarity Systems, Emptoris, DemandTec (for retail). IBM also has other information management assets like Ascential, Filenet, Watson, DB2 etc. They are building a formidable ecosystem around data. They see this as a $20Bln per year opportunity in managing the data, understanding the data and then acting on the data. Read more 
Big Data Investment Theme – Fidelity Investments
Fidelity Investments put out an interesting analysis on Big Data as a Macro Investment Themes for clients. Since everyone has an underperforming investment portfolio in this current market, I reproduced the article here to generate some ideas.
Key Takeaways
- New types of large data sets have emerged because of advances in technology, including mobile computing, and these data are being examined to generate new revenue streams.
- More traditional types of business data have also expanded exponentially, and companies increasingly want and need to analyze this information visually and in real time.
- Big data will be driven by providers of Internet media platforms, data amalgamation applications, and integrated business software and hardware systems.
Investment Theme – Big Data
The concept of “big data” generally refers to two concurrent developments. First, the pace of data accumulation has accelerated as a wider array of devices collect a variety of information about more activities: website clicks, online transactions, social media posts, and even high-definition surveillance videos.
A key driver of this flood of information has been the proliferation of mobile computing devices, such as smartphones and tablets. Mobile data alone are expected to grow at a cumulative annualized rate of 92% between 2010 and 2015 (see Exhibit 1, below). Read more 
Analytics-as-a-Service: Understanding how Amazon.com is changing the rules
“By 2014, 30% of analytic applications will use proactive, predictive and forecasting capabilities” Gartner Forecast
“More firms will adopt Amazon EC2 or EMR or Google App Engine platforms for data analytics. Put in a credit card, by an hour or months worth of compute and storage data. Charge for what you use. No sign up period or fee. Ability to fire up complex analytic systems. Can be a small or large player” Ravi Kalakota’s forecast
—————————-
Big data Analytics = Technologies and techniques for working productively with data, at any scale.
Analytics-as-a-Service is cloud based… Elastic and highly scalable, No upfront capital expense. Only pay for what you use, Available on-demand
The combination of the two is the emerging new trend. Why? Many organizations are starting to think about “analytics-as-a-service” as they struggle to cope with the problem of analyzing massive amounts of data to find patterns, extract signals from background noise and make predictions. In our discussions with CIOs and others, we are increasingly talking about leveraging the private or public cloud computing to build an analytics-as-a-service model.
Analytics-as-a-Service is an umbrella term I am using to encapsulate “Data-as-a-Service” and “Hadoop-as-a-Service” strategies. It is more sexy 🙂
The strategic goal is to harness data to drive insights and better decisions faster than competition as a core competency. Executing this goal requires developing state-of-the-art capabilities around three facets: algorithms, platform building blocks, and infrastructure.
Analytics is moving out of the IT function and into business — marketing, research and development, into strategy. As result of this shift, the focus is greater on speed-to-insight than on common or low-cost platforms. In most IT organizations it takes anywhere from 6 weeks to 6 months to procure and configure servers. Then another several months to load, configure and test software. Not very fast for a business user who needs to churn data and test hypothesis. Hence cloud-as-a-analytics alternative is gaining traction with business users.
Are you one of these — Data Scientist, Analytics Guru, Math Geek or Quant Jock?
“The sexy job in the next ten years will be statisticians…” ‐ Hal Varian, Google
Analytics Challenge — California physicians group Heritage Provider Network Inc. is offering $3 million to any person or firm who develops the best model to predict how many days a patient is likely to spend in the hospital in a year’s time. Contestants will receive “anonymized” insurance-claims data to create their models. The goal is to reduce the number of hospital visits, by identifying patients who could benefit from services such as home nurse visits.
The need for analytics talent is growing everywhere. Analytics touches everyone in the modern world. It’s no longer on the sidelines in a support role, but instead is driving business performance and insights like never before.
Job posting analysis indicate that market demand for data scientists and analytics gurus capable of working with large real-time data sets or “big data” took a huge leap recently. The most common definition of “big data” is real-time insights drawn from large pools of data. These datasets tend to be so large that they become awkward to work with using on-hand relational database tools, or Excel.
It’s super trendy to be labeled “big data” right now – but that doesn’t mean the business trend’s not real. Take for the instance the following scenario in B2B supply chains. Coca-Cola Company is leveraging retailers’ POS data (e.g., Walmart) to build customer analytical snapshots, including mobile iPad reporting, and enable the CPFR (Collaborative Planning, Forecasting, and Replenishment) process in Supply Chain. Walmart alone accounts for $4 bln of Coca-Cola company sales.
Airlines, hotels, retail, financial services and e-commerce are industries that deal with big data. The trend is nothing new in financial services (low latency trading, complex event processing, straight thru processing) but radical in traditional industries. In trading, the value of insights depends on speed of analytics. Old data or slow analytics translate into losing money.
As data growth in business processes outpaces our ability to absorb, visualize or even process, new talent around Business Analytics will have to emerge. New roles such as Data Scientists, Analytics Savants, Quant Modelers are required in almost every corporation for converting the growing volumes of data into actionable insights.
Look at these data stats.







