• News
  • Columns
  • Interviews
  • BW Communities
  • Events
  • BW TV
  • Subscribe to Print
  • Editorial Calendar 19-20
BW Businessworld

How Big Is Big Data

Photo Credit :

Since Alan Mullaly took over as CEO in 2006, he has made data collection — and analysis — an obsession at Ford Motor. “We have constant meetings to figure out how to use the petabytes of data generated in the organisation, from our cars, our dealers, from social media networks and the rest of the Internet,” says John Ginder, manager of system analytics at Ford and the man leading the data charge from his office in Troy, Michigan, US. The goal is to engage customers better. Ginder is collecting lots of data — including those he cannot analyse right now because proper tools do not exist.

Meanwhile, two giants on opposite sides of the retail industry divide — Walmart and Amazon — are also harnessing Big Data to improve customer service, stock better inventory, gauge sales trends better and improve operational efficiencies.

Across the world, institutions are racing to figure out ways to harness the digital deluge that is dubbed Big Data. In California, a startup called Cardio DX analysed over 100 million genes to shortlist 23 primary predictive genes for coronary artery disease. Big Data projects are also on to improve traffic management in real time, send alerts for freak weather changes, increase milk production, improve power grid efficiencies and dozens of other things. A McKinsey Global study says Big Data can save billions or even trillions of dollars in the US and Europe in healthcare, retail, public administration and manufacturing.

There are a few things you need to know about Big Data before we go further. Big Data as we know it today did not even exist five years ago. It was less than a $100 million industry even in 2009, according to a study by Deloitte. But it is now being touted as the biggest opportunity for everyone in the tech industry — from the likes of IBM and Amazon, to Indian software services blue chips such as TCS, Infosys, Wipro and HCL Tech. From start­ups in the US — Cloudera, GoodData and Parcel, to name a few of the hundreds that have jumped in — to those in India such as Mu Sigma, Xurmo, Meta­ome, Vizury, MeshLabs, among others.

To get an idea of how fast the market is growing, consider this. IDC thinks the Big Data market is worth $5 billion or so. By 2015, it predicts Big Data revenues will touch $30 billion and cross $54 billion by 2017. Many feel the IDC numbers are a gross underestimation.

But just what is Big Data all about? To put it in extremely simple terms, Big Data is the trillions and quintillions of bytes being generated — mostly in unstructured form — by millions of digital media. These range from digital imaging instruments, social networks, emails, digital video uploads, traffic signal data, short messaging services and tweets. IBM estimates the world’s digital networks and storage systems now hold nearly 300 exabytes of data, almost 90 per cent of which were generated in the past two years alone.

Several forces came together to create the Big Data explosion. First, an increasing number of devices went digital — from cameras to phones, from power meters to traffic signals. Two, the rapid drop in storage prices and the rise of cloud computing allowed all manner of digital data to be stored and accessed online. (Today, storing all the music in the world will only cost you $600). Third, as the world became increasingly networked, real-time data sharing and analysis became a reality. Finally, the rise of cellphones (there are 1 billion smartphones in the world) and social media networks (550 million Facebook posts and 250 million tweets a day) led to a data explosion.

Meanwhile, tech companies realised that most of what worked earlier in data collection and analysis — including relational databases — was useless in tracking and analysing Big Data. For one, there was a format problem — a lot of data floating around the Web is unstructured. In the old days, disgruntled customers wrote to you or emailed. And you could feed that into a specific format to analyse problems. Now, people tweet, text, post on review websites, write a status update on Facebook — and give you feedback in hundreds of new ways. Gathering that data fast and reacting to it cannot happen with old tools. Then again, the earlier theory of data analysis was geared around collecting data over weeks, months and sometimes years and then analysing them. Now, marketers and scientists are trying to analyse data and react real time before the customer goes offline.

“The old business intelligence (BI) analytics was business-centric and did not predict future outcomes from a consumer standpoint. For a client, retaining his customer is the key and Big Data analysis is a business driver for IT companies,” says K.R. Sanjiv, senior vice-president, analytics, at Wipro Technologies. “You must understand that Big Data is misunderstood. Most firms look at this as parallel processing; it is not simple analytics,” says Sanchit Vir Gogia, principal analyst, emerging technologies, at IDC. He says companies must use Big Data to understand machine data and consumer data.

Put all that together, and you have two conclusions. One, old tech giants are as new to the Big Data game as any startup is (they have more cash though). And two, everyone is still figuring out the rules of the game. The opportunities broadly lie in storage, in software that can help give a structure to unstructured data, software and software services that can help in real-time analysis, parallel processing hardware, artificial intelligence and, finally, in training people to function in a Big Data world.

IBM was the first to point out that anyone who needs to grapple with Big Data needs expertise in volume, variety and velocity. That is also what Vishnu Bhat, head of cloud services at Infosys, is offering his clients: “Volume, velocity and variety is what we want to solve... this will allow data to be looked at very differently from plain reports.” His counterparts at TCS, HCL Tech and Wipro echo the sentiment. “Big Data can be as lucrative as the Y2K business for India,” says Naresh Nagarajan, senior vice-president, HCL Technologies.

There are plenty of software tools cropping up to instil discipline into Big Data. But these are pretty new for most traditional software services firms. Among the software tools available are some with exotic names such as Hadoop, Hive, PIG, Mahout, MapReduce and ZooKeeper. “They take data from the warehouse and convert it into multiple rows and columns on what is called distributed computing. This is converted into text format or raw data, which can be used for on-the-fly data crunching,” says Amandeep Kalsi, director, Protiviti Consulting.

“There are scattered objectives everywhere, but the opportunity is large,” says S. Sridhar, director, enterprise solutions business, Dell India.
Because of the proliferation of technologies that did not exist even a few years ago, many big companies find themselves at a slight disadvantage. They have thousands of people who have specialised in traditional software that is useless in dealing with Big Data. Startups working on a fresh slate can move faster. “These disruptive technologies are being built by startups, which can scale up the product quickly,” says Praveen Bhadada, director at Zinnov, a management consulting firm.

But one problem plaguing the industry is the paucity of trained people. McKinsey India estimates that the country needs at least 150,000 new data wizards to come to grips with Big Data. McKinsey Global estimated the US itself had a shortage of between 140,000 and 190,000 people with deep data analysis skills and another 1.5 million managers, which was hampering its Big Data projects.

But that is also an opportunity for some people. Bangalore-based startup JigSaw Academy is making a business out of training people aspiring to be data scientists. “It is important for the marketing executive of the future to understand data not from a sales and promotions perspective, but from a viewpoint where she can tell about customers’ habits and preferences to improve company margins,” says Gaurav Vohra, co-founder of JigSaw.
The Data Brigade
Late afternoon on a sunny Tuesday in Bangalore, Sridhar Gopalakrishnan, co-founder and CEO of Xurmo Technologies, is on a conference call with Jaimie Carbonell and Ganesh Mani, both experts in artificial intelligence and language technology at Carnegie Mellon University. The conversation revolves around how to teach machines to understand queries in a human context. “If you talk about Big Data, the machine or tool needs to think of the right meaning to queries,” says Gopalakrishnan. “The machine must be taught thoroughly; like what does ‘I miss my little girl’ mean. A normal tool will throw up various results. But Big Data is more about intelligence,” he says. His three-year-old firm is preparing for a Series A venture capital funding after two angel rounds. “We were trying to convince clients on why intelligence is necessary to rethink business strategy,” says Gopalakrishnan. The firm has 20 people, all technologists, building Big Data tools. HCL Technologies and Cognizant Technologies have already roped in Xurmo as technology partner. “We will focus on making technology and filing multiple patents,” says Gopalakrish­nan. The data analytics and consulting part will be handled by its partners. Xurmo’s business model lies in shipping the product to partners and working on various business models ranging from license-plus-AMC, user-based to monthly contracts.

FOUNDERS: Sarita Digumarti and Gaurav Vohra
WHAT IT DOES: Trains data analytics providers on rapid data scanning and big data tools such as R and SaaS; advises corporates on final delivery of big data knowledge

Meanwhile, in Mumbai, a group of tech wizards has sweated it out to get its technology on the road and has partnered with TCS. Iken Solutions, incubated out of IIT Bombay, has built its own Big Data architecture on an open-source platform, much like Xurmo. “Our algorithm can go through reams of data and ensures that clients can report on a daily basis. Top-down crunching is over, data needs to be personalised,” says Siddarth Goel, co-founder of Iken Solutions.

Iken CTO Rajendra Sonar began writing Big Data algorithms 10 years ago, but computing and storage costs were high and commercially piloting the algorithm remained a dream. After years of experimentation, Iken was born in 2009. It is seed-funded by India Innovation Fund, Society for Innovation and Entrepreneurship and Cellnet.

Meanwhile, another startup, Vizury, is trying to make advertising on the Net more efficient. Vizury, which boasts of two IIM grads and an IITian as its founders, is mining behaviour of consumers on the Net. The goal is to make sure that advertising display is put up in the right context. “A couple of years ago, there was no scale because no customer understood the potential of a targeted audience, but data explosion has happened and they can’t ignore us,” says Gourav Chindlur, founder and COO of Vizury. “Most of our clients are online retailers and travel portals.”

The next time you go to a travel site or a retailer and get discounts, you might find Vizury’s technology helping the discount programme. Their business model works on cost-per-click or cost-per-acquisition and, on top of that, there is a consulting layer built for statistical modelling. They have a team of 12 data scientists and a 53-member corporate and sales force that works with 20,000 websites. Vizury had raised two rounds of funding; the Series A round was from Inventus and Ojas Venture Partners. It raised a third round of funding worth $9 million from Nokia Growth Partners. “If Big Data achieves brand recall, we are only proving to the world that targeted messaging is only possible with personalised insights,” says Chindlur.

“A decade ago, you would not have an under 25-year-old talk about product-oriented businesses,” says Manu Rekhi, principal at Inventus Capital Partners. While this may be true, some have put in years as scientists before they could embark on an entrepreneurial journey. Ramkumar Nandakumar and Kalpana Krishnaswami worked for 15 years in the bioscience industry before they understood there was a business in making cumbersome databases simple. Medical and life sciences databases were neither intuitive nor easy to compile. In 2008, they embarked on creating an engine that could think intuitively for life sciences, health and pharma industries. They began by aggregating 15 databases and then building technology to understand them from a scientist’s viewpoint. “Drug discovery is expensive. The relation of proteins to diseases is crucial for a scientist and that is where our technology comes in. It sieves through all the meta data,” says Krishnaswami, CEO and cofounder of Metaome. She says she had to create a team of bio-informaticians from scratch.

“Biology has more exceptions than rules. For example, the word mercury has to have the right meaning; it can be a car, a Greek god, a chemical or a town,” says Nandakumar, CTO of Metaome. He says they had self-funded the firm by doing consulting work in the pharma industry to keep their entrepreneurial dreams alive. Their 15-member team is now scaling up their informatics platform by adding health as a segment, where they can give doctors on-the-fly data on what body types are allergic to different medicines. “The engine will help doctors prescribe the right medicine,” says Krishnaswami. She adds that the whole business of Big Data will be complete only when one understands semantics, technology and statistics. “The startups that can build such technology can be big business,” says Amit Singh, executive director of Avendus Capital.

Since Big Data is also about text, which is unstructured data, three friends started MeshLabs to solve this problem. “We realised there was business when there are 250 million tweets a day and double the number of messages on Facebook,” says Arijit Mitra, co-founder of MeshLabs.

He adds that the analytics tool his company is building will let brand managers understand the effectiveness of their campaigns. “Technology has to be programmed to understand the linguistic style used in social media, which can be a nightmare,” says Mitra. It is still a small firm, barely 15-strong, and is yet to raise outside funding. Its business model is to first convince clients to take a three-year license for using the platform and then opt for a pay-as-you-go service.

Another startup, Analytics Quotient, analyses all the raw data collected from within and outside an organisation and then advises on why marketing campaigns would not work in a particular region for the launch of a particular product. “The rule is, can you predict sentiment and segment campaigns based on that quickly?” asks Pritha Choudhuri, CEO of Analytics Quotient. Her four-year-old firm, founded with four others, is self-funded.
FOUNDERS: Ram kumar Nanda kumar and Kalpana Krishnaswami
WHAT IT DOES: Consolidates data bases for pharma firms and helps scientists find related compounds in drugs. This can help companies reduce their R&D expenses
The Battle of Giants
Meanwhile, every tech giant is trying to carve out its space in Big Data services. And they have plenty of clients to target. The Future Group, Shoppers Stop, Airtel, Idea, Asian Paints and Croma have all been working with technology partners to increase their sales. “Every individual leaves a digital footprint; and we need to find him and tell him that this is why he has to shop with us,” says Kashyap Mehta, head of e-commerce at Infiniti Retail and Croma’s e-commerce platform, which belongs to the Tata Group. He says in six months he will be able to tell if the learning from Big Data has translated into revenues.

T. K. Hitesh, vice-president of IT at Idea Cellular, agrees with him. “I am today able to understand how a person browses, what he downloads and what he likes. I can correlate all these patterns and offer him services,” says Hitesh. He adds that all the data of Idea’s 22 mobile circles sits in one repository, which is then analysed to capture behaviour of individuals. And different patterns are created on behaviour every three hours.

Airtel, too, has applied Big Data analytics. “This is the fourth phase of data reporting, which is speedy and meaningful insights. I can manage my subscribers’ churn as we know who does what and who changes smartphones regularly. With this data, campaigns change on a daily basis,” says Amrita Gangotra, director for IT at Bharti Airtel.

Big clients are inevitably turning to big tech giants. To handle the volume and velocity of data, IBM, EMC, VMware, Oracle, Informatica and HP are building tools and persuading older clients to try out Big Data. “With Big Data, I want to change what is happening at an instant; it is all about the speed at which you can retain a customer,” says Kiran Bhatia, country manager, information management, IBM India.

 “All this has been possible because storage and computing costs are falling every day,” says Rajesh Janey, president of EMC India. He says that it is an era where partnerships with various firms will emerge or system integrators will take the best-in-class products and sell it to customers. The startup story is just the beginning; there are huge government projects embarking on Big Data technologies and all the tech giants are getting their sales teams to find business in making sense of government data. “Big Data will change India like it is changing western economies; one billion people will someday generate so much data that companies cannot ignore what it means,” says Richard Jones, managing director of Informatica, South Asia. Artificial intelligence — like the codes of software that allow a robot to dream or question its existence and make decisions like a human in Isaac Asimov’s Robot series, or like that of James Cameron’s Blade Runner — may still be science fiction, but with the way things are moving, we may get there faster than expected. And plenty of companies will get rich in the process.

FOUNDERS: Chetan Kulkarni, Gourav Chindlur and Vikram Nayak
WHAT IT DOES: Its platform tracks customers when they go online and helps brands choose the best space to tap them, based on their personal choices — all through real-time research


(This story was published in Businessworld Issue Dated 25-03-2013)