IT Focus Area: Infrastructure Optimization
January 13, 2012
Big Data in the Cloud
Can big data really deliver big value?
Big data—as a potential business game-changer—is attracting a lot of attention. For example, a recent Google search of the term “big data cloud computing” yielded more than 1.45 million hits. A McKinsey Global Institute report touts big data’s potential for “making information transparent and usable at a much higher frequency; using sophisticated analytics to improve substantially decision making; and innovating the next generation of new business models, products and services.”
The potential of using big data—the petabytes and exabytes of unstructured and semistructured data that companies produce and collect—is promising. A sound cloud computing architecture makes it possible today to manage and manipulate hard-to-structure data for analysis in ways that were difficult to achieve with traditional relational databases, data warehouses and desktop business intelligence tools.
How big is big?
In terms of data and value, the answer is different for every enterprise. If you are helping marketers understand customer behavior on the Web, you may have to process and store data from tens of thousands of websites, amounting to an average of 8,000 events per second of input data. Financial services firms are mining unstructured data and incorporating contextual awareness about breaking news and weather to help make more informed trading decisions. In addition, by constantly monitoring massive amounts of trading activity in real time, management is better able to detect and swiftly take action against fraudulent trades.
Efficiently sifting through huge amounts of data helps pharmaceutical companies reduce the risks of introducing new drugs, accelerate the regulatory review process, and shorten the time required to turn research into revenue. One powerful example of unstructured data processing led to the development of a predictive test that could identify coronary artery disease at an early stage. By analyzing more than 100 million gene samples, an innovative genomic research company was able to identify the 23 primary predictive genes for coronary artery disease, an insight that was widely heralded as one of the top medical breakthroughs in 2010.
Consumer goods companies and retailers have tremendous potential to use big data. They collect tons of information every day—at the cash register, via online transactions, taking surveys, collecting demographic and psychographic information, and interactions through social media. Bringing all this information together can help these companies capture new insights into consumer behaviors and attitudes.
If you are a chief information officer (CIO) or work in the IT department at a company that generates a large amount of data about your customers, distributors or suppliers, you are most likely going to be asked about big data. If you haven’t been asked about it, you most likely will be asked soon. They are going to ask you how you can take advantage of big data, the return on investment (ROI) and how fast you can help.
With CIOs and IT organizations already passionate about efficiently managing massive amounts of data and the millions of dollars that go toward supporting it, what’s different now?
Cloud Computing Has Changed the Game
As a result of improved network bandwidth to transport large amounts of data cheaply and quickly, the transient need of highly intensive resources for a short duration, and the ability to “pay as you go.” Cloud architectures, when properly deployed, can offer greater agility, cost savings, storage and scalability. In fact, many organizations we work with are already thinking about their cloud strategy as a means of harnessing the power of big data. Conversations about protecting and managing data are turning into deeper discussions about the demands CIOs are facing from the rest of the business for new business intelligence. But while business managers are looking for patterns and insights that are buried in big data as a means of making better-informed business decisions and beating the competition, many CIOs are still trying to get their arms around large-scale cloud computing. In the best case, the CIO will take the lead to determine how big data can be harnessed to help the business. Others may wait for the call from the chief executive officer (CEO), chief operating officer (COO) or chief marketing officer (CMO). In either case, there are five key challenges that IT organizations must overcome before generating big value from big data.
In many enterprise environments, big data is comprised of the unstructured data or semi-structured data that account for 80 percent or more of a company’s digital information. Getting actionable information from those untapped sources is a new challenge for many CIOs. Storage costs remain a substantial budget line, and the projected growth of storage needs is stunning. International Data Corporation’s report, “Extracting Value from Chaos,” predicts that in 2011, the amount of information created and replicated will surpass 1.8 zettabytes or 1.8 trillion gigabytes. That is growth by a factor of nine in just five short years. Liabilities and risks, particularly privacy and security, lurk in unlikely places.
Without knowing where to begin, a company risks making investments that will not pay off. Building a solid business case for that first big data project is challenging. Once the project sponsors start to brainstorm the “what ifs” of using big data, a project that at first required a few servers and storage assets, can quickly spiral out of control. Use all your budgeting, negotiating, governance, and leadership skills to set the right course and manage scope.
The value of a big data project can be measured by the usefulness of the information it provides. What tools are required to generate actionable insights from big data? Will the business be satisfied scraping the surface, or will executives want to drill down into the data across several dimensions? Big data makes it possible for businesses to profile individual consumers in comprehensive ways; not just their buying habits, but also their participation in social networking sites and through mobile devices. Your company’s engineers may need to analyze data from sensors embedded in their products. Or your merchandise designers may want to analyze streaming video from multiple locations to assess product placement alternatives in the retail environment. What data sources are available today, and what others could be valuable? Choosing the right analytical tools will make all the difference.
What sort of integration will be required to pull information from new sources for analysis through applications that may be cloud-based? No single vendor will be able to deliver all the features and functions the business will demand to consolidate and analyze huge amounts of unstructured market activity and information. That means juggling more IT infrastructure; understanding how to best use a model and framework system like Hadoop™; learning more about solutions like Scale-out NAS; and bringing in other new technologies to manage unstructured data storage.
IT Skill Set
Do you have the skills and organizational framework in place to manage big data? More resources should be devoted to advanced content management, security, provisioning, procurement, and real-time processing. You may want to consider creating a distinct team of integrated IT skills sets (infrastructure, applications and governance) to work closely with big data end users. Finally, carefully managing how big data is used, shared and archived is new territory for most companies. Maintaining reliability, availability and resiliency will be particularly important in cases where big data is the fuel that powers business intelligence and becomes mission critical.
Security and Privacy
The CIO risks building a Pandora’s box of big data that, if recklessly unleashed, can put the entire enterprise at risk. How can the company mitigate the exposure of information? What dependencies have to be managed? What are the priorities for securing different kinds of information? In a world of limited resources, a risk management framework can help the CIO identify and address the greatest threats.
Big Data Provides Game-Changing Insights If Done Right
Are you ready to take on a big data project? It is a great opportunity to partner with a business unit or functional lead on a strategic initiative, and it can be a platform for building new skills in cloud computing and business intelligence that will help you create a truly data-driven organization. But don’t let the hype about big data’s potential entice you to take shortcuts. Scoping these cloud-based projects is not an exercise left to the inexperienced. Don’t underestimate the effort and skills required to deploy analytical tools necessary to make sense of big data. Integration across internal data marts, private and public clouds, social media sources, and mobile devices may require new approaches, new skills and even new team structures.
Carefully managing how big data is used, shared, and archived is new territory for most companies. The scale of big data will challenge existing approaches to protecting that data with backups and replication. One-off or siloed approaches will not satisfy regulatory, legal, and customer privacy expectations in a cloud environment populated with big data. That’s a tall order. But the game-changing insights residing in big data may be too big to pass up.