Wednesday, February 1, 2012

EMC's Hadoop strategy cuts to the chase

This guest post comes courtesy of Tony Baer’s OnStrategies blog. Tony is a senior analyst at Ovum.

By Tony Baer

To date, Big Storage has been locked out of Big Data. It’s been all about direct attached storage for several reasons. First, Advanced SQL players have typically optimized architectures from data structure (using columnar), unique compression algorithms, and liberal usage of caching to juice response over hundreds of terabytes. For the NoSQL side, it’s been about cheap, cheap, cheap along the Internet data center model: have lots of commodity stuff and scale it out. Hadoop was engineered exactly for such an architecture; rather than speed, it was optimized for sheer linear scale.

Over the past year, most of the major platform players have planted their table stakes with Hadoop. Not surprisingly, IT household names are seeking to somehow tame Hadoop and make it safe for the enterprise.

Up ' til now, anybody with armies of the best software engineers that Internet firms could buy could brute force their way to scale out humungous clusters and if necessary, invent their own technology, then share and harvest from the open source community at will. Hardly a suitable scenario for the enterprise mainstream, the common thread behind the diverse strategies of IBM, EMC, Microsoft, and Oracle toward Hadoop has been to not surprisingly make Hadoop more approachable.

Up ' til now, anybody with armies of the best software engineers that Internet firms could buy could brute force their way to scale out humungous clusters and if necessary.

What’s been conspicuously absent so far was a play from Big Optimized Storage. The conventional wisdom is that SAN or NAS are premium, architected systems whose costs might be prohibitive when you talk petabytes of data.

Similarly, so far there has been a different operating philosophy behind the first generation implementations from the NoSQL world that assumed that parts would fail, and that five nines service levels were overkill. And anyway, the design of Hadoop brute forced the solution: replicate to have three unique copies of the data distributed around the cluster, as hardware is cheap.

As Big Data gains traction in the enterprise, some of it will certainly fit this pattern of something being better than nothing, as the result is unique insights that would not otherwise be possible. For instance, if your running analysis of Facebook or Twitter goes down, it probably won’t take the business with it. But as enterprises adopt Hadoop – and as pioneers stretch Hadoop to new operational use cases such as what Facebook is doing with its messaging system – those concepts of mission-criticality are being revisited.

And so, ever since EMC announced last spring that its Greenplum unit would start supporting and bundling different versions of Hadoop, we’ve been waiting for the other shoe to drop: When would EMC infuse its Big Data play with its core DNA, storage?

Today, EMC announced that its Isilon networked storage system was adding native support for Apache Hadoop’s HDFS file system. There were some interesting nuances to the rollout.

Big vendors feeling their way

It’s interesting to see how IT household names are cautiously navigating their way into unfamiliar territory. EMC becomes the latest, after Oracle and Microsoft, to calibrate their Hadoop strategy in public.

Oracle announced its Big Data appliance last fall before it lined up its Hadoop distribution. Microsoft ditched its Dryad project built around its HPC Server. Now EMC has recalibrated its Hadoop strategy; when it first unveiled its Hadoop strategy last spring, the spotlight was on the MapR proprietary alternatives to the HDFS file system of Apache Hadoop. It’s interesting that vendor initial announcements have either been vague, or have been tweaked as they’ve waded into the market. For EMC’s shift, more about that below.


For EMC, HDFS is the mainstream

MapR’s strategy (and IBM’s along with it, regarding GPFS) has prompted debate and concern in the Hadoop community about commercial vendors forking the technology. As we’ve ranted previously, Hadoop’s growth will be tied, not only to megaplatform vendors that support it, but the third party tools and solutions ecosystem that grows around it.

For such a thing to happen, ISVs and consulting firms need to have a common target to write against, and having forked versions of Hadoop won’t exactly grow large partner communities.

Regarding EMC, the original strategy was two Greenplum Hadoop editions: a Community Edition with a free Apache distro and an Enterprise Edition that bundled MapR, both under the Greenplum HD branding umbrella. At first blush, it looked like EMC was going to earn the bulk of its money from the proprietary side of the Hadoop business.

This reflects emerging conventional wisdom that the enterprise mainstream is leery about lock-in to anything that smells proprietary for technology where they still are in the learning curve.

What’s significant is that the new announcement of Isilon support pertains on to the HDFS open source side. More to the point, EMC is rebranding and subtly repositioning its Greenplum Hadoop offerings: Greenplum HD is the Apache HDFS edition with the optional Isilon support, and Greenplum MR is the MapR version, which is niche targeted towards advanced Hadoop use cases that demand higher performance.

Coming atop recent announcements from Oracle and Microsoft that have come clearly out on the side of OEM’ing Apache rather than anything limited or proprietary, and this amounts to an unqualified endorsement of Apache Hadoop/HDFS as not only the formal, but also the de facto standard.

This reflects emerging conventional wisdom that the enterprise mainstream is leery about lock-in to anything that smells proprietary for technology where they still are in the learning curve. Other forks may emerge, but they will not be at the base file system layer. This leaves IBM and MapR pigeonholed – admittedly, there will be API compatibility, but clearly both are swimming upstream.

Central Storage is newest battleground

As noted earlier, Hadoop’s heritage has been the classic Internet data center scale-out model. The advantage is that, leveraging Hadoop’s highly linear scalability, organizations could easily expand their clusters quite easily by plucking more commodity server and disk. Pioneers or purists would scoff at the notion of an appliance approach because it was always simply scaling out inexpensive, commodity hardware, rather than paying premiums for big vendor boxes.

In blunt terms, the choice is whether you pay now or pay later. As mentioned before, do-it-yourself compute clusters require sweat equity – you need engineers who know how to design, deploy, and operate them. The flipside is that many, arguably most corporate IT organizations either lack the skills or the capital. There are various solutions to what might otherwise appear a Hobson’s Choice:

  • Go to a cloud service provider that has already created the infrastructure, such as what Microsoft is offering with its Hadoop-on-Azure services;
  • Look for a happy, simpler medium such as Amazon’s Elastic MapReduce on its DynamoDB service;
  • Subscribe to SaaS providers that offer Hadoop applications (e.g., social network analysis, smart grid as a service) as a service;

    Pioneers or purists would scoff at the notion of an appliance approach because it was always simply scaling out inexpensive, commodity hardware, rather than paying premiums for big vendor boxes.

  • Get a platform and have a systems integrator put it together for you (key to IBM’s BigInsights offering, and applicable to any SI that has a Hadoop practice)
  • Go to an appliance or engineered systems approach that puts Hadoop and/or its subsystems in a box, such as with Oracle Big Data Appliance or EMC’s Greenplum DCA. The systems engineering is mostly done for you, but the increments for growing the system can be much larger than simply adding a few x86 servers here or there (Greenplum HD DCA can scale in groups of 4 server modules). Entry or expansion costs are not necessarily cheap, but then again, you have to balance capital cost against labor.
  • Surrounding Hadoop infrastructure with solutions. This is not a mutually exclusive strategy; unless you’re Cloudera or Hortonworks, which make their business bundling and supporting the core Apache Hadoop platform, most of the household names will bundle frameworks, algorithms, and eventually solutions that in effect place Hadoop under the hood. For EMC, the strategy is their recent announcement of a Unified Analytics Platform (UAP) that provides collaborative development capabilities for Big Data applications. EMC is (or will be) hardly alone here.

With EMC’s new offering, the scale-up option tackles the next variable: storage. This is the natural progression of a market that will address many constituencies, and where there will be no single silver bullet that applies to all.

This guest post comes courtesy of Tony Baer’s OnStrategies blog. Tony is a senior analyst at Ovum.

You may also be interested in:

Tuesday, January 31, 2012

Enterprise architects play key role in transformation, data analytics value -- but they need to act fast, say Open Group speakers

Good data management, analytics, and helping to shape the goals of the business are keys to transforming the enterprise through impactful enterprise architecture (EA). That was the theme, from different perspectives, presented by a series of plenary speakers this week at The Open Group Conference in San Francisco.

Jeanne Ross, Director and Principal Research Scientist at MIT's Center for Information System Research, opened Monday's plenary session, telling the attendees that the stakes are high for EA, which needs to show swift success in the new digital economy. Enterprise architects also now need to help their organizations better use new services and instill a "value cycle." [Disclosure: The Open Group is a sponsor of BriefingsDirect podcasts.]

Coming from the siloed past in IT, companies are now moving to business service-driven processes across various resources, Ross said. But they need to recognize the forces around consumption of such services, not just the implementation.

Making good data management a priority, a "single source of truth" is also at the heart of making EA valuable, said Ross. Ensuring the quality of data and the speed of data refresh will help enterprise architects rise in performance appreciation more than just about anything else, she said. Ross studies how firms develop competitive advantage through the implementation and reuse of digitized platforms.

Some day CIOs are going to report to the enterprise architect, because that's the way it ought to be.



She is also the co-author of three books: IT Governance: How Top Performers Manage IT Decision Rights for Superior Results, Enterprise Architecture As Strategy: Creating a Foundation for Business Execution, and IT Savvy: What Top Executives Must Know to Go from Pain to Gain.

I also interviewed Ross on enterprise transformation issues before the conference.

IT-enablement isn't enough, Ross said, because companies typically under-utilize new systems and applications. It's not that we can't build them, she said of systems, but that companies aren't using them to their potential. Architects need to consider this and then market and evangelize solutions.

And EAs need to be more involved with making quality data center stage in their companies. "You don't get good analytics with bad data," Ross said, "The secret to good EA is to put information in every person's hands so they can use data better." And that in turn will help transform the business and spur added innovation using IT systems and good architecture principles.

Most senior executives aren't very good at combining business and technology strategies, Ross said, and she outlined the architect's elevated role in helping their bosses deliver increased business value:
  • Help senior execs clarify business goals
  • Identify architectural capabilities that can be readily exploited
  • Present options and their implications for business goals
  • Build capabilities incrementally
She closed out, getting applause from the audience, by predicting, "Some day CIOs are going to report to the enterprise architect, because that's the way it ought to be."

Impressive cost reduction

The second plenary speaker, Celso Guiotoko, Corporate Vice President and CIO of Nissan Motor Co, Ltd., told how business value is at the top of IT principles for Nissan, information as an asset comes next, and then reducing complexity.

Using these principles, Nissan in 2005 developed "BEST" as an IT mid-term plan and significantly improved the efficiency of its information systems. BEST is an acronym for business alignment, EA, selective sourcing, and technology simplifications.

This was followed in 2009 with the development of the "Change" program, which provided the basis for further advances by changing people, technology, and "process." And, in 2011, the next IT mid-term plan "VITESSE" was launched, designed to bring direct profit to the company. VITESSE encompasses value, innovation, technology, simplification, and service excellence. Through the various initiatives, Nissan has reduced IT cost by over 40 percent, going from a cost per user of $1.09 to $0.63.

The transformed enterprise

Andy Mulholland, Global Chief Technology Officer and Corporate Vice President at Capgemini, focused on the transformed enterprise and cloud trends, as well as the effect of new devices and social networking. Forty millions tablets and 70 million smartphones are having a huge impact on how workers and consumers expect to work and shop.

The "bring your own device" phenomenon is forcing a change in thinking for enterprises, Mulholland said, as two environments are developing -- inside IT and outside IT. Typically back-end activities operate inside the firewall, while front-end people and activities operate outside the firewall, yet people nowadays want to be able to use smartphones and tablets for both personal and work tasks.

This has led to a situation in which workers are increasingly going outside IT to buy services. Mulholland quoted a Gartner prediction that up to 35 percent of IT expenditures will be outside the IT department by 2015. Other industry analysts like IDC have placed the figure higher.

IT faces a huge “re-integration project” to bring together the inside and outside services in a rational way.



Because of this, IT faces a huge “re-integration project” to bring together the inside and outside services in a rational way, Mulholland said, adding that the transformed enterprise needs to focus on the productivity of people and innovative business models.

I interviewed Mulholland a few weeks ago and we delved even deeper into the cloud duality issues now coming to the fore of enterprise technology issues and planning. I was also intrigued by a Wall Street Journal piece today on how the US faces a new tech boom. It was aligned with much of what Mulholland was saying.

The key to doing this “re-integration project,” according to Mulholland, is governance, and the industry really lacks a good cloud governance model, meaning that many businesses are already in trouble. However, enterprises shouldn't let that get in the way of progress. Mulholland advised, "If business wants something radically different from you, don't try to stop it. Try to understand it and take control of it."

Driving IT transformation

Lauren States, Vice President and Chief Technology Officer, Cloud Computing and Growth Initiatives, IBM, emphasized that transforming the enterprise requires a huge emphasis on analytics, and a successful integration of analytics and IT.

States drew on IBM's decades-long journey of constant transformation, relying on business process excellence, values-based culture, and IT-enablement. This has led to $1.5 billion in IT savings since 2005 as well as avoiding over $20 million in expenses over five years with a private analytics cloud, she said.

According to States, CMOs are overwhelmingly underprepared for the data explosion and recognize the need to invest in and integrate technology and analysis and consider analytics as business differentiators.

CEOs and CIOs are both highly focused on insights, clients, and people skills, States said, feeding into what she called the "new reality," the need to harvest and pass insights and build trusted relationships.

States' takeaway: We're at the beginning of a major change, much like the PC revolution three decades ago. The cloud's sweet spot now, she says, is in bringing new innovation and insights to marketing, sales and customer service.

No need to wait

Speaker Bill Rouse, executive director, Tennenbaum Institute at Georgia Tech, said that many enterprises wait too long to change, with the decision to transform dragging on until the damage is beyond repair. As evidence, he said that in the past 25 years, 1000 companies have dropped from the Fortune 500 list -- showing enterprise transformation has high failure rate, and that waiting for the right time change is a risky business plan.

Moreover, for those enterprises seeking transformation, they need to look at the full ecosystem that a business operates in to effectively transform, says Rouse. Business ecosystems are co-creating high-value services, expanding transformation across supply chains, says Rouse. This is an important nee dimension, he added.

Using analytics better to support evidence-based decision making is transformative and should be a priority, says Rouse. And architecture-oriented thinking can be transformative in itself, he said.

Cyber security threats

On the topic of cyber security, plenary speaker Joseph Menn, cyber security correspondent for the Financial Times and author of Fatal System Error: The Hunt for the New Crime Lords Who are Bringing Down the Internet, made it clear that business as usual won't do.

Joe has covered security since 1999 for both the Financial Times and then before that, for the Los Angeles Times. Fatal System Error is his third book, he also wrote All the Rave: The Rise and Fall of Shawn Fanning's Napster. I also recently interviewed him.

"It's in no one's interest to tell us how bad it really is" when it comes to cyber crime and security, said Menn. And the Stuxnet affair is huge as a harbinger of things to come, he said.

As a result, more taxpayer money will be needed for effective government-level defenses against cyber attacks, he suggested. But government intervention won't do the job alone. Increasingly, corporations will need to play more than just defense on attacks, many of which come from Russia and China and from groups that blend state and criminal interests.

Counter attacks may be a strong defense when it comes to cyber risks, and US government may "turn blind eye", says Menn. We may even see cyber crime bounty hunters that corporations hire on the QT to go after those that attack them, he said.

Meanwhile, IT groups and enterprise architects can play a bigger role. Knowing what you have helps you know when something has been taken, so improve tracking of assets, Menn told them. He also suggusted that companies keep their most critical data offline, and protect their intellectual property by burying it in and among fake data.

Allen Brown, President and CEO of The Open Group, said that more than 400 corporations are now members of The Open Group, showing strong growth over past 12 years since its founding. TOGAF 9 certification rates growing rapidly worldwide, he said.

FACE standard

In other news from The Open Group on Monday, The Future Airborne Capability Environment (FACE) Consortium, announced the official release of the FACE Technical Standard, which provides guidelines for creating a common operating environment to support applications across multiple Department of Defense avionics systems. See my interview on FACE as it was just getting under way.

The standard is designed to enhance the U.S. military aviation community’s ability to address issues of limited software reuse and accelerate and enhance warfighter capabilities, as well as enabling the community to take advantage of new technologies more rapidly and affordably.

It is our hope this standard will accelerate the open and secure development of products within the Department of Defense’s Airborne community by enabling industry-government collaboration.

The FACE technical standard will enable developers to create and deploy a wide catalog of applications for use across the spectrum of military aviation systems through a common operating environment. Product development efforts by industry and procurements by government customer organizations are already underway based on the FACE standard.

“The introduction of the FACE Technical Standard is an important milestone in extending interoperability among the armed forces and creating a common platform for avionics that enables systems to work together across each of the branches of the U.S. military,” said Brown.

And on Tuesday, The Open Group announced the arrival of ArchiMate 2.0, the latest version of the organization's open and independent modeling language for enterprise architecture. This version is more tightly aligned to TOGAF, so enterprise architects using the language can improve the way key business and IT stakeholders collaborate and adapt to change.

ArchiMate 2.0 improves collaboration through clearer understanding across multiple functions, including business executives, enterprise architects, systems analysts, software engineers, business process consultants and infrastructure engineers, according to the release. The new standard enables the creation of fully integrated models of an organization's Enterprise Architecture, the motivation behind it, and the programs, projects and migration paths to implement it.

"By combining TOGAF and ArchiMate, TOGAF becomes more easy to apply in any organization," said Harmen van den Berg, partner and co-founder at BiZZdesign. "Having a reference model makes them both easier to apply in any industry or vertical."

He added: "Architects like to make models, and this now helps them to use those models to create change in the organization, for something that means more to the business."

Making the EA function a chief weapon of enterprise transformation in a time of roiling change and complexity, that's the main message from the conference. No time to wait.

You may also be interested in: