Monday, December 1, 2014

Hortonworks accelerates the big data mashup between Hadoop and HP Haven

This latest BriefingsDirect deep-dive big data thought leadership interview examines how Hortonworks is working with HP on improved management of very large -- and very active -- datasets.

We'll explore how HP and Hortonworks are integrating Hadoop into more of the HP Haven family to make it easier for developers and data scientists to access business intelligence (BI) and analytics as a service.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. 

To learn how, BriefingsDirect sat down with Mitch Ferguson, Vice President of Business Development at Hortonworks at the recent HP Big Data 2014 Conference in Boston. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: We heard the news earlier this year about HP taking a $50-million stake in Hortonworks, and then about Hortonworks' IPO plans. Please fill us in little bit about why Hortonworks and HP are coming together.

Ferguson: There are two core parts to that answer. One is that the majority of Hadoop came out of Yahoo. Hortonworks was formed by the major Hadoop engineers at Yahoo moving to Hortonworks. This was all in complete corporation with Yahoo to help evolve the technology faster. We believe the ecosystem around Hadoop is critical to the success of Hadoop and critical to the success of how enterprises will take advantage of big data.

Ferguson
If you look at HP, a major provider of technology to enterprises, both at the compute and storage level but the data management level, the analytics level, the systems management level, and the complimentary nature of Hadoop as part of the modern data architecture with the HP hardware and software assets provides a very strong foundation for enterprises to create the next generation modern data architecture.

Gardner: I'm hearing a lot about the challenges of getting big data into a single set or managing the large datasets.
Fully experience the HP Vertica analytics platform...
Get the free HP Vertica Community Edition

Become a member of myVertica
Users are also trying to figure out how to migrate from SQL or other data stores into Hadoop and into HP Vertica. It’s a challenge for them to understand a roadmap. How do you see these datasets as they grow larger, and we know they will, in terms of movement and integration? How is that path likely to unfold?
Machine data

Ferguson: Look at the enterprises that have been adapting Hadoop. Very early adopters like eBay, LinkedIn, Facebook, and Twitter are generating significant amounts of machine data. Then we started seeing large enterprises, aggressive users of technology adopt it.

One of the core things is that the majority of data being created everyday in an enterprise is not coming from traditional enterprise resource planning (ERP) or customer relationship management (CRM) financial management systems. It's coming from websites like Clickstream, data, log data, or sensor, data. The reason there is so much interest in Hadoop is that it allows companies to cost effectively capture very large amounts of data.

Then, you begin to understand patterns across semi-structured, structured, and unstructured data to begin to glean value from that data. Then, they leverage that data in other technologies like Vertica, analytics technologies, or even applications or move the data back into the enterprise data warehouse.

As a major player in this Hadoop market, one of the core tenets of the company was that the ecosystem is critical to the success of Hadoop. So, from day one, we’ve worked very closely with vendors like Microsoft, HP, and others to optimize how their technologies work with Hadoop.

SQL has been around for a long time. Many people and enterprises understand SQL. That's a critical access mechanism to get data out of Hadoop. We’ve worked with both HP and Microsoft. Who knows SQL better than anyone? Microsoft. We're trying to optimize how SQL access to Hadoop can be leveraged by existing tools that enterprises know about, analytics tools, data management tools, whatever.

That's just one way that we're looking at leveraging existing integration points or access mechanisms that enterprises are used to, to help them more quickly adopt Hadoop.
The technology like Hadoop is optimized to allow an enterprise to capture very, very large amounts of that data.

Gardner: But isn’t it clear that what happens in many cases is that they run out of gas with a certain type of database and that they seek alternatives? Is that not what's driving the market for Hadoop?

Ferguson: It's not that they're running out of gas with an enterprise data warehouse (EDW) or relational database. As I said earlier, it's the sheer amount of data. By far, the majority of data is not coming from those traditional ERP,  CRM, or transactional systems. As a result, the technology like Hadoop is optimized to allow an enterprise to capture very, very large amounts of that data.

Some of that data may be relevant today. Some of that data may be relevant three months or six months from now, but if I don't start capturing it, I won't know. That's why companies are looking at leveraging Hadoop.

Many of the earlier adopters are looking at leveraging Hadoop to drive a competitive advantage, whether they're providing a high level of customer service, doing things more cost-effectively than their competitors, or selling more to their existing customers.

The reason they're able to do that is because they're now being able to leverage more data that their businesses are creating on a daily basis, understanding that data, and then using it for their business value.

More than size

Gardner: So this is an alternative for an entirely new class of data problem for them in many cases, but there's more than just the size. We also heard that there's interest in moving from a batch approach to a streaming approach, something that HP Vertica is very popular around.

What's the path that you see for Hortonworks and for Hadoop in terms of allowing it to be used in more than a batch sense, perhaps more toward this streaming and real-time analytics approach?

Ferguson: That movement is under way. Hadoop 1.0 was very batch-oriented. We're now in 2.0 and it's not only batch, but interactive and also real-time, and there's a common layer within Hadoop.  Hortonworks is very influential in evolving this technology. It's called YARN. Think of it as a data operating system that is part of Hadoop, and it sits on top of the file system.

Via YARN, applications or integration points, whether they're for batch oriented applications, interactive integration, or real-time like streaming or Spark, are access mechanisms. Then, those payloads or applications, when they leverage Hadoop, will go through these various batch interactive, real-time integration points.

They don't need to worry about where the data resides within Hadoop. They'll get the data via their batch real-time interactive access point, based on what they need. YARN will take advantage of moving that data in and out of those applications. Streaming is just one way of moving data into Hadoop. That's very common for sensor data. It’s also a way to move it out. SQL is a way, among others, to move data.
Fully experience the HP Vertica analytics platform...
Get the free HP Vertica Community Edition

Become a member of myVertica
Gardner: So this is giving us choice about how to manage larger scales of data. We're seeing choice about the way in which we access that data. There's also choice around the type of the underlying infrastructure to reduce costs and increase performance. I am thinking about in-memory or columnar.

What is there about the Hadoop community and Hortonworks, in particular, that allows you to throw the right horsepower at the problem?

Ferguson: It was very important, from Hortonworks perspective from day one, to evolve the Hadoop technology as fast as possible. We decided to do everything in open source to move the technology very quickly and leverage the community effective open-source, meaning lots of different individuals helping to evolve this technology fast.

The ability for the ecosystem to easily and optimally integrate with Hadoop is important. So there are very common integration points. For example, for systems management, there is the Ambari Hadoop services integration point.

Whether it's an HP OpenView or System Center in the Microsoft world, that allows it to leverage, manage, or monitor Hadoop along with other IT assets that those management technologies integrate with.

Access points

Then there's SQL's access via Hive, an access point to allow any technology that integrates or understands SQL to access Hadoop.

Storm and Spark are other access points. So, common open integration points well understood by the ecosystem are really designed to help optimize how various technologies at the virtualization layer, at the operating system layer, data movement, data management, access layer can optimally leverage Hadoop.

Gardner: One of the things that I hear a lot from folks who don't understand yet how things will unfold, is where data and analytics applications align with the creation of other applications or services, perhaps in a cloud setting like a platform as a service (PaaS).

It seems to me that, at some point, more and more application development will be done through PaaS with an associated or integrated cloud. We're also seeing a parallel trajectory here with the data, along the same lines of moving from traditional systems of record into relational, and now into big data and analytics in a cloud setting. It makes a lot of sense.
What a number of people are doing with this concept is called the data lake. They're provisioning large Hadoop clusters on prem, moving large amounts of data into this data lake.

I talked to lot of people about that. So the question, Mitch, is how do we see a commingling and even an intersection between the paths of PaaS in general application development and PaaS in BI services, or BI as a service, somehow relating?

Ferguson: I'll answer that question in two ways. One is about the companies that are using Hadoop today, and using it very aggressively. Their goal is to provide Hadoop as a service, irrespective of whether it's on premises or in the cloud.

Then we'll talk about what we see with HP, for example, with their whole cloud strategy, and how that will evolve into a very interesting hybrid opportunity and maybe pure cloud play.

When you think about PaaS in the cloud, the majority of enterprise data today is on premises. So there's a physics issue of trying to run all of my big data in the cloud. As a result, what a number of people are doing with this concept is called the data lake. They're provisioning large Hadoop clusters on premises, moving large amounts of data into this data lake.

That's providing data as a service to those business units that need data in Hadoop -- structured, semi-structured, unstructured for new applications, for existing analytics processes, for new analytics processes -- but they're providing effectively data as a service, capturing it all in this data lake that continues to evolve.

Think about how companies may want to leverage then a PaaS. It's the same thing on premises. If my data is on premises, because that's where the physics requires that, I can leverage various development tools or application frameworks on top of that data to create new business apps. About 60 percent of our initial sales at Hortonworks are new business applications by an enterprise. It’s business and IT being involved.

Leveraging datasets

Within the first five months, 20 percent of those customers begin to migrate to the data-lake concept, where now they are capturing more data and allowing other business entities within the company to leverage these datasets for additional applications or additional analytics processes. We're seeing Hadoop as a service on premises already. When we move to the cloud, we'll begin to see more of a hybrid model.

We are already starting to see this with one of Hortonworks large partners, where you put archive data from on premises to store in the cloud at low-cost storage. I think HP will have that same opportunity with Hadoop and their cloud strategy.

Already, through an initiative at HP, they're providing Hadoop as a service in the cloud for those entities that would like to run Hadoop in a managed service environment.
We're seeing Hadoop as a service on prem already. When we move to the cloud, we'll begin to see more of a hybrid model.

That’s the first step of HP beginning to provide Hadoop in a managed service environment off premises. I believe you'll begin to see that migrate to on-prem/off-prem integration in a hybrid opportunity in the some companies as their data moves off prem. They just want to run all of their big-data services or have Hadoop as a service running completely in HP cloud, for example.

Gardner: So, we're entering in an era now where we're going to be rationalizing how we take our applications as workloads, and continue to use them either on premises, in the cloud, or hybrid. At the same time, over on the side, we're thinking along the same lines architecturally with our data, but they're interdependent.

You can’t necessarily do a lot with the data without applications, and the applications aren’t as valuable without access to the analytics and the data. So how do these start to come together? Do you have a vision on that yet? Does HP have a vision? How do you see it?

Ferguson: The Hadoop market is very young. The vision today is that companies are implementing Hadoop to capture data that they're just letting fall on the floor. Now, they're capturing it. The majority of that data is on premises. They're capturing that data and they're beginning to use it in new a business applications or existing analytics processes.
Fully experience the HP Vertica analytics platform...
Get the free HP Vertica Community Edition

Become a member of myVertica
As they begin to capture that data, as they begin to develop new applications, and as vendors like HP working in combination with Hortonworks provide the ability to effectively move data from on premises to off premises and provide the ability to govern where that data resides in a secure and organized fashion, you'll begin to see much tighter integration of new business or big-data applications being developed on prem, off prem, or an integration of the two. It won't matter.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Monday, November 24, 2014

HP simplifies Foundation Care Services to deliver just-in-time, pan-IT tech support

Much of the attention to coping with mega IT challenges such as cloud, bring your own device (BYOD), mobile applications, and big data focuses on adoption and implementation strategy. Yet the added complexity and requirements of how to support these technologies once they are in place has now also become top of mind.

So how do enterprises deliver improved user experiences, leverage new reactive support tools and diagnostics, and increasingly rely on self-help and automation to keep their far-flung systems and services fully functional?

 Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

BriefingsDirect recently sat down with an HP Technology Services Executive to chart a better path to simplified, just-in-time, and pan-IT support improvements -- despite dynamic and complex IT environments. Lou Berger, Vice President of Technology Services Enablement and Readiness in the HP Enterprise Group, took some questions from me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: What are some of the key trends and drivers that are impacting the reactive IT support services market?

Berger: Data center managers and CIOs are entrusted with managing the current legacy environment they have while transitioning to address all the new trends: cloud, mobile, big data, and BYOD. They're all asking the data center to change and transform to address these things.

Berger
These new workloads are different, requiring new infrastructure and new strategies that are unpredictable. They're changing quickly -- and in the largest, hybrid cloud, with increased complexity. Hosting versus the legacy infrastructure is impacting the decisions and the requirements CIOs have to manage in their environments. This is obviously giving them more choices, but also adding more complexity.

The current state of affairs for a CIO is a very complex mix of technologies, supporting the old, while developing the new. They have new solutions that they're building on their own. They're buying solutions, converging their infrastructure, and really being asked to make choices now that are going to lead them into the future.

Gardner: And is this the case, Lou, for both enterprises as well as SMBs? Is there any difference between those two markets when it comes to IT support?

Impactful decisions

Berger: Not at all. The decisions are the same. The SMBs are looking to the future to save and optimize their environment and making the same exact decisions that the enterprises are making. Perhaps they're moving at different speeds, some more agile and innovative than others, but everybody is being forced to make the same decision.

Gardner: It seems that expectations have changed. End-users, from their consumer devices or at-home systems, are used to getting rapid support and help. Have the expectations of the end-user shifted?

Berger: In the new world, always-on is really the keyword, and data center managers' service-level agreements (SLAs) with their customers, the end-user, are at a much higher level. Access is always expected to be there for the full community of users, from the developers, to the actual customers, and then the end-users on the outside. The world today is 24x7 and with all the changes happening at the same time, they need to support that environment.

Gardner: So we're all adapting, we're all changing, and HP has adapted and changed as well. Maybe you could fill us in a little bit at a high level of what has changed with Foundation Care Services, and then also how reactive support fits into a wider panoply of all support choices?
We totally revamped our portfolio about two years ago to really enable this new style of IT.

Berger: At HP, we took a hard look at our service portfolio, not only Foundation Care, but across the whole portfolio, and we looked at how we were addressing customers’ needs in the current environment, and how we needed to look forward as the world was changing to meet the needs we just discussed. We totally revamped our portfolio about two years ago to really enable this new style of IT.

The first thing we did was simplify. CIOs have to make very difficult choices based on meeting the SLAs of the customers, of the environment, of each solution and each component of that solution, and then balance that against the cost of those things. We took a look at and simplified our portfolio.

The first thing was simplify it to make the choices easier for the CIOs. We broke the portfolio into three basic portfolio items. One was Foundation Care, the base of all service, the reactive parts of the service, the first decision the CIO has to make.

Second was adding Proactive Care, the ability for CIOs to add and make the decision of how much proactive support they wanted to add for the specific environment and the solutions they were building.

Finally, built on that, Datacenter Care, which combines all the options that we can make available to a customer to tailor to their specific needs for either their solutions or environment.

Simplifying support

When we talk about Foundation Care, we looked at our portfolio and we realized it was extremely complex for something that seems simple as reactive support. We had over 18 offerings that we were making available for customers, adding confusion to the decision-making process. Then finally we looked at our SLAs to them, and how they combined that in combination to manage complex environments.

In our new Foundation Care portfolio, we've narrowed it down to five offerings only, with three response choices for the customer to decide. This way, a customer can make very easy choices to understand the more fundamental decisions they need to make on reactive support -- what response time they want, what coverage window they want, and the length of term they want for that service before they review for renewal. It's a very simple decision-making practice.

We then took a look at our Foundation Care, and what customers required to manage those environments and made those things available through our call centers and our portals. So customers can understand very easily on a component level or across their environment what’s available through the services they've already been provided and with the SLA we have with them, so they can manage their environments.

There's an ability to be able to use their mobility tools to assess and understand exactly the state of their environments or the devices that they have connected to us and our support that we highly recommend because of the value it brings.
We removed complexity and we provided management and operational tools for customers to use on this foundational service.

So we removed complexity and we provided management and operational tools for customers to use on this foundational service.

Gardner: This sounds like it aligns very well to some of these trends we mentioned -- consumer behavior and expectations. Many people like the idea of self-help, of getting the right information that they can act on. Of course, they like to get it on a mobile device, which gives them flexibility and that 24x7 ability to track and manage.

Lou, one of the things that seems different nowadays is the ability for automation to play a larger role. How are your customers and HP adjusting to trying to automate some of these things, maybe through alerts and notifications, maybe through remote access in understanding systems regardless of where they are? What's the newest on that level, that automation capability?

Berger: As you know, HP has always invested heavily in the connected devices, the ability for us to securely connect to a customer’s environment, each one of those devices, and monitor. For those devices that we monitor this way, our time-to-repair is significantly faster than a straight call-in with no device.

That connectivity allows us to do much more than that. It allows us to communicate information that we're capturing for the customer to actually see, using mobility devices, on the health and the state of devices themselves and, in many cases, the configuration.

It allows us to understand failures, repair them quickly on behalf of customers, and notify customers of an issue so that we can work with them to repair.

Connected experience

These tools allow us to understand the state of their environment. So as we move up the proactive stack, we can help them understand and do preemptive maintenance, understanding what recommendations we make based on the devices, on doing upgrades to firmware and software, of the compatibility among the environment, of the key parts of the environment to the whole solution, and help make recommendations and keep their devices healthy.

The connected world is a key part of our strategy in helping customers manage through the complexity of new environments. Of course, the information we track becomes available to customers to help them manage their environments independently, both their SLAs, their contractual information, and the broader environment.

Gardner: And not only are we dealing with rapid change and complexity, but heterogeneity remains with us, as it has all along. When we talk about doing this support with updates, patches, and firmware, we are not just talking about one company or one vendor. We're talking about whatever your environment has and whatever you need. Is that not correct?

Berger: That’s very correct. When HP develops a compatibility matrix, these are the things we apply in helping customers be preemptive and make the decisions on the best way of managing their environment and staying up-to-date in the healthiest way.
Our connectivity is highly secure. It's been tested, agreed, and approved across every tier of business and every type of business.

Gardner: So you would be able to cover the entire fabric of your environment, not just parts and pieces, and that’s essential? You can’t have those cracks where things fall between or where patches don’t get made. That’s where these real problems can arise.

I have to also imagine, Lou, that this has the interest of the security and the governance, risk and compliance (GRC) people. This is another way for them to get assurance that things will continue not only performing, but performing securely. How does GRC and security fit into the services portfolio?

Berger: First -- and it's most relevant when we talk about security -- our connectivity is highly secure. It's been tested, agreed, and approved across every tier of business and every type of business, from the financial industries, to the government agencies. These have set the bar very high for security. So you can rest assured that our connectivity is a very secure and comfortable connection.

The compliance of environments in this new world is imperative. As CIOs make the decision at each product, at the solution, or at the environment, and how they meet their SLAs, they make the decision on how many proactive elements they want to add to that support. Providing these types of reports, or an enhanced call experience, across the environment, rather than at the piece level, adds to our ability to and the customer’s ability to manage environment to those compliance levels.

Again, the goal of the new portfolio was to simplify and make clear what each level of the portfolio gave in deliverables, and how that translates to value and the ability for the CIO to make these decisions and then meet their compliance requirements.

We stage our portfolio in a way that allows CIOs to make the right decisions to meet their compliance and security needs at the optimum cost for them.

Information is key

Gardner: So a key to good support, of course, is getting the right information to the right people in the right time-frame. We've talked a bit about the timing being very rapid, and the means to get that information being somewhat automated, with more mobility. But the information is still key.

So how do we improve the information flow? I understand the HP Support Center has been revamped to a certain degree as well. So that part of the equation, the information, is also rich, up-to-date, and easily available?

Berger: At a Foundation Care-level support, a customer has the option to only call the call center on a problem to be fixed and get the full support experience that comes from there. They also have access to our product pages, where they get specific information, access to our drivers, software and firmware, and the ability to download software, firmware and drivers on their own, which often includes both fixes and new features and functionality.

They have the ability to search the HP Support Center, which has all the content repositories for answers to support questions; guided troubleshooting, which provides step-by-step ability for our customers to self-heal themselves. And the Support Community, and our HP Forums allows our customers to interact with peers and learn how others dealt with issues and best practices.
In addition, we have 24x7 chat from our HP Support Specialists, which is available either from the mobile app or from a PC.

You have the Support Case Manager, where a customer can call in at any time and understand exactly what the state of open cases is. So if a case is in progress of being fixed, they can call in. Or they can use the mobile app, which allows automated updates.

In addition, we have 24x7 chat from our HP Support Specialists, which is available either from the mobile app or from a PC. And the full suite of solutions and technical manuals that are available to a customer for support.

Gardner: Of course, HP being a global company that means that these services are available around the world, with localization issues managed. What’s the breadth and depth in terms of that applicability to different markets and different languages?

Berger: HP’s greatest strength for a global customer is it's 24x7 worldwide support. We have Support Centers in every region. We have local language support for all our customers in every country necessary. We have the full suite of access and the same customer experience in any place in the world. That is the strength of HP.

Gardner: Okay, we've talked a lot about what it does. I think it's always great to show in addition to tell. Do we have any examples where we can point to an organization, large or small, one market or another, and demonstrate how they're using the simplified Foundation Care Services, getting some benefits, making sure that all the systems are up and running, and if not, the fix is in right away?

Foundation Care services

Berger: Foundation Care, our reactive services, is the base of all services. I will stick to that as an example. The first is a UK-based IT service company, a holding company for a group of companies involved in the provision of real-time monitoring systems and data management services, specifically the UK's leisure and forecourt petrol services.

The customers were looking to upgrade their IT infrastructure to handle growth in their customer demand. Our solution from the product side was to deploy converged infrastructure with HP Blades and Virtual Storage.

The customers’ requirements were met with HP Foundation Care Support. This is a very stable environment. It’s a converged infrastructure, but there are times when an anomaly can arise.

For example, in a connected world, the customer’s storage device sent a message stating that a driver is about to fail. Under Foundation Care, the driver was sent to the customer, preventing an issue before it happened. It's a different experience for many of our competitors, because we monitor the converged infrastructure and we take proactive actions versus waiting for the problem to occur.
A quote from the customer, “HP Support was fantastic. We were protected all the way through the support processes.”

So we recognized an issue. We proactively notified the customer. We sent the fix or sent a CE to fix their problem. We helped this customer meet their SLA at 99.98 percent uptime. In this case, we gave them a 100 percent uptime.

A quote from the customer, “HP Support was fantastic. We were protected all the way through the support processes.”

Gardner: Any other examples?

Berger: Sure. In this case, we helped a customer shift their focus from maintenance to strategic activities. HP offered a differentiated support experience by providing proactive alerts to flag potential issues.

The customer in this case is an underwriting services using proprietary databases and algorithms to estimate people’s life expectancy based on their medical records. The customer had performance issues with the large amounts of data on different services, with various hard drive configurations and several direct-attached devices for storage.

The resolution was to modernize their data center, where we worked closely with the customer, consolidating servers and storage using server virtualization and SAN technology. We installed ProLiant Server and 3PAR Storage and the customer purchased Foundation Care 24x7 support services.

The benefits were that centralized storage provided reliability and productivity. For the customer, their IT staff previously spent about 70 percent of their time dealing with infrastructure. Now, they spend only 20 percent of the time. That's a 50-percent saving in time.

With Foundation Care support, they now manage availability better with proactive support alerts on potential issues and focusing on improving applications rather than failures.

Managing costs better

Gardner: Lou, I've been tracking enterprise IT for quite some time now, and the question always comes up, "What do you get for your dollar or your peso or your Euro?" I have had trouble always coming up with return on investment (ROI) or a total cost of ownership (TCO) formula for some aspects of IT, for example, investing in modernization of a data center.

It's more the soft quality-assurance issues, but it seems to me, the economics of something like technology services and Foundation Care in particular is pretty straightforward.

What do you tell people when they ask you about the ROI here? It seems if you catch one big issue and you're in an always-on environment. That that can really save you a great deal of money very rapidly.

Berger: In any industry, an outage translates to revenue and cost, besides the customer satisfaction issues and everything else. There are studies that go back and say that in some industries, an outage, a long outage can actually put a company out of business in a very short amount of time.
Very fast response time is critical to the business. We commit to a six-hour call-to-repair.

But this does play very closely to the decision the CIO must make when they choose the support, and understanding the impact on their environment and understanding the crux of the business if they don't meet those needs.

In the Foundation Care Services portfolio we have three response levels with most customers. So Call-to-Repair is the highest level of service. Very fast response time is critical to the business. We commit to a six-hour call-to-repair.

It's our broadest coverage. We have 24x7 coverage, and take four hours, generally to fix a customer’s problem, with full access to our Support Centers and on-site service as part of the coverage.

Most economical would be Foundation Care Next Business Day, with coverage from 8 a.m. to 5 p.m., Monday to Friday. So a CIO can make decisions, based on the SLA they have and the impact to the business, whether critical or not, and apply these very simple service choices -- rather than the 18 we had before.

Gardner: So even though you've simplified, you still have the benefit of one size doesn't need to fit all. For example, I might have a set of applications or even a small rack or data center that doesn't require that higher level of oversight, and I might want to tier this. That gives me a lot more flexibility, and therefore I can manage my costs better. Is that the case?

Berger: That’s exactly the reason we did it. A data center manager, a CIO, can make the right decisions, at the right cost profile to meet his business needs, and optimize his decision making. Then, he can manage and understand by using the tools we provide to understand exactly what is covered for each one of those devices at any time. So they can understand if they are still meeting those needs as times change and customs change.

Looking to the future

Gardner: Looking to the future a little bit Lou, as we mentioned at the beginning, we have a lot of change. You mentioned it earlier, but we're looking at a lot more converged-infrastructure capabilities, particularly for big data.

We're looking at more use of hybrid and more types of cloud, platform as a service (PaaS), software as a service (SaaS), moving workloads from cloud to cloud, if we can do that in the future; the Internet of Things; the scale of the data and the amount of data and streaming data rather than static or batch data.

How do these things come to bear? What is your vision for how technology services adjust, given what we're expecting to happen over the next several years?

Berger: I hope you can see from the way we developed our portfolio that our Foundation Care Services allow the customer to make the most basic decisions on these requirements.
Proactive Care service allows a customer to call in across a variety of products. We own that problem and solve the problem across the solution.

We added Proactive Care service, which allows the customer to add further coverage based on the same parameters, adding preemptive support for those areas, environments, and solutions that require a greater uptime, a greater sense of security, and an enhanced call experience that includes solutions support.

Proactive Care service allows a customer to call in across a variety of products. We own that problem and solve the problem across the solution.

Then building into our Datacenter Care service, which was built including Foundation Care and the Proactive Care services, and allowing the customer to add elements specific to meeting their specific requirements, many of them now being built specifically for the new style of IT.

We also have a Cloud Hybrid Support offering, specific for this new style of IT. And different opportunities for customers to translate CAPEX into OPEX through support offerings, because many of the customers who are building on-premise clouds and converged infrastructure want the same experience from a financial point of view as moving to a hosted service. We built that into our Datacenter Care service.

As we move forward, the new style of IT, then DevOps requires agility, velocity, innovation, and continuous service. We're tailoring new offerings specific to that audience, specific to meet those requirements that we will partner closer and closer with customers on meeting those specific needs, and as always built on a Foundation Care support for their environment, too.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Friday, November 14, 2014

HP Analytics blazes new trails in examining business trends from myriad data

The next BriefingsDirect deep-dive big data thought leadership interview examines how HP analyzes its own vast data warehouses to derive new insights for its global operations, extensive supply chain, sales organization, global marketing groups, and customers.

We'll explore how the Analytics Group at HP, based in India, sifts through myriad internal data sources, as well as joins with other public data sets, to deliver entirely new intelligence value that helps make business more responsive and efficient.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn how, BriefingsDirect sat down with Pramod Singh, Director of Digital and Big Data Analytics at HP Analytics in Bangalore, India, at the recent HP Big Data 2014 Conference in Boston. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us a little bit about the Analytics Group at HP, what you do, and what’s the charter of your organization.

Singh: We have a big analytics organization in HP, it’s called Global Analytics and serves the analytics for most of HP. About 80 to 90 percent of the analytics happening inside of HP comes out of this eco-system. We do analytics across the entire food chain at HP, which includes the supply chains, marketing, and sales.

What I personally lead is an organization called Digital Analytics, and we are responsible for doing analytics across all digital properties for HP. That includes the eCommerce, social media, search, and campaign analytics. Additionally, we also have a Center of Excellence for Big Data Analytics, where we're using HP’s big-data technologies, which is that framework called HAVEn, to help develop big-data solutions for HP customers, as well as internal HP.
Fully experience the HP Vertica analytics platform...
Become a member of myVertica
Gardner: Obviously, HP is a very large global company. What sort of datasets are we talking about here? What’s the volume that you're working with?

Data explosion

Singh: As you know, a data explosion is happening. On one end, HP has done a very good job over the last six to seven years of getting most of their enterprise data into something called an enterprise data warehouse. We're talking about close to two petabytes of data, which is structured data.

Singh
The great part of this journey is that we have taken data from 700-800 different data marts into one enterprise data warehouse over the last three to four years. A lot of data that is not part of the enterprise is also becoming an important part of making the business decisions.

A lot of that data I personally deal with in the digital space, is what we call the human-generated data, the social media data, which no enterprise owns. It’s open for anybody to go use that. What I've started to see is that, on one hand, we've done a really good job of getting data in the enterprise and getting value out of it.

We've also started to analyze and harvest the data that is out in the open space. It could be blogs, Twitter feeds, or Facebook data. Combining that is what’s bringing real business value.

The Global Analytics organization is more than 1,000 people spread through different parts of the world. A big chunk of that is in Bangalore, India, but we have folks in the US and the UK. We have a center in Guadalajara, Mexico and couple of other locations in India. My particular organization is close to 100 people.
We've also started to analyze and harvest the data that is out in the open space. It could be blogs, Twitter feeds, or Facebook data. Combining that is what’s bringing real business value.

I have a PhD in pure mathematics, and before that I had an MBA in marketing. It's a little bit of an awkward mix there, and got in into analytics space in mid '90s working for Walmart.
I built out Walmart’s Assortment Planning System in late '90s and then came to HP in 2000 leading an advance data-mining center in Austin, Texas. From there I evolved into doing e-business analytics for few years and then moved to customer knowledge management. I spent five years in IT developing analytics platform.

About year-and-a-half ago, I got an opportunity to lead the big-data practice for this organization called Global Analytics. In five years, they had gone from five people to more than 1,000 people, and that intrigued me a lot. I was able to take the opportunity and move to India to go lead that team.

More insights

Gardner: Pramod, when we look back into this data, do you gain more insights knowing what you're looking for, or not knowing what you're looking for? What kind of insights were the unexpected consequences of your putting together this type of data infrastructure and then applying big-data analytics to it?

Singh: We deal with that day-in and day-out. I’ll give you a couple of examples there. This is something that happened about three or four years ago with HP. We were looking at a problem that was a classic problem in marketing to the US small and medium-sized business organizations (SMBs). We had a fixed budget for marketing, and across the US, there are more than 20 million SMBs. The classic definition of an SMBs is any business with 100-500 employees.

HP had an install base of a small part of that. We realized that particular segment of a SMBs is squeezed between a classic consumer, where you can do mass marketing, such as TV advertising, and an enterprise, where you can actually put bodies, your people who have relationship. SMBs are squeezed in between those two extremes.
The question then became what do we do with that? Again, when you do data mining and analytics, you may not know where this will lead you.

On one hand, you can't reach out to every single one of them. It’s just way too expensive to do that. On the other hand, if you try to go do the marketing, you don’t get the best out of it.

We were starting to work on something like that. I was approached by a vice president in marketing who said revenues are declining and they had a limited marketing budget. They didn’t know what to do.

This is where one of those unexpected things came in. I said, "Let’s see in that install base whether there are different segments of customers that are behaving differently." That led us on kind of a journey where we said, well, "How do we start to do that right? Let’s figure out what are the different attributes of data that I can capture."

On one hand, if you look at SMBs, you can capture who they are, what industry segment they're in, how many employees they have, where are they based, who the CEO is. It's what we call firmographics.

On the other hand, you have classes of data involving their interaction with HP. It could be things like how many PCs or servers they bought, how long ago did they buy it, how much money they spent, the whole transactional aspect of it.

Then, there are some things that are derived attributes. You may be able to derive that in the last one year they came to us four times. What interaction did we have on the website,? For example, did they come to us through a web channel? If they did, how many email offers were sent to them? How many of those were clicked? How many of those converted? Those are the classes of data that we could capture.

The question then became what do we do with that? Again, when you do data mining and analytics, you may not know where this will lead you.

Mathematical modeling

We thought that maybe there are different classes of customers. We pulled our data together and started to do mathematical modeling. There are techniques called clustering, analytical techniques called K-Means, and things like that. We started to get some results and to analyze them. In this type of situation, we have to be careful, because there are some things that may look mathematically correct, but may not have a real business value behind it.

Once we started to look at those things, we went through multiple iterations. We realized that we were not getting segments or clusters that were very distinct. One day, I was driving home in Austin, and I said, "You know what? Who they are I don’t control, but as far as what they're doing with HP we have a reasonably good understanding."

So we started to do clustering based only on those attributes, and that’s where an "aha" moment came. We started to find these clusters, which we call segments, where we eventually found a cluster which was that 7 to 8 percent of the population that brought in 45 percent of revenue.

The marketers started to say that this was a gold mine. That’s what we never expected to happen. We put together a structure. Once we figured out these four or five clusters, we tried to figure out why they were clustered together. What’s common?
Fully experience the HP Vertica analytics platform...
Become a member of myVertica
We built out a primary research thing, where we took a random sample out of each one of those clusters, interviewed those guys, and were able to build a very good profile of what these segments were.

There are 20 million SMBs in US, and we are able to build a model to predict which of these prospects are similar to the clusters we had. That’s where we were able to find customers that looked like our most profitable customers, which we ended up calling Vanguards. That resulted into a tremendous amount of  a dollar increment for HP. It's a good example of what you talked when you find unexpected things.

We just wanted to analyze data. It led us to a journey and ended up finding a customer group we weren't even aware of. Then, we could build marketing strategy to actually go target those and get some value out of it.

Gardner: At the Big Data Conference, I've spoken to other organizations who are creating an analytics capability and then exposing that to as many of their employees as possible, hoping for this very sort of unexpected positive benefit. Is there a way that you're taking your analytics either through visualization or tools and then allowing a larger population within HP to experiment with it?

Singh: We're trying to democratize the analytics as much as we can. One thing we're realizing is that to get the full value, you don't want data to stay in silos. So there are a couple of things you have to do. In terms of building out an ecosystem where you have good set of motivated people and where you can give them a career path, we have created this organization called Global Analytics. You get a critical mass of people who challenge each other, learn from each other, and do lot of analytics.

But also it’s very important that on the consumption side of it, you have people who are analysts and understand analytics and get the best value out of it. So they try to create that ecosystem. We have seen both ends of it.

Good career path

If you just give them to one data miner or analytics person in one team, sometimes the person does not find an ecosystem to challenge himself or herself. We're trying to do it on both sides of the fence, so that we can provide people with a good career path.

Hiring these folks is not easy. Once you've hired them, retaining them is not easy. You want to make sure to create an ecosystem where it’s challenging enough for these people to work. It also has to be an ecosystem where you continually challenge them and keep training them.

The analytical techniques are evolving. When I started doing it, things were stable for years. Now, the newer class of data is coming in, newer techniques are coming in, and newer classes of business problems are coming in. It’s very important that we keep the ecosystem going. So we try to do it on both sides.

Gardner: Very interesting. HP, of course, has its own line of products for big-data analysis. You're such a large global enterprise that you're doing lots of analysis, as any good business should, but you're also being asked to show how this works. Are there some specific use cases that demonstrate for other enterprises what you've learned yourselves.
You want to make sure to create an ecosystem where it’s challenging enough for these people to work. It also has to be an ecosystem where you continually challenge them and keep training them.

Singh: There are several that we can talk about. One is in a social media space. I briefly talked about that. My career evolved of doing analytics in what I call "data inside the enterprise." But, over the last couple of years, we started to go look at data outside the enterprise.

Recently we went and looked at a bank. We were able to harvest data from the Internet, publicly available data like Glassdoor, for example. Glassdoor is a website where employees of a company can put their feedback, talk about the company, and rate things.

We were presenting it to the executives of this particular bank and we were able to get all the data and tell them the overall employee morale. We figured out that the life-work balance for the employees wasn't very good.

The main component that the employees weren't happy about was their leave policy and their vacation policy. We drilled down and figured out that the bankers seemed to be fairly happy, but the IT guys and analysts weren't very happy. Again, this is one example where we didn't ask for a line of data from the customer. This data is publicly available. You and I, or anybody else, can go get it. I can do that same analysis for HP or any other company.

That’s where I believe the classes of analytics we're doing is changing. A lot of times, your competitive differentiator is the ability to do things with that data. Data is a corporate asset and it will be, but this class of what we call the user-generated data is changing analytics as a whole. The ability to go harvest it and, more importantly, get value out of it will be the competitive differentiator.

Gardner: Any other use cases that demonstrate the power of a particular type of platform, let’s say Vertica in HAVEn, where you've got the power of a columnar architecture and you've got the ability to bring in unstructured data from Autonomy? Maybe there are a couple of use cases that demonstrate the unique attributes of HAVEn when it comes to inclusivity and the comprehensive nature of information today?

Game changer

Singh: Let me talk about a couple of the things that happened in the HAVEn ecosystem. One of the main work forces in HAVEn is our massively parallel database called Vertica. In addition to being a database where we can ingest data very quickly, ingest large volumes of data, and run query performance, the game-changer for us as an analytics practitioner for me has been ability to do analytics in database.

If I look at my career over the last 20-22 years, most of the times what happens in the analytics space is that you have data residing in a database or an enterprise data warehouse. When you want to build a model, you take the data out and use an analytics platform like SAS, R, or SPSS. You do something there and you either bring the data back into the environment or you run the models and publish them out.

What Vertica has done that's unique is given us a framework, and through the UDEF framework, we could build a data mining model and run it directly on a database engine and take the output out.

An example we took to HP Discover a couple of months ago was trying to predict a failure of a machine before the actual failure happens. HP has these big machines and big printers, which are very expensive.

Like lot of high-end devices these days, they send out a lot of data. They send out data about when you're using a machine. The sensors send out a lot of information, maybe the pressure of the valves, the kind of the temperature they're in, the kind of throughput they're giving you, or the number of pages you've printed.
Looking at each components of failure, we could predict with a certain probability when the machine will fail and with a certain probability.

Also, they give you data on the events when the machine was not performing optimally or actually failed. We were able to go ingest all that data, put the data onto in the Vertica  platform, and build predictive models using open source R language. We built a model that can predict the failure of a machine.

Looking at each components of failure, we could predict with a certain probability when the machine will fail and with a certain probability, so our service reps can actually be proactive and not wait for the machine to fail. That's one example of doing an in-database data mining using Vertica.

Another example used more components around the social-media space. One of the problems in the social-media space, and I think you guys are probably familiar with this, is finding influencers.

I gave a talk yesterday around figuring out how you do that. There are classical ways if you go by the uni-dimensional thing around the number of followers or retweets you have. Barack Obama or Lady Gaga would be big influencers, but Barack Obama, for cloud computing for HP, may not be a very big influencer.

So you build those classes of algorithms. My team has actually built out three patented algorithms to figure out how to identify influencers in the space. We've actually built out a framework where we can source that data from the social-media space, drop it into a Hadoop kind of an environment.
Fully experience the HP Vertica analytics platform...
Become a member of myVertica
We use Autonomy to enrich and put some sentiments to it and then drop the data into the Vertica environment. In that Vertica environment, you run the compressed algorithms and get an output. Then, you can score and predict who is the influencer for the topic you are looking for.

Influencers

I gave the example of Barack Obama, in general a big influencer, but he is not influencer for all topics. Maybe in politics or the US government he's a big influencer, but not for cloud computing. Influencer is also a function of time. Somebody like Diego Maradona probably was a big influencer in soccer in the ’90s, but in 2014, not that much.

You have to make sure that you can incorporate those as part of the logic of your algorithm. We've been able to use the multiple components of HAVEn and build out a complete framework where we can tell numerically who the main influencers are and how influential they are. For example, if you get a score of 93 and I get a score of 22, you are almost four times as influential as I am.

Gardner: For other organizations that are interested in learning more about how HP Analytics is operating and maybe learning from your example, are there any resources or websites we can go to, where you are providing more information about HP Analytics?
You have to make sure that you can incorporate those as part of the logic of your algorithm.

Singh: Definitely. We work through our partners in Enterprise Services. We have our own website as well. There are multiple ways that you can approach us. You can talk to the Vertica sales team and they can connect to us. As I said, we do analytics for all of HP and for select customers. We do not have a direct sales arm to us. We work through our partners in Enterprise Services, as well as with software team.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Tuesday, November 11, 2014

Vichara Technologies grows the market for advanced analytics after cutting its big data teeth on Wall Street

The next BriefingsDirect deep-dive big data benefits case study interview explores how Vichara Technologies in Hoboken, New Jersey is expanding its capabilities in big data from origins on Wall Street into other areas, and thereby demonstrating the growing marketplace for advanced big-data analytics services.

The use of HP Vertica as a big data core component to Vichara has allowed them to extend their easier to use financial modeling and tools, and then apply them to other industries such as insurance and healthcare.

 Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about how advanced big data, cloud, and converged infrastructure implementations are expanding the impact and value of rapid and increasingly predictive analytics, BriefingsDirect sat down with Tim Meyer, Managing Director at Vichara Technologies at the recent HP Big Data 2014 Conference in Boston. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us how your organization evolved, and how big data has become such a large part of the marketplace for gaining insights into businesses.

Meyer: The company has its roots in analytics and risk modeling and for all sorts of instruments that are used on Wall Street for predicting prices and valuation of instruments. As the IT infrastructure grew from Excel to databases and eventually to very fast databases, such as Vertica, we realized that there were many problems that couldn't be solved before, and that required way too long a time to answer.

Meyer
Wall Street people measure time in seconds, not in hours. We've found that there's a great value in answering a lot of business intelligence (BI) questions -- especially around valuations and risk models, as well as portfolio management. These are very large portfolios and datasets that have to be analyzed. We think that this is a great use of big-data analytics.

Gardner: How long have you been using Vertica? How did it become a part of your portfolio of services?
Fully experience the HP Vertica analytics platform...
Become a member of myVertica.
Meyer: We've been using Vertica for at least for two years now. It’s one of the early ones, and we recognized it as being one of the very fastest databases. We try to use as many of these components as possible. We really like Vertica for its capabilities.

Risk assessment

Gardner: Tim, this whole notion of risk assessment is of interest to me. I think it's coming to bear on more industries. People are also interested in extending from knowing what has happened to being able to predict, and then better prescribe new efforts and new insights.

Tell me about predictive risk assessment. How do you go about that, and what should other companies understand about that?

Meyer: Risk assessment comes about from starting to look at how prices fluctuate and how interest rates move, and thus create changes in derivatives. What has happened most recently is that a lot of the banks and hedge funds have recognized this. Not only is [predictive risk assessment] a business imperative for them to have that half-percent hedge, but there are also compliance reasons for which they need to predict what their business is going to look like.

There are now more and more demands on stress testing, as well as demands from international banking regulations, such as Basel III, that require that businesses such as hedge funds and banks not just look behind, but ahead at how their business is going to look in a year. So this becomes really very important for a host of reasons even more than just how your business is doing.

Gardner: If I were a business and wanted to start taking advantage of what's now available through big-data analytics -- and at a more compelling price and higher performance than in the past -- what are some of the first steps?
Fully experience the HP Vertica analytics platform...
Become a member of myVertica
Do I need to think about the type of data or the type of risk? How do you go about of recognizing that you can now get the technology to do this at an analytics level, but there is still the needed understanding of how to do it at the process and methodological level?

Meyer: We work very closely with our customers and try to separate algorithmic work from the development work. A lot of our customers have more than a few Caltech and MIT PhDs who do the algorithmic definitions. But all of them still need the engine, the machine with its scripting, and fast capability to build those queries right into the system as quickly as possible.

We usually work with these kinds of people, and it is a bit of a team-work effort. We find that that’s a way to figure out what is our value, and what is the value of our customer. Together, it has turned out to be very good teamwork.

Gardner: And you are a consultancy, as well as a services provider? Do you extend into any hosting or do you have a cloud approach? How do you manage the technology for the consulting and services you offer?

Broader questions

Meyer: We expand from the core products and tools into broader questions for people who want a proof of concept (POC) into this new technology. We build those on an ongoing basis. People, as well, want to look at options such as different performances of clouds. They do vary.

So we take on those kinds of consulting work as well, not to mention that sometimes it expands into back-office compliance and sometimes into billing issues. They all relate to the core business of managing portfolios, but yet they are linked.

Very often, we've done those kinds of projects and we see even more of these possibilities as we see compliance as a bigger issue, such as Dodd-Frank as well as Basel III, in the financial world. But they are really no different than many regulations coming on the healthcare side for paperwork management, for example.

Gardner: So that raises the question of the verticals that you expect first. Where is predictive risk assessment and the analytics requirements for that likely to appear first?
They all relate to the core business of managing portfolios, but yet they are linked

Meyer: One thing we have learned from our experience in financial modeling and tools is that there is always a need for people who are totally unskilled in SQL or other query languages to quickly get answers. Although many people have different takes on this, we think we've found some tools that are unique. And we think that these tools will apply to other industries, most particularly to healthcare.

These are big problems, but we think the way we think of it is to start small with a POC or really defining a very small problem and solving it and not trying to take a bite of the entire elephant, so to speak. We find that to be a much better approach to going into new segments and we'll be looking at both insurance and healthcare as two examples.
Fully experience the HP Vertica analytics platform...
Become a member of myVertica
Gardner: Back to the technology front. Are there any developments in the technology arena that give you more confidence that you can take on any number of data types, information types, and scale and velocity types?
I'm thinking of looking at either cloud or converged infrastructure support of in-memory or columnar architectures. Is there a sense of confidence that no matter what you go to bite off in the market, you have the technology, and the technology partner, to back you up?

Meyer: We're finding that there is much more maturity in a lot of database technologies that are now coming out.

There is always something new on the horizon, but there are, as you said, columnar architectures and so on. These are already here, and we're constantly experimenting with them.

To your point about cloud infrastructure and where that is going, it's the same thing. We see ParAccel, Amazon, and data warehouses such as Redshift showing us the way where a lot of the technology is becoming very prepackaged. The value-add is to talk to the customer and speed up that process of integration.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in: