Friday, February 19, 2016

Extreme Apps approach to analysis makes on-site retail experience king again

The next BriefingsDirect big-data use case discussion explores how technology providers have teamed as an ecosystem to deliver new dynamic and rapid analysis capabilities to the retail industry.

We’ll explore how the Extreme Apps for Retail initiative places new knowledge in the hands of on-site sellers -- to the customized benefit of shoppers at the very point of sales and in real time.

By leveraging power of SAP HANA big-data software infrastructure, Hewlett Packard Enterprise (HPE) hardware, and Capgemini targeted analysis and intelligence, these Extreme Apps are designed to make the physical retail experience king by leveraging the best of online assets – all brought to enhance the user experience at the mobile edge.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

To learn more about how individual buyer information and group buying behavior inferences combine to customize the buying experience anywhere and anytime, please welcome Frank Wammes, Chief Technology Officer, for Capgemini Continental Europe. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: We’ve had so much change over the past five years in retail. It’s a vertical industry that’s under lots of pressure with a need for innovation. What, in your mind, are the top trends driving this desire to use big data to better enhance users’ ability -- at the retail site -- to get customized buyer experiences and customized deals based on their individual needs and wants?

Wammes: Retail indeed is one of the industries which is most impacted by the outflow of financial crisis in 2007-2008, where a lot of companies struggled. They ask: “Okay, how are we going to revive our business?” It's been an industry where you could see the winners and the losers very clearly. But there are a few things that everybody in the retail industry is now thinking about and need to answer.

Wammes
First of all, the big opportunity that retailers have is leveraging the whole big-data movement. There is so much data that retailers have about their customers and the consumers, structured and unstructured data, that they can benefit from. The only question is how they'll do that and they're going to make sure that all the data that they processed comes to action in order to create better experiences in the store or on the website.

The second big trend is how to gain the loyalty of your buyers? We see that it’s very easy for consumers to switch between different brands, between different retailers, between different stores, and loyalty is something that came in the past but it’s something that will not come automatically now or in the future.

But if you give a client a real custom experience, and they know that every time they come to you they'll get the same experience and they’ll get the benefits because of their loyalty, they'll adapt their needs in real time and will keep their loyalty towards your brand. So the second question is how do you increase their loyalty.

And third, it’s really the combination of the online and physical retail experience, the only general experience that people have. How do you make sure that during the buying journey of a customer, they continuously have the same experience?

Wowing the buyer

We always joke that if you go to a retail outlet in your specific country, how many retailers, when you have bought something online and you want to cancel it and you want to buy something in the store, can you go to that store, cancel that order, and make sure that you can take a physical good out of the store? In 95 percent of the cases, that will not be the case. It’s very easy to surprise your customer if you can do it. So how can you wow your buyer and give them the real experience?

Those are the three big things: leveraging all the data to increase the loyalty in both online and offline worlds.
Learn More
On Extreme Applications for Retail
Built on SAP HANA
Gardner: It’s interesting that we're using big-data and intelligence to, in effect, combine what happens online with what happens in real-time and real space. Until fairly recently, people expected their online shopping experience to be the one where analysis was being derived from their actions, from their history, their clickstream, and so forth.

It's fascinating that we're able to now bring analysis to the physical site, and it seems that shopping is one of those things where so much more can be done when you're actually in touch with the goods, to be able to feel them, see them, try them on.

Why have we had a problem getting to this point where we can combine the best of online analysis capabilities and data gathering with the physical world? What have been some of the problems that needed to be solved in order to get to this point?
From an online perspective, we've been able to give you much more personalized offers or a better experience towards your needs using the intelligence and the big data.

Wammes: Once you went online, people could capture where you came in from through your IP address. So if you consistently came through that IP address you didn’t even have to have a loyalty card. We knew that you were a returning client.

We probably knew that you bought something. That was the reason why, from an online perspective, we've been able to give you much more personalized offers or a better experience towards your needs using the intelligence and the big data.

The issue was that in the physical store, once you entered, we didn't know who you were. Probably at the counter, at the moment that you already made your purchase, you drew your loyalty card. That was the moment that we could do something for you, but that was already at the end of the purchase.

A lot of the technology has changed. One of the things is that you can have your sales agents in the stores, or your sales representatives in the stores, and have them use tablets.

So once people are shopping in the physical store, I can create a contact moment and I can probably ask them for their loyalty card or if they've bought something, yes or no.

Beacon technology

Even more important, one of the other things that you can do now is with beacon technology. Once you come in with your phone and you already have a connection to the company because you're in some kind of a loyalty program, you already downloaded an app from that specific store, at the moment that you enter, we know that you entered.

We can upload a picture on the sales rep's mobile device, so that he can proactively approach you and say, "It’s so good that you came back again. How was the coat that you bought last time?"

The moments that we can have in these interactions with our customers within the retail store gives us the possibility to the leverage from the insights and the big-data capabilities. That's something that we didn’t have in the past.

That is the thing that helps. Now, we have the capabilities and the technologies to crunch all that data in real time. It's good that I can recognize my customer, but more importantly, I now have the technology to instantly, in real time, crunch all the data so I can give him this personal experience.

Gardner: I can see why you are calling it “Extreme Apps.” It really is powerful and interesting that you can do all this now –- to have someone greet me and recognize my last purchase and follow up on that. Clearly, with my opt-in at a store,  I'm giving them information, but I'm getting a lot back in return. It really is groundbreaking.
This is something that we're also actively looking into, making sure that the retention of the consumer will be increased.

Is this something that we're seeing only in retail or are there other vertical industries, not to go too far a field from our discussion, but is this something that’s applicable beyond retail and is that something you’ve considered?

Wammes: There is a little bit of retail in a lot of different industries. We initially focused on hardcore retail. The reason we did it is because we looked the industries where so much transactional data is coming in that we can crunch the data and use the power of in-memory analytics. That was the starting point.

Then you can look at utilities, because with the utilities there are so many streams of information and so many transactions that you can crunch the data and get a personal experience, particularly now with the deregulation of the utility industries. This is something that we're also actively looking into, making sure that the retention of the consumer will be increased.

The banking industry, the insurance industry, all have this kind of retail perspective. On the other hand, we're also are in talks with some oil companies that have their retail outlets, sometimes directly or sometimes indirectly.

We had some discussions with a very large beverage producer. We said they could perhaps offer analytics as a service towards retailers, so that the retailer themselves don’t have to buy the analytical capabilities. The companies could offer this as a service so that they have more influence and insight on what’s happening with their product. Perhaps they could put that into the hands of an independent party so that the retailer doesn’t see all this insight.

We see the retailer as the starting point because of this experience, the customer experience, that you directly can enhance. But there is a lot of retail kind of experience in the banking and insurance industries. So the opportunities are more diverse, and it all leads to how can I optimize the personal experience the individual buyer has with our company.

New combinations

Gardner: It’s fascinating. We're really combining the physical world, the mobile tier data, across existing industries in really new ways.

Let’s explore how that “Extreme” experience benefit on the front-end is made possible by some extreme technology on the back-end, so to speak. We have several different players involved here: SAP, HPE, and Capgemini. Explain to me how these partners in this ecosystem have come together, and what each contributes to the ability to deliver these capabilities.
Explore How
HPE and Capgemini Collaborate
On Business Solutions
Wammes: Definitely, because it is extreme, but it also required some extreme engineering of new technologies in order to create it.

What we have done is a combination of some very strong partners of ours, where we try to leverage the new technology. First of all, it started with SAP HANA. It was also the question that SAP posed to us. We have HANA as the in-memory engine, but we have it just as an engine. Can you, with your creativity and your industry knowledge, create some solutions on top of it? That’s where this whole Extreme Apps started.
SAP comes in with their traditional Business Suite, but the Extreme Apps are explicitly built into the SAP HANA platform.

SAP delivers the HANA technology, and what we also offer, but this is optional. We also say that if you don’t have a proper back end to provide you all this data and to capture all the data and all the transactional data, Capgemini has built a retail template on top of the SAP Business Suite. So we have a preconfigured retail solution for those companies that don’t have a proper or a state-of-the-art enterprise-resource planning (ERP) system yet Capgemini’s One-Path solution.

First of all, SAP comes in with their traditional Business Suite, but the Extreme Apps are explicitly built into the SAP HANA platform. HPE delivers the hardware and the services, the hybrid cloud expertise. Together with SAP and HPE we look at the architecture, because you always want to have all your data in memory mode. We also took in technologies like Hadoop to make a distinction between the hot data and the cold data that we work on in our analytics space.

What Capgemini added on top of it basically was the algorithm. We leveraged on the algorithms that we had to put into this Extreme environment. It runs now on the Capgemini data centers, on the HPE systems , but we can leverage this in multiple ways.

You can host it in the Capgemini data center, you can have it installed on-premise, we can have it in the cloud, and we can also deliver it as the normal traditional license and transaction price. But we also engineered it together with SAP and HPE so that you can also have it in a price-per-month scenario.

Gardner: So in addition to this flexible-deployment capability -- where you can bring these Extreme Apps anywhere, anytime, anyplace -- you also have a set of APIs available so that this can be customized and adapted to a mobile, web, point of sale, and so forth client. Tell me about the role of the API and the ability to customize apps and delivery.

Two big scenarios

Wammes: If you look at the Extreme Apps that we have right now, we have two big scenarios. We've already built other analytics around it, but there are two big scenarios. One is the Market Basket Analysis and the other is the Next Best Action.

The Market Basket Analysis is the tool for the merchandise agents. They want to know whether if they make a promotion, does it really add to the margin of the company, or if they look at a certain promotion, why is it performing better in location A compared to location B?

You want to leverage a lot of the analytics and the visuals that are already on the web, but that’s through your normal web browser. It’s the professional user using it. We deliver the standard analytics and the standard visuals. There isn't a lot of tailoring around it.

The Next Best Action is a completely different ballgame, Dana. You want to provide the capability to offer something while the user is making his purchasing decision, and that can be through all different kind of things. It can be when a client is shopping on your web site and then you want to have this engine immediately promoting something that is relevant for the user.

It could also be that he walks by your company, and this is an example that we had with a lingerie store. They said when they have the telephone number of a client and they know through our beacon technology she is walking somewhere around our store, we give her a promotion. So we give her a 10 percent discount if she comes into the store and buys within an hour.

We already said it’s good that you give the 10 percent promotion, but wouldn’t it be much better if you give a very explicit promotion based on her buying pattern and based on the buying patterns of others who bought similar kind of products? Then, the promotion really becomes valuable. You want to have the promotion on your mobile. For your sales reps, you want to have it in a specific function, which gives them the opportunity to have a good conversation with the client while they are in the store.

We have developed some standard screens already, whether it's for mobile, tablet, the web. More importantly, all these companies already have their mobile apps, or already have a sales representative outlet. We need to create an API so they can embrace it and incorporate it within their own existing environment, so that they really can start quickly and don’t have to do a complete rebuild of their environment.

This is the way the API works. Through the API they can get the promotion data and can incorporate it in their existing applications.

Gardner: Frank, where are we on the roll-out or milestones for this Extreme Apps for Retail initiative? Tell us a little bit about that: when it started, where we are now, and when we should expect to see more of these apps in actual use.

Adding value

Wammes: It began about two years ago when we had the discussion with SAP, where they first started to build applications on top of HANA. How can you add value from an industry perspective towards this technology platform? That’s where we started. We built it.

We also crafted it together with a clothing retailer. It was not just created within the buildings of Capgemini and with the help of HPE and SAP with no client. We built it together with a client. We immediately knew the issues that they had, and not only leveraging our own industry expertise. So that was basically the first client.

And then we went into a do-it-yourself retail chain, where we implemented it. We saw that when the users, and particularly the professional users of the application, saw what the potential is, what you can do with real-time, in-memory capabilities, immediately additional questions emerged.

We started with these two applications, but then the question was, "If you have my point- of-sale-data, can I also create a report so that I can show my CFO what the daily sales are, but in a very advanced graphical way?" By the way, we leverage all the standard visualizations that you get with HTML5. So you can set up many libraries.
If a client installs one of the main scenarios, we already provide them with the reports that we have and that we build from the other clients.

So quickly, we had four or five additional reports that were built, because we already worked on the past data. This is where we are right now. We have the two main scenarios. If a client installs one of the main scenarios, we already provide them with the reports that we have and that we build from the other clients.

We have some additional algorithms and test environments where we continuously are in discussions with our clients. Which algorithm is most valuable to your business? We said, "If you're an SAP dominant client, and you're already leveraging the power of the Extreme Apps, we'll make sure that we extend the scenarios that we have with that algorithm that you have."

We can anticipate that, in the coming year, we'll build more based on the proof of concepts (POCs) that we will do with our clients, where first we'll test the algorithm and then we'll build it into the HANA platform, thereby enhancing the portfolio of the different Extreme Apps scenarios.

Gardner: Given that these services, these apps, are available and are proven in the field, if an interested organization wanted to start leveraging these capabilities, how long does it typically take, and what's involved in getting this actually in implementation?

Wammes: That’s the cool thing. There are a lot of aspects, which you also already have recognized, that required Extreme Apps for Retail. Perhaps the most Extreme is the implementation time.

We leveraged the environment that we have within the organization, the combined Capgemini-HPE environment. If you deliver your data in the data structure that we ask, then we can store your data into Extreme Apps, and within two weeks, you can start experimenting to see whether the technology works for your company.

Improving promotions

Give us your data, and we'll load it into the Extreme Apps environment. For your Market Basket Analysis, you can already do the first analysis, where you can see where you can improve your promotions, whether you are making the wrong decisions and putting items in promotions which negatively affect each other.

We can already provide you environment where you can do your proof of value to showcase that, within a very short time, you can have a return on investment (ROI).

Because we have this API, we can also immediately integrate it within your existing environment, whether it’s your app, your web browser, or your Internet page. You can already start experimenting by giving a little bit more advanced, more analytical capabilities.

So it’s not only that you recommend this product because other people who bought product A also bought product B. Rather, because you bought these series of products, I compared it to people who also bought these series of products, but they then bought product B. So the advancement of the analytics is much bigger than the traditional, "If you buy A, then buy B, because others also bought B."

This is something that we can have installed very quickly, and once you want to go in production, it depends on whether you want to go on-premise, or whether you want to go hybrid. In the meantime, you can leverage the environment that both HPE and Capgemini set up in our own data centers.
It has also been a journey for us learning that it is not only the capability of doing the analytics, but it has changed the way that you can do your business models.

Gardner: So you can integrate to a retail organization’s website capability, their online marketing or marketplace and selling, and any buying capability -- and also reach out to their point-of-sale retail outlets in as little as two weeks?

Wammes: Exactly. I think the combination is now the cool thing that we see, and that’s also Dana, some things that we learned. We started off with the traditional model and we built the scenario. If we went to a client, they needed to buy the HANA license, they needed to buy the hardware, and they needed to buy a scenario from us. Then, we built it in a offering where you do it on a monthly basis. What we're now seeing is that together with other solutions, we can have it integrated in some engine.

So for instance, Capgemini has another piece of intellectual property (IP), which is called RM3. RM3 is a middleware solution where you can optimize your promotion, so you can mix and match the promotions to tailor it as much as possible to the individual need.

But now, we can put the Extreme Apps in it and make the promotion more advanced. We're in talks now with some other clients who have their own engines, where they give promotion capabilities through mobile apps, but they don't have a powerful analytics module behind it to make it personal. Now, they can have this Extreme Apps as the engine.

It has also been a journey for us learning that it is not only the capability of doing the analytics, but it has changed the way that you can do your business models. This applies both to the retailer, as well to the conglomerate that we are working with.

Better analysis

Gardner: Of course we know from the benefits of data and analysis that the more data and the longer period of time, the better the analysis. So, you're able to give your individual retailers more insight into individual behavior. They're able to see their own processes, promotions, and enticements work better, but stepping back, you're also, at the Capgemini level, getting a lot of insight into an even larger set of data across multiple retailers, multiple types of shopping environments, and multiple types of buyers.

Does that mean that you're going to get better algorithms and better insights from this larger historical set of data that can then be applied back into this set of Extreme Apps?

Wammes: That is a very good suggestion for an additional business model, Dana, to be quite honest.

Now, we separate the different environments. So, at this time, it’s the environment that we set up and the algorithm work for the specific individual client.

What we now do, which goes a little bit to your point, is that we learn from how the different clients that use our Extreme Apps leverage the Extreme Apps to optimize their promotions and their interactions with the client. That’s the first step.

At this time, we're not at the point that we say we can leverage the knowledge that we take from the multiple client sets. However, what you refer to is something that we've thought of already, but it comes back to the example that I gave on the beverage producer.
That’s where the learning on the multiple clients and multiple different retail stores will kick in.

If you, as a consumer goods company, can provide an engine to a retailer or to a multiple set of retailers, where you say, "We can help you in optimizing your promotions so that, in the end, you will sell more, and if you sell more, we will also sell more."

That’s where the learning on the multiple clients and multiple different retail stores will kick in. We've thought of that concept, but not so much offered the users of our Extreme Apps solutions. It's more in the context of whether consumer product companies can offer this as some kind of analytical capability towards different retailers?

Gardner: Perhaps, Frank, in a year or two we will have another conversation where we will talk about how synergistic shopping works. When you buy one type of product, it might mean you will be buying another soon, and some coordination and intelligence can be brought to that.

Wammes: Yeah, definitely.

Gardner: In the meantime, do you have any examples of either named or unnamed organizations that have put the Extreme Apps for Retail to use? What business benefits they get from it? Any measurements of success, such as, we were able to increase share of wallet, we were able to increase larger sets of purchases by certain buyers? Anything along the lines of proof points for how well this works?

Business cases

Wammes: Yeah. Well, I can mention some industries. We can't disclose specific types of retailers. When we looked at the business case that we got for the do-it-yourself company, their main business case was on the Next Best Action.

They saw the potential to do an increase of about 25 percent, because they could better target the promotions that they gave. It was also because we started to introduce the Next Best Action on the apps and on the website, which is a very growing business of course in that specific industry. So making sure that the up-sell and the cross-sell emerged was really the business case on the do-it-yourself side.

With the food retailer, it’s much more about the merchandized planning. What we saw particularly was the promotion. That was a business case where it was something about four percentage points of improvement that could be achieved easily. So there wasn't much action.
They saw the potential to do an increase of about 25 percent, because they could better target the promotions that they gave.

The benefit really was that, through the analysis, we could see which products had affinity with each other, but also what the potential financial benefit between those two products were if you would not put them into promotions again, or if you put them explicitly into promotions?

As an example, and I think it’s the most easiest example, but everybody understands it, if you sell crisps in your promotion, don't put your beer into the promotion, because there is such a high correlation between people who buy beer and will automatically buy crisps as well. So don't do that.

Through these kinds of correlations and affinities, we could have a four percentage point improvement on the revenue, making sure that people would not do the promotions again.
Learn More
On Extreme Applications for Retail
Built on SAP HANA
We were able to reach a couple of percentage points because we could sell products which were very slow selling and where you could have issues on your expiry date. We could identify what kind of products they would sell. So if I have this low turnover of goods, but I put a promotion on high turnover of goods, with a high probability that the low turnover of goods would sell as well, then I would get rid of the inventory.

We also saw that the three percentage point potential was really about don’t put something on promotion where a product with a high affinity is out of stock.

These are the real examples that we have in the different industries.

Gardner: Now, these are benefits that are clearly significant to the seller. We're talking about retail, where it’s very competitive, and there are large revenue numbers involved. So a couple of percent is a lot.

But what about the buyer? Have you done any surveys or questionnaires, found out why the buyers are benefiting from this, and what it makes in terms of loyalty develop with them?

Personalized experience

Wammes: The research was really on the business case and the elements that I gave to you. So, I can’t give you the exact numbers on the loyalty.

However, in our interaction with our clients was that they said that it's the opportunity to give this personalized experience in a relatively inexpensive way. We always use some clients as the real best practice on how can you integrate your online and offline shopping experience.

So for instance, for us, Burberry is one of the stars in having a very integrated omni channel experience. What we see beside the business case effects that we just discussed is that they said the fact that they can really have a personalized conversation when somebody enters the store, with data already up front, gives high value. However, we didn't measure it.

I have a very good case in The Netherlands. There was a very large retailer that went bankrupt on December 31. They had 10,000 employees, and on the verge of the new year they got to hear that they are unemployed. They were one of the oldest big retailers in The Netherlands, big department store.

I've visited that department store a lot. The issue always was that when I came to the floor, there were no people that came to help me. They didn't come to advise me. They didn’t come to assist me. When I finally grabbed a product to buy, I had to stand in a big line, because there were only a few cash registers on this very big floor.
Technology is not threatening that. If you apply it in the right way, you strengthen it.

It’s a very bold statement, but I think their future would have been much brighter if somebody would have approached me and already knew that I bought something because they have a loyalty card and they knew what I bought in the past. They knew what my interests were. And they could have greeted me and said, "Mr. Wammes, it's so good that you're here again. Can I show you around because I noticed that you marked it as interest on our online store and let me show you?"

I could buy it from that person as well, because they have this integrated credit card mechanism attached to their tablet. That would really be a complete transformation of doing business in that department store. If so, 10,000 employees wouldn't have had that bad message at the end of the year.

Gardner: You're basically saying that the personal touch in the retail environment is empowered now and can come back. We've all noticed in the past years, even decades, that the amount of personalization, personal touch, and human interaction in sales has gone down; it's very much self-service. If it remains self-service, what's the difference between online and bricks and mortar? Not very much. So you really with this capability, this Extreme set of Apps bringing the relevant nature of person-to-person sales and service interactions back into vogue -- and making it very economically powerful.

Wammes: Exactly. You've hit the nail, as we say in The Netherlands. One of my colleague said it's all about relevant personal experience, and it should be relevant personal experience. Technology is not threatening that. If you apply it in the right way, you strengthen it. And I think that's really where we can have a great omni-channel, relevant personal experience delivered towards the consumer.

Gardner: We're just about out of time. I just want now for a brief moment look to the future. Now that we've taken this significant step into Extreme Apps for Retail, what comes next? What might we consider the next chapters in being able to leverage these capabilities around real-time, vast data being brought to bear, fast APIs for implementation and delivery of the visualization and other data, and then this newfound empowerment of that salesperson, that personal advisory service, at the retail outlet? What might we expect in the next months and years?

More artificial intelligence

Wammes: Well, let me start far in the future and then bring it back a little bit. If I go far to the future, bringing in even more artificial intelligence (AI) will not only even enhance the creation of strong algorithms that increase this relevant personal experience, but also AI, in contrast, will give robotics a chance to interact with us.

There are already some examples, for instance, robots driving around in airports to help people along the journey. But the sales agent will be supported by AI to give more relevant personal experience towards our client. AI is definitely something that will kick in in the coming years.

The roll-out of beacon technology, so that we really can recognize the individual consumer, is something that will be more broadly explored in the coming years in the industry.
The most powerful part of the solution is that we put a toolset into the hands of people who, in the past, were always limited by the IT department.

We've seen a lot of companies talk about big data, but a lot of retailers are still struggling a little bit with how to really apply it? What we've seen is with the clients who implemented the Extreme Apps for Retail is that because people were exposed to the enormous power of what in-memory, big data solutions can bring, all of a sudden the imagination is awakened.

In some of the examples that I gave earlier people said, "If you have this data anyway, can you then give us some very nice visual analytics to use that?" The most powerful part of the solution is that we put a toolset into the hands of people who, in the past, were always limited by the IT department, because it was difficult to build, it costs lot of money, and was very difficult to maintain.

With the new technologies, it's very easy to create stuff that is very visual and powerful. Therefore, the imagination becomes the limit of what we can do. That's perhaps the most surprising part, and that’s the thing I can't answer, because I don't know yet what kind of things people will come up with. But we're entering an area where imagination becomes a driving force of the things that we can do.
Explore How
HPE and Capgemini Collaborate
On Business Solutions
Gardner: For those who are reading or listening to our conversation today, if they want more information about how to learn about this to start the journey towards understanding how it might benefit their organization, where would you point them?

Wammes: First of all, they always can contact me at frank.wammes@capgemini.com or go to my Twitter account, @fwammes. If you go to the Capgemini site, there's a section called Ready2Series, and Ready2Series is the solutions where Capgemini owns their own IP. Under the Ready2Series, you'll find more information about the Extreme Apps, and you can learn more from the solutions that we have there (https://www.capgemini.com/sap/sap-hana/extreme-applications-for-retail.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Tuesday, February 16, 2016

SAP Ariba's chief strategy officer on the digitization of business and future of technology

The next BriefingsDirect technology innovation thought leadership discussion focuses on advancements in business applications and the modern infrastructure that supports them, and what that combination portends for the future.

As we enter 2016, the power of business networks is combining with advanced platforms and mobile synergy to change the very nature of business and commerce. We’ll now explore the innovations that companies can expect -- and how that creates new abilities and instant insights -- and how companies can, in turn, create new business value and better ways to reach and serve their customers.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

To learn more about the future of technology and business networks, we’re joined by Chris Haydon, Chief Strategy Officer at SAP Ariba. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Now that we have cloud, big data, and mobile architectures aligned, how does that support where we can go with new business applications and processes -- to get entirely new levels of productivity?

Haydon
Haydon: It's an exciting new age. The new platforms, as you say, and the applications coming together are a kind of creative destructivism. We can start all over again. The value chain is changing because of digitization, and technology needs to respond.

So what you hear are buzzwords like adaptivity, configurability, or whatever, but these are table stakes now for business applications and business networks. This digitization of value chains forces us to think about how we bring the notion of multiple constituents within the organization, in terms of the adoption, and then couple that with the agility they need to do to deal with this constant and increasing rate of change.

Gardner: People are talking more about “digital business.” It means looking at not just new technologies, but how you do business, of taking advantage of the ability to have insight into your business, sharing that insight across ecosystems with partners. Where do you see the real advantage in action now for a business-to-business (B2B) environment?

Outcome-based conversations

Haydon: We hear about the technology and it’s important, but what we really hear about is the outcomes. We have very outcome-based conversations with customers. So how does the platform with the business network give you these differential outcomes?

What's pretty evident is that you have to be closer to your end user. And it's also about the cloud paradigm adoption. You're only as good as your last transaction, your last logon, your last order, or your last report -- or whatever business process you're running in.

It's this merger of adoption and outcome, and how you string these two things together to be able to deliver the customer benefit.

From a technology perspective, it's no longer acceptable just to think about the four walls or your firewall; it's really about that extended value chain. So this is where we're seeing this merger of this business network concept, this commerce network concept in the context of these business applications.

We're really starting to emerge from B2B, and it's grown out of the business-to-consumer (B2C) world. With the Facebooks, the LinkedIns, or the Ubers, now you're seeing leading practice companies needing to embrace these larger value chains or commerce chains to give them the outcome and also to help drive differential adoption.
From a technology perspective, it's no longer acceptable just to think about the four walls or your firewall; it's really about that extended value chain.

Gardner: For organizations that are really attracted to this and recognize that they have to compete with upstarts, if they get this right, it could be very disruptive.

When we think about having all of your data accessible, when we think about processes being automated, at some point you're able to gather more data and analysis and process refinement that you can then reapply to your business, creating perhaps algorithms and abilities to add intelligence in ways that you couldn’t never do manually.

How do we get companies to understand that feedback loop and get it instituted more rigorously into their organization?

Haydon: One thing we see is that with the technology we have today, we can hide that complexity from the users and embed it in the way that end users need to work. Let’s talk a little bit about an SAP Ariba example here. If you're going to create a new sourcing event, do you really want to have to think about the business you do with your current suppliers? Absolutely, but wouldn't it be great when that’s all managed by extra information presented right in front of you?

On top of that, wouldn’t it be also great to know that these three new suppliers in this category, in this geography that you haven't thought about before, and wouldn't it also be great that they could be automatically invited at no extra friction to your process? So you get more supplier diversity. You're able to also let suppliers become more involved in the process, earlier in the process.

We're redistributing this value chain in terms of linking the insight and the community to the point of where work is being done -- and that’s part of that transformation that we're seeing, and that’s how we see it in the Ariba context. But we’re also seeing that in the larger business-network and business application context across SAP.

Knowing your needs

Gardner: So to borrow the B2C example, if Uber is our poster child example, instead of my standing outside of a hotel and having visibility of all of the cars that are driving around that neighborhood, I could be a business and get visibility into all of the suppliers that are available to me. And those suppliers will know what my needs are before they even get to the curb, so to speak.

What's the next step? When we gain that visibility, when we have supply chain and buyer and seller synergy, what comes next? Is there some way to bring that value extended to the end-user at some time?

Haydon: The next step is network resource planning. This is the awareness about your supply base, but also what other stakeholders in that process might mean, and this is what it could be for the end user. It's not just about the supplier, but also about the logistics provider. It’s about how you might have working capital and finance.
The next step is network resource planning. This is the awareness about your supply base, but also what other stakeholders in that process might mean.

What if you could dynamically match or even have a conversation about differential service levels from a buyer or supplier? I'm okay to take it tomorrow if I can get it at 8 a.m., but it's $2 cheaper, or I am happy to take it today because of some other dependencies.

This is a type of dynamic “what if,” because we have the technology platform capability, in time real-time memory analytics, but also in the context of the business process. This is this a next generation capability that we'll be able to get to. Because the network is external to the application, because together we can understand the players in the network in the context of the business process, that's where that real next evolution is going to come.

Gardner: It sounds as if we're really starting to remove the margin of error from business. We're starting to remove excess capacity, become fit for purpose through that dynamic applicability of insight and analysis. How much can we squeeze out? Do we have a sense of how substantial these business network capabilities will become? What sort of payoff are we anticipating when we can remove that margin of error, with tighter integration across ecosystems? What’s the gold piece that we are going after here?

Haydon: Well it’s a big, big, big number. Even if we go back a couple of years -- and there’s some good work being done on just the inefficiencies and the first sort of magnitude on paper -- and that’s just moving something from a paper format and dematerializing that into an electronic format. Four years or five years ago when a study was done on that, that was conservatively somewhere between $600 billion and $1 trillion just in the Global 2000 corporations.

There is an order of magnitude more opportunities globally from just this compression of cycle times, in the broader sense, and responsiveness and adaptability throughout the whole world globally.

At SAP Ariba, we just passed a great threshold in 2015. We ran more than $1 trillion in commerce across our business network. If you just start doing a little bit of math around what a one percent or two percent improvement of that can be from better working capital management, or more flexible working capital management, just pure straight productivity and just competition, of leveling the playing field for the smallest micro-supplier through the largest international supplier, and just leveling that all out. There are stupendous gains on both sides of the balance sheet.

Adoption patterns

Gardner: When it comes to adoption patterns, some organizations may have been conservative and held back, perhaps resisted becoming cloud-based. What sorts of organizations are you seeing making a bold move, not just piecemeal, and what can they get a lot done in a relatively short amount of time?

Haydon: In industries where they are traditionally conservative, they really do need to change their value chains, because that’s what their customers are demanding. And so, financial services, where historically you would think the old “big iron” approach. Those types of companies are embracing what they need to do on cloud to just to be more adaptive, to be faster, and also to be more end-user-friendly, and the total cost of ownership approach from the cloud is really there.

But we're a long way away from on-premises applications being dead. I think what the cloud gives enterprises is they can go largely to the cloud -- and we see companies doing that -- but that the legacy-hybrid, on-premise model is really important. That’s what’s great about the cloud model. You can consume as you go. It doesn’t all have to be one big bang.

For pragmatic CEOs, CFOs, or CIOs, that blend of hybrid is the legitimate strategy -- where they can have the best of both worlds. With that said, the inextricable pool of cloud is there, but it can be a little bit more on their own terms, on what makes sense for their businesses.
We talk about our cloud applications, and we have leading, leading practice, widely, broadly adopted source-to-pay cloud applications in a fully integrated suite.

Gardner: We have been at the 70,000- to 80,000-foot height on this discussion. Let’s bring it down a bit lower. Help our readers understand SAP Ariba as an entity now. What does it consist of in terms of the software-as-a-service (SaaS) services that have been acquired and built, and how that then fits into a hybrid portfolio.

Haydon: Number one, we fundamentally believe in Ariba, and it had to give differential outcomes to our customers, that linking cloud applications with the business-network construct will give you better outcomes for the things we just spoke about earlier in the conversation: visibility, supply chain, adaptability, compliance, building on networks of networks to be able to deliver different outcomes, linking to payment networks like we have done with Ariba and Discovery, linking to content networks like we have done with eBay, but bringing them into the context of the business process can only really be enabled through networks and applications.

From an Ariba perspective, we like to think of it in three pillars for everyone. We talk about our cloud applications, and we have leading, leading practice, widely, broadly adopted source-to-pay cloud applications in a fully integrated suite.

From a cloud perspective as well, you can have the Lego-block approach, where we can take any one of our modules, from spend visibility all the way through the invoicing, and start your journey there, if that's your line-of-business requirement, or take the full suite approach.

Intrinsic to that offering, of course, is our business network. Why I bring that up is that our business network and our cloud applications are agnostic. We don't actually care, from a cloud perspective, which back-end system of record you wish to use.

Of course, we love and we believe that the best out there is S/4HANA from SAP, but there is also a larger market, whether it's the mid-market or whether there are other customers who are on other journeys on the enterprise resource planning (ERP) for legacy reasons. We can connect our cloud applications and our network to any one of those.

Three levels

So, there are three levels: network, our end-to-end cloud applications, and last but not least, and which is really relevant from the technology journey, a rock-solid platform. And so I am moving toward that platform that runs our cloud apps and our network in conjunction with SAP for the security, for the privacy, for the availability, for all of these really important things that enterprise customers need.

Also, you have to have the security to run these business processes, because they're entrusting those to us, and that's really what cloud means. It means entrusting your business processes to us to get a differential outcome for your business.

Gardner: As organizations try to become more of a digital business, they will be looking to bringing these benefits to their ERP-level business applications, their supply chain and procurement, but increasingly, they're also looking to manage better their human resources and recognizing that that's a dynamic marketplace more than ever.

Haydon: Yes.

Gardner: So let's talk about how the business network effect and some of these synergistic benefits come to play in that human resources side of running a digital business?
Leading companies today want to have agility on how many full-time employees they can hire, and how to manage contingent or temporary labor aspects.

Haydon: That's also one of the great parts from an SAP portfolio. I like to think about it two ways. There’s human capital management internal, and there’s human capital management external. Leading companies today want to have agility on how many full-time employees they can hire, and how to manage contingent or temporary labor aspects.

From an SAP perspective, what's great is that we have the leading cloud for Human Resource Management and Talent Management solutions with Success Factors, and we have also have the market-leading Contingent Labor Management solution with Fieldglass.

Together with Ariba, you're able to, one, have a one-visibility view into your workforce in and out, and also, if you like, to orchestrate that procurement process to get sourcing, ordering, requisitioning and payment throughout.

From a company perspective, when you think about your spend profile, 30 percent to 70 percent of the spend is about services as we move to a service-based economy. And in conjunction with SAP Ariba and SAP Fieldglass, we have this broad, deep, end-to-end process, in a single context, and -- by the way -- integrated nicely to the ERP system to really again give those best outcomes.

Gardner: When people think about public clouds that are available to them for business, they often couple that with platform-as-a-service (PaaS), and one of the things that other clouds are very competitive about is portraying themselves as having a very good developer environment. But increasingly, development means mobile apps.

Haydon: Yes.

Mobile development

Gardner: What can we gain from your cloud vision as being hybrid, while also taking advantage of mobile development?

Haydon: From a platform perspective, you need to be “API First” because if you're actually able to expose important aspects within a business process, with an API layer, then you give that extensibility and that optionality to your customers to do more things.

Let’s talk about concrete examples. An end-to-end process could be as simply as you could take an invoice from any third-party provider. Right now, Ariba has an open invoice format. If someone chooses to scan it themselves and digitize it themselves or something that a customer wanted to do, we could take that straight feed in.

If you want to talk about a mobile API, it could be as simple as you want to expose a workflow. There's a large corporate mandate sometimes to have a workflow. If you travel, there's a workflow for your expenses, a workflow for your leave request, and a workflow for your purchase orders. If you want that cost – the end-user that has five systems or would rather come to one, you can have that API level there.

There is this whole balance of how you moleculize your offerings to enable customers to have that level of configuration that they need for their individual business requirements, but still get the leverage of not having to rebuild it all themselves.
That's certainly a fundamental part of our strategy. You'll see that SAP is leading in itself under our HANA Cloud Platform. SAP Ariba is building on that.

That's certainly a fundamental part of our strategy. You'll see that SAP is leading in itself under our HANA Cloud Platform. SAP Ariba is building on that. I don’t want to flag too much, but you’ll see some interesting developments along that way as we open up our platform from both an end-to-end perspective and also from an individual mobile perspective throughout the course of this year.

Gardner: Now this concept of API First, it's very interesting, because it doesn't really matter which cloud it’s coming from, whether on a hybrid spectrum of some sort. It also allows you to look at business services and pull them down as needed and construct processes, rather than monolithic, complex, hairball applications.

Do you have any examples of organizations that have taken advantage of this API First approach? And how have they been able to customize their business processes using this hybrid cloud and visibility, reducing the complexity?

Haydon: I can certainly give you some examples. This just starts from simple processes, but they can actually add a lot of value. For example, you have a straightforward shipping process, an advanced shipping process. We know of an example where a customer took 90 percent of their time out of the receiving and made the matching of their receiving receipting process almost by 95 percent, because they can leverage an API to support their custom bar-coding standard.

So they leveraged the standard business-network bus, because that type of barcode that they need to have in their warehouse, and their part of the world, was there. Let’s wind the clock back three or four years. If we had asked for that specific feature, to be very candid, we wouldn’t make it. But once you start opening up the platform at that micro level, you can actually let customers get the job done.

But they can still leverage that larger framework, that platform, that business process, that cloud that we give them. But when you extend that out for what it could mean, again, full payment, or for risk, or for any of these other dimensions that are just typically organizational processes to the current -- whether it’s procurement or whether it’s HR recruiting or whatever it's like -- it gets pretty exciting.

Big data

Gardner: One of the other hallmarks of a digital business is having aspects of a business work in new ways together, in closeness, that they may not have in the past. And one of the things that’s been instrumental to business applications over the past decades is this notion of a system of record or systems of records, and also, we have had this burgeoning business intelligence (BI) now loosely called big data capability.

And they haven't always been that close, but it seems to me that with a platform like SAP HANA, combined with business-networks, that systems of record and the data and information in them, and the big data capabilities, as well as accessing other data sets outside the organization, make a tremendous amount of sense. How do you see the notion of more data, regardless of its origin, becoming a common value stream to an organization?

Haydon: This becomes the fundamental competency that an organization needs to harness. This notion of the data, and then the data in the context of the business process, and then again to your point, how that’s augmented in the right way, is really the true differentiation for where we’ll go.
We're  working with our customers to identify and remove forced labor in the supply chain, or advance global risk management or even expedited delivery and logistics.

Historically, we laid down the old railway tracks on business processes,  but there is no such thing as railway tracks anymore. You rebuild them every single day. Inside that data, with the timeliness of it, is sentiment analysis so that from a business-network context, it enables you to make different and dynamic decisions.

Within SAP Ariba, we're fundamentally rethinking how we can have the data that’s actually in our environment and how we get that out -- not just to our account managers, not just to where our product manager is, but more importantly, out to our end users. They can then actually start to see patterns, play with it, and create some interesting innovations. We're  working with our customers to identify and remove forced labor in the supply chain, or advance global risk management or even expedited delivery and logistics.

Gardner: Okay, we talked about business-networks in the context of applications working together for efficiency, we’ve talked about the role of hybrid cloud models helping to accelerate that, we've talked about the data issues and some of the development and customization on mobile side of things. What have we missed, what is the whole, greater than the sum of the parts component that we’re not talking about that we should?

Haydon: There are probably two or three. There’s certainly the notion of the user experience and that's a function of mobile, but not mobile only. The notion of reinventing the old traditional flows and thinking that was prevalent even five years ago on what constituted one type of work channel no longer exists.

There's the new discipline of what a user experience is about and that's not just the user interface, that’s also things like they’re just the tone or the content that’s presented to you. It’s also what it does mean on the differential devices and way you’re working. So I think that's an evolving piece, but cannot be left behind.

That's where the B2C world is blazing and that's now the expectation of all of us in that, when we go to work and put our corporate hat on, as simple as that. These two are security and privacy. That is top of mind for a number of reasons and it's really fair to say that it’s in a massive state of flux and change here in the United States, but certainly in Europe. It doesn’t matter which region you are in, APJ or Latin America as well.

Competitive advantage

That's another competitive advantage that enterprises and providers in this space like SAP and SAP Ariba, can, and will, and should, lead on. The last point, maybe a trend, is that you're really seeing very quickly the transition between the traditional service and material flows that exist, and then the financial flows.

We're seeing the digitalization of payments just exploding and banks and financial institutions having to rethink and look at what they're doing. With the technology and the platforms we have, that linking of that is physical flows, whether they be for services or materials and that crossing over to that payment and then the natural working capital because, at the end of that, commerce follows money.

It’s all about the commerce. So it's the whole space in that whole area and that technology is the trend as well. Security UX and the whole payment working capital management or the digitalization of that are the three large things.

Gardner: And these are areas where scale really helps. Global scale, like a company like SAP has, when the issues of data sovereignty come up and you need to think about hybrid cloud, not just in its performance and technical capabilities but the actual location of data, certain data for certain periods of time and certain markets, is very difficult to do if you're not in those markets and understanding those markets. It's the same with the financial side. Because banking and finance are dynamic, it’s different having that breadth and scope, a key component of making that possible as well.
We're seeing the digitalization of payments just exploding and banks and financial institutions having to rethink and look at what they're doing.

There's one last area we can close out on and it’s looking a bit to the future. Some competitors of yours are out there talking about artificial intelligence (AI) more than others, and when you do have network effects as we’ve described, big-data mesh across organization thinking of data as a life cycle for a digital business, not just data in different parts of your organization.

We can think about expertise in vertical industries being brought to bear on insights in the markets and ecosystems. When and how might we expect some sort of an AI, value, API, or set of APIs to come forward to start thinking things through in ways people probably haven’t been able to do up until now or in the near future.

Haydon: The full notion of something like a [2001: A Space Odyssey’s] HAL 9000, is probably a little way away. But then again, what you would see within the next 12 to 18 months is specific -- maybe you call them smart apps rather than intelligent or smart agents.

They already exist today in some areas. You will see them augmented because of feedback from a system that’s not your own, whether it’s moving average price of an inventory. Someone will bring the context of an updated price, or an updated inventory and that will trigger something, and that will be the smart agent going to do all that work for you, but ready for you to make the release.

There still is a notion of the competency, as well, within the organization, not as much a technology thing, but a competency on what Master Data Governance means, and the quality of that data means, and being able to have a methodology to be to manage that to let these systems do it.

So you will see probably in a lower-risk spend categories, at least from a procurement perspective indirect, or may be some travel and these aspects, maybe a little bit of non-inventory materials repair and operating supplies, you probably fair way away from fully releasing direct material supply chain in some of these pretty, pretty important value chains we manage.

Self-driving business process

Gardner: So maybe we should expect to see self-driving business processes before we see self-driving cars?

Haydon: I don't know, I'm lucky enough to live in Palo Alto, and I see a self-driving car three days a week. So we'll back out of that one.

But there is a really important piece, at least from Ariba perspective and an SAP perspective. We fundamentally believe that these business-networks are the rivers of data.

It's not just what's inside your firewall. You will truly get the insight from the largest scale of these rivers of data from these business-networks; whether it be Ariba or our financial partners, or whether it be others. There will be networks of networks.
It's scale and adoption. From the scale and from the adoption, will come that true benefit from the networks, the business process, and the connectivity therein.

This notion of having a kind of the bookend of the process, a registry to make sense of the actors in these business networks and the context of the business process, and then linking that to the financial and payment change, that's where the real intelligence and some real money could be released, and that's some of the thinking that we have out there.

Gardner: So, a very bright interesting future, but in order to get to that next level of value, you need to start doing those blocking and tackling elements around the rivers of information as you say the network effects and putting yourself in a position and then be able to really exploit these new capabilities when they come out.

Haydon: It's scale and adoption. From the scale and from the adoption, will come that true benefit from the networks, the business process, and the connectivity therein.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: SAP Ariba.

You may also be interested in:

Thursday, February 11, 2016

How New York Genome Center manages the massive data generated from DNA sequencing

The next BriefingsDirect big-data use case discussion examines how the non-profit New York Genome Center manages and analyzes up to 12 terabytes of data generated each day from its genome sequence appliances. We’ll learn how a swift, cost efficient, and accessible big-data analytics platform supports the drive to better diagnose disease and develop more effective medical treatments.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy.

To hear how genome analysis pioneers exploit vast data outputs to speedily correlate for time-sensitive research, please join me in welcoming our guest, Toby Bloom, Deputy Scientific Director for Informatics at the New York Genome Center in New York City. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: First, tell us a little bit about your organization. It seems like it’s a unique institute, with a large variety of backers, consortium members. Tell us about it.

Bloom
Bloom: New York Genome Center is about two-and-a-half years old. It was formed initially as a collaboration among 12 of the large medical institutions in New York: Cornell, Columbia, NewYork-Presbyterian Hospital, Mount Sinai, NYU, Einstein Montefiore, and Stony Brook University. All of the big hospitals in New York decided that it would be better to have one genome center than have to build 12 of them. So we were formed initially to be the center of genomics in New York.

Gardner: And what does one do at a center of genomics?

Bloom: We're a biomedical research facility that has a large capacity to sequence genomes and use the resulting data output to analyze the genomes, find the causes of disease, and hopefully treatments of disease, and have a big impact on healthcare and on how medicine works now.

Gardner: When it comes to doing this well, it sounds like you are generating an awesome amount of data. What sort of data is that and where does it come from?
Start Your
HPE Vertica
Community Edition Trial Now
Bloom: Right now, we have a number of genome sequencing instruments that produce about 12 terabytes of raw data per day. That raw data is basically lots of strings of As, Cs, Ts and Gs -- the DNA data from genomes from patients who we're sequencing. Those can be patients who are sick and we are looking for specific treatment. They can be patients in large research studies, where we're trying to use and correlate a large number of genomes to find the similarities that show us the cause of the disease.

Gardner: When we look at a typical big data environment such as in a corporation, it’s often transactional information. It might also be outputs from sensors or machines. How is this a different data problem when you are dealing with DNA sequences?

Lots of data

Bloom: Some of it’s the same problem, and some of it’s different. We're bringing in lots of data. The raw data, I said, is probably about 12 terabytes a day right now. That could easily double in the next year. But than we analyze the data, and I probably store three to four times that much data in a day.

In a lot of environments, you start with the raw data, you analyze it, and you cook it down to your answers. In our environment, it just gets bigger and bigger for a long time, before we get the answers and can make it smaller. So we're dealing with very large amounts of data.

We do have one research project now that is taking in streaming data from devices, and we think over time we'll likely be taking in data from things like cardiac monitors, glucose monitors, and other kinds of wearable medical devices. Right now, we are taking in data off apps on smartphones that are tracking movement for some patients in a rheumatoid arthritis study we're doing.
In our environment, it just gets bigger and bigger for a long time, before we get the answers and can make it smaller. So we're dealing with very large amounts of data.

We have to analyze a bunch of different kinds of data together. We’d like to bring in full medical records for those patients and integrate it with the genomic data. So we do have a wide variety of data that we have to integrate, and a lot of it is quite large.

Gardner: When you were looking for the technological platforms and solutions to accommodate your specific needs, how did that pan out? What works? What doesn’t work? And where are you in terms of putting in place the needed infrastructure?

Bloom: The data that comes off the machines is in large files, and a lot of the complex analysis we do, we do initially on those large files. I am talking about files that are from 150 to 500 gigabytes or maybe a terabyte each, and we do a lot of machine-learning analysis on those. We do a bunch of Bayesian statistical analyses. There are a large number of methods we use to try to extract the information from that raw data.
Start Your
HPE Vertica
Community Edition Trial Now
When we've figured out the variance and mutations in the DNA that we think are correlated with the disease and that we were interested in looking at, we then want to load all of that into a database with all of the other data we have to make it easy for researchers to use in a number of different ways. We want to let them find more data like the data they have, so that they can get statistical validation of their hypotheses.

We want them to be able to find more patients for cohorts, so they can sequence more and get enough data. We need to be able to ask questions about how likely it is, if you have a given genomic variant, you get a given disease. Or, if you have the disease, how likely it is that you have this variant. You can only do that if it’s easy to find all of that data together in one place in an organized way.

So we really need to load that data into a database and connect it to the medical records or the symptoms and disease information we have about the patients and connect DNA data with RNA data with epigenetic data with microbiome data. We needed a database to do that.

We looked at a number of different databases, but we had some very hard requirements to solve. We were looking for one that could handle trillions of rows in a table without failing over, tens of trillions of rows without falling over, and to be able to answer queries fast across multiple tables with tens of trillions of rows. We need to be able to easily change and add new kinds of data to it, because we're always finding new kinds of data we want to correlate. So there are things like that.

Simple answer

We need to be able to load terabytes of data a day. But more than anything, I had a lot of conversations with statisticians about why they don’t like databases, about why they keep asking me for all of the data in comma-delimited files instead of databases. And the answer, when you boiled it down, was pretty simple.

When you have statisticians who are looking at data with huge numbers of attributes and huge numbers of patients, the kinds of statistical analysis they're doing means they want to look at some much smaller combinations of the attributes for all of the patients and see if they can find correlations, and then change that and look at different subsets. That absolutely requires a column-oriented database. A row-oriented relational database will bring in the whole database to get you that data. It takes forever, and it’s too slow for them.

So, we started from that. We must have looked at four or five different databases. Hewlett Packard Enterprise (HPE) Vertica was the one that could handle the scale and the speed and was robust and reliable enough, and is our platform now. We're still loading in the first round of our data. We're still in the tens of billions of rows, as opposed to trillions of rows, but we'll get there.
We must have looked at four or five different databases. Vertica was the one that could handle the scale and the speed and was robust and reliable enough and is our platform now.

Gardner: You’re also in the healthcare field. So there are considerations around privacy, governance, auditing, and, of course, price sensitivity, because you're a non-profit. How did that factor into your decision? Is the use of off-the-shelf hardware a consideration, or off-the-shelf storage? Are you looking at conversion infrastructure? How did you manage some of those cost and regulatory issues?

Bloom: Regulatory issues are enormous. There are regulations on clinical data that we have to deal with. There are regulations on research data that overlap and are not fully consistent with the regulations on clinical data. We do have to be very careful about who has access to which sets of data, and we have all of this data in one database, but that doesn’t mean any one person can actually have access to all of that data.

We want it in one place, because over time, scientists integrate more and more data and get permission to integrate larger and larger datasets, and we need that. There are studies we're doing that are going to need over 100,000 patients in them to get statistical validity on the hypotheses. So we want it all in one place.

What we're doing right now is keeping all of the access-control information about who can access which datasets as data in the database, and we basically append clauses to every query to filter down the data to the data that any particular user can use. Then we'll tell them the answers for the datasets they have and how much data that’s there that they couldn’t look at, and if they needed the information, how to go try to get access to that.

Gardner: So you're able to manage some of those very stringent requirements around access control. How about that infrastructure cost equation?

Bloom: Infrastructure cost is a real issue, but essentially, what we're dealing with is, if we're going to do the work we need to do and deal with the data we have to deal with, there are two options. We spend it on capital equipment or we spend it on operating costs to build it ourselves.

In this case, not all cases, it seemed to make much more sense to take advantage of the equipment and software, rather than trying to reproduce it and use our time and our personnel's time on other things that we couldn’t as easily get.

A lot of work went into HPE Vertica. We're not going to reproduce it very easily. The open-source tools that are out there don’t match it yet. They may eventually, but they don’t now.

Getting it right

Gardner: When we think about the paybacks or determining return on investment (ROI) in a business setting, there’s a fairly simple straightforward formula. For you, how do you know you’ve got this right? What is it when you see certain, what we might refer to in the business world as service-level agreements (SLAs) or key performance indicators (KPIs)? What are you looking for when you know that you’ve got it right and when you’re getting the job done, based all of its requirements and from all of these different constituencies?

Bloom: There’s a set of different things. The thing I am looking for first is whether the scientists who we work with most closely, who will use this first, will be able to frame the questions they want to ask in terms of the interface and infrastructure we’ve provided.

I want to know that we can answer the scientific questions that people have with the data we have and that we’ve made it accessible in the right way. That we’ve integrated, connected and aggregated the data in the right ways, so they can find what they are looking for. There's no easy metric for that. There’s going to be a lot of beta testing.
The place where this database is going to be the most useful, not by any means the only way it will be used, is in our investigations of common and complex diseases, and how we find the causes of them and how we can get from causes to treatments.

The second thing is, are we are hitting the performance standards we want? How much data can I load how fast? How much data can I retrieve from a query? Those statisticians who don’t want to use relational databases, still want to pull out all those columns and they want to do their sophisticated analysis outside the database.

Eventually, I may convince them that they can leave the data in the database and run their R-scripts there, but right now they want to pull it out. I need to know that I can pull it out fast for them, and that they're not going to object that this is organized so they can get their data out.

Gardner: Let's step back to the big picture of what we can accomplish in a health-level payback. When you’ve got the data managed, when you’ve got the input and output at a speed that’s acceptable, when you’re able to manage all these different level studies, what sort of paybacks do we get in terms of people’s health? How do we know we are succeeding when it comes to disease, treatment, and understanding more about people and their health?

Bloom: The place where this database is going to be the most useful, not by any means the only way it will be used, is in our investigations of common and complex diseases, and how we find the causes of them and how we can get from causes to treatments.

I'm talking about looking at diseases like Alzheimer’s, asthma, diabetes, Parkinson’s, and ALS, which is not so common, but certainly falls in the complex disease category. These are diseases that are caused by some combinations of genomic variance, not by a single gene gone wrong. There are a lot of complex questions we need to ask in finding those. It takes a lot of patience and a lot of genomes, to answer those questions.
Start Your
HPE Vertica
Community Edition Trial Now
The payoff is that if we can use this data to collect enough information about enough diseases that we can ask the questions that say it looks like this genomic variant is correlated with this disease, how many people in your database have this variant and of those how many actually have the disease, and of the ones who have the disease, how many have this variant. I need to ask both those questions, because a lot of these variants confer risk, but they don’t absolutely give you the disease.

If I am going to find the answers, I need to be able to ask those questions and those are the things that are really hard to do with the raw data in files. If I can do just that, think about the impact on all of us? If we can find the molecular causes of Alzheimer’s that could lead to treatments or prevention and all of those other diseases as well.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in: