Wednesday, July 1, 2020

How REI used automation to cloudify infrastructure and rapidly adjust its digital pandemic response

https://www.rei.com/about-rei

Like many retailers, Recreational Equipment, Inc. (REI) was faced with drastic and rapid change when the COVID-19 pandemic struck. REI’s marketing leaders wanted to make sure that their online e-commerce capabilities would rise to the challenge. They expected a nearly overnight 150 percent jump in REI’s purely digital business.

Fortunately REI’s IT leadership had already advanced their systems to heightened automation, which allowed the Seattle-based merchandiser to turn on a dime and devote much more of its private cloud to the new e-commerce workload demands.

The next BriefingsDirect Voice of Innovation interview uncovers how REI kept its digital customers and business leadership happy, even as the world around them was suddenly shifting.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To explore what works for making IT agile and responsive enough to re-factor a private cloud at breakneck speed, we’re joined by Bryan Sullins, Senior Cloud Systems Engineer at REI in Seattle. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.


Here are some excerpts:

Gardner: When the pandemic required you to hop-to, how did REI manage to have the IT infrastructure to actually move at the true pace of business? What put you in a position to be able to act as you did?

Digital retail demands rise 

Sullins: In addition to the pandemic stay-at-home orders a couple months ago, we also had a large sale previously scheduled for the middle of May. It’s the largest sale of the year, our anniversary sale.

Sullins
And ramping up to that, our marketing and sales department realized that we would have a huge uptick in online sales. People really wanted to get outside, because people could go outside without breaking any of the social distancing rules.

For example, bicycle sales were up 310 percent compared to the same time last year. So in ramping up for that, we anticipated our online presence at rei.com was going to go up by 150 percent, but we wanted to scale up by 200 percent to be sure. In order to do that, we had to reallocate a bunch of ESXi hosts in VMware vSphere. We either had to stand up new ones or reallocate from other clusters and put them into what we call our digital retail presence.

As a result of our fully automated process, using Hewlett Packard Enterprise (HPE) OneView, Synergy, and Image Streamer, we were able to reallocate 6 out of the 17 total hosts needed. We were able to do that in 18 minutes, all at once -- and that’s single touch, that’s launching the automation and then pulling them from one cluster and decommissioning them and placing them all the way into the digital retail clusters.

We also had to move some from our legacy platform, they aren’t at HPE Synergy yet, and those took an additional three days. But those are in transition, we are moving through to that fully automated platform all around.

Gardner: That’s amazing because just a few years ago that sort of rapid and automated transition would have been unheard of. Even at a slow pace you weren’t guaranteed to have the performance and operations you wanted.

If you were not able to do this using automation – if the pandemic had hit, heaven forbid, five or seven years ago – what would have been the outcome?
We needed to make sure we had the infrastructure capacity so that nothing failed under a heavy load. We were able to do it in the time-frame, and be able to get some sleep.

Sullins: There were actually two outcomes from this. The first is the fairly obvious issue of not being able to handle the online traffic on our rei.com retail presence. It could have been that people weren’t able to put stuff into a shopping cart, or inventory decrement, and so on. It could have been a very broad range of things. We needed to make sure we had the infrastructure capacity so that none of that fails under a heavy load. That was the first part.

Gardner: Right, and when you have people in the heat of a purchasing moment, if you’re not there and it’s not working, they have other options. Not only would you lose that sale, you might lose that customer, and your brand suffers as well.

Sullins: Oh, without a doubt, without a doubt.

The other issue, of course, would have been if we did not meet our deadline. We had just under a week to get this accomplished. And if we had to do this without a fully automated approach, we would have had to return to our managers and say, “Yeah, so like we can’t do it that quickly.” But with our approach, we were able to do it all in the time frame -- and be able to get some sleep in the interim. So it was a win-win.

Gardner: So digital transformation pays off after all?

Sullins: Without a doubt.

Gardner: Before we learn more about your journey to IT infrastructure automation, tell us about REI, your investments in advanced automation, and why you consider yourself a data-driven digital business?

Automation all the way 

Sullins: Well, a lot of that precedes me by quite a bit. Going back to the early 2000s, based on what my managers tell me, there was a huge push for REI become an IT organization that just happens to do retail. The priority is on IT being a driving force behind everything we do, and that is something that, at the time, REI really needed to do. There are other competitors, which we won’t name, but you probably know who they are. REI needed to stay ahead of that curve.

https://www.rei.com/
So since then there have been constant sweeping and cyclical changes for that digital transformation. The most recent one is the push for automating all things. So that’s the priority we have. It’s our marching orders.

Gardner: In addition to your company, culture, and technology, tell us about yourself, Bryan. What is it about your background and personal development that led you to be in a position to act so forthrightly and swiftly?

Sullins: I got my start in IT back in 1999. I was a public school teacher before that, and then I made the transition to doing IT training. I did IT training from 1999 to about 2012. During those years, I got a lot of technology certifications, because in the IT training world you have to.

I began with what was, at the time, called the Microsoft Certified Solutions Expert (MCSE) certification. Then I also did the Linux Professional Institute. I really glommed on to Linux. I wanted to set myself apart from the rest of the field back then, so I went all-in on Linux.

And then, 2008-2009-ish, I jumped on the VMware train and went all-in on VMware and did the official VMware curriculum. I taught that for about three years. Then, in 2012, I made the transition from IT training into actually doing this for real as an engineer working at Dell. At the time, Dell had an infrastructure-as-a-service (IaaS) healthcare cloud that was fairly large – 1,200-plus ESXi hosts. We were also responsible for the storage and for the 90-plus storage area network (SAN) arrays as well.
In a large environment, you really have to automate. It's been the focus of my career. I typically jump right into new technology.

In an environment that large, you really have to automate. I cut my teeth on automating through PowerCLI and Ansible. Since then, about 2015, it’s been the focus of my career. I’m not saying I’m a guru, by any means, but it’s been a focus of my career.

Then, in 2018, REI came calling. I jumped on that opportunity because they are a super-awesome company, and right off the bat I got free reign over: if you want to automate it, then you automate it. And I have been doing that ever since August of 2018.

Gardner: What helped you make the transition from training to cloud engineer?

Sullins: I typically jump right into new technology. I don’t know if that comes from the training or if that’s just me as a person. But one of the positives I’ve gotten from the training world is that you learn a 100 percent of the feature base that’s available with said technology. I was able to take what I learned and knew from VMware and then say, “Okay, well, now I am going to get the real-world experience to back that up as well.” So it was a good transition.

Gardner: Let’s look at how other organizations can anticipate the shift to automation. What are some of the challenges that organizations typically face when it comes to being agile with their infrastructure?

Manage resistance to cloud 

Sullins: The challenges that I have seen aren’t usually technical. Usually the technology that people use to automate things are ready at hand. Many are free; like Ansible, for example, is free. PowerCLI is free. Jenkins is free.

So, people can start doing that tomorrow. But the real challenge is in changing people’s mindset about a more automated approach. I think that it’s tough to overcome. It’s what I call provisioning by council. More traditional on-premises approaches have application owners who want to roll out x number of virtual machines (VMs), with all their particular specs and whatnot. And then a council of people typically looks at that and kind of scratches their chin and says, “Okay, we approve.” But if you need to scale up, that council approach becomes a sort of gate-keeping process.

https://www.hpe.com/us/en/solutions/infrastructure/composable-infrastructure.html

With a more automated approach, like we have at REI, we use a cloud management platform to automate the processes. We use that to enable self-service VMs instead of having a roll out by council, where some of the VMs can take days or weeks roll out because you have a lot of human beings touching it along the way. We have a lot of that process pre-approved, so everybody has already said, “Okay, we are okay with the roll out. We are okay with the way it’s done.” And then we can roll that out in 7 to 10 minutes rather than having a ticket-based model where somebody gets to it when they can. Self-service models are able to do that much better.

But that all takes a pretty big shift in psychology. A lot of people are used to being the gatekeeper. It can make them uncomfortable to change. Fortunately for me, a lot of the people at REI are on-board with this sort of approach. But I think that resistance can be something a lot of people run into.

Gardner: You can’t just buy automation in a box off of a shelf. You have to deal with an accumulation of manual processes and habits. Why is moving beyond the manual processes culture so important?

Sullins: I call it a private cloud because that means there is a healthy level of competition between what’s going in the public cloud and what we do in that data center.

The public cloud team has the capability of “selling” their solution side-by-side with ours. When you have application owners who are technically adept -- and pretty much all of them are at REI -- they can be tempted to say, “Well, I don’t want to wait a week or two to get a VM. I want to create one right now out on the public cloud.”
There is a healthy level of competition between what's going in the public cloud and what we do in the date center. We offer our customers a spectrum of services. And now they can do that in an automated way. That's a big win.

That’s a big challenge for us. So what we are trying to accomplish -- and we have had success so far through the transition – is to offer our customers a spectrum of services. So that’s great.

The stakeholders consuming that now gain flexibility. They can say, “Okay, yeah, I have this application. I want to run it in the public cloud, but I can’t based on the needs for that application. We have to run it on-premises.” And now they can do that in an automated way. That’s a big win, and that’s what people expect now, quite honestly.

Gardner: They want the look and feel of a public cloud but with all the benefits of the private cloud. It’s up to you to provide that. Let’s find out how you did.

How did you overcome the challenges that we talked about and what are the investments that you made in tools, platforms, and an ecosystem of players that accomplished it?

Sullins: As I mentioned previously, a lot of our utilities are “free,” the Ansibles of the world, PowerCLI, and whatnot. We also use Morpheus to do self-service and the implications behind automating things on what I call the front end, the customer face. The issue you have there is you don’t get that control of scaling up before you provision the VM. You have to monitor and then roll it out on the backend. So you have to monitor for usage and then scale up on the backend, and seamlessly. The end users aren’t supposed to know that you are scaling up. I don’t want them to know. It’s not their job to know. I want to remain out of their way.


In order to do that, we’ve used a combination of technologies. HPE actually has a GitHub link for a lot of Ansible playbooks that plug right in. And then the underlying hardware adjacent management ecosystem platform is HPE OneView with HPE Synergy and Image Streamer. With a combination of all of those technologies we were able to accomplish that 18-minute roll-out of our various titles.

Gardner: Even though you have an integrated platform and solutions approach, it sounds like you have also made the leap from ushering pets through the process into herding cattle. If you understand my metaphor, what has allowed you to stop treating each instance as a pet into being able to herd this stuff through on an automated basis?

From brittle pets to agile cattle 

Sullins: There is a psychological challenge with that. In the more traditional approach – and the VMware shop listeners are going to be very well aware of this -- I may need to have a four-node cluster with a number of CPUs, a certain amount of RAM, and so on. And that four-node cluster is static. Yes, if I need to add a fifth down the line I can do that, but for that four-node cluster, that’s its home, sometimes for the entire lifecycle of that particular host.

https://www.rei.com/
With our approach, we treat our ESXi hosts as cattle. The HPE OneView-Synergy-Image Streamer technology allows us to do that in conjunction with those tools we mentioned previously, for the end point in particular.

So rather than have a cluster, and it’s static and it stays that way -- it might have a naming convention that indicates what cluster it’s in and where -- in reality we have cattle-based DNS names for ESXi hosts. At any time, the understanding throughout the organization, or at least for the people who need to know, is that any host can be pulled from one cluster automatically and placed into another, particularly when it comes to resource usage on that cluster. My dream is that the robots will do this automatically.

So if you had a cluster that goes into the yellow, with its capacity usage based on a threshold, the robot would interpret that and say, “Oh, well, I have another cluster over here with a host that is underutilized. I’m going to pull it into the cluster that’s in the yellow and then bring it back into the green again.” This would happen all while we sleep. When we wake up in the morning, we’d say, “Oh, hey, look at that. The robots moved that over.”

Gardner: Algorithmic operations. It sounds very exciting.

Automation begets more automation 

Sullins: Yes, we have the push-button automation in place for that. It’s the next level of what that engine is that’s going to make those decisions and do all of those things.

Gardner: And that raises another issue. When you take the plunge into IT automation, you are making your way down the Chisholm Trail with your cattle, all of a sudden it becomes easier along the way. The automation begets more automation. As you learn and grow, does it become more automated along the way?

Sullins: Yes. Just to put an exclamation point on this topic, imagine the situation we opened the podcast with, which is, “Okay, we have to reallocate a bunch of hosts for rei.com.” If it’s fully automated, and we have robots making those decisions, the response is instantaneous. “Oh, hey, we want to scale up by 200 percent on rei.com.” We can say, “Okay, go ahead, roll out your VM. The system will react accordingly. It will add physical hosts as you see fit, and we don’t have to do anything, we have already done the work with the automation.” Right?

https://h20195.www2.hpe.com/v2/GetPDF.aspx/c04815217.pdf
But to the automation begetting automation, which is a great way of putting it, by the way, there are always opportunities for more automation. And on a career side note, I want to dispel the myth that you automate your way out of a job. That is a complete and total myth. I’m not saying it doesn’t happen, where people get laid off as a result of automation. I’m not saying that doesn’t happen, but that’s relatively rare because when you automate something, that automation is going to need to be maintained because things change over time.

The other piece of that is a lot of times you have different organizations at various states of automation. Once you get your head above water to where it's, “Okay, we have this process and now it's become trivial because it's been automated.” We can now concentrate on automating either more things -- or you have new things that need to be automated. And whether that’s the process for only VMs, a new feature base, monitoring, or auto-scaling -- whatever it is -- you have the capability of from day one to further automate these processes.

Gardner: What was it specifically about the HPE OneView and Synergy that allowed you to move past the manual processes, firefighting, and culture of gatekeeping into more herding of cattle and being progressively automated?

Sullins: It was two things. The Image Streamer was number one. To date, we don’t run PXE boots infrastructure, not that we can't, it’s just not something that we have traditionally done. We needed a more standard process for doing that, and Image Streamer fit that and solved that problem.

The second piece is the provided Ansible playbooks that HPE has to kick off the entire process. If you are somewhat versed in how HPE does things through OneView, you have a server profile that you can impose on a blade, and that can be fully automated through Ansible.
Image Streamer allows us to say, "Okay, we build a gold image. We can apply that gold image to any frame in the cluster." We needed a more standard process, and Image Streamer solved that problem.

And, by the way, you don’t have to use Image Streamer to use Ansible automation. This is really more of an HPE OneView approach, whereby you can actually use it to do automated profiles and whatnot. But the Image Streamer is really what allows us to say, “Okay, we build a gold image. We can apply that gold image to any frame in the cluster.” That’s the first part of it, and the rest is configuring the other side.

Gardner: Bryan, it sounds like the HPE Composable Infrastructure approach works well with others. You are able to have it your way because you like Ansible, and you have a history of certain products and skills in your organization. Does the HPE Composable Infrastructure fit well into an ecosystem? Is it flexible enough to integrate with a variety of different approaches and partners?

Sullins: It has been so far, yes. We have anticipated leveraging HPE for our bare metal Linux infrastructure. One of the additional driving forces and big initiatives right now is Kubernetes. We are going all-in on Kubernetes in our private cloud, as well as in some of our worker nodes. We eventually plan on running those as bare metal. And HPE OneView, along with Image Streamer, is something that we can leverage for that as well. So there is flexibility, absolutely, yes.

Coordinating containers 

Gardner: It’s interesting, you have seen the transition from having VMware and other hypervisor sprawl to finding a way to manage and automate all of that. Do you see the same thing playing out for containers, with the powerful endgame of being able to automate containers, too?

Sullins: Right. We have been utilizing Rancher as part of our coordination tool for our Kubernetes infrastructure and utilizing vSphere for that. So we are using that.

As far as the containerization approach, REI has been doing containers before containers was a big thing. Our containerization platform has been around since at least 2015. So REI has been pretty cutting edge as far as that is concerned.

https://www.rei.com/about-rei

And now that Kubernetes has won the orchestration wars, as it were, we are looking to standardize that for people who want to do things online, which is to say, going back to the digital transformation journey.

Basically, the industry has caught up with what our super-awesome developers have done with containerization. But we are looking to transition the heavy lifting of maintaining a platform away from the developers. Now that we have a standard approach with Kubernetes, they don’t have to worry so much about it. They can just develop what they need to develop. It will be a big win for us.

Gardner: As you look back at your automation journey, have you developed a philosophy about automation? How this should this best work in the future?

Trust as foundation of automation 

Sullins: Right. Have you read Gene Kim’s The Unicorn Project? Well, there is also his The Phoenix Project. My take from that is the whole idea of trust, of trusting other people. And I think that is big.

I see that quite a bit in multiple organizations. For REI, we are going to work as a team and we trust each other. So we have a pretty good culture. But I would imagine that in some places that is still big challenge.

https://www.hpe.com/us/en/home.html
And if you take a look at The Unicorn Project, a lot of the issues have to do with trusting other human beings. Something happened, somebody made a mistake, and it caused an outage. So they lock it up and lock it away and say only certain people can do that. And then if you multiply that happening multiple times -- and then different individuals walking that down -- it leads to not being able to automate processes without somebody approving it, right?

Gardner: I can't imagine you would have been capable, when you had to transition your private cloud for more online activity, if you didn’t have that trust built into your culture.

Sullins: Yes, and the big challenge that might still come up is the idea of trusting your end users, too. Once you go into the realm of self-service, you come up on the typical what-ifs. What if somebody adds a zero and they meant to only roll out 4 VMs but they roll out 40? That’s possible. How do you create guardrails that are seamless? If you can, then you can trust your users. You decrease the risk and can take that leap of faith that bad things won’t happen.

Gardner: Tell us about your wish list for what comes next. What you would like HPE to be doing?

Small steps and teamwork rewards 

Sullins: My approach is to first automate one thing and then work out from there. You don’t have to boil the ocean. Start with something small and work your way up.

As far as next steps, we want auto scaling a physical layer and having the robots do all of that. The robots will scale up and down our requesters while we sleep.

We will continue to do application programming interface (API)-capable automation with anything that has a REST API. If we can connect to that and manipulate it, we can do pretty much whatever automation we want.

https://www.briefingsdirectblog.com/2019/09/hpe-strategist-mark-linesch-on-surging.html

We are also containerizing all things. So if any application can be containerized properly, containerize it if you can.

As far as what decision-making engine we have to do the auto-scaling on the physical layer, we haven’t really decided upon what that is. We have some ideas but we are still looking for that.

Gardner: How about more predictive analytics using artificial intelligence (AI) with the data that you have emanating from your data center? Maybe AIOps?

Sullins: Well, without a doubt. I, for one, haven’t done any sort of deep dive into that, but I know it’s all the rage right now. I would be open to pretty much anything that will encompass what I just talked about. If that’s HPE InfoSight, then that’s what it is. I don’t have a lot of experience quite honestly with InfoSight as of yet. We do have it installed in a proof of concept (POC) form, although a lot of the priorities for that have been shifted due to COVID-19. We hope to revisit that pretty soon, so absolutely.


Gardner: To close out, you were ahead of the curve on digital transformation. That allowed you to be very agile when it came time to react to the COVID-19 pandemic.  What did that get you? Do you have any results?

Sullins: Yes, as a matter of fact, our boss’s boss, his boss -- so three bosses up from me -- he actually sits in on our load testing. It was an all-hands-on-deck situation during that May online sale. He said that it was the most seamless one that he had ever seen. There were almost no issues with this one.
We had done what we needed on the infrastructure side to make sure that we met dynamic demands. It was very successful. We went past our goals, so it was a win-win all the way around.

What I attribute that to is, yes, we had done what we needed on the infrastructure side to make sure that we met dynamic demands. Also, everybody worked as a team. Everybody, all the way up the stacks, from our infrastructure contribution, to the hypervisor and hardware layer, all the way on up to the application layer and the containers, and all of our DevOps stuff. It was very successful. We went past our goals of what we had thought for the sale, so it was a win-win all the way around.

Gardner: Even though you were going through this terrible period of adjustment, that’s very impressive.

Sullins: Yes.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Friday, June 19, 2020

How the right data and AI deliver insights and reassurance on the path to a new normal

https://www.hpe.com/us/en/solutions/artificial-intelligence.html

The next BriefingsDirect Voice of AI Innovation podcast explores how businesses and IT strategists are planning their path to a new normal throughout the COVID-19 pandemic and recovery.

By leveraging the latest tools and gaining data-driven inferences, architects and analysts are effectively managing the pandemic response -- and giving more people better ways to improve their path to the new normal. Artificial intelligence (AI) and data science are proving increasingly impactful and indispensable.


Stay with us as we examine how AI forms the indispensable pandemic response team member for helping businesses reduce risk of failure and innovate with confidence. To learn more about the analytics, solutions, and methods that support advantageous reactivity -- amid unprecedented change -- we are joined by two experts.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

Please welcome Arti Garg, Head of Advanced AI Solutions and Technologies, at Hewlett Packard Enterprise (HPE), and Glyn Bowden, Chief Technologist for AI and Data, at HPE Pointnext Services. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: We’re in uncharted waters in dealing with the complexities of the novel coronavirus pandemic. Arti, why should we look to data science and AI to help when there’s not much of a historical record to rely on?  

Garg: Because we don’t have a historical record, I think data science and AI are proving to be particularly useful right now in understanding this new disease and how we might potentially better treat it, manage it, and find a vaccine for it. And that’s because at this moment in time, raw data that are being collected from medical offices and through research labs are the foundation of what we know about the pandemic.

https://www.linkedin.com/in/arti-g-4148023/
Garg
This is an interesting time because, when you know a disease, medical studies and medical research are often conducted in a very controlled way. You try to control the environment in which you gather data, but unfortunately, right now, we can’t do that. We don’t have the time to wait.

And so instead, AI -- particularly some of the more advanced AI techniques -- can be helpful in dealing with unstructured data or data of multiple different formats. It’s therefore becoming very important in the medical research community to use AI to better understand the disease. It’s enabling some unexpected and very fruitful collaborations, from what I’ve seen.

Gardner: Glyn, do you also see AI delivering more, even though we’re in uncharted waters?

Bowden: The benefits of something like machine learning (ML), for example, which is a subset of AI, is very good at handling many, many features. So with a human being approaching these projects, there are only so many things you can keep in your head at once in terms of the variables you need to consider when building a model to understand something.

But when you apply ML, you are able to cope with millions or billions of features simultaneously -- and then simulate models using that information. So it really does add the power of a million scientists to the same problem we were trying to face alone before.

Gardner: And is this AI benefit something that we can apply in many different avenues? Are we also modeling better planning around operations, or is this more research and development? Is it both?
Data scientists are collaborating directly with medical science researchers and learning how to incorporate subject matter expertise into data science models.

Garg: There are two ways to answer the question of what’s happening with the use of AI in response to the pandemic. One is actually to the practice of data science itself.

One is, right now data scientists are collaborating directly with medical science research and learning how to incorporate subject matter expertise into data science models. This has been one of the challenges preventing businesses from adopting AI in more complex applications. But now we’re developing some of the best-practices that will help us use AI in a lot of domains.

In addition, businesses are considering the use of AI to help them manage their businesses and operations going forward. That includes things such as using computer vision (CV) to ensure that social distancing happens with their workforce, or other types of compliance we might be asked to do in the future.

Gardner: Are the pressures of the current environment allowing AI and data science benefits to impact more people? We’ve been talking about the democratization of AI for some time. Is this happening more now?

More data, opinions, options

Bowden
Bowden: Absolutely, and that’s both a positive and a negative. The data around the pandemic has been made available to the general public. Anyone looking at news sites or newspapers and consuming information from public channels -- accessing the disease incidence reports from Johns Hopkins University, for example -- we have a steady stream of it. But those data sources are all over the place and are being thrown to a public that is only just now becoming data-savvy and data-literate.

As they consume this information, add their context, and get a personal point of view, that is then pushed back into the community again -- because as you get data-centric you want to share it.

So we have a wide public feed -- not only from universities and scholars, but from the general public, who are now acting as public data scientists. I think that’s creating a huge movement.

Garg: I agree. Making such data available exposes pretty much anyone to these amazing data portals, like Johns Hopkins University has made available. This is great because it allows a lot of people to participate.

It can also be a challenge because, as I mentioned, when you’re dealing with complex problems you need to be able to incorporate subject matter expertise into the models you’re building and in how you interpret the data you are analyzing.

And so, unfortunately, we’ve already seen some cases -- blog posts or other types of analysis -- that get a lot of attention in social media but are later found to be not taking into account things that people who had spent their careers studying epidemiology, for example, might know and understand.

https://www.hpe.com/us/en/home.html
Gardner: Recently, I’ve seen articles where people now are calling this a misinformation pandemic. Yet businesses and governments need good, hard inference information and data to operate responsibly, to make the best decisions, and to reduce risk.

What obstacles should people overcome to make data science and AI useful and integral in a crisis situation?

Garg: One of the things that’s underappreciated is that a foundation, a data platform, makes data managed and accessible so you can contextualize and make stronger decisions based on it. That’s going to be critical. It’s always critical in leveraging data to make better decisions. And it can mean a larger investment than people might expect, but it really pays off if you want to be a data-driven organization.

Know where data comes from 

Bowden: There are a plethora of obstacles. The kind that Arti is referring to, and that is being made more obvious in the pandemic, is the way we don’t focus on the provenance of the data. So, where does the data come from? That doesn’t always get examined, and as we were talking about a second ago, the context might not be there.

All of that can be gleaned from knowing the source of the data. The source of the data tends to come from the metadata that surrounds it. So the metadata is the data that describes the data. It could be about when the data was generated, who generated it, what it was generated for, and who the intended consumer is. All of that could be part of the metadata.

Organizations need to look at these data sources because that’s ultimately how you determine the trustworthiness and value of that data.
We don't focus on the provenance of the data. Where does the data come from? That doesn't always get examined and he context might not be there.

Now it could be that you are taking data from external sources to aggregate with internal sources. And so the data platform piece that Arti was referring to applies to properly bringing those data pieces together. It shouldn’t just be you running data silos and treating them as you always treated them. It’s about aggregation of those data pieces. But you need to be able to trust those sources in order to be able to bring them together in a meaningful way.

So understanding the provenance of the data, understanding where it came from or where it was produced -- that’s key to knowing how to bring it together in that data platform.

Gardner: Along the lines of necessity being the mother of invention, it seems to me that a crisis is also an opportunity to change culture in ways that are difficult otherwise. Are we seeing accelerants given the current environment to the use of AI and data?

AI adoption on the rise 

Garg: I will answer that question from two different perspectives. One is certainly the research community. Many medical researchers, for example, are doing a lot of work that is becoming more prominent in people’s eyes right now.

I can tell you from working with researchers in this community and knowing many of them, that the medical research community has been interested and excited to adopt advanced AI techniques, big data techniques, into their research.

https://www.hpe.com/us/en/solutions/artificial-intelligence.html

It’s not that they are doing it for the first time, but definitely I see an acceleration of the desire and necessity to make use of non-traditional techniques for analyzing their data. I think it’s unlikely that they are going to go back to not using those for other types of studies as well.

In addition, you are definitely going to see AI utilized and become part of our new normal in the future, if you will. We are already hearing from customers and vendors about wanting to use things such as CV to monitor social distancing in places like airports where thermal scanning might already be used. We’re also seeing more interest in using that in retail.

So some AI solutions will become a common part of our day-to-day lives.

Gardner: Glyn, a more receptive environment to AI now?

Bowden: I think so, yes. The general public are particularly becoming used to AI playing a huge role. The mystery around it is beginning to fade and it is becoming far more accepted that AI is something that can be trusted.

It does have its limitations. It’s not going to turn into Terminator and take over the world.

The fact that we are seeing AI more in our day-to-day lives means people are beginning to depend on the results of AI, at least from the understanding of the pandemic, but that drives that exception.
The general public are particularly becoming used to AI playing a huge role. The mystery around it is beginning to fade and it is becoming far more accepted that AI is something that can be trusted.

When you start looking at how it will enable people to get back to somewhat of a normal existence -- to go to the store more often, to be able to start traveling again, and to be able to return to the office -- there is that dependency that Arti mentioned around video analytics to ensure social distancing or temperatures of people using thermal detection. All of that will allow people to move on with their lives and so AI will become more accepted.

I think AI softens the blow of what some people might see as a civil liberty being eroded. It softens the blow of that in ways and says, “This is the benefit already and this is as far as it goes.” So it at least forms discussions whenever it was formed before.

Garg: One of the really valuable things happening right now are how major news publications have been publishing amazing infographics, very informative, both in terms of the analysis that they provide of data and very specific things like how restaurants are recovering in areas that have stay-in-place orders.

In addition to providing nice visualizations of the data, some of the major news publications have been very responsible by providing captions and context. It’s very heartening in some cases to look at the comments sections associated with some of these infographics as the general public really starts to grapple with the benefits and limitations of AI, how to contextualize it and use it to make informed decisions while also recognizing that you can go too far and over-interpret the information.

Gardner: Speaking of informed decisions, to what degree you are seeing the C-suite -- the top executives in many businesses -- look to their dashboards and query datasets in new ways? Are we seeing data-driven innovation at the top of decision-making as well?

Data inspires C-suite innovation 

Bowden: The C-suite is definitely taking a lot of notice of what’s happening in the sense that they are seeing how valuable the aggregation of data is and how it’s forwarding responses to things like this.

So they are beginning to look internally at what data sources are available within their own organizations. I am thinking now about how do we bring this together so we can get a better view of not only the tactical decisions that we have to make, but using the macro environmental data, and how do we now start making strategic decisions, and I think the value is being demonstrated for them in plain sight.

https://www.hpe.com/us/en/solutions/artificial-intelligence.html

So rather than having to experiment, to see if there is going to be value, there is a full expectation that value will be delivered, and now the experiment is how much they can draw from this data now.

Garg: It’s a little early to see how much this is going change their decision-making, especially because frankly we are in a moment when a lot of the C-suite was already exploring AI and opening up to its possibilities in a way they hadn’t even a year ago.

And so there is an issue of timing here. It’s hard to know which is the cause and which is just a coincidence. But, for sure, to Glyn’s point, they are dealing with more change.


Gardner: For IT organizations, many of them are going to be facing some decisions about where to put their resources. They are going to be facing budget pressures. For IT to rise and provide the foundation needed to enable what we have been talking about in terms of AI in different sectors and in different ways, what should they be thinking about?

How can IT make sure they are accelerating the benefits of data science at a time when they need to be even more choosy about how they spend their dollars?

IT wields the sword to deliver DX 

Bowden: With IT particularly, they have never had so much focus as right now, and probably budgets are responding in a similar way. This is because everyone has to now look at their digital strategy and their digital presence -- and move as much as they can online to be able to be resistant to pandemics and at-risk situations that are like this.

So IT has to have the sword, if you like, in that battle. They have to fix the digital strategy. They have to deliver on that digital promise. And there is an immediate expectation of customers that things just will be available online.
With the pandemic, there is now an AI movement that will get driven purely from the fact that so much more commerce and business are going to be digitized. We need to enable that digital strategy.

If you look at students in universities, for example, they assume that it will be a very quick fix to start joining Zoom calls and to be able to meet that issue right away. Well, actually there is a much bigger infrastructure that has to sit behind those things in order to be able to enable that digital strategy.

So, there is now an AI movement that will get driven purely from the fact that so much more commerce and business is going to be digitized.

Gardner: Let’s look to some more examples and associated metrics. Where do you see AI and data science really shining? Are there some poster children, if you will, of how organizations -- either named or unnamed -- are putting AI and data science to use in the pandemic to mitigate the crisis or foster a new normal?

Garg: It’s hard to say how the different types of video analytics and CV techniques are going to facilitate reopening in a safe manner. But that’s what I have heard about the most at this time in terms of customers adopting AI.

In general, we are at very early stages of how an organization is going to decide to adopt AI. And so, for sure, the research community is scrambling to take advantage of this, but for organizations it’s going to take time to further adopt AI into any organization. If you do it right, it can be transformational. Yet transformational usually means that a lot of things need to change -- not just the solution that you have deployed.

Bowden: There’s a plethora of examples from the medical side, such as how we have been able to do gene analysis, and those sorts of things, to understand the virus very quickly. That’s well-known and well-covered.

The bit that’s less well covered is AI supporting decision-making by governments, councils, and civil bodies. They are taking not only the data from how many people are getting sick and how many people are in hospital, which is very important to understand where the disease is but augmenting that with data from a socioeconomic situation. That means you can understand, for example, where an aging population might live or where a poor population might live because there’s less employment in that area.

https://www.hpe.com/us/en/solutions/artificial-intelligence.html
The impact of what will happen to their jobs, what will happen if they lose transport links, and the impact if they lose access to healthcare -- all of that is being better understood by the AI models.

As we focus on not just the health data but also the economic data and social data, we have a much better understanding of how society will react, which has been guiding the principles that the governments have been using to respond.

So when people look at the government and say, “Well, they have come out with one thing and now they are changing their minds,” that’s normally a data-driven decision and people aren’t necessarily seeing it that way.

So AI is playing a massive role in getting society to understand the impact of the virus -- not just from a medical perspective, but from everything else and to help the people.

Gardner: Glyn, this might be more apparent to the Pointnext organization, but how is AI benefiting the operational services side? Service and support providers have been put under tremendous additional strain and demand, and enterprises are looking for efficiency and adaptability.

Are they pointing the AI focus at their IT systems? How does the data they use for running their own operations come to their aid? Is there an AIOps part to this story?

AI needs people, processes 

Bowden: Absolutely, and there has definitely become a drive toward AIOps.

When you look at an operational organization within an IT group today, it’s surprising how much of it is still human-based. It’s a personal eyeball looking at a graph and then determining a trend from that graph. Or it’s the gut feeling that a storage administrator has when they know their system is getting full and they have an idea in the back of their head that last year something happened seasonally from within the organization making decisions that way.

We are therefore seeing systems such as HPE’s InfoSight start to be more prominent in the way people make those decisions. So that allows plugging into an ecosystem whereby you can see the trend of your systems over a long time, where you can use AI modeling as well as advanced analytics to understand the behavior of a system over time, and how the impact of things -- like everybody is suddenly starting to work remotely – does to the systems from a data perspective.

So the models-to-be need to catch up in that sense as well. But absolutely, AIOps is desirable. If it’s not there today, it’s certainly something that people are pursuing a lot more aggressively than they were before the pandemic.

Gardner: As we look to the future, for those organizations that want to be more data-driven and do it quickly, any words of wisdom with 20/20 hindsight? How do you encourage enterprises -- and small businesses as well -- to better prepare themselves to use AI and data science?

Garg: Whenever I think about an organization adopting AI, it’s not just the AI solution itself but all of the organizational processes -- and most importantly the people in an organization and preparing them for the adoption of AI.

I advise organizations that want to use AI and corporate data-driven decision-making to, first of all, make sure you are solving a really important problem for your organization. Sometimes the goal of adopting AI becomes more important than the goal of solving some kind of problem. So I always encourage any AI initiative to be focused on really high-value efforts.

https://www.hpe.com/us/en/solutions/artificial-intelligence.html

Use your AI initiative to do something really valuable to your organization and spend a lot of time thinking about how to make it fit into the way your organization currently works. Make it enhance the day-to-day experience of your employees because, at the end of the day, your people are your most valuable assets.

Those are important non-technical things that are non-specific to the AI solution itself that organizations should think about if they want the shift to being AI-driven and data-driven to be successful.

For the AI itself, I suggest using the simplest-possible model, solution, and method of analyzing your data that you can. I cannot tell you the number of times where I have heard an organization come in saying that they want to use a very complex AI technique to solve a problem that if you look at it sideways you realize could be solved with a checklist or a simple spreadsheet. So the other rule of thumb with AI is to keep it as simple as possible. That will prevent you from incurring a lot of overhead.

Gardner: Glyn, how should organizations prepare to integrate data science and AI into more parts of their overall planning, management, and operations?

Bowden: You have to have a use case with an outcome in mind. It’s very important that you have a metric to determine whether it’s successful or not, and for the amount of value you add by bringing in AI. Because, as Arti said, a lot of these problems can be solved in multiple ways; AI isn’t the only way and often isn’t the best way. Just because it exists in that domain doesn’t necessarily mean it should be used.
AI isn't an on/off switch; it's an iteration. You can start with something small and then build into bigger and bigger components that bring more data to bear on the problem, and then add new features that lead to new functions and outcomes.

The second part is AI isn’t an on/off switch; it’s an iteration. You can start with something small and then build into bigger and bigger components that bring more and more data to bear on the problem, as well as then adding new features that lead to new functions and outcomes.

The other part of it is: AI is part of an ecosystem; it never exists in isolation. You don’t just drop in an AI system on its own and it solves a problem. You have to plug it into other existing systems around the business. It has data sources that feed it so that it can come to some decision.

Unless you think about what happens beyond that -- whether it’s visualizing something to a human being who will make a decision or automating a decision – it could really just be hiring the smartest person you can find and locking them in a room.

Pandemic’s positive impact

Gardner: I would like to close out our discussion with a riff on the adage of, “You can bring a horse to water but you can’t make them drink.” And that means trust in the data outcomes and people who are thirsty for more analytics and who want to use it.

How can we look with reassurance at the pandemic as having a positive impact on AI in that people want more data-driven analytics and will trust it? How do we encourage the perception to use AI? How is this current environment impacting that?

Garg: The fact that so many people are checking the trackers of how the pandemic is spreading and learning through a lot of major news publications as they are doing a great job of explaining this. They are learning through the tracking to see how stay-in-place orders affect the spread of the disease in their community. You are seeing that already.

We are seeing growth and trust in how analyzing data can help make better decisions. As I mentioned earlier, this leads to a better understanding of the limitations of data and a willingness to engage with that data output as not just black or white types of things.

As Glyn mentioned, it’s an iterative process, understanding how to make sense of data and how to build models to interpret the information that’s locked in the data. And I think we are seeing that.

https://www.hpe.com/us/en/solutions/artificial-intelligence.html
We are seeing a growing desire to not only view this as some kind of black box that sits in some data center -- and I don’t even know where it is -- that someone is going to program, and it’s going to give me a result that will affect me. For some people that might be a positive thing, but for other people it might be a scary thing.

People are now much more willing to engage with the complexities of data science. I think that’s generally a positive thing for people wanting to incorporate it in their lives more because it becomes familiar and less other, if you will.

Gardner: Glyn, perceptions of trust as an accelerant to the use of yet more analytics and more AI?

Bowden: The trust comes from the fact that so many different data sources are out there. So many different organizations have made the data available that there is a consistent view of where the data works and where it doesn’t. And that’s built up the capability of people to accept that not all models work the first time, that experimentation does happen, and it is an iterative approach that gets to the end goal.


I have worked with customers who, when they saw a first experiment fall flat because it didn’t quite hit the accuracy or targets they were looking for, they ended the experiment. Whereas now I think we are seeing in real time on a massive scale that it’s all about iteration. It doesn’t necessarily work the first time. You need to recalibrate, move on, and do refinement. You bring in new data sources to get the extra value.

What we are seeing throughout this pandemic is the more expertise and data science you throw in an instance, the much better the outcome at the end. It’s not about that first result. It’s about the direction of the results, and the upward trend of success.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in: