Wednesday, April 20, 2016

ITSM automation and intelligence gains deliver self-service help to more users

The next BriefingsDirect IT support thought leadership discussion highlights how automation, self-service and big data analytics are combining to allow IT help desks to do more for less.

We'll learn how automation and ITSM-driven insights endow help desk personnel with more knowledge and provide a single point of support for end users, regardless of their needs while still catering to their preferred method of help.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to share the latest on how IT support is advancing in the era of bring your own device (BYOD), cloud, and tight budgets, are three experts, David Blackeby, Program Solution Owner for Cloud Services at Sopra Steria, based in the UK; Diana Wosik, Group Program Manager at Sopra Steria, based in Poland, and Mark Laird, Group Technical Architect at Sopra Steria, based in the UK. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Let’s start at a high level and talk about how support has changed, and why enabling self-service is so important nowadays. Mark, why is self-service such an important issue when it comes to IT help desk?

Laird: For us, there are probably a number of issues. We have a range across our customer base, from millennials, who are used to dealing with websites, mobile, tablets, who really don’t want to call a call center, and don’t want to end up talking to somebody on the phone, through to the legacy users who are much more used to picking up the phone, asking for help, and talking through a problem.

So they're looking for a more human approach, human interaction, versus the millennials who want to fix it themselves, want to do it quickly, and really don’t want to talk to somebody about it. That’s introducing a range of problems and challenges.

Gardner: It sounds as if you need to deliver support in a spectrum of ways, but perhaps with a common core to that support function.

Underlying answer

Laird: The underlying answer to the problem, whatever the problem is, is likely to be the same. If you have a log-on issue, it will be a password reset or an account issue. It’s how you get that information out to the person who has the challenge.

Laird
If it’s a person on the phone, it's easy enough to talk them through it. But if you have somebody who is coming through a self-service portal, you have to provide them with that same information. So yes, at times, you connect a single call, a single database, and send your knowledge environment to a range of callers.

Gardner: David, we're being called on here to deliver support across the spectrum of modalities, methods, or even latency, but at the same time, many of the world governments are asking for austerity and savings in their budgets for IT. How are we able to reconcile this need for more variety and the delivery of help desk services, but cutting costs at the same time? Is there any way to reconcile them?

Blackeby: It’s part of the core challenge in the current world with austerity, where both our public and private customers are looking at how they can do more for less money.

IT has continuing cost pressures to reduce cost and overhead of providing IT.  At the same time, we talk about new methods of self-service, different types of platforms and different types of devices and this multi-channel effect that costs time, effort and money to invest in these technologies.

Blackeby
That’s the underlying driver for how it comes down to the service provider to do that. The only way we can do that is looking at industrializing that service delivery and automating processes, moving activities that may have previously been done by Level 2 and Level 3 resources. We're looking at how we can move those to cheaper or lower-cost resources, such as a service desk, or in an ideal world, remove them entirely from the cost chain and drive the automation. So the activity increases the speed and the agility while reducing the cost of delivering the service.

Gardner: Diana, another variable in the mix here is the increased use of mobile devices, of fluidity of the user in terms of their geography, their location, even the time of day that they might be working, and of course there is a plethora of devices, if you want to bring your own device organization. How is mobility affecting this equation for a more complex approach to help desk?

Wosik: Mobility is very important nowadays, because everybody uses mobile devices, every single day. We need to ensure a single point of contact, so they all can approach their help desk at
Wosik
any time they need, and they need the availability 24×7 for that.

Gardner: So, we've established that we have a need for more variability, addressing more types of help from more types of users. Tell me a bit more, Mark, about automation and self-service and how they support one another? What is it about automating processes that endows the user with more access to help, but then maybe that same feedback loop between the user and the support infrastructure can be brought to bear on future issues?

Laird: Automation is doing the same thing in a repeated, controlled fashion. Whether it’s a password reset or the delivery of a service or a server, what you're doing is scripting. You're putting into a workflow a process that a user can call on. Whether that user is an end user, an end customer, or in fact one of the operations team, it allows them to do that fairly standard process in a repeated quality controlled fashion.

And that can allow lower cost, potentially, as David said, bringing the tasks from maybe a qualified Level 3 expensive support person into an operations center, or in fact, maybe on to the self-service portal, where you're not having to give access to systems to end users, but you are allowing them to run a script.

Double benefit

Gardner: David, perhaps you could help me understand why self-service is a benefit to both the receiver of the help, the end user, as well as the organization. What is it about self-service that refines process and benefits the deliverer of the help, but at the same time, gives more speed or perhaps options to the receiver of the help?

Blackeby: Essentially it supports both sides of the equation. From an end user perspective, it’s that instant gratification, I can go into a centralized portal. I can do my search or raise my request and I can be instantly satisfied with the response. I could be presented with a knowledge article that tells me how to fix my particular issue.

If I'm requesting a new service to be delivered through orchestration in the back end, I can make my request, and the orchestration comes in and drives the automated delivery of that service to me. So it increases the agility for the user and it reduces delays.

From the other side of the equation, looking at it from a service provider’s perspective, the more work the user can do themselves takes cost away from us as a service provider.

Historically, a user would have called the service desk, so as a part of that conversation you need to understand who the user is to provide them the service. Make sure it’s a service that they are potentially allowed to have and sort of help through the process. That means that we need a body to answer the phone, and the amount of time that we spend on a typical call from the user drives the cost from a support center perspective.
That reduces the handling time by our agents and by the people who are delivering them the service.

Even if you have a scenario where a user using the portal today, and still need ultimately a human interaction to deliver that service, we already know who they are, and will have asked relevant questions upfront which means we don’t have to ask the questions later on down the line when we try to deliver a service. That reduces the handling time by our agents and by the people who are delivering them the service.

Gardner: Before we dig into the how you do this, now that we have established why it's an important new aspect of helpdesk, Diana, perhaps you can tell us a little bit about Sopra Steria, the organization, and to what degree they are supporting help desks in your markets?

Wosik: I can give you a good example of how it works in Poland and how the automation helps us out regarding the functionality of help desk.

We apply quite a few solutions, like virtual machine (VM) provisioning that has been automatically provisions the machines aligned to customer needs. There is a monitoring tool that is automated. So not only we monitor whatever is going on, but we're also able to answer the needs very quickly, thanks to our automation services.

And then there's the thing regarding the automatic deployment of our releases. Whenever there's a new release of the system, we don’t need a bunch of people who are going to work on it. We can also deploy it very quickly in production, and that helps us to bring the solution as quickly as possible to our customer.

Higher-level view

Gardner: Could you give us a higher-level view of Sopra Steria, the organization, and to what degree help desk support is part of a larger portfolio of services?

Laird: We're a European IT company. We run IT for a wide range of European customers. We deliver services. We write software. We do business process outsourcing. Essentially, if there's a computer involved in there somewhere, that’s what we do.

We have a presence in 27 countries across Europe, in India, and then smaller offices in Singapore, Hong Kong, and China. We have 36,500 staff, and an annual turnover of about 3.5 billion euros. So, we're a reasonably large company, one of the top 10 European IT companies.

For us, the service desk is the single point of contact. For all of our customers, that is their point of contact with us, whether it’s through the Global Delivery Center in Poland, where we're offering French, German, English, small amounts of Spanish and Italian, or through some of the in-country service desks, such as the ones we have in France and the UK. So that is our single point of contact and it’s of key importance to us.

Blackeby: Just to follow on from that, the key piece of that is that it’s an intelligent service desk as opposed to a help desk. It’s really about having the phones manned by intelligent people who are able to both try and fix or resolve issues straight away, as opposed to just logging a call, creating a ticket, and passing it off to someone else.

Gardner: How is it that we're providing those individuals on the front line with better knowledge? Are they getting more tools? Are they getting more data? Is this really just correlating a single point of access to the existing data? Is it all of the above? How do we empower those people to do this difficult help desk job better?

Blackeby: In the same way that we try to have a single point of entry for users, for a portal, it’s really the same piece for our support staff as well.

While there are many systems that underpin our service delivery, the key element we try to strive for is that the operators have a single place to work. It’s very much thorough the integration of various systems and data sources into a centralized repository, so that the person that’s trying to act on a ticket, request, or other activity has everything they need in one place, so they can immediately see what the issue is, see what the request is, and then deliver the service to that end user.

Gardner: It strikes me that whether it’s a help desk’s person or the end user, the more they use this, the more the data can be collected, the more knowledge can be harnessed from the interactions, and therefore brought back through a feedback loop into the next level of support.

Is the cost savings on this ultimately about you're better able to understand the market because of the self-service, because of these portal approaches? Is that a big part of it?

Key items

Blackeby: It feeds into that. If you're looking at industrializing or automating, you're really looking for repeatable activities that are done time and time again. The data helps to support that. It identifies suitable candidates that are high volume, high throughput transactions that are really the key things that you want to focus on in terms of introducing automation into the environment, or automation into task elements in a given process. So, over time, it’s pretty much what we are doing.

As Mark mentioned, we're a managed service provider (MSP), providing the services across many customers. So, a lot of the economies of scale we get are best practices that we apply in one account or particular scenarios or issues that we see in one, we can see correlations in other customer accounts as well. So we can bring those efficiencies and bring that investment we make and automation through our back office processes to benefit multiple customers.

Wosik: What is very well known right now is big data and smart analytics that will help us to gather all the information from our customers, so the more tickets and the more incidents are logged, the more information you can gather as well. This is gathered and analyzed. This is when we can provide more accurate and quicker answers to our customers. It’s something that has really impacted our quality of service.

Gardner: Let’s look also back to the systems, when we think about gathering information, more and more big data gathered from logs and other output data from the systems themselves, from the platforms. How are you at Sopra Steria managing the knowledge gathering from your systems and then applying that into this other knowledge base about the activities on your help desk and from the self-help portal?
What is very well known right now is big data and smart analytics that will help us to gather all the information from our customers, so the more tickets and the more incidents are logged, the more information you can gather as well.

Laird: We're looking at some of the new technologies around smart analytics and big data, but we're starting with some of the simpler approaches, which as David alluded to and as Diana mentioned earlier, are just the simple high-volume transactions, the things that we do on a regular basis that are maybe quality issues or maybe they are just time consuming, but those are the key ones we're after.

Then, over the next three to six months, as we move into some of the newer technologies around smart analytics, for example, we'll be taking some of the incidents and things coming into service desk, into the service management system, and looking at those and doing problem management on them.

Have we suddenly got an influx of incidents around our exchange platform? Is that actually indicating that there is an underlying problem or an underlying system error that we need to fix?

It’s starting to link all the various systems, whether it’s the business service monitoring system to the back end that the operations teams are using, or the service management platforms at the front that our service desk people are using, pulling all those together, tying them in with, for example, the configuration management platform, so that people are seeing the same information, both from a front-end user impacting view, or from a back-end infrastructure and service view.

Gardner: And I should think that would also help in more agility to do root-cause analysis and making it faster to time for resolution.

Automate and fix

Laird: Exactly. That back goes back to when we fix problems, close incidents, and if there's a resolution in there, doing the analysis on them to identify common fixes. If an incident comes in or a particular type of incident comes in and we always do the same thing to it, we can automate that. We can actually either get the service desk or help desk people access to that quick fix or just automate it right at the start, so when that issue occurs, we automate and fix.

In some cases, that’s moving out of the customer’s view completely. We're fixing it almost before there's an impact.

Gardner: We've talked a bit about making these help desk approaches better from the end-user perspective, empowering the personnel in the help desk organization itself, and finding some new technologies and analysis benefits to propel that forward, but I would like to go back to the issue of cost.

How are we wringing out more cost from this process, perhaps things like identifying automation and what’s called shift left, better or earlier in the process. So, where are we targeting to get the most results when it comes to cost reduction in all of this?

Blackeby: It really talks about how people do transactions, what things are continually occurring that have a high amount of touch points to them. Some of that comes out through time.
These days, more and more commonly, we can use software distribution, or automated software push tools, that don’t require human interaction at all.

One of the challenges we have when we take on a new customer is that you don’t have the excellent benefit of hindsight around how the organization works and what their common problems are. So, as we take on a new customer or a new contract, we have the ability to go and talk to their existing service provider or their in-house person. A lot of that comes out over time.

There are some standard things that we can recognize, because we have similar customers in similar marketplaces or industries and things that we would expect to get from the outset, and by looking at things like password reset tools and things like that are common and applicable across all types of clients.

Then, it’s a case of looking at your volumetrics over time, your repeatable activities, incidents and requests, identifying how can we drive the agility and improve the service levels that we're delivering, and at the same time, reduce cost.

Take a simple thing like software deployment to users machines, historically, that might have been a call to the service desk. They might have dispatched a desk-side engineer or used remote control to be able to connect with a user’s device to go and install the software.

These days, more and more commonly, we can use software distribution, or automated software push tools, that don’t require human interaction at all. We can automatically deploy software to the user.

Zero-touch environment

That moves into that zero-touch type of environment. Through a portal request, we can manage the workflow around any approval activities. Then once fully approved, through the orchestration at the back-end, we can interface by software deployment solution to automate the delivery of that software to that endpoint device.

And we support many different types of devices now. We've seen more and more cases where not only are we talking about physical desktops or laptops, but also around how we manage mobile devices and tablet type devices as well, using mobility and mobile device management solutions.

Gardner: Let’s look at some of these solutions in practice. Sopra Steria has been doing this for some time and across a large marketplace. Do you have any examples that demonstrate when you can do this well that you get those benefits of self-help, common core data, more knowledgeable help desk, reduce costs, all at the same time?
It probably took two or three days to code the solution, but we're saving a significant amount of time every day.

Laird: One of the solutions we looked at in Poland, certainly around automation, was a really simple challenge that the operations team had as part of our Polish operation. Every morning, backups from a particular customer was taking them in the region of one hour to produce a backup report, look at the backup that had failed, re-run backups as appropriate, and then if backups had failed maybe consistently for a couple of days, escalating that out to support team.

We automated the whole thing. It’s all automated using HPE Operations Orchestration. The whole process now takes one of the team about five minutes in the morning, and it’s really a case of checking the output from the system.

So, we've saved somewhere in the region of just under an hour everyday for one person. It probably took two or three days to code the solution, but we're saving a significant amount of time every day. We're getting a much better quality report, and we're able to pass that information out to our second-line and third-line teams earlier in the day, it gives them much more time to fix things.

One of the things that we've looked at now is automating the re-run of backups overnight. Rather than letting them go to maybe two or three days, they're fixed overnight, and we run them within the backup window. It's improving quality to the customer and a having significant impact on savings to the operations team.

Gardner: You mentioned the use of the HPE tools. Are there any other HPE platforms or approaches that are helping you bring in this common data. We talked about the analysis earlier that also helps in this equation of doing more with less.

HPE partner

Laird: We're an HPE partner. We have been for over 10 years now, and we have quite a range of HPE tools across the portfolio, whether that’s from things like the Application Lifecycle Manager, through to HPE Service Manager.

We also have solutions like OMi doing things like event correlation, where we have events coming in from the monitoring solutions, whether that’s from HPE SiteScope or Operations Manager or from third party tools, like SCCM and some of the Nagios tools.

OMi is correlating those events and passing through to the service desk and the operations center the ones that actually need to be looked at. We're filtering out more than 50 percent, 60 percent of the alerts. It reduces our cost. We're filtering those alerts out at a much earlier point in the chain, and with that, we're only raising incidents for ones that actually need to be escalated up to the teams.

We're using tools and technology, to keep costs down and reduce the costs as far as we can.
One of the challenges that are coming more to the forefront these days is probably the adoption of cloud services. It’s a disruptive influence on traditional IT and how IT is delivered.

Gardner: So as we think about being able to future-proof the support services, and by that I mean being able to adapt to a millennial audience, more distribution points, more types of help desk and automation, and that single portal, we also need to be thinking about being backwards compatible. Some organizations do want more of that human touch, the interactions, and perhaps some of the government organizations are interested in that as well.

What is it about the future direction of your services at Sopra Steria, some of the tools and technologies that you are employing from HPE, that allows you to feel confident about being both future proof and backwards compatible for your support?

Blackeby: One of the challenges that are coming more to the forefront these days is probably the adoption of cloud services. It’s a disruptive influence on traditional IT and how IT is delivered.

It’s a challenge for us the service providers to adapt to these. You're talking about environments that can be built in minutes, bringing a whole new way of working, very fluid environments with auto-scaling where the number of resources that we are supporting and managing is growing and shrinking dynamically over time. So that’s really had a big sort of impact on how we deliver service.

We've recognized this and are looking at how we transform the service delivery. We're becoming more reliant on the data that supports the service. So it’s very much around how we manage what’s out there, with a heavy reliance on things like configuration management systems, and discovery of IT resources.

As Mark said, there are things like event correlation, looking at patterns, trends and events so that we can increase the agility and really manage much higher volumes of applications, of servers and of users with a smaller number of people or with the same number of people.

Gardner: It is very exciting a lot is going on.

Tools and technologies

Blackeby: As a ratio you might have a scenario of a support person looking after an average 40 servers to now having to deal with realms of managing, so there are a 100-plus servers, but it’s only through the deployment of the tools and technologies that we can do that.

But at the same time, we still have a large legacy estate and legacy clients and we still need to support. So it’s really looking at how come we engineer our processes so that irrespective of what we are talking about legacy physical server workloads or perhaps on premise virtualized workloads as well as things that might be spun up inside Amazon Web Services or in Microsoft Azure public cloud environments that we provide that consistent level of service and service delivery irrespective of where the service is located or in which format it is delivered back to the customer or users.

Gardner: When I speak to developer organizations and IT production organizations operations, they're seeing a compression and a large degree of collaboration between development and operations. Thus, the DevOps trend.
But at the same time, we still have a large legacy estate and legacy clients and we still need to support.

But when I listen to you, I'm hearing also a compression between operations and help desk in such a way that it benefits the entire IT process in a more automated and the more software-defined and the more data that’s made available, the tighter that compression seems to get. Am I perhaps describing seeing this idea of help desk, support and operations becoming more collaborative, more tightly aligned?

Laird: The whole concept of the operations team being hidden away in a back room and the service desk being the public face is changing. They're becoming much more tightly aligned. Things that the operations team is doing have an almost immediate impact on what the service desk is looking at, and the service desk needs to have access to really all the information the operations team has got.

When the user is on the phone and has a problem with a service, it’s good if the service desk can actually say, "Yes, we know there's a problem and we know what the problem is. We have an estimated fix time of 15 minutes." That gives the user the warm feeling that you're in control and you know what you're doing.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Tuesday, April 19, 2016

Panel explores how the IT4IT Reference Architecture acts as a digital business enabler

The next BriefingsDirect expert panel discussion examines the value and direction of The Open Group IT4IT initiative, a new reference architecture for managing IT as a business.

IT4IT was a hot topic at The Open Group San Francisco 2016 conference in January, and the enterprise architect and IT leader attendees examined it from a variety of different angles. This panel, conducted live at the event, elevates the IT4IT discussion to the level of enabling digital business value.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

And so to learn more about how IT4IT aids businesses, we are joined by Chris Davis, Professor of Information Systems at the University of South Florida and also Chairman of The Open Group IT4IT Forum; Lars Rossen, a Distinguished Technologist at Hewlett Packard Enterprise (HPE) and a chief architect for the IT4IT program; Ryan Schmierer, Business and Enterprise Architect for IT at Microsoft, and David Wright, Chief Strategy Officer at ServiceNow. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: I hear IT4IT described as a standard, a framework, a methodology, and a business-enabler. Chris, is it all of those, is it more, is this a whole greater than the sum of the parts? Help us understand the IT4IT potential.

Davis: It could be seen as all of those. I have been academically in this space for 20 to 25 years, and the thing that is different, the thing that adds potential to this is the value-chain orientation.

Davis
As well as being a really potent technical standard, we've abstracted this to levels that can be immediately appreciated in the C Suite. People like Kathleen come along, they see it and get it, and that provides some traction. That is a very positive thing, and will enable us to pick up speed as people like Toine invite real penetration down to the CMDB level and so on.

We have this multilayer view. Lars and I articulated it as levels of abstraction, but I think the integration of Mike Porter’s stuff really adds some perspective to this technical standard that maybe isn’t present or hasn’t been present in other frameworks and tools.

Gardner: And as we explain this up the value chain into the organization, do you expect that IT4IT is something you would take to a board setting environment and have them understand this concept of a value stream and consolidating around that?

Davis: Yeah, I do. Some of the observations that were made yesterday about the persistence of models like value chain, value stream, and so on, still make enormous sense to people at the CIO level. That enables the conversation to begin and also provides the ability to see whereabouts, how much of the standard, which particular value streams, where in the organization (the various parts and perspectives) fit.

As well as being very potent and very prescriptive, we have that conceptual agility that the standard provides. I find it exciting and quite refreshing. 

Organic development

Gardner: Lars, one thing that’s also interesting to me about IT4IT is that this was an organic development within IT organizations, for and by them. Tell us how, at HPE, you developed this, and why it was a good fit for The Open Group as a standardization process? 

Rossen: A couple of things made us kick this off, together with Shell initially and then a lot of members came over the years. For us in HPE, it was around consumption of our toolsets. That’s where I came from.

Rossen
I was sitting on the portfolio group and I said, well, we're all drawing all of these diagrams around how it could fit together and we have these endless discussions with customers about whether this was right or this was wrong. I was completely disagreeing with all our friendly partners, as well as not so friendly competitors, about what was the right diagram.

Putting this into the open -- and we chose Open Group for that particular reason; they have shown in the past that they can create these kinds of things -- allowed us to have that common framework for defining the To-Be architecture for our customers. That simply made it much easier for us to sell our product suite. So it made a lot of business value for us.

And it also made it much easier for our consultancy service. We didn’t have to argue about the To-Be architecture; it was a given. Then, we can talk about how to actually implement it, which is much more interesting. 

Gardner: And while we are speaking about HPE and your experience there, do you have any tangible metrics of success as to how this improved? You went through a large business separation of IT departments; that must have been a difficult process. Was there anything that the IT4IT approach brought to that particular activity that you can point to as a business driver or business benefit?

Rossen: I can. A very large organization is compartmentalized in many different ways, and you could say, well, how do all of these units interchange and work with each other, because it goes both ways; it’s not only the split, but it’s also all the acquisitions we've been doing over the years.

And then we have the framework that we can use and plot things in to, and we have a standardized toolset we can use and reuse over and over again.

Before we had IT4IT, we counted how many integrations we had between our various IT management products, and it ran to about 500. With IT4IT, we can drill down and see that there are only about 50 that are really interesting. Then, we can double down on those. We can now measure how much these are the ones that are being consumed moving forward, both internally within our service practice and as well as with our customer base.

Gardner: Ryan, at Microsoft, I’m wondering about Bimodal IT and Shadow IT. Because you perhaps have a more concentrated view on IT and you can control your organization, you don’t have that problem – or maybe you do. Is there is any degree of Bimodal IT at Microsoft or Shadow IT within your IT organization, have you addressed that, and has IT4IT been a use in that direction?

Consistency and repeatability

Schmierer: First, starting with the idea of Bimodal IT, we go back to some of the research and the thoughts coming from Gartner over the last couple of years about different parts of IT needing to work at different paces. Some need to be more agile and work faster; others need to be the foundational stalwarts of the organization, providing that consistency and that repeatability that we need.

Schmierer
At Microsoft, we tend to look at it a little bit differently. When you think about agile versus waterfall, it’s not a matter of one versus the other. Should we do one or the other? There's a place for both of these. They are tools within our toolbox. Within IT, there are places where we want to move in a more agile way -- where we want to move faster. There are also certain activities where waterfall is still an excellent methodology to drive the consistency and predictability that we need.

A good example of that comes with large releases. We may develop changes or features in a very agile way, but as we move towards making large changes to the business that impact large business functions, we need to roll those changes out in a very controlled, scripted way. So, we take a little bit different look at Bimodal than some companies do.

Your other question was on Shadow IT. One of the things that we have challenged a lot over the last year or so is this concept the role of the IT organization relative to the rest of the enterprise. As we think about that, we're not thinking about IT as a service provider to the enterprise, but as a supporting function to the enterprise.

What does that mean? It means Shadow IT doesn’t exist. It just happens to be someone else within the organization providing that function. And so it becomes less of a question of controlling and preventing Shadow IT and more of embracing that outside-in approach and being able to assimilate those changes and coordinate them in a more structured way to manage things like risk and security.
We're not thinking about IT as a service provider to the enterprise, but as a supporting function to the enterprise.

Gardner: Well, we have heard that there’s a bridging of siloes benefit to IT4IT in either Bimodal or Shadow IT. Can you relay a way in which IT4IT helped you bridge silos and consolidate culturally and otherwise your IT efforts?

Schmierer: Absolutely. Very similar to some of the experiences that Lars explained at HPE, at Microsoft we've had a number of different product groups focusing on different products and solutions and service suites over the last few years.

As we've moved to more of a One Microsoft approach, we're looking at, how to bring the organization and the enterprise together in a cohesive way?

IT plays a role in enabling that as a supportive function to the company and the IT4IT standard has been a great tool for us to have a common talking point, a common framework, to bridge those discussions about not only what we do internally within IT, but how the things that we do internally relate to the products and services that we sell out into the marketplace as well. Having that common framework, that common taxonomy, is not just about talking with customers; it’s about talking internally and getting the entire enterprise aligned.

Business service management

Gardner: Dave, as organizations are working at different paces toward being digital businesses, they might look to their IT organizations for leadership. We might, as a business, want to behave more like our IT organizations.

At ServiceNow I have heard you describe IT service management (ITSM) as one step toward business service management (BSM), rather than just ITSM. How do you see the evolution from ITSM to business service management and a digital business benefit? And how do you foresee IT4IT aiding and accelerating that?

Wright: The interesting thing about IT4IT is the fact that it conceptualizes the whole four stages that people go through on the journey. I suppose you could say the gift that ITIL gave IT was to give it an operational framework to work with.

Wright
Most other parts of the business haven’t got an operational framework. If you want to request something off most parts of the business, you will send them an email. If you want something off legal, you want something off marketing, send them an email. They haven’t got a system where they can request something.

If we take some of the processes described in IT4IT and publish that in a business-service catalog, you effectively allow everyone to have a single system of engagement. They might have their own back-end systems, they might have their own human capital management system, their own enterprise resource planning (ERP) system, but how do you engage and link all those companies together?

The other thing that IT has learned over a number of different implementations is how important the experience becomes, because if you can generate an experience where people want to use it, that’s what’s going to drive adoption of it as a function.

Let’s take this room as a whole. If we all sat together and built Uber, it would be crap. It would be really good for the taxi drivers, but it would be terrible for the people who actually wanted to request the service, and that’s because we tend to build everything from the inside out.

The fact we have now got a way to elevate that position and look at it from above, and understand all those components, and be able to track all those components from start to finish, and give people visibility in where you are in that process, that’s not just a benefit to IT; that’s a benefit to anyone who provides a service.

Gardner: As we also explore ways that we can evangelize and advocate for this in our organizations, it’s helpful to have places where it works first, the crawl-walk-run approach. Chris, can you help us understand areas where applying IT4IT early and often as a beachhead works?

Need and competence

Davis: Where you have the need and the competence. Back to my earlier point about how the standard can be envisioned, and the point that David just made, what we offer in IT4IT is something that’s not only prescriptive and ready to hand, but it’s also ready to mind, so people get it very quickly.

The quick wins are the important ones, not necessarily the low-hanging fruit, but the parts of the business where opportunities like the ones that David just suggested -- if we were to try to do something like Uber -- that would be too much.

If somewhere in an organization like Microsoft -- where Kathleen is in-charge -- there is a group that can gain rapid traction, that would be most effective. Then the telling of the early success stories; the work by Toine that shows how from the early stages in the development of the architecture, it was useful at Rabobank, that adds momentum.

Gardner: Lars, same question, where did you see this as getting traction best? Maybe it’s new efforts, greenfield application development, mobile-first type development, or maybe it’s some other area. Where might you point to as a great starting point to build this into an organization?
It isn't until you have the value streams more in order that you can start building up that service backbone that is so crucial to IT4IT.

Rossen: It’s pretty simple actually. We've done more than 50, maybe a 100 engagements now using the IT4IT model with our customer base. Very often, it's the central IT. It comes out of saying, "We're too inconsistent." It’s the automation story that comes first, and then typically you end up in a discussion around Detect to Correct. It’s a familiar area and people understand the various components that are involved in that.

But back to what you mentioned before is the layer approach that allows us to go in with a single slide. We can put it up in large format on the wall, and you can start to put Post-It notes on it. You don’t need to understand architecture. That implies that we can have decision makers coming in, and we break down a lot of siloes in the operations area, just with Detect to Correct. That’s where 99 percent of our engagements have been starting.

Then, the Request to Fulfill with the experience is where people want to go. That’s the Holy Grail, or one of the Holy Grails. There are actually two Holy Grails, and that’s just one of them. The other one is to be able to do Strategy to Portfolio, and no longer just say, "I have this application and I need to move it to the next version or whatever." It's understanding what are the services, not the applications, but the services I'm delivering to the business.

It isn't until you have the value streams more in order that you can start building up that service backbone that is so crucial to IT4IT.

Gardner: Is there an element of educating the consumer of IT in an enterprise to anticipate services differently? Ryan, when you mentioned earlier the Request to Fulfill value stream, I can understand how that makes a great deal of sense from IT out to the organization. But do people have to make an adjustment in order to receive things as a value stream, to consume them, to think of asking things through the lens of your being a broker organization? What must we do to educate and help the consumer of IT understand that it might be a different ballgame? 

Reducing friction

Schmierer: We need to start with the goal of reducing friction within the organization. Consumers of IT are operating in a changing landscape. I talked earlier about the network effect and how the environment is constantly evolving, constantly changing. As it does, the needs and desires of the people consuming technology and information will continue to change.

Request to Fulfill helps provide the mechanics for a corporate IT organization to become that broker of services. But if we look at that from a consumption perspective (from the users of services) it's all about enabling them to change their mind, change their needs, change their business processes faster, and removing the friction that exists within the process of provisioning today.

If something is a new technology that they want to bring into their organization, because they see a potential to it, how do we get that in there faster? The whole Request to Fulfill value stream is about accelerating the time to value for new technology coming into the organization and reducing the friction of the request process. 
When you look at how people consume things now, there is definitely a trend going on, where people are becoming more service-aware.

Gardner: Dave, anything to offer on that same side, the consumption side, rather than the delivery perspective? 

Wright:  We're getting this breakdown now, where people are saying that it’s not about the CIs; it’s about the service that those CIs support, how you can take something that can have not a CI-centric CMDB, but a service-centric CMDB. How people can map those relationships. The whole consumption side of it is flipping now, as people’s expectations come in line.

The other thing I found specifically with the IT4IT concept is that people start to put together a kind of business logic very quickly around things. So they'll look at the whole process. And I had someone said to me a few weeks ago, "If I understand the cost elements of each of those, I truly know what that service costs. Could I move and actually be able to manage my system based on what it’s costing the business not the fact it’s a server on problem or it’s a red light? It’s costing me x-amount of dollars a minute for this to be down and I’ve spent this much money actually building it and getting out." But you have to have all those elements tied in, all the way from the portfolio element right the way through to the run element.

Gardner: So it really seems as if it also offers a value of rationalization, prioritization, but in business terms rather than IT terms. Is that correct?

Rossen: Correct.

Gardner: As I try to factor where this will work best, early, and often, not only would we look at specific parts of IT within organization, but we might look at specific companies as a culture, as a type of company but also vertical industries. I'll go back to you, Dave, because ServiceNow has a fairly horizontal view of many different companies. Are there particular companies that you think it would be, as a culture or a type of company, better suited for adoption of IT4IT or in other vertical industries where this makes sense first?

Holistic process

Wright: The people I have seen who would be most disciplined about wanting to be able to look at things holistically right across the whole gamut have been the pharmaceutical companies. Pharmaceutical companies have come along and they're obviously very regimented in the same way finances are. They're the people who seem to be the early adopters of looking at this holistic process.

If I look at customers, the people who are adopting it first, at a low level, tend to be the financial institutions, but after that, the conversation tends to go through pharmaceuticals. I don’t think any one business has really nailed it, but this is a challenge of every company. Every company has an IT division, and they run IT, but their business isn’t to run IT; their business is inherently to provide financial services or develop drugs.

Looking at what processes people do to drive their core business, the people who are very regimented and disciplined tend to be the people who are saying there has to be a way we can gain more visibility into what we're doing from an IT perspective.
It’s a scale question and it’s a risk question. Who is under the most pressure to improve their cost performance?

Gardner: Ryan, thoughts on the similar question about where this is applicable either as a type of company or a vertical industry?

Schmierer: I'd look at who is most threatened by the changes going on in the world today. Where are cost pressures to drive efficiencies most prevalent because they're going to have the most motivation to change quickly? I'd also look at companies that were early adopters of IT who, through their early adoption, have ended up with a lot of legacy debt that they're trying to manage and they now need to rationalize that in order to get their total IT cost profile down.

In terms of specific verticals, there are pockets within each vertical or each industry that there are opportunities here. I'd look at it from a scale perspective. If you go back to the scale model that I shared this morning about the different sizes of organizations, a lot of small organizations don’t need this, and a lot of start-ups can build it into their DNA. Some of the companies that have more legacy (more mature enterprises) have more of a fundamental need for this type of structure and are going to be able to reap some benefits more quickly or with only a few pieces of it.

It’s a scale question and it’s a risk question. Who is under the most pressure to improve their cost performance?

Gardner: So if I do IT4IT correctly, how might I know a few months -- six months, a quarter or two down the road – later that I can attribute improvement to that particular activity?

Rossen: There are a couple of different things that I believe can be done at an abstract level where actually within IT4IT trying to make more concrete key performance indicator (KPI) assessments of what would make sense in terms of measuring it. More abstractly, are you really embracing the multi-supplier options that reside in IT4IT. That’s one of the reasons we kicked it off. Shell has some good examples of what it costs to integrate a supplier. And that’s tremendous high cost typically, because you have to design how to exchange an incident every time over-and-over again, and then it becomes much more reusable.

That's a place where you see that the cost of working with your partner should go down, and you can become a service broker. That's a particular area where we would see benefits very quickly. But it's also coming back to the original question or questions. That's also where we see the typical companies that wants to pick it up are the companies that really are having that pain that it's not a centralized IT any longer. It's lines of business IT, it's central, it’s suppliers and you yourself are supplying to others. If you have that problem then IT4IT is really good for you and you can quickly see benefits.

Gardner: Chris, thoughts on this notion of how do I attribute benefits in my IT organization at the business level to IT4IT?

Holy Grail for academics

Davis: This has been another Holy Grail for academics. We go all the way back to the 1970s constructive cost model and things like that. Lars hit the nail on the head. The other thing is what Cathleen said this morning. It will be less easily measured, more easily sensed, there will be changes in mindsets and so on. So it's very difficult to articulate and measure, but we're working on ways to make it much more tractable.

Wright: I've been implementing ITSM system since the mid-90s, but we still do one thing in the same way that’s truly weird and you are kind of hitting on this question. Can we define the outcomes?

Whenever anyone undertakes a project like this, they decide they're going to completely redefine the way that IT manages itself as a business. You probably should design the outcomes in the metrics that you want before you put the system in. Almost everyone I can ever remember implements a system and goes "Cool, let's write some reports." And then you take the reports you can get and say, "We'd like a report that shows this," and the consultant who put it in says, "Oh, you can't get that."

If only you step back and said, "Let's think what we want and build a system that delivers that data," is would provide a lot more value to the business.

Gardner: Well, I've had a chance to ask lots of questions. Let's go now to our architects, the people in the trenches. Dave Lounsbury, CTO at The Open Group, help us out with some practical approaches to implementing IT4IT.

Lounsbury: First off, I want to mention that it's really gratifying to see that new participants like Ryan and David come in and adopt this technology, and give us their insights. So thank you very much for participating, as well as our legacy folks. IT always has a legacy, right?

Lounsbury
Each speaker mentioned the need for better data management as part of this process, and so this is a governance issue. And who in these evolving organizations should be responsible for data governance; is it the business, is it IT, is it a third entity that should be doing that? Any thoughts on that?

Schmierer: Let me take that one. We need to start by rethinking the idea of data governance. We're trying to govern the data because we're trying to create too much data. We're spending far too much time adding overhead tasks to people who need to do their day jobs, people who are trying to execute on the value stream in order to generate data needed to make decision-making. When we don't get the data that we're looking for to drive decisions, we apply governance and we apply more overhead on top of it.

As we think about IT4IT and the fact that we have a value stream and a separate set of supporting functions, it gives us an opportunity to ask "How can we reduce the amount of data required to be generated within the value stream itself?"

The extra data points that someone collects as a part of a request or the status updates that are created as a part of a project or an agile release, how do we get to the point that we can derive that from the operational systems themselves and let people just do their jobs? If we're not asking people to manually create data, there's no need to create governance processes for it. That's why IT4IT has a lot of value here. We're going to get greater [quality] data by making people’s jobs easier.

Service backbone

Rossen: I'd like to answer that, very much in line to what you are saying. One of the purposes of the service backbone is that everything relates back to that. If you really follow it, everything would be available. You don’t need to do anything further in terms of data skews, any log message, any incident, or any report or set of data from the development. It can all be related back to the conceptual service and then you can have fun with creating the reports you want to do, but you don’t add any overhead to the individuals in the value chain.

Lounsbury: Can you elaborate on how best to address the people and mindset shifts you need to make as you transition to this kind of a model?

Schmierer: From a Microsoft perspective, it starts with valuing the individuals, the contributions they’ve made to the organization, and the opportunity for them to be a part of the future where the company is going. We need to make sure that we talk with individuals and reinforce that they are valuable and appreciated.

Change is always difficult. When you talk about changing skill sets, asking people to learn new skills, adopting new ways of working, it’s uncomfortable. We're moving people out of their comfort zone and asking them to do something new. But I don’t think this one is difficult at all; it’s basic. Appreciate your people and tell them thank you.
Change is always difficult. When you talk about changing skill sets, asking people to learn new skills, adopting new ways of working, it’s uncomfortable.

Lounsbury: So given a complex service request demand by a business user, how will IT4IT assist me in designing a service with say, five different vendors?

Rossen: Well, the first thing is that within S2P, which is really where such a thing comes in, it’s a new service that needs to be introduced. We now have the framework for working on the conceptual service that we will make up whatever is requested. But everybody in the room here should probably appreciate the fact. We're not throwing away all the good stuff that goes around TOGAF and architecture in general for the business. If it's a very complex thing, you need to have an enterprise architecture worked out for that.

But it feeds into the pipeline of that, executing it. You can split it up into projects. You can still attract them as being part of the bigger things, but it does lead to that. A very important thing in IT4IT and in the industry in general is that you have to design small things that are making dependences to each other so one service depends on another service and so on. It’s not just an app on top of the infrastructure or platform infrastructure. It becomes much more complex with respect to that, but it’s the way the industry goes.

Lounsbury: What are the most important steps a small-to-medium sized enterprise (SME) could take to move to this service broker model that’s been advocated in IT4IT?

Wright: If it’s an SME, typically they're going to be using multiple systems coupled together. There won’t be any real formality around it. But the first thing for them is to get a common place where they can go and request these services. So that catalog is going to be structured in a way that’s easy to use.

I have a funny story. We were looking at how we designed UI/UX for our customers to interact with software, and we hired a group of people who were 23 or 24 years old to build the UI. We were showing a lot of them a standard service-management type of process you go through, and he said it was very complex, and I said it was. He asked how people learn to use it? I said, "What typically happens is you roll the system out and then you send all your users on a training course." He was horrified. He said, "You're allowed to write a software that’s so bad, you have to train people how to use it?" I said, "Yes, I’ve made a good living for 25 years doing that."

Service catalog

To be able to get a catalog, especially in a smaller business where you’ve perhaps got a younger workforce, more rapid turnover, or a potential to expand, it's development system is where you don’t have to train people how to use them where it’s very intrusive. 

I go onto this, I request something, and then suddenly something pops-up. I've got a task I need to do. It’s not like the going in sorting through records wondering what it all means and why have I got like 300 fields on the form and a couple of tabs to go through. It’s making work as simple as possible, that’s what’s going to drive the adoption of this.

But at a high level, what really drives the adoption is the visibility of the end result that you get from this, having that clarity of information. Imagine everyone in this room used to seeing incidents by category, so you can see a percentage of where you're spending your time, you are on hardware issues, you are doing software upgrades. No other part of the business, especially in this consolidated business model, can see that.

If you go to human resources and ask for a breakdown of percentages, how much you spend on each different type of task, you'll get some tribal knowledge ballpark figures. Same for legal, same for finance. Everyone who has been there for a while knows it, but there are no metrics. If you can provide those metrics at a top level, that just drives it further and further into the organization.
Because you don’t have a service backbone, you don’t really have connected information, so implementing IT4IT will allow you to make these decisions much easier.

Lounsbury: One more, okay, so which one to choose? And of course people will be able to interact with these folks at the breaks and at our evening reception if I don’t get to your question. So how does IT4IT help in a situation where a company is trying to eliminate a data center and move to the public cloud? As a broker of services who owns the system integration and process services, how does that flow in the IT4IT model?

Rossen: I'll take the first crack. Again it’s a classical scenario around saying where can you rationalize your portfolio? So do I outsource it, do I move the infrastructure to the cloud, do I still maintain the actual application, etc. You can’t make these decisions without having assistance of insight around what you're actually running, how it’s being consumed, what business value does it bring, which goes back to strategy to portfolio, what conceptual services do you have, how are they currently implemented, how are they running, what is the quality, how many consumers are there on it?

If you have that data, it’s actually fairly easy to make these decisions, but typically most organizations, this exercises require 60 spread sheets, half a calendar year 60 people trying to figure that out and in the meantime it’s not really correct, right? And that’s again because you don’t have a service backbone, you don’t really have connected information, so implementing IT4IT will allow you to make these decisions much easier.

Schmierer:  Let me add onto that a little bit. As we talked about, "If you want to move something in a cloud, how can I get IT4IT to help me?" We have to remember that this is an area where the industry is evolving. We haven’t got it all figured out yet. IT4IT is a great starting point for having the conversation with those folks helping you in system integration and your cloud service provider to step through the questions about how things need to change, what needs to be done differently. "What are the things that the consuming IT organization no longer needs to do because the cloud service provider is doing for them?"

For now, start by using IT4IT as a checklist, use it as a starting point for brokering the conversation to ask if we've thought about everything. Over time, this will get repeatable -- it will become a common pattern, and we'll just know and won’t need to have that conversation. But for now, IT4IT is a great reference model to help us have that conversation.

Gardner: Would it not make sense for you as a consumer of cloud services to wonder whether your cloud provider is using IT4IT and wouldn’t that give you a common denominator by which to pursue some of these benefits?

Tool certification

Rossen: That would certainly be in the future when we come to tool certification within The Open Group. A cloud provider would also need to be certified to saying, well, if you find my service, I can actually provide you with an incident interface according to the standards, so it's easy for you to hand over and go back and forth if there are issues just to take one example, right?

Gardner: Any more to offer from anyone?

Schmierer: One thing I can offer is this: since the IT4IT standard launched in Edinburgh three months ago, I can’t tell you how many emails I receive from our account teams and from customers who are asking us this exact question.

Customers are asking the question about IT4IT, how it plays into the service provider landscape and how they can use it to drive the conversation. So the word is getting out, and the best thing you can do as a consumer of this stuff, as you go work with different service providers is to ask the questions, and ask their opinion and their thoughts on it.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: The Open Group.

You may also be interested in:

Tuesday, April 12, 2016

How Etsy uses big data for ecommerce to put buyers and sellers in the best light

The next BriefingsDirect big data case study discussion explores how Etsy, a global e-commerce site focused on handmade and vintage items, uses data science to improve buyers and sellers’ discovery and shopping experiences.

We'll learn how mining big data at speed and volume helps Etsy define and distribute top trends, and allows those with specific interests to find items that will best appeal to them.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

To learn more about leveraging big data in the e-commerce space, please join Chris Bohn aka “CB,” a Senior Data Engineer at Etsy, based in Brooklyn, New York. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about Etsy for those that aren’t familiar with it. I've heard it described as it’s like being able to go through your grandmother's basement. Is that fair?

CB: Well, I hope it’s not as musty and dusty as my grandmother’s basement. The best way to describe it is that Etsy is a marketplace. We create a marketplace for sellers of handcrafted goods and the people who want to buy those goods.

We've been around for 10 years. We're the leader in this space and we went public in 2015. Just some quick little metrics. The total of value of the merchandise sold on Etsy in 2014 was about $1.93 billion. We have about 1.5 million sellers and about 22 million buyers.

Gardner: That's an awful lot of stuff that’s being moved around. What does the big data and analytics role bring to the table?

CB: It’s all about understanding more about our customers, both buyers and sellers. We want to know more about them and make the buying experience easier for them. We want them to be able to find products easier. Too much choice sometimes is no choice. You want to get them to the product they want to buy as quickly as possible.

We also want to know how people are different in their shopping habits across the geography of the world. There are some people in different countries that transact differently than we do here in the States, and big data lets us get some insight into that.

Gardner: Is this insight derived primarily from what they do via their clickstreams, what they're doing online? Or are there other ways that you can determine insights that then you can share among yourself and also back to your users?

Data architecture

CB: I'll describe our data architecture a little bit. When Etsy started out, we had a monolithic Postgres database and we threw everything in there. We had listings, users, sellers, buyers, conversations, and forums. It was all in there, but we outgrew that really quickly, and so the solution to that was to shard horizontally.

CB
Now we have many hundreds of sharded MySQL servers, horizontal. Then we decided that we needed to do some analytics on this stuff. So we scratched our heads. This was about five years ago. So we said, "Let’s just set up a Postgres server and we'll copy all the data from these shards into the Postgres server that we call BI server." And we got that done.

Then, we kind of scratched our heads and said, "Wait a minute. We just came full circle. We started with a monolithic database, then we went sharded, and now all the data is back monolithic."

It didn't perform well, because it's hard to get the volume of big data into that database. A relational database like Postgres just isn’t designed to do analytic-type queries. Those are big aggregations, and Postgres, even though it is a great relational database, is really tailored for single-record lookup.

So we decided to get something else going on. About three-and-a-half years ago, we set about searching for the replacement to our monolithic business-intelligence (BI) database and looked at what the landscape was. There were a number of very worthy products out there, but we eventually settled on HPE Vertica for a number of reasons.

One of those is that it derives, in large part, from Postgres. Postgres has a Berkeley license. So  companies could take it private. They can take that code and they don’t have to republish it out to the community, unlike other types of open source copyright agreements.

So we found out that the parser was right out of Postgres and all the date handling and typecasting stuff that is usually different from database to database was exactly spot-on the same between Vertica and Postgres. Also, data ingestion via the copy command is the best way to bulk-load data, exactly the same in both, and it’s the same format.
There were a number of very worthy products out there, but we eventually settled on Vertica for a number of reasons.

We said, "This looks good, because we can get the data in quickly, and queries will probably not have to be edited much." So that's where we went. We experimented with it and we found exactly that. Queries would run unchanged, except they ran a lot faster and we were able to get the data in easily.

We built some data replication tools to get data from the shards and also some legacy Postgres databases that we had laying around for billing and got that all data into HPE Vertica.

Then, we built some tools that allowed our analysts to bring over custom tables they had created on that old BI machine. We were able to get up to speed really quickly with Vertica, and boom, we had an analytics database that we were able to hit the ground running with it.

Gardner: And is the challenge for you about the variety of that data? Is it about the velocity that you need to move it in and out? Is it about simply volume that you just have so much of it, or a little of some of those?

All of the above

CB: It’s really all of those problems. Velocity-wise, we want our replication system to be eventually consistent, and we want it to be as near real-time as possible. There is a challenge in that, because you really start to get into micro-batching data in.

This is where we ended up having to pay off some technical debt, because years ago, disk storage was fairly pricey, and databases were designed to minimize storage. Practices grew up around that fact. So data would get deleted and updated. That's the policy that the early originators of Etsy followed when they designed the first database for it.
Start Your
HPE Vertica
Community Edition Trial Now
Eventually what we have got now is lossy data. If someone changes the description or the tags that are associated with a listing, the old ones go away. They are lost forever. And that's too bad, because if we kept those, we can do analytics on a product that wasn’t selling for a long time and all of a sudden it started selling. What changed? We would love to do analytics on that, but we can't do it because of the loss of data. That's one thing that we learned in this whole process.

But getting back to your question here about velocity and then also the volume of data, we have a lot of data from our production databases. We need to get it all into Vertica. We also have a lot of clickstream data. Etsy is a top 50 website, I believe, for traffic, and that generates a lot of clicks and that all gets put into Vertica.
This is where we ended up having to pay off some technical debt, because years ago, disk storage was fairly pricey, and databases were designed to minimize storage.

We run big batch jobs every night to load that. It's important that we have that, because one of the biggest things that our analytics like to do is correlate clickstream data with our production data. Clickstream data doesn't have a lot of information about the user who is doing those clicks. It’s just information about their path through the site at that time.

To really get a value-add on that, you want to be able to join on your user details tables, so that you can know where this person lives, how old they are, or their buying history in the past. You need to be able to join those, too, and we do that in HPE Vertica.

Gardner: CB, give us a sense about the paybacks, when you do this well, when you've architected, and when you've paid your technical debts, as you put it. How are your analysts able to leverage this in order to make your business better and make the experience of your users better?

CB: When we first installed Vertica, it was just a small group of analysts that were using it. Our analytics program was fairly new, but it just exploded. Everybody started to jump in on it, because all of a sudden, there was a database with which you could write good SQL, with a rich SQL engine, and get fantastic results quickly.

The results weren’t that different from what we were getting in the past, but they were just coming to us so fast, the cycle of getting information was greatly shortened. Getting result sets was so much better that it was like a whole different world. It’s like the Pony Express versus email. That’s the kind of difference it was. So everybody started jumping in on it.

More dashboards

Engineers who were adding new facets of the product wanted to have dashboards, more or less real time, so they could monitor what the thing was doing. For example, we added postage to Etsy, so that our sellers can have preprinted labels. We'd like to monitor that in real time to see how it's this going. Is it going well or what?

That was something that took a long time to analyze before we got into big-data analytics. All of a sudden, we had Vertica and we could do that for them, and that pattern has repeated with other groups in the company.

We're doing different aspects of the site. All of a sudden, you have your marketing people, your finance people, saying, "Wow, I can run these financial reports that used to take days in literally seconds." There was a lot of demand. Etsy has about 750 employees and we have way more than 200 Vertica accounts. That shows you how popular it is.
Start Your
HPE Vertica
Community Edition Trial Now
One anecdotal story. I've been wanting to update Vertica for the past couple of months. The woman who runs our analytics team said, "Don't you dare. I have to run Q2 numbers. Everybody is working on this stuff. You have to wait until this certain week to be able to do that." It’s not just HPE Vertica, but big data is now relied on for so many things in the company.

Gardner: So the technology led to the culture. Many times we think it's the other way around, but having that ability to do those easy SQL queries and get information opened up people's imagination, but it sounds like it has gone beyond that. You have a data-driven company now.

CB: That's an astute observation. You're right. This is technology that has driven the culture. It's really changed the way people do their job at Etsy. And I hear that elsewhere also, just talking to other companies and stuff. It really has been impactful.
This is technology that has driven the culture. It's really changed the way people do their job at Etsy.

Gardner: Just for the sake of those of our readers who are on the operations side, how do you support your data infrastructure? Are you thinking about cloud? Are you on-prem? Are you split between different data centers? How does that work?

CB: I have some interesting data points there for you. Five-plus years ago, we started doing Hadoop stuff, and we started out spinning up Hadoop in Amazon Web Service (AWS).

We would run nightly jobs. We collected all of the search terms that were used and buying patterns and we fed these into MapReduce jobs. The output from that then went into MATLAB, and we would get a set of rules out of that, that then would drive our search engine, basically improving search.

Commodity hardware

We did that for a while and then realized we were spending a lot of money in AWS. It was many thousands of dollars a month. We said, "Wait a minute. This is crazy. We could actually buy our own servers. This is commodity hardware that this can run on, and we can run this in our own data center. We will get the data in faster, because there are bigger pipes." So that's what we did.

We created what we call Etsydoop, which has got 200+ nodes and we actually save a lot of money doing it that way. That's how we got into it.

We really have a bifurcated data analytics, big-data system. On the one hand, we have Vertica for doing ad hoc queries, because the analysts and the people out there understand SQL and they demand it. But for batch jobs, Hadoop rocks, and it's really, really good for that.

But the tradeoff is that those are hard jobs to write. Even a good engineer is not going to get it right every time, and for most analysts, it's probably a little bit beyond their reach to get down, roll up their sleeves, and get into actual coding and that kind of stuff.
The analysts and the people out there understand SQL and they demand it. But for batch jobs, Hadoop rocks, and it's really, really good for that.

But they're great at SQL, and we want to encourage exploration and discovering new things. We've discovered things about our business just by some of these analysts wildcatting in the database, finding interesting stuff, and then exploring it, and we want to encourage that. That's really important.

Gardner: CB, in getting to understand Etsy a little bit more, I saw that you have something called Top Trends and Etsy Finds, ways that you can help people with affinity for a product or a craft or some interest to pursue that. Did that come about as a result of these technologies that you have put in place, or did they have a set of requirements that they wanted to be able to do this and then went after you to try to accommodate it? How do you pull off that Etsy Finds capability?

CB: A lot of that is cross-architecture. Some of our production data is used to find that. Then, a lot of the hard crunching is done in Vertica to find that. Some of it is MapReduce. There's a whole mix of things that go into that.

I couldn't claim for Etsy Finds, for example, that it’s all big data. There are other things that go in there, but definitely HPE Vertica plays a role in that stuff.

I'll give you another example, fraud. We fingerprint a lot of our users digitally, because we have problems with resellers. These are people who are selling resold mass-produced stuff on Etsy. It's not huge, but it's an annoyance. Those products compete against really quality handmade products that our regular sellers sell in their shops.

Sometimes it’s like a game of Whack-a-Mole. You knock one of these guys down -- sometimes they're from the Far East or other parts of the world -- and as soon as you knock one down, another one pops up. Being able to capture them quickly is really important, and we use Vertica for that. We have a team that works just on that problem.

What's next?

Gardner: Thinking about the future, with this great architecture, with your ability to do things like fraud detection and affinity correlations, what's next? What can you do that will help make Etsy more impactful in its market and make your users more engaged?

CB: The whole idea behind databases and computing in general is just making things faster. When the first punch-card machines came out in the 1930s or whatever, the phone companies could do faster billing, because billing was just getting out of control. That’s where the roots of IBM lie.

As time went by, punch cards were slow and they wanted to go faster. So they developed magnetic tape, and then spinning rust disks. Now, we're into SSDs, the flash drives. And it’s the same way with databases and getting answers. You always want to get answers faster.

We do a lot of A/B testing. We have the ability to set the site so that maybe a small percentage of users get an A path through the site, and the others a B path, and there's control stuff on that. We analyze those results. This is how we test to see if this kind of button work better than this other one. Is the placement right? If we just skip this page, is it easier for someone to buy something?
The whole idea behind databases and computing in general is just making things faster.

So we do A/B testing. In the past, we've done it where we had to run the test, gather the data, and then comb through it manually. But now with Vertica, the turnaround time to iterate over each cycle of an A/B test has shrunk dramatically. We get our data from the clickstreams, which go into Vertica, and then the next day, we can run the A/B test results on that.

The next step is shrinking that even more. One of the themes that’s out there at the various big data conferences is streaming analytics. That's a really big thing. There is a new database out there called PipelineDB, a fork of Postgres. It allows you to create an event steam into Postgres.

You can then create a view and a window on top of that stream. Then you can pump your event data, like your clickstream data, and you can join the data in that window to your regular Postgres tables, which is really great, because we could get A/B information in real time. You set up a one minute turnaround as opposed to one day. I think that’s where a lot of things are going.

If you just look at the history of big data, MapReduce started about 10 years ago at Google, and that was batch jobs, overnight runs. Then, we started getting into the columnar stores to make databases like Vertica possible, and it’s really great for aggregation. That kicked it up to the next level.

Another thing is real-time analytics. It’s not going to replace any of these things, just like Vertica didn't replace Hadoop. They're complementary. Real-time streaming analytics will be complementary. So we're continuing to add these tools to our big data toolbox.

Gardner: It has compressed those feedback loops if we provide that capability into innovative, creative organization. The technology might drive the culture, and who knows what sort of benefits they will derive from that.

All plugged in

CB: That's very true. You touched earlier about how we do our infrastructure. I'm in data engineering, and we're responsible for making sure that our big databases are healthy and running right. But we also have our operations department. They're working on the actual pipes and hardware and making sure it’s all plugged in. It's tough to get all this stuff working right, but if you have the right people, it can happen.

I mentioned earlier about AWS. The reason we were able to move off of that and save money is because we have the people who can do it. When you start using AWS extensively, what you're doing is you are paying for a very high priced but good IT staff at Amazon. If you have got a good IT staff of your own, you're probably going to be able to realize some efficiencies there, and that's why really we moved over. We do it all ourselves.

Gardner: Having it as a core competency might be an important thing moving forward. The whole idea behind databases and computing in general is just making things faster.

CB: Absolutely. You have to stay on top of all this stuff. A lot is made of the word disruption, and you don't go knocking on disruption’s door; it usually knocks on yours. And you had better be agile enough to respond to it.

I'll give you an example that ties back into big data. One of the most disruptive things that has happened to Etsy is the rise of the smartphone. When Etsy started back in 2005, the iPhone wasn't around yet; it was still two years out. Then, it came on the scene, and people realized that this was a suitable device for commerce.
Start Your
HPE Vertica
Community Edition Trial Now
It’s very easy to just be complacent and oblivious to new technologies sneaking up on you. But we started seeing that there was more and more commerce being done on smartphones. We actually fell a little bit behind, as a lot of companies did five years ago. But our management made decisions to invest in mobile, and now 60 percent of our traffic is on mobile. That's turned around in the past two years and it has been pretty amazing.

Big data helps us with that, because we do a lot of crunching of what these mobile devices are doing. Mobile is not the best device maybe for buying stuff because of the form factor, but it is a really good device for managing your store, paying your Etsy bill, and doing that kind of stuff. So we analyzed all that and crunched it in big data.

Gardner: And big data allowed you to know when to make that strategic move and then take advantage of it?

CB: Exactly. There are all sorts of crossover points that happen with technology, and you have to monitor it. You have to understand your business really well to see when certain vectors are happening. If you can pick up on those, you're going to be okay.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in: