Tuesday, March 31, 2015

Novel consumer retail behavior analysis from InfoScout relies on HP Vertica big data chops

The next BriefingsDirect big data innovation case study interview highlights how InfoScout in San Francisco gleans new levels of accurate insights into retail buyer behavior by collecting data directly from consumers’ sales receipts.

In order to better analyze actual retail behaviors and patterns, InfoScout provides incentives for buyers to share their receipts, but InfoScout is then faced with the daunting task of managing and cleansing that essential data to provide actionable and understandable insights.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Read a full transcript or download a copy.

To learn more about how big -- and even messy -- data can be harnessed for near real time business analysis benefits, please join me in welcoming our guests, Tibor Mozes, Senior Vice President of Data Engineering, and Jared Schrieber, the Co-founder and CEO, both at InfoScout, based in San Francisco. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: In your business you've been able to uniquely capture strong data, but you need to treat it a lot to use it and you also need a lot of that data in order to get good trend analysis. So the payback is that you get far better information on essential buyer behaviors, but you need a lot of technology to accomplish that.

Tell us why you wanted to get to this specific kind of data and then your novel way of acquiring.

Schrieber: A quick history lesson is in order. In the market research industry, consumer purchase panels have been around for about 50 years. They started with diaries in people’s homes, where they had to write down exactly every single product that they bought, day-in day-out, in this paper diary and mail it in once a month.

Schrieber
About 20 years ago, with the advent of modems in people’s homes, leading research firms like Nielsen would send a custom barcode scanner into people’s homes and ask them to scan each product they bought and then thumb into the custom scanner the regular price, the sales price, any coupons or deals that they got, and details about the overall shopping trip, and then transfer that electronically. That approach has not changed in the last 20 years.

With the advent of smartphones and mobile apps, we saw a totally new way to capture this information from consumers that would revolutionize how and why somebody would be willing to share their purchase information with a market research company.

Gardner: Interesting. What is it about mobile that is so different from the past, and why does that provide more quality data for your purposes?

Schrieber: There are two reasons in particular. The first is, instead of having consumers scan the barcode of each and every item they purchase and thumb in the pricing details, we're able to simply have them snap a picture of their shopping receipt. So instead of spending 20 minutes after a grocery shopping trip scanning every item and thumbing in the details, it now takes 15 seconds to simply open the app, snap a picture of the shopping receipt, and be done.

The second reason is why somebody would be willing to participate. Using smartphone apps we can create different experiences for different kinds of people with different reward structures that will incentivize them to do this activity.

For example, our Shoparoo app is a next-generation school fundraiser akin to Box Tops for Education. It allows people to shop anywhere, buy anything, take a picture of their receipt, and then we make an instant donation to their kid’s school every time.

Another app is more of a Tamagotchi game called Receipt Hog, where if you download the app, you have adopted a virtual runt. You feed it pictures of your receipt and it levels-up into a fat and happy hog, earning coins in a piggy bank along the way that you can then cash-out from at the end of the day.
Become a member of myVertica
Register now

Gain access to the HP Vertica Community Edition
These kinds of experiences are a lot more intrinsically and extrinsically rewarding to the panelists and have allowed us to grow a panel that’s many times larger than the next largest panel ever seen in the world, tracking consumer purchases on a day-in day-out basis.

Gardner: What is it that you can get from these new input approaches and incentivization through an app interface? Can you provide me some sort of measurement of an improved or increased amount of participation rates? How has this worked out?

Leaps and bounds

Schrieber: It's been phenomenal. In fact, our panel is still growing by leaps and bounds. We now have 200,000 people sharing with us their purchases on a day-in day-out basis. We capture 150,000 shopping trips a day. The next largest panel in America captures just 10,000 shopping trips a day.

In addition to the shopping trip data, we're capturing geolocation information, Facebook likes and interests from these people, demographic information, and more and more data associated with their mobile device and the email accounts that are connected to it.

Gardner: So yet another unanticipated consequence of the mobility trend that’s so important today.

Tibor, let’s go to you. The good news is that Jared has acquired this trove of information for you. The bad news is that now you have to make sense of it. It’s coming in, in some interesting ways, as almost a picture or an image in some cases, and at a great volume. So you have velocity, variability, and volume. So what does that mean for you as the Vice President of Data Engineering?

Mozes: Obviously this is a growing panel. It’s creating a growing volume of data that has created a massive data pipeline challenge for us over the years, and we had to engineer the pipeline so that is capable of processing this incoming data as quickly as possible.
It’s creating a growing volume of data that has created a massive data pipeline challenge for us over the years.

As you can imagine, our data pipeline has gone through an evolution. We started out with a simple solution at the beginning with MySQL and then we evolved it using Elastic Map Reduce and Hive.

But we felt that we wanted to create a data pipeline that’s much faster, so we can bring data to our customers much faster. That’s how we arrived at Vertica. We looked at different solutions and found Vertica a very suitable product for us, and that’s what we're using today.

Gardner: Walk me through the process, Tibor. How does this information come in, how do you gather it, and where does the data go? I understand you're using the HP Vertica platform as a cloud solution in the Amazon Web Services Cloud. Walk me through the process for the data lifecycle, if you will.

Mozes: We use AWS for all of our production infrastructure. Our users, as Jared mentioned, typically download one of our several apps, and after they complete a receipt scan from their grocery purchases, that receipt is immediately uploaded to our back-end infrastructure.

Mozes
We try to OCR that image of the receipt, and if we can’t, we use Amazon Mechanical Turk to try to make sense of the image and turn that image into text. At the end of the day, when an image is processed, we have a fairly clean version of that receipt in a text format.

In the next phase, we have to process the text and try to attribute various items on the receipt and make the data available in our Vertica data warehouse.

Then, our customers, using a business intelligence (BI) platform that we built especially for them, can analyze the data. The BI platform connects to Vertica, so our customers can analyze various metrics of our users and their shopping behavior.

Gardner: Jared, back to you. There's an awful lot of information on a receipt. It’s supposed to be very complex, given not just the date and the place and the type of retail organization, but all the different SKUs, every item that’s possibly being bought. How do you attack that sort of a data problem from a schema and cleansing and extract, transform, load (ETL) and then making it therefore useful?

Schrieber: It’s actually a huge challenge for us. It's quite complex, because every retailer’s receipt is different. The way that they structure the receipt, the level of specificity about the items on the receipt, the existence of product codes, whether they are public product codes like the kind of you see on a barcode for a soda product versus an internal product code that retailers use as a stock keeping unit internally versus just a short description on the receipt.

One of our challenges as a company is to figure out the algorithmic methods that allow us to identify what each one of those codes and short descriptions actually represent in terms of a real world product or category, so that we can make sense of that data on behalf of our client. That’s one of the real challenges associated with taking this receipt-based approach and turning that into useful data for our clients on a daily basis.
One of our challenges as a company is to figure out the algorithmic methods that allow us to identify what each one of those codes and short descriptions actually represent.

Gardner: I imagine this would be of interest to a lot of different types of information and data gathering. Not only are pure data formats and text formats being brought into the mix, as has been the case for many years, but this image-based approach, the non-structured approach.

Any lessons learned here in the retail space that you think will extend to other industries? Are we going to be seeing more and more of this image-based approach to analysis gathering?

Schrieber: We certainly are. As an example, just take Google Maps and Google Street View, where they're driving around in cars, capturing images of house and building numbers, and then associating that to the actual map data. That’s a very simple example.

A lot of the techniques that we're trying to apply in terms of making sense of short descriptions for products on receipts are akin to those being used to understand and perform social-media analytics. When somebody makes a tweet, you try to figure out what that tweet is actually about and means, with those abbreviated words and shortened character sets. It’s very, very similar types of natural language processing and regular expression algorithms that help us understand what these short descriptions for products actually mean on a receipt.

Gardner: So we've had some very substantial data complexity hurdles to overcome. Now we have also the basic blocking and tackling of data transport, warehouse, and processing platform.

Going back to Tibor, once you've applied your algorithms, sliced and diced this information, and made it into something you can apply to a typical data warehouse and BI environment, how did you overcome these issues about the volume and the complexity, especially now that we're dealing with a cloud infrastructure?

Compression algorithms

Mozes: One of the benefits of Vertica, as we went into the discovery process, was the compression algorithms that Vertica is using. Since we have a large volume of data to deal with and build analytics from, it has turned out to be beneficial for us that Vertica is capable of compressing data extremely well. As a result of that, some of our core queries that require a BI solution can be optimized to run super fast.

You also talked about the cloud solution, why we went into the cloud and what is the benefit of doing that. We really like running our entire data pipeline in AWS because it’s super easy to scale it up and down.

It’s easy for us to build a new Vertica cluster, if we need to evaluate something that’s not in production yet, and if the idea doesn’t work, then we can just pull it down. We can scale Vertica up, if we need to, in the cloud without having to deal with any sort of contractual issues.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
Schrieber: To put this in context, now we're capturing three times as much data every day as we were six months ago. The queries that we're running against this have probably gone up 50X to a 100X in that time period as well. So when we talk about needing to scale this up quickly, that’s a prime example as to why.

Gardner: What has happened in just last six months that’s required that ramp up? Is it just because of the popularity of your model, the impactfulness and effectiveness of the mobile app acquisition model, or is it something else at work here?

Schrieber: It’s twofold. Our mobile apps have gotten more and more popular and we've had more and more consumers adopt them as a way to raise money for their kid’s school or earn money for themselves in a gamified way by submitting pictures of their receipts. So that’s driven massive growth in terms of the data we capture.

Also, our client base has more than tripled in that time period as well. These additional clients have greater demands of how to use and leverage this data. As those increase, our efforts to answer their business questions multiplies the number of queries that we are running against this data.

Gardner: That, to me, is a real proof point of this whole architectural approach. You've been able to grow by a factor of three in your client base in six months, but you haven’t gone back to them and said, "You'll have to wait for six months while we put in a warehouse, test it, and debug it." You've been able to just take that volume and ramp up. That’s very impressive.

Schrieber: I was just going to say, this is a core differentiator for us in the marketplace. The market research industry has to keep up with the pace of marketing, and that pace of marketing has shifted from months of lead time for TV and print advertising down to literally hours of lead time to be able to make a change to a digital advertising campaign, a social media campaign, or a search engine campaign.

So the pace of marketing has changed and the pace of market research has to keep up. Clients aren’t willing to wait for weeks, or even a week, for a data update anymore. They want to know today what happened yesterday in order to make changes on-the-fly.

Reports and visualization

Gardner: We've spoken about your novel approach to acquiring this data. We've talked about the importance of having the right platform and the right cloud architecture to both handle the volume as well as scale to a dynamic rapidly growing marketplace.

Let’s talk now about what you're able to do for your clients in terms of reports, visualization, frequency, and customization. What can you now do with this cloud-based Vertica engine and this incredibly valuable retail data in a near real-time environment for your clients?

Schrieber: A few things on the client side. Traditional market research providers of panel data have to put a very tight guardrails on how clients can access and run reports against the data. These queries are very complex. The numerators and denominators for every single record of the reports are different and can be changed on-the-fly.

If, all of a sudden, I want to look at anyone who shopped at Walmart in the last 12 months that has bought cat food in the last month and did so at a store other than Walmart, and I want to see their purchase behavior and how they shop across multiple retailers and categories, and I want to do that on-the-fly, that gets really complex. Traditional data warehousing and BI technologies don't support allowing general business-analyst users to be able to run those kinds of queries and reports on-demand, yet that’s exactly what they want.

They want to be able to ask those business questions and get answers. That’s been key to our strategy, which is to allow them to do so themselves, as opposed to coming back to them and saying, "That’s going to be a pretty big project. It will require a few of our engineers. We'll come back to you in a few weeks and see what we can do." Instead, we can hand them the tools directly in a guided workflow to allow them to do that literally on-the-fly and have answers in minutes versus weeks.
They want to be able to ask those business questions and get answers. That’s been key to our strategy.

Gardner: Tibor, how does that translate into the platform underneath? If you're allowing for a business analyst type of skill set to come in and apply their tools, rather than deep SQL queries or other more complex querying tools, what is it that you need from your platform in order to accommodate that type of report, that type of visualization, and the ability to bring a larger set of individuals into this analysis capability?

Mozes: Imagine that our BI platform can throw out very complex SQL queries. Our BI platform essentially is using, under the hood, a query engine that's going to run queries against Vertica. Because, as Jared mentioned, the questions are so complex, some of the queries that we run against Vertica are very different than your typical BI use cases. They're very specialized and very specific.

One of the reasons we went with Vertica is its ability to compute very complex queries at a very high speed. We look at Vertica not as simply another SQL database that scales very well and that’s very fast, but we also look at it as a compute engine.

So as part of our query engine, we are running certain queries and certain data transformations that would be very complicated to run outside Vertica.

We take advantage of the fact that you can create and run custom UDFs that is not part of the ANSI 99 SQL. We also take advantage some of the special functions that are built into Vertica allowing data to be sessionized very easily.

Analyzing behavior

Jared can talk about some of the use cases where we like to analyze user’s entire shopping trips. In order to do that, we have to stitch together different points in time that the user has gone through and shopped at various locations. And using some of the built –in functions in Vertica that’s not standard SQL, we can look at shopping journeys, we call them trip circuits, and analyze user behavior along the trip.

Gardner: Tibor, what other ways can you be using and exploiting the Vertica capabilities in the deliverables for your clients?

Mozes: Another reason we decided to go with Vertica is its ability to optimize very complex queries. As I mentioned, our BI platform is using a query engine under the hood. So if a user asks a very complicated business question, our BI platform turns that question into a very complicated query.

One of the big benefits of using Vertica is to be able to optimize these queries on the fly. It’s easy to do this with running the database optimizer to build custom projections, making queries running much faster than we could do before.
Another reason we decided to go with Vertica is its ability to optimize very complex queries.

Gardner: I always think more impactful for us to learn through an example rather than just hear you describe this. Do you have any specific InfoScout retail client use cases where you can describe how they've leveraged your solution and how some of these both technical and feature attributes have benefited them -- an example of someone using InfoScout and what it's done for them?

Schrieber: We worked with a major retailer this holiday season to track in real time what was happening for them on Thanksgiving Day and Black Friday. They wanted to understand their core shoppers, versus less loyal shoppers, versus non-core shoppers, how these people were shopping across retailers on Thanksgiving Day and Black Friday, so that the retailer could try to respond in more real time to the dynamics happening in the marketplace.

You have to look at what it takes to do that, for us to be able to get those receipts, process them, get them transcribed, get that data in, get the algorithms run to be able to map it to the brands and categories and then to calculate all kinds of metrics. The simplest ones are market share; the most complex ones have to do with what Tibor had mentioned: the shopper journey or the trip circuit.

We tried to understand, when this retailer was the shopper's first stop, what were they most likely to buy at that retailer, how much were they likely to spend, and how is that different than what they ended up buying and spending at other retailers that followed? How does that contrast to situations where that retailer was the second stop or the last stop of the day in that pivotal shopping day that is Black Friday?

For them to be able to understand where they were winning and losing among what kinds of shoppers who were looking for what kinds of products and deals was an immense advantage to them -- the likes of which they never had before.

Decision point

Gardner: This must be a very sizable decision point for them, right? This is going to help you decide where to build new retail outlets, for example, or how to structure the experience of the consumer walking through that particular brick-and-mortar environment.

When we bring this sort of analysis to bear, this isn’t refining at a modest level. This could be a major benefit to them in terms of how they strategize and grow. This could be something that really deeply impacts their bottom line. Is that not the case?

Schrieber: It has implications as to what kinds of categories they feature in their television, display advertising campaigns, and their circulars. It can influence how much space they give in their store to each one of the departments. It has enormous strategic implications, not just tactical day-to-day pricing decisions.

Gardner: Now, that was a retail example. I understand you also have clients that are interesting in seeing how a brand works across a variety of outlets or channels. Is there another example you can provide on somebody who is looking to understand a brand impact at a wider level across a geography for example?
It has enormous strategic implications, not just tactical day-to-day pricing decisions.

Schrieber: I'll give you another example that relates to this. A retailer and a brand were working together to understand why the brand sales were down at this particular retailer during the summer time. To make it clear for you, this is a brand of ice-cream. Ice cream sales should go up during the summer, during the warmer months, and the retailer couldn’t understand why their sales were underperforming for this brand during the summer.

To figure this out, we had to piece-together, along the shopper journey over time, not only in the weeks during the summer months, but year round to understand this dynamic of how they were shopping. What we were able to help the client quickly discover was that during the summer months people eat more ice-cream. If they eat more ice-cream, they're going to want larger pack sizes when they go and buy that ice-cream. This particular retailer tended to carry smaller pack sizes.

So when the summer months came around, even though people has been buying their ice-cream at this retailer in the winter and spring, they now wanted larger pack sizes and they were finding them at other retailers, and switching their spend over to these other retailers.

So for the brand, the opportunity was a selling story to the retailer to give the brand more freezer space and to carry an additional assortment of products to help drive greater sales for that brand, but also to help the retailer grow their ice cream category sales as well.

Idea of architecture

Gardner: So just that insight could really help them figure that out. They probably wouldn’t have been able to do it any other way.

We've seen some examples of how impactful this can be and how much a business can benefit from it. But let’s go back to the idea of the architecture. For me, one of my favorite truths in IT is that architecture is destiny. That seems to be the case with you, using the combination of AWS and HP Vertica.

It seems to me that you don’t have to suffer the costs of a large capital outlay of having your own data center and facilities. You're able to acquire these very advanced capabilities at a price point that's significantly less from a capital outlay and perhaps predictable and adjustable to the demand.

Is that something you then can pass along? Tell me a little bit about the economics of how this architectural approach works for you?

Mozes: One of the benefits of using AWS is that it’s very easy for us to adjust our infrastructure on demand, as we see fit. Jared has referred to some of the examples that we had before. We did a major analysis for a large retailer on Black Friday, and we had some special promotions to our mobile app users going on at that point. Imagine that our data volume would grow tremendously from one day to the next couple of days, and then after when the promotion is over and the big shopping season is over, our volume would come down somewhat.
It’s very cost efficient to run an operation where you can just add additional computing power as you need, and then when you don’t need that anymore, you can scale it down.

When you run an infrastructure in the cloud in combination with online data storage and data engine, it's very easy to scale it up and down. It’s very cost efficient to run an operation where you can just add additional computing power as you need, and then when you don’t need that anymore, you can scale it down.

We did this during a time period, when we had to bring a lot fresh data online quickly. We could just add additional nodes, and we saw very close to linear scalability by increasing our cluster size.

Schrieber: On the business side, the other advantage is we can manage our cash flows quite nicely. If you think about running a startup, cash is king, and not having to do large capital outlays in advance, but being able to adjust up and down with the fluctuations in our businesses, is also valuable.

Gardner: We're getting close to the end of our time. I wonder if you have any other insights into the business benefits from an analytics perspective of doing it this way. That is to say, incentivizing consumers, getting better data, being able to move that data and then analyze it at an on-demand infrastructure basis, and then deliver queries in whole new ways to a wider audience within your client-base.

I guess I'm looking for how this stands up both to the competitive landscape, but also to the past. How new and how innovative is this in marketing? Then we'll talk about where we go next? Let’s try to get a level set as to how new and how refreshing this is, given what the technology enables both at cloud basis and the mobility basis and then the core stuff, the underlying analytics platform basis.

Product launch

Schrieber: We have an example that's going on right now around a major new product launch for a very large consumer goods company. They chose us to help monitor this launch, because they were tired of waiting for six months for any insight in terms of who is buying it, how they were discovering it, how they came about choosing it over the competition, how their experience was with the product, and what it meant for their business.

So they chose to work with us for this major new brand launch, because we could offer them visibility within days or weeks of launching that new product in the market to help them understand who were the people who were buying, was it the target audience that they thought it was going to be, or was it a different demographic or lifestyle profile than they were expecting. If so, they might need to change their positioning or marketing tactics and targeting accordingly.

How are these people discovering the products? We're able to trigger surveys to them in the moment, right after they've made that purchase, and then flow that data back through to our clients to help them understand how these people are discovering it. Was it a TV advertisement? Was it discovered on the shelf or display in the store? Did a friend tell them about it? Was their social media marketing campaign working?
Often, hundreds of millions of dollars spent by major consumer goods companies on new brand launches to get this quick feedback in terms of what’s working and what’s not.

We're also able to figure out what these people were buying before. Were they new to this category of product? Or did they not use this kind of product before and were just giving it a try? Were they buying a different brand and have now switched over from that competitor? And, if so, how did they like it by comparison, and will they repeat purchase? Is this brand going to be successful? Is this meeting needs?

These are enormous decisions. Often, hundreds of millions of dollars spent by major consumer goods companies on new brand launches to get this quick feedback in terms of what’s working and what’s not, who to target with what kind of messaging, and what it’s doing to the marketplace in terms of stealing share from competitors.

Driving new people to the product category can influence major investment decisions along the lines of whether we need to build the new manufacturing facility, do we need to change our marketing campaigns, or should we go ahead and invest in that TV Super Bowl ad, because this really has a chance to go big?

These are massive decisions that these companies can now make in a timely manner, based on this new approach of capturing and making use of the data, instead of waiting six months on a new product launch. They're now waiting just weeks and are able to make the same kinds of decisions as a result.

Gardner: So, in a word it’s unprecedented. You really just haven’t been able to do this before.

Schrieber: It’s not been possible before at all, and I think that’s really what’s fueling the growth in our business.

Look to the future

Gardner: Let’s look to the future quickly. We hear a lot about the Internet of Things. We know that mobile is only partially through its evolution. We're going to see more smart phones in more hands doing more types of transactions around the globe. People will be using their phones for more of what we have thought of as traditional business in commerce. So that opens up a lot more information that’s generated and therefore need to gather and then analyze.

So where do we go next? How does this generate additional novel capabilities, and then where do we go perhaps in terms of verticals? We haven’t even talked about food or groceries, hospitality, or even health care.

So without going too far -- this could be another hour conversation in itself -- maybe we could just tease the listener and the reader with where the potential for this going forward is.

Schrieber: If you think about Internet of Things as it relates to our business, there are a couple of exciting developments. One is the use of things like beacons inside of stores. Now we can know exactly which aisle people have walked down and what shelf they’ve stood in front of, and what product they've interacted with. That beacon is communicating with their smartphone and that smartphone is tied to our user account in a way that we're surveying these individuals or triggering surveys to them, in-the-moment, as they shop.
That will open up entirely new fields of research and consumer understanding about how people shop and make decisions at the shelf.

That’s not something that’s been doable before. It’s something that the Internet of Things, and very specifically beacons linking with smartphones, will allow us to do going forward. That will open up entirely new fields of research and consumer understanding about how people shop and make decisions at the shelf.

The same is true inside the home. We talk about the Internet of Things as it relates to smart refrigerators or smart laundry machines, etc. Understanding daily lifestyle activities and how people make the choice of which product to use and how to use them inside their home is a field of research that is under-served today. The Internet of Things is really going to open up in the years to come.

Gardner: Just quickly, what are other retail sectors or vertical industries where this would make a great deal of sense.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
Schrieber: I have a friend who runs an amazing business called Wavemark, which is basically an Internet of Things for medical devices and medical consumables inside of hospitals and care facilities, with the ability to track inventory in real time, tying it to patients and procedures, tying it back to billing and consumption.
Making all of that data available to the medical device manufacturers, so that they can understand how and when their products are being used in the real world in practice, is revolutionizing that industry. We're seeing it in healthcare, and I think we're going to see it across every industry.

Engineering perspective

Gardner: Last word to you, Tibor. Given what Jared just told us about the greater applicability. The model, the architecture comes back to mind for me, the cloud, the mobile device, the data, the engine, the ability to deal with that velocity, volume, and variability at a cost point that is doable and scales up and down. Are there any thoughts about this from an engineering perspective and where we go next?

Mozes: We see that with all these opportunities bubbling up, the amount of data that we have to process on a daily basis is just going to continually grow at an exponential rate. We continue to get additional information on shopping behavior and more data from external data sources. Our data is just going to grow. We will need to engineer everything to be as scalable as possible.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Wednesday, March 25, 2015

IT operations modernization helps energy powerhouse Exelon acquire businesses

This next BriefingsDirect IT innovation discussion examines how Exelon Corporation, based in Chicago, employs technology and process improvements to not only optimize their IT operations but also to both help manage a merger and acquisition transition, and to bring outsourced IT operations back in-house.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about how this leading energy provider in the US, with a family of companies having $23.5 billion in annual revenue, accomplishes these goals we're joined by Jason Thomas, Manager of Service, Asset and Release Management at Exelon. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: I gave a brief overview of Exelon, but tell us a little bit more. It's quite a large organization that you're involved with.

Thomas: We are vast and expansive. We have a large nuclear fleet, around 40-odd nuclear power plants in three utilities, ComEd in Chicago, in the Illinois space; PECO out of Philadelphia; and BG and E in Baltimore.

So we have large urban utilities center. We also have a large retail presence with the Constellation brand and the sale of power both to corporations and to users. So there's a lot of that we do obviously in the utility space, and there are some element of the trade, the commodity trading side, as well in trading power in these markets.

Gardner: I imagine it must be quite a large IT organization to support all that?

Thomas: There are 1,200 to 1,300 IT employees across the company.
Reap the rewards of software compliance
Get the HP Toolkit
For Optimized Software Licensing
Gardner: Tell us about some of the challenges that you've been facing in managing your IT operations and making them more efficient. And, of course, we'd like to hear more about the merger between Constellation and Exelon back in 2012.

Merger is a challenge

Thomas: The biggest challenge is the merger. Obviously, our scale and the number of, for lack of a better word, things that we had to monitor, be aware of, and know about vastly increased. So we had to address that.

Thomas
A lot of our efforts around the merger and post-merger were around bringing everything into one standard monitoring platform, extending that monitoring out, leveraging the Business Service Management (BSM) suite of products, leveraging Universal Configuration Management Database (UCMDB).

Then there was a lot around consolidating asset management. In early 2013, we moved to Asset Manager as our asset manager platform of choice, consolidating data from Exelon from their tool, the Cergus CA Argis tool, into Asset Manager in support of moving to new IT billing that would be driven out of the data and Asset Manager in leveraging some of the executive scorecard and financial manager pieces to make that happen.

There was also a large effort through 2013 to move the company to a standardized platform to support our service desk, incident management, and also our service catalog for end-users. But a lot of this was driven last year around the in-sourcing of our relationship with Computer Sciences Corporation for our IT operations.

This was to basically realize a savings to the company of $12 to $15 million annually from the management of that contract, and also to move both the management and the expertise in house and leverage a lot of the processes that we built up and that had grown through the company as a whole.

Gardner: So knowing yourself well in terms of your IT infrastructure and all the elements of that is super important, and then bringing in-sourcing transition to the picture, involves quite a bit of complexity.
You've leveled the playing field and you have that common set of tools that you're going to drive to take you to the next level.

What do you get when you do this well? Is there a sense of better control, better security, or culture? What is it that rises to the top of your mind when you know that you have your IT service management (ITSM) in order, when you have your assets and configuration management data in order. Is it sleeping better at night? Is it a sense of destiny you have fulfilled -- or what?

Thomas: Sleeping better at night. There is an element of that, but there's also sometimes the aspect of, "Now what's next?" So, part of it is that there's an evolutionary aspect too. We've gotten everything in one place. We're leveraging some of the integrations, but then what’s next?

It's more restful. It's now deciding how we better position ourselves to show the value of these platforms. Obviously, there's a clear monetary value of what we did to in-source, but now how do we show the business the value that we have done? Moving to a common set of tools helps to get there. You've leveled the playing field and you have that common set of tools that you're going to drive to take you to the next level.

Gardner: What might that next level be? Is it a cloud transition? Is it more of a hybrid sourcing for IT? Is this enabling you to take advantage of the different devices in terms of mobile? Where does it go?

Automation and cloud

Thomas: A lot of it is really around automation, the intermediate step around cloud. We've looked at cloud. We do have areas where the company has leveraged it. IT is still trying to wrap their heads around how we do it, and then also how we expose that to the rest of the organization.

But the steps we’ve done around automation are very key in making leaner operations, IT operations, but also being able to do things in an automated fashion, as opposed to requiring the manual elements that, in some cases, we had never done prior to the merger.

Gardner: Any examples? You mentioned $15 million in savings, but are there any other metrics of success or key performance indicator (KPI)-level paybacks that you can point to in terms of having all this in place for managing and understanding your IT?

Thomas: We're still going through what it is we're going to measure and present. There's been a standard set of things that we've measured around our availability and our incidents and whether these incidents are caused by IT, by infrastructure.
One of the key things is how you're changing and how you do IT operations.

We've done a lot better operationally. Now it's taking some of those operational aspects and making them a little bit more business-centric. So for the KPIs, we're going through that process of determining what we're going to measure ourselves against.

Gardner: Jason, having gone through quite a big and complex undertaking in getting your ITSM and Application Lifecycle Management (ALM) activities, what comnes next? Maybe a merger and acquisition is going to push you in a new direction.

Thomas: We recently announced the intent to acquire Pepco Holdings, which is the regional utility in Washington, DC area, that further widens our footprint in the mid-Atlantic area. So yeah, we get to do it all over again with a new partner, bringing Pepco in and doing some elements of this again.

Gardner: Having gone through this and anticipating yet another wave, what words of wisdom might you provide in hindsight for those who are embarking on a more automated, streamlined, and modern approach to IT operations?
Reap the rewards of software compliance
Get the HP Toolkit
For Optimized Software Licensing
Thomas: One of the key things is how you're changing and how you do IT operations. Moving towards automation, tools aside, there's a lot of organizational change if you're changing how people do what they do or changing people's jobs or the perception of that.

You need to be clear. You need to clearly communicate, but you also need to make sure that you have the appropriate support and backing from leadership and that the top-down communication is the same message. We certainly had that, and it was great, but there's alway going to be that challenge of making sure everybody is getting that communication, getting the message, and getting constant reinforcement of that.

Organizational changes resulting from a large merger or acquisition are huge. It's key to show the benefits, even to the people who are obviously going to reap some of these immediate benefits,  those in IT. You know the business is going to see some. It's couching that value in the means or method appropriate for those actors, all of those stakeholders.

Full circle

Gardner: Of course, you have mentioned working through a KPI definition and working the executive scorecard. That makes if full circle, doesn’t it?

Thomas: Defining those KPIs, but also having one place where those KPIs can be viewed, seen easily, and drilled into is big. To date, it's been a challenge to provide some of that historiography around that data. Now, you have something where you can even more readily drill into it to see that data -- and that’s huge.

Presenting that, being able to show it, and being able to show it in a way that people can see it easily, is huge, as opposed to just saying, "Well, here's the spreadsheet with some graphs" or "Here’s a whiz-bang PowerPoint doc."

Gardner: And, Jason, I suppose this points to the fact that IT is really maturing. Compared to other business services and functions in corporations, things that had been evolving for 80 or 100 years, IT is, in a sense, catching up.
Now, you have something where you can even more readily drill into it to see that data -- and that’s huge.

Thomas: It's catching up, but I also think it's more of a reflection. It's reflection of a lot of the themes of the new style of IT. A lot of that is that consumerization aspect. In fact,  if you look at the last 10 years ago, the wide presence of all of these, your smart devices and your smartphones, is huge.

We have brought to most people something that was never easily accessible. And having to take that same aspect and make it part of how you present what you do in IT is huge. You see it in how you're manifesting it in your various service catalogs and some of the efforts that we're undertaking to refine and better the processes that underlie our technical service catalog to have a better presentation layer.

That technical service catalog will refer to what we've seen with Propel. It's an easier, nicer, friendlier way to interact, and people expect that. Why can’t this be more like my app store, or why can't this be more like X.

Is IT catching up or has IT become more reachable, has become more warm and fuzzy as opposed to something that’s cold, hard, and stored away somewhere? You kind of know about it, and perhaps the guys in the basement are the ones who are doing all the heavy lifting, and it's more tangible.

Gardner: Humanization of IT, perhaps.

Thomas: Absolutely.

Gardner: All right, one last area I want to get into before we sign off. We've heard quite a bit  about The Machine, HP unveiling more detail from its labs activities. It’s not necessarily on a product roadmap yet, but it’s described through a lower footprint, much more rapid ability to join compute and memory, and then  reduce the size of the data center down to a size of a refrigerator.

I know that it's on the horizon, but how does that strike you, and how interesting is that for you?

Ramp up/ramp down

Thomas: It's interesting, because it allows you to get to bit more ability to ramp up or ramp down, based on what you need, as opposed to you having x amount of servers and x amount of storage that's always somewhere. It gives you a lot more flexibility and, to some extent, gives you a bit more tenability. It's directly applicable to certain aspects of the business, where you need that capability to ramp up and ramp down much more easily.
Reap the rewards of software compliance
Get the HP Toolkit
For Optimized Software Licensing
I had a conversation with one of my peers about that. We were talking about how both that and the Moonshot aspect and the ability to have that for a lot of the customer-facing websites, and the ability to tie them, in particular, the utility customer-facing websites whose utilization tends to spike during weather events.

While they don't spike all at the same time, there is the potential opportunity in the Mid-Atlantic of all the utilities spiking at the same time around a hurricane or Sandy-esque event. There's obviously a need to able to respond to that kind of demand, and that technology positions you with the flexibility to do that rather quickly and easily.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:


Thursday, March 19, 2015

Axeda's machine cloud produces on-demand IoT analysis services

This BriefingsDirect big data innovation discussion examines how Axeda, based in Foxboro, Mass., has created a machine-to-machine (M2M) capability for analysis -- in other words, an Axeda Machine Cloud for the Internet of Things (IoT).

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about how Axeda produces streams of massive data to multiple consumer dashboards that analyze business issues in near-real-time, we're joined by Kevin Holbrook, Senior Director of Advance Development at Axeda. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: We have the whole Internet of Things (IoT) phenomenon. People are accepting more and more devices, end points, sensors, even things within the human body, delivering data out to applications and data pools. What do you do in terms of helping organizations start to come to grip with this M2M and IoT data demand?
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
Holbrook: It starts with the connectivity space. Our focus has largely been in OEMs, equipment manufacturers. These are people who have the "M" in the M2M or the "T" in the Internet of Things. They are manufacturing things.

The initial drivers to have a handle on those things are basic questions, such as, "Is this device on?" There are multi-million dollar machines that are currently deployed in the world where that question can’t be answered without a phone call.

Initial driver

That was the initial driver, the seed, if you will. We entered into that space from the remote-service angle. We deployed small-agent software to the edge to get the first measurements from those systems and get them pushed up to the cloud, so that users can interact with it.

Holbrook
That grew into remote accesstelnet sessions or remote desktop being able to physically get down there, debug, tweak, and look at the devices that are operating. From there, we grew into software distribution, or content distribution. That could be anything from firmware updates to physically distributing configuration and calibration files for the instrument. We're recently seeing an uptake in content distribution for things like digital signage or in-situ ads being displayed on consumer goods.

From there, we started aggregating data. We have about 1.5 million assets connected to our cloud now globally, and there is all kinds of data coming in. Some of it's very, very basic from a resource standpoint, looking at CPU consumption, disks space, available memory, things of that nature.

It goes all the way through to usage and diagnostics, so that you can get a very granular impression how this machine is operating. As you begin to aggregate this data, all sorts of challenges come out of it. HP has proven to be a great partner for starting to extract value.

We can certainly get to the data, we can connect the device, and we can aggregate that data to our partners or to the customer directly. Getting value from that data is a completely different proposition. Data for data’s sake is not high value.
From our perspective, Vertica represents an endpoint. We've carried the data, cared for the data, and made sure that the device was online, generating the right information and getting it into Vertica.

Gardner:  What is it that you're using Vertica for to do that? Are we creating applications, are we giving analysis as a service? How is this going to market for you?

Holbrook: From our perspective, Vertica represents an endpoint. We've carried the data, cared for the data, and made sure that the device was online, generating the right information and getting it into Vertica.

When we approach customers, were approaching it from a joint-sale perspective. We're the connectivity layer, the instrumentation, the business automation layer there, and we're getting it into Vertica ,so that can be the seed for applications for business intelligence (BI) and for analytics.

So, we are the lowest component in the stack when we walk into one of these engagements with Vertica. Then, it's up to them, on a customer-by-customer basis, to determine what applications to bring to the table. A lot of that is defined by the group within the organization that actually manages connectivity.

We find that there's a big difference between a service organization, which is focused primarily on keeping things up and running, versus a business unit that’s driving utilization metrics, trying to determine not only how things are used, but how it can influence their billing.

Business use

We've found that that's a place where Vertica has actually been quite a pop for us in talking to customers. They want to know not just the simple metrics of the machines' operation, but how that reflects the business use of it.

The entire market has shifted and continues to shift. I was somewhat taken aback only a couple of weeks ago, when I found out that you can no longer buy a jet engine. I thought this was a piece of hardware you purchased, as opposed to something that you may have rented and paid per use. And so [the model changes to leasing] as the machines get  bigger and bigger. We have GE and the Bureau of Engraving and Printing as customers.

We certainly have some very large machines connected to our cloud and we're finding that these folks are shifting away from the notion that one owns a machine and consumes it until it breaks or dies. Instead, one engages in an ongoing service model, in which you're paying for the use of that machine.

While we can generate that data and provide some degree of visibility and insight into that data, it takes a massive analytics platform to really get the granular patterns that would drive business decisions.

Gardner: It sounds like many of your customers have used this for some basic blocking and tackling about inventory and access and control, then moved up to a business metrics of how is it being used, how we're billing, audit trails, and that sort of thing. Now, we're starting to look at a whole new type of economy. It's a services economy, based on cloud interactivity, where we can give granular insights, and they can manage their business very, very tightly.
There's not only a ton of data being generated, but the regulatory and compliance requirements which dictate where you can even leave that data at rest.

Any thoughts about what's going to be required of your organization to maintain scale? The more use cases and the more success, of course, the more demand for larger data and even better analytics. How do you make sure that you don't run out of runway on this?

Holbrook: There are a couple of strategies we've taken, but before I dive into that, I'll say that the issue is further complicated by the issue of data homing. There's not only a ton of data being generated, but the regulatory and compliance requirements which dictate where you can even leave that data at rest. Just moving it around is one problem, and where it sits on a disk is a totally different problem. So we're trying to tackle all of these.

The first way to address the scale for us from an architectural perspective was to try to distribute the connectivity. In order for you to know that something's running, you need to hear from it. You might be able to reach out, what we call contactability, to say, "Tell me if you're still running." But, by and large, you know of a machine's existence and its operation by virtue of it telling you something. So even if a message is nothing more than "Hello, I'm here," you need to hear from this device.

From the connectivity standpoint, our goal is not to try to funnel all of this into a single pipe, but rather to find where to get a point of presence that is closest and that is reasonable. We’ve been doing this on our remote-access technology for years, trying to find the appropriate geographically distributed location to route data through, to provide as easy and seamless an experience as possible.

So that’s the first, as opposed to just ruthlessly federating all incoming data, distributing the connectivity infrastructure, as well as trying to get that data routed to its end consumer as quickly as possible.

We break down data from our perspective into three basic temporal categories. There's the current data, which is the value you would see reading a dial on the machine. There's recent data, which would tell you whether something is trending in a negative direction, say pressure going up. Then, there is the longer-term historical data. While we focus on the first two, we’d deliberately, to handle the scale problem, don't focus on the long-term historical data.

Recent data

I'll treat recent data as being anywhere from 7 to 120 days and beyond, depending on the data aggregation rates. We focus primarily on that. When you start to scale beyond that, where the real long tail of this is, we try to make sure that we have our partner in place to receive the data.

We don't want to be diving into two years of data to determine seasonal trending when we're attempting to collect data from 1.5 million assets and acting as quickly as possible to respond to error conditions at the edge.

Gardner: Kevin, what about the issue of latency? I imagine some of your customers have a very dire need to get analysis very rapidly on an ongoing streamed basis. Others might be more willing to wait and do it in a batch approach in terms of their analytics. How do you manage that, and what are some of the speeds and feeds about the best latency outcomes?

Holbrook: That’s a fantastic question. Everybody comes in and says we need a zero-latency solution. Of course, it took them about two-and-a-half seconds to say that.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
There's no such thing as real-time, certainly on the Internet. Just negotiating up the TCP stack and tearing it down to send one byte is going to take you time. Then, we send it over wires under the ocean, bounce it off a satellite, you name it. That's going to take time.

There are two components to it. One is accepting that near-real-time, which is effectively the transport latency, is the smallest amount of time it can take to physically go from point A to point B, absent having a dedicated fiber line from one location to the other. We can assume that on the Internet that's domestically somewhere in the one- to two-second range. Internationally, it's in the two- to three-second or beyond range, depending on the connectivity of the destination.

What we provide is an ability to produce real-time streams of data outbound. You could take from one asset, break up the information it generates, and stream it to multiple consumers in near-real-time in order to get the dashboard in the control center to properly reflect the state of the business. Or you can push it to a data warehouse in the back end, where it then can be chunked and ETLd into some other analytics tool.

For us, we try not to do the batch ETLing. We'd rather make sure that we handle what we're good at. We're fantastic at remote service, at automating responses, at connectivity and at expanding what we do. But we're never going to be a massive ETL, transforming and converting into somebody’s data model or trying to get deep analytics as a result of that.

Gardner: Was it part of this need for latency, familiarity, and agility that led into Vertica? What were some of the decisions that led to picking Vertica as a partner?

Several reasons

Holbrook: There were a few reasons. That was one of them. Also the fact that there's a massive set of offerings already on top of it. A lot of the other people when we considered this -- and I won't mention competitors that we looked at -- were more just a piece of the stack, as opposed to a place where solutions grew out of.

It wasn't just Vertica, but the ecosystem built on top of Vertica. Some of the vendors we looked at are currently in the partner zone, because they're now building their solutions on top of Vertica.

We looked at it as an entry point into an ecosystem and certainly the in-memory component, the fact that you're getting no disk reads for massive datasets was very attractive for us. We don’t want to go through that process. We've dealt with the struggles internally of trying to have a relational data model scale. That’s something that Vertica has absolutely solved.

Gardner: Now your platform includes application services, integration framework, and data management. Let’s hone in on the application services. How are developers interested in getting access to this? What are their demands in terms of being able to use analysis outcomes, outputs, and then bring that into an application environment that they need to fulfill their requirements to their users?
It wasn't just Vertica, but the ecosystem built on top of Vertica. Some of the vendors we looked at are currently in the partner zone, because they're now building their solutions on top of Vertica.

Holbrook: It breaks them down into two basic categories. The first is the aggregation and the collection of data, and the second is physical interaction with the device. So we focus on both about equally. When we look at what developers are doing, almost always it’s transforming the data coming in and reaching out to things like a customer relationship management (CRM) system. It's opening a ticket when a device has thrown a certain error code or integrating with a backend drop-ship distribution system in the event that some consumable has begun to run low.

In terms of interaction, it's been significant. On the data side, we primarily see that they're  extracting subsets of data for deeper analysis. Sometimes, this comes up in discrete data points. Frequently, this comes up in the transfer of files. So there is a certain granularity that you can survive. Coming down the fire-hose is discrete data points that you can react to, and there's a whole other order of magnitude of data that you can handle when it's shipped up in a bulk chunk.

A good example is one of the use cases we have with GE in their oil and gas division  where they have a certain flow of data that's always ongoing and giving key performance indicators (KPIs). But this is nowhere near the level of data that they're actually collecting. They have database servers that are co-resident with these massive gas pipeline generators.

So we provide them the vehicle for that granular data. Then, when a problem is detected automatically, they can say, "Give me far more granular data for the problem area." it could be five minutes before or five minutes since. This is then uploaded, and we hand off to somewhere else.

So when we find developers doing integration around the data in particular, it's usually when they're diving in more deeply based on some sort of threshold or trigger that has been encountered in the field.
Become a member of myVertica
Register now
Gain access to the HP Vertica Community Edition
Gardner: And lastly, Kevin, for other organizations that are looking to create data services and something like your Axeda Machine Cloud, are there any lessons learned that you could share when it comes to managing such complexity, scale, and the need for speed? What have you learned at a high level that you could share?

All about strategy

Holbrook: It’s all going to be about the data-collection strategy. You're going to walk into a customer or potential customer, and their default response is going to be, "Collect everything." That’s not inherently valuable. Just because you've collected it, doesn’t mean that you are going to get value from it. We find that, oftentimes, 90-95 percent of the data collected in the initial deployment is not used in any constructive way.

I would say focus on the data collection strategy. Scale of bad data is scale for scale’s sake. It doesn’t drive business value. Make sure that the folks who are actually going to be doing the analytics are in the room when you are doing your data collection strategy definition. when you're talking to the folks who are going to wire up sensors,  and when you're talking to the folks who are building the device.

Unfortunately, these are frequently within a larger business ,in particular, completely different groups of people that might report to completely different vice presidents. So you go to one group, and they have the connectivity guys. You talk about it and you wire everything up.
We find that, oftentimes, 90-95 percent of the data collected in the initial deployment is not used in any constructive way.

Then, six to eight months later, you walk into another room. They’ll say "What the heck is this? I can’t do anything with this. All I ever needed to know was the following metric." It wasn’t collected because the two hadn't stayed in touch. The success of deployed solutions and the reaction to scale challenges is going to be driven directly by that data-collection strategy. Invest the time upfront and then you'll have a much better experience in the back.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

 You may also be interested in:

Tuesday, March 17, 2015

Health Shared Services BC harnesses a healthcare ecosystem using IT asset management

The next BriefingsDirect innovation panel discussion examines how Health Shared Services BC in Vancouver improves process efficiency and standardization through better integration across health authorities in British Columbia, Canada.

We'll explore how HSSBC has successfully implemented one of the healthcare industry’s first Service Asset and Configuration Management Systems to help them optimize performance of their IT systems and applications.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about how HSSBC gains up-to-date single views of IT assets across a shared-services environment, please join me in welcoming our guests, Daniel Lamb, Project Manager for the ITSM Program, and Cam Haley, Program Manager for the ITSM Program, both at HSSBC. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Gentlemen, tell me first about the context of your challenge. You're an organization that's trying to bring efficiency and process improvements across health authorities in British Columbia. What is it about that task that made better IT service management (ITSM) an imperative?

Haley: If you look at the healthcare space, where it is right now within British Columbia, we have the opportunity to look at using our healthcare funding more efficiently and specifically focus on delivering more clinical outcomes for consumers of the services.

Haley
That was one of the main drivers behind the formation of HSSBC, to consolidate some of the key supporting and enabling services into an organization that could deliver a standardized set of service offerings across our health authority clients, so that they can focus on clinical delivery.

That was the key business driver around why we're here and why we are doing some of those things. For us to effectively deliver on that mandate, we need the tools and the process capabilities to be able to effectively deliver more consistent service outcomes, all those things that we want to deliver there, and to look at reducing cost a little long-term so that those cost could be again shifted into clinical delivery and to really enable those outcomes.

Necessary system

Gardner: Daniel, why was a Service Asset and Configuration Management System something that was important to accomplish this?
For the visibility you need
Get the HP Toolkit
For maximizing PMO success
Lamb: We have been in the process of a large data center migration project over the past three years, moving a lot of the assets out of Vancouver and into a new data center. We standardized on HP infrastructure up in Kamloops and we have -- when we put in all our Health Authorities assets, it's going to be upwards of around probably 6,500-7,000 servers to manage.
Lamb
As we merged to the super organization, the manual processes just don’t exist anymore. To keep those assets up-to-date we needed an automated system. The reason we went for those products, which included the asset side and the configuration service management, is that’s really our business. We're going to be managing all these assets for the organization and all the configuration items, and we are providing these services. So this is where the toolset really fitted our goals.

Gardner: So other than scale, size, and the migration, were there any other requirements or problems that you needed to solve that moving into this more modern ITSM capability delivered?

Haley: Just to build on what Daniel said, one of the key drivers in terms of identifying the toolset and the capabilities was to support the migration of infrastructure into the data center.

But along with that, we provide a set of services that go beyond data center. The tool capability that has been delivered in supporting that outcome enables us to focus on optimizing our processes, getting a better view into what's happening in our own environment. So having the configuration items (CIs) in the configuration management data base (CMDB), having the relationships develop both at the infrastructure level, but all the way up to the application or the business service level.

Now we have a view up and down the stack of what's going on. We get better analytics and better data, and we can make some better decisions as well around where we want to focus. What are the pain points that we need to target? We 're able to mine that stuff and really look at opportunities to optimize.

The tool allows us to standardize our processes and roll out the capabilities. Automation is built into the tool, which is fantastic for us in terms of taking that manual overhead out of that and really just allowing us to focus on other things. So it's been great.

Gardner: Any unexpected benefits, ancillary benefits, that come from the standardization with this visibility, knowing your organization better that maybe you didn't anticipate?

Up-to-date information

Lamb: We've been able to track down everything that’s out there. That’s one thing. We just didn’t know where everything was or what we had. So in terms of being able to forecast to the health authorities, "This is how much you need to part with for maintenance, that sort of thing," that was always a guess in the past. We now have that up-to-date information available.

This has also laid the platform for us to better take advantage of the new technologies that are coming in. So what HP is talking about at the moment, we can’t really take advantage of that, but they have this base platform. It’s going to allow us to take advantage of a lot of the new stuff that’s coming out.

Gardner: So in order to get the efficiency and cost benefits of new infrastructure and converged systems and data center efficiencies, having your ducks lined up and understood is a crucial first step.
For the visibility you need
Get the HP Toolkit
For maximizing PMO success
Lamb: Definitely.

Gardner: Looking down the road, what’s piquing your interest in terms of what HP is doing or new developments, or does this now allow you to then progress into other areas that you are interested in?

Lamb: Personally, I'm looking at obviously the new versions of the product sets we have at the moment. We've also been speaking to other customers on the success that we've had and giving them some lessons learned on how things worked.
One of the things that we have been able to do is enable our staff to be more effective at what they're doing.

Then, we're looking at some of other products we could build on to this -- the PPM, which is the Project Management toolset and the BSM, which is unified monitoring and that sort of thing. Being able to put those products on is where we'll start seeing even more value, like in terms of being able to reduce the amount of tickets and support cost and that sort of thing. So we're looking at that.

Then, just ad-hoc interest are the things around the big data and that sort of thing, just trying to get my head around how that works for us, because we have a lot of data. So some of those new technologies are coming out as well.

Gardner: Cam, given what you've already done, what has it gotten for you? What are some of the benefits and results that you have seen. Are there any metrics of success that you can share with us?

Haley: The first thing is that we're still pretty early in our journey out of the gate, if I just talk about what we've already achieved. One of the things that we have been able to do is enable our staff to be more effective at what they're doing.

We've implemented change management in particular within the toolset, and that’s giving us a more robust set of controls around what's actually happening and what’s actually going into the environment. That's been really important, not only for the staff, although there is bit of a learning curve around that, but in terms of the outcomes for our clients.

Comfort level

They have a higher comfort level that we have more insight or oversight into what’s actually happening in space and we are actually protecting the services that they need to deliver by putting those kinds of capabilities in. So from the process perspective, we've certainly been able to get some benefits in that area in particular.

From a client perspective, it's putting the toolset in it. It helps us develop that level of trust that we really need in order to have an effective partnering relationship with our clients. That’s something that hasn’t always been there in the past.

I'm not saying that we're all the way there yet, but we're starting to show that we can deliver the services that the health authorities expect us to deliver, and we are using the toolset to help enable that. That’s also an important aspect.

The other thing is that through the work we've done in terms of consolidating some of our contracts, maintenance agreements, and so on into our asset management system, we have a better view of what we're paying for. We've already realized some opportunities to consolidate some contracts and show some savings as well.
It helps us develop that level of trust that we really need in order to have an effective partnering relationship with our clients.

That's just a number of areas where we're already seeing some benefits. As we start to roll out more of the capabilities of the tool in the coming year and beyond that, we expect that we will get some of those standard metrics that you would typically get out of it. Of course, we'll continue to drive out the ROI value as well. So we're already a good way down that path, and we'll just continue to do that.

Gardner: Any words of wisdom, based on your journey so far, for other organizations that might be struggling with spreadsheets and tracking all of their assets and all of their devices and even the processes around IT support? What have you learned. What could you share to someone who is just starting out?
For the visibility you need
Get the HP Toolkit
For maximizing PMO success
Lamb: We had a few key lessons that we spoke about. One was the guiding principles that you are going to do the implementation by. We were very much of the approach that we would try to keep things as out-of-the-box as possible. HP, as they are doing the new releases, would pick up the functionality that we are looking for. So we didn’t do a lot of tailoring.
And we did the project in a short cycle. These projects can go on for years sometimes, and a lot of money can get sunk and there isn’t value gained sometimes. We said, "Let’s do these in more short sprint projects. We'll get something in, we'll start showing value to the organization, then we'll get into another thing." That’s the cycle that we're working in, and that's worked really well.

The other thing is that we had a great consultant partner that we worked with, and that was key. We were feeling a little lost when we came here last year, and that was one of the things we did. We went to a good consultant partner, Effectual Systems from San Francisco, and that helped us.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Thursday, March 12, 2015

Hackathon model plus big data equals big innovation for Thomson Reuters

The next BriefingsDirect innovation interview explores the use of a hackathon approach to unlock creativity in the search for better use of big data for analytics. We will hear how Thomson Reuters in London sought to foster innovation and derive more value from its vast trove of business and market information.

The result: A worldwide virtual hackathon that brought together developers and data scientists to uncover new applications, visualizations, and services to make all data actionable and impactful.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about getting developers on board the big-data analysis train, BriefingsDirect sat down with Chris Blatchford, Director of Platform Technology in the IT organization at Thomson Reuters in London. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Blatchford: Thomson Reuters is the world's leading source of intelligent information. We provide data across the finance, legal, news, IP, and science, tax, and accounting industries through product and service offerings, combining industry expertise with innovative technology.

Gardner: It’s hard to think of an organization where data and analysis is more important. It’s so core to your very mission.

Blatchford
Blatchford: Absolutely. We take data from a variety of sources. We have our own original data, third-party sources, open-data sources, and augmented information, as well as all of the original content we generate on a daily basis. For example, our journalists in the field provide original news content to us directly from all over the globe. We also have third-party licensed data that we further enrich and distribute to our clients through a variety of tools and services

Gardner: And therein lies the next trick, what to do with the data once you have it. About this hackathon, how did you come up upon that as an idea to foster innovation?

Big, Open, Linked Data

Blatchford: One of our big projects or programs of work currently is, as everyone else is doing, big data. We have an initiative called BOLD, which is Big, Open, Linked Data, headed up by Dan Bennett. The idea behind the project is to take all of the data that we ingest and host within Thomson Reuters, all of those various sources that I just explained, stream all of that into a central repository, cleanse the data, centralize it, extract meaningful information, and subsequently expose it to the rest of the businesses for use in their specific industry applications.

As well as creating a central data lake of content, we also needed to provide the tools and services that allow businesses to access the content; here we have both developed our own software and licensed existing tools.

So, we could demonstrate that we could build big-data tools using our internal expertise, and we could demonstrate that we could plug in third-party specific applications that could perform analysis on that data. What we hadn’t proved was that we could plug in third-party technology enterprise platforms in order to leverage our data and to innovate across that data, and that’s where HP came in.

HP was already engaged with us in a number of areas, and I got to speaking with their Big Data Group around their big data solutions. IDOL OnDemand came up. This is now part of the Haven OnDemand platform. We saw some synergies there between what we were doing with the big-data platform and what they could offer us in terms of their IDOL OnDemand API’s. That’s where the good stuff started.
Bringing human understanding to the cloud
Helping developers build a new class of apps
Gardner: Software developers, from the very beginning, have had a challenge of knowing their craft, but not knowing necessarily what their end users want them to do with that craft. So the challenge -- whether it’s in a data environment, a transactional environment or interface, or gaming -- has often been how to get the requirements of what you're up to into the minds of the developers in a way that they can work with. How did the hackathon contribute to solving that?

As well as creating a central data lake of content, we also need to provide the tools and services that allow businesses to access the content.
Blatchford: That’s a really good question. That’s actually one of the biggest challenges big data has in general. We approach big data in one of two ways. You have very specific use cases, for example, consider a lawyer working on a particular case for a client, it would be useful for them to analyze prior cases with similar elements. If they are able to extract entities and relevant attributes, they may be able to understand the case final decision, or perhaps glean information that is relevant to their current case.

Then you have the other approach, which is much more about exploration, discovering new insights, trends, and patterns. That’s similar to the the approach we wanted to take with the hackathon -- provide the data and the tools to our developers for them just to go and play with the data.

We didn’t necessarily want to give them any requirements around specific products or services. It was just, "Look, here is a cool platform with some really cool APIs and some capabilities. Here is some nice juicy data. Tell us what we should be doing? What can we come up with from your perspective on the world?"

A lot of the time, these engineers are overlooked. They're not necessarily the most extroverted of people by the nature of what they do and so they miss chances, they miss opportunities, and that’s something we really wanted to change.

Gardner: It’s fascinating the way to get developers to do what you want them to do is to give them no requirements.

Interesting end products

Blatchford: Indeed. That can result in some interesting end-products. But, by and large, our engineers are more commercially savvy than most, hence we can generally rely on them to produce something that will be compelling to the business. Many of our developers have side projects and personal development projects they work on outside of the realms of their job requirement. We should be encouraging this sort of behavior.

Gardner: So what did you get when you gave them no requirements? What happened?

Blatchford: We had 25 teams that submitted their ideas. We boiled that down to 7 finalists based upon a set of preliminary criteria, and out of those 7, we decided upon our first-, second-, and third-place winners. Those three end results were actually taken, or are currently going through a product review, to potentially be implemented into our product lines.

The overall winner was an innovative UI design for mobile devices, allowing users to better navigate our content on tablets and phones. There was a sentiment analysis tool, that allowed users to paste in news stories or any news content source on the web and extract sentiment from that news story.

And the other was more of an internally focused, administrative exploration tool, that  allowed us to more intuitively navigate our own data, which perhaps doesn’t initially seem as exciting as the other two, but is actually a hugely useful application for us.
Bringing human understanding to the cloud
Helping developers build a new class of apps
Gardner: Now, how does IDOL OnDemand come to play in this? IDOL is the ability to take any kind of information, for the most part, apply a variety of different services to it, and then create analysis as a service. How did that play into the hackathon? How did the developers use that?

Blatchford: Initially the developers looked at the original 50-plus APIs that IDOL OnDemand provides, and you have everything in there from facial recognition, to OCR, to text analytics, to indexing, all sorts of cool stuff. Those, in themselves, provided sufficient capabilities to produce some compelling applications, but our developers also utilized Thomson Reuters API’s and resources to further augment the IDOL platform.

This was very important, as it demonstrated that not only could we plug in an Enterprise analytics tool into our data, but also that it would fit well with our own capabilities.

Gardner: And HP Big Data also had a role in this. How did that provide value?

Five-day effort

Blatchford: The expertise. We should remember we stood this hackathon up from inception to completion in a little over one month, and that’s I think pretty impressive by any measure.

The actual hackathon lasted for five days. We gave the participants a week to get familiar with the APIs, but they really didn’t need that long because the documentation behind the APIs on IDOL OnDemand and the kind of "try it now" functionality it has was amazing. This is what the engineers and the developers were telling me. That’s not my own words.

The Big Data Group was able to stand this whole thing up within a month, a huge amount of effort on HP’s side that we never really saw. That ultimately resulted in a hugely successful virtual global hackathon. This wasn’t a physical hackathon. This was a purely virtual hackathon the world over.

Gardner: HP has been very close to developers for many years, with many tools, leading tools in the market for developers. They're familiar with the hackathon approach. It sounds like HP might have a business in hackathons as a service. You're proving the point here.

For the benefit of our listeners, if someone else out there was interested in applying the same approach, a hackathon as a way of creating innovation, of sparking new thoughts, light bulbs going off in people's heads, or bringing together cultures that perhaps hadn't meshed well in the past, what would you advise them?
First and foremost, the reason we were successful is because we had a motivated, willing partner in HP.

Blatchford: That’s a big one. First and foremost, the reason we were successful is because we had a motivated, willing partner in HP. They were able to put the full might of their resources and technology capabilities behind this event, and that along side our own efforts ultimately resulted in the events success.

That aside, you absolutely need to get the buy-in of the senior executives within an organization, get them to invest into the idea of something as open as a hackathon. A lot of hackathons are quite focused on a specific requirement. We took the opposite approach. We said, "Look, developers, engineers, go out there and do whatever you want. Try to be as innovative in your approach as possible."

Typically, that approach is not seen as cost effective, businesses like to have defined use cases, but sometimes that can strangle innovation. Sometimes we need to loosen the reins a little.

There are also a lot of logistical checks that can help. Ensure you have clear criteria around hackathon team size and members, event objectives, rules, time frames and so on. Having these defined up front makes the whole event run much smoother.

We ran the organization of the event a little like an Agile project, with regular stand-ups and check-ins. We also stood up a dedicated internal intranet site with all the information above. Finally, we set-up user accounts on the IDOL platform early on, so the participants could familiarize themselves with the technology.

Winning combination

Gardner: Yeah, it really sounds like a winning combination: the hackathon model, big data as the resource to innovate on, and then IDOL OnDemand with 50 tools to apply to that. It’s a very rich combination.

Blatchford: That’s exactly right. The richness in the data was definitely a big part of this. You don’t need millions of rows of data. We provided 60,000 records of legal documents and we had about the same in patents and news content. You don’t need vast amounts of data, but you need quality data.

Then you also need a quality platform as well. In this case IDOL OnDemand.The third piece is what’s in their heads. That really was the successful formula.
Bringing human understanding to the cloud
Helping developers build a new class of apps
Gardner: I have to ask. Of course, the pride in doing a good job goes a long way, but were there any other incentives; a new car, for example, for the winning hackathon application of the day?

Blatchford: Yeah, we offered a 1960s Mini Cooper to the winners. No, we didn't. We did offer other incentives. There were three main incentives. The first one, and the most important one in my view, and I think in everyone’s view, was exposure to senior executives within the organization. Not just face time, but promotion of the individual within the organization. We wanted this to be about personal growth as much as it was about producing new applications.

Going back to trying to leverage your resources and give them opportunities to shine, that’s really important. That’s one of the things the hackathon really fostered -- exposing our talented engineers and product managers, ensuring they are appreciated for the work they do.

We also provided an Amazon voucher incentive, and HP offered some of their tablets to the winners. So it was quite a strong winning set.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in: