Gaining control over far-flung
and disparate data has been a decades’ old struggle, but now as hybrid and
public clouds join the mix of legacy and distributed digital architectures, new
ways of thinking are demanded if comprehensive analysis of relevant data is
going to become practical.
Stay with us now as we
examine the latest strategies for making the best use of data integration, data
catalogs and indices, as well highly portable data analytics platforms.
To learn more about closing the analysis gap between data and multiple -- and probably changeable -- cloud models, we are joined by Mike Potter, Chief Technology Officer (CTO) at Qlik. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.
Here are some excerpts:
Gardner: Mike, businesses are adopting cloud computing
for very good reasons. The growth over the past decade has been strong and
accelerating. What have been some of the -- if not unintentional -- complicating
factors for gaining a comprehensive data analysis strategy amid this cloud
computing complexity?
Potter: The biggest thing is recognizing that
it’s all about where data lives and where it's being created. Obviously,
historically most data have been generated on-premises. So, there is a strong
pull there, but you are seeing more and more cases now where data is born in
the cloud and spends its whole lifetime in the cloud.
Potter |
And so now the use cases are
different because you have a combination of those two worlds, on-premises and
cloud. To add further complexity, data is now being born in different cloud
providers. Not only are you dealing with having some data and legacy systems on-premises,
but you may have to reconcile that you have data in Amazon, Google, or Microsoft.
Our whole strategy around multicloud
and hybrid cloud architectures is being able to deploy Qlik where the data lives. It allows you to leave the data where
it is, but gives you options so that if you need to move the data, we can
support the use cases on-premises to cloud or across cloud providers.
Gardner: And you haven’t just put on the patina
of cloud-first or software as a service (Saas) -first. You have rearchitected
and repositioned a lot of what your products and technologies do. Tell us about
being “SaaS-first” as a strategy.
Scaling the clouds
Potter: We began our journey about 2.5 years
ago, when we started converting our monolith architecture into a microservices-based architecture. That journey struck to the core of the
whole product.
Qlik’s heritage was a
Windows Server architecture. We had to rethink a lot of things. As part of that
we made a big bet 1.5 years ago on containerization, using Docker and Kubernetes. And that’s really paid off for us. It
has put us ahead of the technology curve in many respects. When we did our
initial release of our multicloud product in June 2018, I had conversations with customers who
didn’t know what Kubernetes was.
One enterprise customer had
an infrastructure team who had set up an environment to provision Kubernetes cluster
environments, but we were only the second vendor that required one, so we were ahead
of the game quite a bit.
Gardner: How does using a managed container platform
like Kubernetes help you in a multicloud world?
Potter: The single biggest thing is it allows
you to scale and manage workloads at a much finer grain of detail through auto-scaling
capabilities provided by orchestration environments such as Kubernetes.
More importantly it allows
you to manage your costs. One of the biggest advantages of a microservice-based
architecture is that you can scale up and scale down to a much finer grain. For
most on-premises, server-based, monolith architectures, customers have to buy
infrastructure for peak levels of workload. We can scale up and scale down
those workloads -- basically on the fly -- and give them a lot more control
over their infrastructure budget. It allows them to meet the needs of their
customers when they need it.
Gardner: Another aspect of the cloud evolution over
the past decade is that no one enterprise is like any other. They have usually adopted
cloud in different ways.
Has Qlik’s multicloud analytics
approach come with the advantage of being able to deal with any of those
different topologies, enterprise by enterprise, to help them each uniquely attain
more of a total data strategy?
Potter: Yes, I think so. The thing we want to
focus on is, rather than dictate the cloud strategy – often the choice of our
competitors -- we want to support your cloud strategy as you need it. We
recognize that a customer may not want to be on just one cloud provider. They
don’t want to lock themselves in. And so we need to accommodate that.
There may be very valid
reasons why they are regionalized, from a data sovereignty perspective, and we
want to accommodate that.
There will always be
on-premises requirements, and we want to accommodate that.
The reality is that, for quite
a while, you are not going to see as much convergence around cloud providers as
you are going to see around microservices architectures, containers, and the
way they are managed and orchestrated.
You
are not going to see as much convergence around cloud providers as you
are going to see around microservices architectures, containers, and the
way they are managed and orchestrated.
Gardner: And there is another variable in the
mix over the next years -- and that’s the edge. We have an uncharted, immature
environment at the edge. But already we are hearing that a private cloud at the
edge is entirely feasible. Perhaps containers will be working there.
At Qlik, how are you anticipating
edge computing, and how will that jibe with the multicloud approach?
Running at the edge
Potter: One of the key features of our platform
architecture is not only can we run on-premises or in any cloud at scale, we
can run on an edge device. We can take our core analytics engine and deploy it
on a device or machine running at the edge. This enables a new opportunity,
which is taking analytics itself to the edge.
A lot of Internet of Things (IoT) implementations are geared toward
collecting data at the sensor, transferring it to a central location to be
processed, and then analyzing it all there. What we want to do is push the
analytics problem out to the edge so that the analytic data feeds can be processed
at the edge. Then only the analytics events are transmitted back for central
processing, which obviously has a huge impact from a data-scale perspective.
But more importantly, it
creates a new opportunity to have the analytic context be very immediate in the
field, where the point of occurrence is. So if you are sitting there on a
sensor and you are doing analytics on the sensor, not only can you benefit at
the sensor, you can send the analytics data back to the central point, where it
can be analyzed as well.
Gardner: It’s auspicious, the way that Qlik’s catalog,
indexing, and abstracting out the information about where data is approach can
now be used really well in an edge environment.
Potter: Most definitely. Our entire data strategy is intricately linked with our architectural strategy in that respect, yes.
Gardner: Analytics and being data-driven across an
organization is the way of the future. It makes sense to not cede that core
competency of being good at analytics to a cloud provider or to a vendor. The
people, process, and tribal knowledge about analytics seems essential.
Do you agree with that, and
how does Qlik’s strategy align with keeping the core competency of analytics of,
by, and for each and every enterprise?
Potter: Analytics is a specialization
organizationally within all of our customers, and that’s not going to go away. What
we want to do is parlay that into a broader discussion. So our focus is
enabling three key strategies now.
It's about enabling the
analytics strategy, as we always have, but broadening the conversation to
enabling the data strategy. More importantly, we want to close the
organizational, technological, and priority gaps to foster creating an
integrated data and analytics strategy.
By doing that, we can create
what I describe as a raw-to-ready analytics platform based on trust, because we own the
process of the data from source to analysis, and that not only makes the
analytics better, it promotes the third part of our strategy, which is around
data literacy. That’s about creating a trusted environment in which people can
interact with their data and do the analysis that they want to do without
having to be data scientists or data experts.
So owning that whole
end-to-end architecture is what we are striving to reach.
Gardner: As we have seen in other technology
maturation trend curves, applying automation to the problem frees up the larger
democratization process. More people can consume these services. How does
automation work in the next few years when it comes to analytics? Are we going
to start to see more artificial intelligence (AI) applied to the problem?
Automated, intelligent analytics
Potter: Automating those environments is an
inevitability, not only from the standpoint of how the data is collected, but in
how the data is pushed through a data operations process. More importantly, automating
enables on the other end, too, by embedding artificial and machine learning
(ML) techniques all the way along that value chain -- from the point of source to
the point of consumption.
Gardner: How does AI play a role in the automation and the capability to leverage data across the entire organization?
Potter: How we perform analytics within an
analytic system is going to evolve. It’s going to be more conversational in
nature, and less about just consuming a dashboard and looking for an insight
into a visualization.
The analytics system itself
will be an active member of that process, where the conversation is not only
with the analytics system but the analytics system itself can initiate the
conversation by identifying insights based on context and on other feeds. Those
can come from the collective intelligence of the people you work with, or even
from people not involved in the process.
The
analytics system itself will be an active member of that process, where
the conversation is not only with the analytics system but it will
initiate the conversation by identifying insights based on context and
other feeds.
Gardner: I have been at some events where robotic process automation (RPA) has been a key topic. It seems to me
that there is this welling opportunity to use AI with RPA, but it’s a separate
track from what's going on with BI, analytics, and the traditional data
warehouse approach.
Do you see an opportunity
for what’s going on with AI and use of RPA? Can what Qlik is doing with the
analytics and data assimilation problem come together with RPA? Would a process
be able to leverage analytic information, and vice versa?
Potter: It gets back to the idea of pushing
analytics to the edge, because an edge isn’t just a device-level integration. It
can be the edge of a process. It can be the edge of not only a human process,
but an automated business process. The notion of being able to embed analytics
deep into those processes is already being done. Process analytics is an
important field.
But the newer idea is that
analytics is in service of the process, as opposed to the other way around. The
world is getting away from analytics being a separate activity, done by a
separate group, and as a separate act. It is as commonplace as getting a text
message, right?
Gardner: For the organization to get to that
nirvana of total analytics as a common strategy, this needs to be part of what
the IT organization is doing, with full stack architecture and evolution. So AIOps and DataOps also getting closer over time.
How does DataOps in your
thinking relate to what the larger IT enterprise architects are doing, and why
should they be thinking about data more?
Optimizing data pipelines
Potter: That’s a really good question. From my
perspective, when I get a chance to talk to data teams, I ask a simple question:
“You have this data lake. Is it meeting the analytic requirements of your
organization?”
And often I don’t get very
good answers. And a big reason why is because what motivates and prioritizes
the data team is the storage and management of data, not necessarily the analytics.
And often those priorities conflict with the priorities of the analytics team.
What we are trying to do
with the Qlik integrated data and analytic strategy is to create data pipelines
optimized for analytics, and data operations optimized for analytics. And our
investments and our acquisitions in Attunity and Podium are about taking that process and focusing on the raw-to-ready
part of the data operations.
Gardner: Mike, we have been talking at a fairly
abstract level, but can you share any use cases where leading-edge
organizations recognize the intrinsic relationship between DataOps and enterprise architecture? Can you
describe some examples or use cases where they get it, and what it gets for
them?
Potter: One of our very large enterprise
customers deals in medical devices and related products and services. They realized
an essential need to have an integrated strategy. And one of the challenges
they have, like most organizations, is how to not only overcome the technology
part but also the organizational, cultural, and change-management aspects as
well.
They recognized the business
has a need for data, and IT has data. If you intersect that, how much of that
data is actually a good fit? How much data does IT have that isn't needed? How
much of the remaining need is unfulfilled by IT? That's the problem we need to
close in on.
Gardner: Businesses need to be thinking at the C-suite
level about outcomes. Are there some examples where you can tie together such
strategic business outcomes back to the total data approach, to using enterprise
architecture and DataOps?
Data decision-making, democratized
Potter: The biggest ones center on end-to-end
governance of data for analytics, the ability to understand where the data comes
from, and building trust in the data inside the organization so that decisions
can be made, and those decisions have traceability back to results.
The other aspect of building
such an integrated system is a total cost of ownership (TCO) opportunity,
because you are no longer expending energy managing data that isn't relevant to
adding value to the organization. You can make a lot more intelligent choices
about how you use data and how you actually measure the impact that the data
can have.
Gardner: On the topic of data literacy, how do
you see the behavior of an organization -- the culture of an organization -- shifting?
How do we get the chicken-and-egg relationship going between the data services
that provide analytics and the consumers to start a virtuous positive adoption
pattern?
One
of the biggest puzzles a lot of IT organizations face is around
adoption and utilization. They build a data lake and they don't know why
people aren't using it.
Potter: One of the biggest puzzles a lot of IT
organizations face is around adoption and utilization. They build a data lake
and they don't know why people aren’t using it.
For me, there are a couple of
elements to the problem. One is what I call data elitism. When you think
about data literacy and you compare it to literacy in the pre-industrial age,
the people who had the books were the people who were rich and had power. So church
and state, that kind of thing. It wasn't until technology created, through the
printing press, a democratization of literacy that you started to see
interesting behavior. Those with the books, those with the power, tried to
subvert reading in the general population. They made it illegal. Some argue
that the French Revolution was, in part, caused by rising rates of
literacy.
If you flash-forward this
analogy to today in data literacy, you have the same notion of elitism. Data is
only allowed to be accessed by the senior levels of the organization. It can
only be controlled by IT.
Ironically, the most data-enabled
organizations are typically oriented to the Millennials or younger users. But
they are in the wrong part of the organizational chart to actually take
advantage of that. They are not allowed to see the data they could use to do
their jobs.
The opportunity from a
democratization-of-data perspective is understanding the value of data for
every individual and allowing that data to be made available in a trusted
environment. That’s where this end-to-end process becomes so important.
Gardner: How do we make the economics of
analytics an accelerant to that adoption and the democratization of data? I’ll
use another historical analogy, the Model T and assembly line. They didn't sell
Model Ts nearly to the degree they thought until they paid their own people enough
to afford one.
Is there a way of looking at
that and saying, “Okay, we need to create an economic environment where analytics
is paid for-on-demand, it's fit-for-purpose, it's consumption-oriented.” Wouldn’t
that market effect help accelerate the adoption of analytics as a total
enterprise cultural activity?
Think positive data culture
Potter: That’s a really interesting thought. The
consumerization of analytics is a product of accessibility and of cost. When
you build a positive data culture in an organization, data needs to be as
readily accessible as email. From that perspective, turning it into a cost
model might be a way to accomplish it. It's about a combination of leadership, of
just going there and making occur at the grassroots level, where the value it
presents is clear.
And, again, I reemphasize
this idea of needing a positive data culture.
Gardner: Any added practical advice for
organizations? We have been looking at what will be happening and what to
anticipate. But what should an enterprise do now to be in an advantageous
position to execute a “positive data culture”?
Potter: The simplest advice is to know that
technology is not the biggest hurdle; it's change management, culture, and
leadership. When you think about the data strategy integrated with the analytics
strategy, that means looking at how you are organized and prioritized around
that combined strategy.
Finally, when it comes to a data literacy strategy, define how you are going to enable your organization to see data as a positive asset to doing their jobs. The leadership should understand that data translates into value and results. It's a tool, not a weapon.
You may also be interested in:
- How real-time data streaming and integration set the stage for AI-driven DataOps
- How a Business Matchmaker Application Helps SMBs Impacted by Natural Disasters Gain New Credit
- The New Procurement Advantage-How Business Networks Generate Multi-Party Ecosystem Solutions
- How Data-Driven Business Networks Help Close the Digital Transformation Gap
- Building the Intelligent Enterprise with Strategic Procurement and Analytics
- How SMBs impacted by natural disasters gain new credit thanks to a finance matchmaker app
- The new procurement advantage: How business networks generate multi-party ecosystem solutions
- SAP Ariba's chief data scientist on how ML and dynamic processes build an intelligent enterprise
- SAP Ariba’s President Barry Padgett on building the intelligent enterprise