Friday, June 7, 2019

How real-time data streaming and integration set the stage for AI-driven DataOps

https://www.qlik.com/us

The next BriefingsDirect business intelligence (BI) trends discussion explores the growing role of data integration in a multi-cloud world.

Just as enterprises seek to gain more insights and value from their copious data, they’re also finding their applications, services, and raw data spread across a hybrid and public clouds continuum. Raw data is also piling up closer to the edge -- on factory floors, in hospital rooms, and anywhere digital business and consumer activities exist.


Stay with us now as we examine the latest strategies for uniting and governing data wherever it resides. By doing so, businesses are enabling rapid and actionable analysis -- as well as entirely new levels of human-to-augmented-intelligence collaboration.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about the foundational capabilities that lead to a total data access exploitation, we’re now joined by Dan Potter, Vice President of Product Marketing at Attunity, a Division of Qlik. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Dan, what are the business trends forcing a new approach to data integration?

Potter: It’s all being driven by analytics. The analytics world has gone through some very interesting phases of late: Internet of Things (IoT), streaming data from operational systems, artificial intelligence (AI) and machine learning (ML), predictive and preventative kinds of analytics, and real-time streaming analytics.

Potter
So, it’s analytics driving data integration requirements. Analytics has changed the way in which data is being stored and managed for analytics. Things like cloud data warehouses, data lakes, streaming infrastructure like Kafka -- these are all a response to the business demand for a new style of analytics.

As analytics drives data management changes, the way in which the data is being integrated and moved needs to change as well. Traditional approaches to data integration – such as batch processes, more ETL, and scripted-oriented integration – are no longer good enough. All of that is changing. It’s all moving to a much more agile, real-time style of integration that’s being driven by things like the movement to the cloud and the need to move more data in greater volume, and in greater variety, into data lakes, and how do I shape that data and make it analytics-ready.

With all of these movements, there have been new challenges and new technologies. The pace of innovation is accelerating, and the challenges are growing. The demand for digital transformation and the move to the cloud has changed the landscape dramatically. With that came great opportunities for us as a modern data integration vendor, but also great challenges for companies that are going through this transition.

Gardner: Companies have been doing data integration since the original relational database (RDB) was kicked around. But it seems the core competency of managing the integration of data is more important than ever.

Innovation transforms data integration

Potter: I totally agree, and if done right, in the future, you won’t have to focus on data integration. The goal is to automate as much as possible because the data sources are changing. You have a proliferation of NoSQL databases, graph databases; it’s no longer just an Oracle database or RDB. You have all kinds of different data. You have different technologies being used to transform that data. Things like Spark have emerged along with other transformation technologies that are real-time-oriented. And there are different targets to where this data is being transformed and moved to.
It's difficult for organizations to maintain the skills set -- and you don’t want them to. We want to move to an automated process of data integration. The more we can achieve that, the more valuable all of this becomes. You don’t spend time with mundane data integration; you spend time on the analytics -- and that’s where the value comes from.

Gardner: Now that Attunity is part of Qlik, you are an essential component of a larger undertaking, of moving toward DataOps. Tell me why automated data migration and integration translates into a larger strategic value when you combine it with Qlik?

Potter: DataOps resonates well for the pain we’re setting out to address. DataOps is about bringing the same discipline that DevOps has brought to software development. Only now we’re bringing that to data and data integration for analytics.

How do we accelerate and remove the gap between IT, which is charged with providing analytics-ready data to the business, and all of the various business and analytics requirements? That’s where DataOps comes in. DataOps is technology, but that’s just a part of it. It’s as much or more about people and process -- along with enabling technology and modern integration technology like Attunity.

We’re trying to solve a problem that’s been persistent since the first bit of data hit a hard drive. Data integration challenges will always be there, but we’re getting smarter about the technology that you apply and gaining the discipline to not boil the ocean with every initiative.

The new goal is to get more collaboration between what business users need and to automate the delivery of analytics-ready data, knowing full-well that the requirements are going to change often. You can be much more responsive to those business changes, bring in additional datasets, and prepare that data in different ways and in different formats so it can be consumed with different analytics technologies.

That’s the big problem we’re trying to solve. And now, being part of Qlik gives us a much broader perspective on these pains as relates to the analytics world. It gives us a much broader portfolio of data integration technologies. The Qlik Data Catalyst product is a perfect complement to what Attunity does.
Our role in data integration has been to help organizations move data in real-time as that data changes on source systems. We capture those changes and move that data to where it's needed -- like a cloud, data lake, or data warehouse. We prepare and shape that data for analytics.

Our role in data integration has been to help organizations move data in real-time as that data changes on source systems. We capture those changes and move that data to where it’s needed -- like a cloud, data lake, or data warehouse. We prepare and shape that data for analytics.

Qlik Data Catalyst then comes in to catalog all of this data and make it available to business users so they can discover and govern that data. And it easily allows for that data to be further prepared, enriched, or to create derivative datasets.

So, it’s a perfect marriage in that the data integration world brings together the strength of Attunity with Qlik Data Catalyst. We have the most purpose-fit, modern data integration technology to solve these analytics challenges. And we’re doing it in a way that fits well with a DataOps discipline.

Gardner: We not only have the different data types, we have another level of heterogeneity to contend with and that’s cloud, hybrid cloud, multi-cloud, and edge. We don’t even know what more is going to be coming in two or three years. How does an organization stay agile given that level of dynamic complexity?

Real-time analytics deliver agility 

Potter: You need a different approach for a different style of integration technology to support these topologies that are themselves very different. And what the ecosystem looks like today is going to be radically different two years from now.

The pace of innovation just within the cloud platform technologies is very rapid. Just the new databases, transformation engines, and orchestration engines -- it’s just proliferates. And now you have multiple cloud vendors. There are great reasons for organizations to use multiple clouds, to use the best of the technologies or approaches that work for your organization, your workgroup, your division. So you need that. You need to prepare yourself for that, and modern integration approaches definitely help.


One of the interesting technologies to help organizations provide ongoing agility is Apache Kafka. Kafka is a way to move data in real-time and make the data easy to consume even as it’s flowing. We see that as an important piece of the evolving data infrastructure fabric.

At Attunity we create data streams from systems like mainframes, SAP applications, and RDBs. These systems weren’t built to stream data, but we stream-enable that data. We publish it into a Kafka stream and that provides great flexibility for organizations to, for example, process that data in real time for real-time analytics such as fraud detection. It’s an efficient way to publish that data to multiple systems. But it also provides the agility to be able to deliver that data widely and have people find and consume that data easily.

https://www.qlik.com/us
Such new, evolving approaches enable a mentality that says, “I need to make sure that whatever decision I make today is going to future-proof me.” So, setting yourself up right and thinking about that agility and building for agility on day one is absolutely essential.

Gardner: What are the top challenges companies have for becoming masterful at this ongoing challenge -- of getting control of data so that they can then always analyze it properly and get the big business outcomes payoff?

Potter: The most important competency is on the enterprise architecture (EA) level, more than on the people who traditionally build ETL scripts and integration routines. I think those are the piece you want to automate.

The real core competency is to define a modern data architecture and build it for agility so you can embrace the changing technologies and requirements landscape. It may be that you have all of your eggs in one cloud vendor today. But you certainly want to set yourself up so you can evolve and push processing to the most efficient place, and to attain the best technology for the kinds of analytics or operational workloads you want.

That’s the top competency that organizations should be focused on. As an integration vendor, we are trying to reduce the reliance on technical people to do all of this integration work in a manual way. It’s time-consuming, error-prone, and costly. Let’s automate as much as we can and help companies build the right data architecture for the future.

Gardner: What’s fascinating to me, Dan, in this era of AI, ML, and augmented intelligence is that we’re not just creating systems that will get you to that analytic opportunity for intelligence. We are employing that intelligence to get there. It’s tactical and strategic. It’s a process, and it’s a result.

How do AI tools help automate and streamline the process of getting your data lined up properly?

Automated analytics advance automation 

Potter: This is an emerging area for integration technology. Our focus initially has been on preparing data to make it available for ML initiatives. We work with vendors such as Databricks at the forefront of processing, using a high performance Spark engine and processing data for data science, ML, and AI initiatives.

We need to ask, “How do we apply cognitive engines, things like Qlik, to the fore within our own technology and get smarter about the patterns of integration that organizations are deploying so we can further automate?” That’s really the next way for us.

Gardner: You’re not just the president, you’re a client.

Potter: Yeah, that’s a great way to put it.

Gardner: How should people prepare for such use of intelligence?

Potter: If it’s done right -- and we plan on doing it right -- it should be transparent to the users. This is all about automation done right. It should just be intuitive. Going back 15 years when we first brought out replication technology at Attunity, the idea was to automate and abstract away all of the complexity. You could literally drag your source, your target, and make it happen. The technology does the mapping, the routing, and handles all the errors for me. It’s that same elegance. That’s where the intelligence comes in, to make it so intuitive that you are not seeing all the magic that’s happening under the covers.
This is all about automation done right. It should just be intuitive. When we first brought out replication technology at Attunity, the idea was to automate and abstract away all of the complexity. That's now where the intelligence comes in, to make it so intuitive that you are not seeing all the magic under the covers.

We follow that same design principle in our product. As the technologies get more complex, it’s harder for us to do that. Applying ML and AI becomes even more important to us. So that’s really the future for us. You’ll continue to see, as we automate more of these processes, all of what is happening under the covers.

Gardner: Dan, are there any examples of organizations on the bleeding edge? They understand the data integration requirements and core competencies. They see this through the lens of architecture.

Automation insures insights into data 

Potter: Zurich Insurance is one of the early innovators in applying automation to their data warehouse initiatives. Zurich had been moving to a modern data warehouse to better meet the analytics requirements, but they realized they needed a better way to do it than in the past.

Traditional enterprise data warehousing employs a lot of people, building a lot of ETL scripts. It tends to be very brittle. When source systems change you don’t know about it until the scripts break or until the business users complain about holes in their graphs. Zurich turned to Attunity to automate the process of integrating, moving it to real-time, and automatically structuring their data warehouse.

Their capability to respond to business users is a fraction of what it was. They reduced 45-day cycles to two-day cycles for updating and building out new data marts for users. Their agility is off the charts compared to the traditional way of doing it. They can now better meet the needs of the business users through automation.

As organizations move to the cloud to automate processes, a lot of customers are embracing data lakes. It’s easy to put data into a data lake, but it’s really hard to derive value from the data lake and reconstruct the data to make it analytics-ready.

For example, you can take transactions from a mainframe and dump all of those things into a data lake, which is wonderful. But how do I create any analytic insights? How do I ensure all those frequently updated files I’m dumping into the lake can be reconstructed into a queryable dataset? The way people have done it in the past is manually. I have scriptures using Pig and other languages try to reconstruct it. We fully automate that process. For companies using Attunity technology, our big investments in data lakes has had a tremendous impact on demonstrating value.

Gardner: Attunity recently became part of Qlik. Are there any clients that demonstrate the combination of two-plus-two-equals-five effect when it comes to Attunity and the Qlik Catalyst catalog?

DataOps delivers the magic 

Potter: It’s still early days for us. As we look at our installed base -- and there is a lot of overlap between who we sell to -- the BI teams and the data integration teams in many cases are separate and distinct. DataOps brings them together.

In the future, as we take the Qlik Data Catalyst and make that the nexus of where the business side and the IT side come together, the DataOps approach leverages that catalog and extends it with collaboration. That’s where the magic happens.

https://www.qlik.com/us

So business users can more easily find the data. They can send the requirements back to the data engineering team as they need them. By, again, applying AI and ML to the patterns that we are seeing from the analytics side will help better apply that to the data that’s required and automate the delivery and preparation of that data for different business users.

That’s the future, and it’s going to be very interesting. A year from now, after being part of the Qlik family, we’ll bring together the BI and data integration side from our joint customers. We are going to see some really interesting results.

Gardner: As this next, third generation of BI kicks in, what should organizations be doing to get prepared? What should the data architect, who is starting to think about DataOps, do to put them in an advantageous position to exploit this when the market matures?


Potter: First they should be talking to Attunity. We get engaged early and often in many of these organizations. The hardest job in IT right now is [to be an] enterprise architect, because there are so many moving parts. But we have wonderful conversations because at Attunity we’ve been doing this for a long time, we speak the same language, and we bring a lot of knowledge and experience from other organizations to bear. It’s one of the reasons we have deep strategic relationships with many of these enterprise architects and on the IT side of the house.

They should be thinking about what’s the next wave and how to best prepare for that. Foundationally, moving to more real-time streaming integration is an absolute requirement. You can take our word for it. You can go talk to analysts and other peers around the need for real-time data and streaming architectures, and how important that is going to be in the next wave.
Data integration is strategic, it unlocks the value of the data. If you do it right, you're going to set yourself up for long-term success.

So, preparing for that and again thinking about the agility in the automation that’s going to get them the desired results because if they’re not preparing for that now, they are going to be left behind, and if they are left behind the business is left behind, and it is a very competitive world and organizations are competing on data and analytics. So the faster that you can deliver the right data and make it analytic-ready, the faster and better decisions you can make and the more successful you’ll be.

So it really is a do-or-die kind of proposition and that’s why data integration, it’s strategic, it’s unlocking the value of this data, and if you do it right, you’re going to set yourself up for long-term success.

No comments:

Post a Comment