Craig Stewart, Vice President of Product Management, SnapLogic, exclusively discusses business challenges when managing big data while touching upon the difficulties of staying GDPR compliant
Can you tell me about yourself and your job role?
Craig Stewart: I’m VP of Product Management at SnapLogic, a provider of self-service application and data integration software. My key responsibility is leading the company’s product strategy and roadmap. One of my main areas of focus is looking at the future state of the IT and deciding what role SnapLogic can play and where we can drive impact for our customers. I’ve been working in the integration space for more than a decade. Before SnapLogic, I led product teams at Oracle, Cognos, Powersoft, Sybase, among others.
What challenges have you faced when managing big data?
Craig Stewart: One of the main pain points we’ve seen with big data management, particularly when on-premises, is the complexity of the Hadoop infrastructure, and the investment in time, manpower and money required to get results from it. You need a bunch of people with specialised skills to manage, monitor and maintain it, which is why we’ve typically seen this kind of deployment in larger enterprises with the resources to do so. Even then, large enterprises have struggled to make the most of their on-premises big data initiatives, with many now looking to the cloud.
There’s also the issue that a lot of the data being generated by businesses is unstructured, in different formats, or locked up in disparate and siloed systems. So actually getting that together into a form that can be useful for driving business strategy, or whatever you need it for, is something that a lot of businesses are still struggling with.
How are big data analytics used to define strategies?
Craig Stewart: A lot of businesses see data analytics as the core tenet for defining their strategy and plans. How much they’re actually doing that really depends on the data maturity of the company we’re talking about. Many companies have a ton of data, and they know it holds real value, and they want to be data-driven, but perhaps they don’t have confidence that maybe all the data is complete, or reliable, or accurate, and they want that trust in the data before they fully invest in analysing and using it. Those are the kinds of companies that we’re trying to help. That’s the market SnapLogic’s here to serve.
For those companies who are doing it the right way, they’re using data analytics tools to get real-time, accurate insights to make informed decisions. It’s essentially about making a business smarter, more agile, and ready to pounce on new business opportunities when they arise.
Does big data in the cloud allow you to boost the performance and reliability of a platform? If so, how?
Craig Stewart: Absolutely. There’s little point having mounds of data if it’s flawed, unmanageable, or can’t be used. If businesses create or migrate their big data projects to the cloud, they’ll get to insights much faster, with more precision, and at less cost. Hand-coding is incredibly time-consuming and error-prone. Code decays over time and must be updated. If the developer who wrote the code leaves the company, it becomes very difficult for IT to understand the data pipeline that is being used at the code level. But thanks to cloud tools and platforms, the best of which have easy-to-use self-service UIs and are driven by AI and machine learning technologies – businesses can process, integrate and analyse their data quickly, without coding, and with proven and reliable results.
What are the pros and cons when setting up a cloud-based big data system?
Craig Stewart: There are significant benefits to managing big data via the cloud: lower costs, better use of resources, faster time to value, to name a few. When you move the infrastructure management to the cloud provider, rather than putting this on the business itself, IT teams can spend more time on strategic initiatives and stay focused on business outcomes. Additionally, with the cloud, you only pay for the capacity and services that you consume, and it’s easy to scale up or down as needed.
That said, one of the biggest challenges of running big data in the cloud is actually getting it there in the first place. Connecting on-premises data lakes to cloud-based big data environments with complex data sets and diverse data sources, creating Apache Spark pipelines to transform that data, this scale of operation has historically required highly technical knowledge and continuous coding resources from data engineers and core IT groups.
It’s this initial hurdle, particularly the time and resource drain on critical IT staff, that is one of the biggest issues organisations have had to overcome in moving to cloud-based big data projects. That’s why we introduced SnapLogic eXtreme, our new solution supporting complex data processes on cloud big data services like Amazon EMR and Microsoft Azure. Data integrators and big data engineers can use SnapLogic eXtreme to build powerful Apache Spark pipelines and manage cloud data architectures without having to write complex code. This removes the prohibitive cost and resource requirements many companies face when attempting to build and operate big data architectures in the cloud.
How do you ensure data is GDPR compliant?
Craig Stewart: The protection of data is essential to our service. SnapLogic’s Enterprise Integration Cloud is an entirely cloud-based platform running on Amazon Web Services. The security behind it is a combination of policy, procedure, and technology spanning physical and virtual platforms, network and data, ensuring that the data being transferred through the platform is always secure. This means that we don’t observe, store or directly interact with any sensitive data as customers move it through the platform.
As customers connect their data with the SnapLogic Enterprise Integration Cloud, our Snaps leverage the endpoint security of whatever they’re connected to, such as an application, database, file, etc. Additionally, if the endpoint supports data encryption, Snaps can be configured to send and receive encrypted data. This means the security of data as it moves through the platform is in the customer’s control at all times, as it is always under the same security parameters the customer has defined within their business.
What effect has the new GDPR had on SnapLogic?
Craig Stewart: With the new regulation in place, we, of course, had to update our own policies to make sure that we are fully compliant. But it has been business as usual for us. We’re helping our customers meet GDPR obligations, and reducing their risk of a data breach. We’ve noticed quite a few of our customers proactively assessing if they really need a person’s Personally Identifiable Information (PII). If a PII is not necessary, they use SnapLogic to help locate the source of the data and then transform it with one of our many intelligent connectors, so the individual is not recognisable.
How does SnapLogic open new ways for businesses to improve the overall use of data via AI?
Craig Stewart: We’re very proud of being the first in our industry to apply machine learning to enterprise integration. The technology, called Iris, uses advanced algorithms to learn from metadata elements and data flows through our Enterprise Integration Cloud. It’s a recommendation engine providing step-by-step guidance as our users assemble integration pipelines, improving the speed and quality of integrations across data, applications, and business processes.
Essentially it takes a lot of burden off of programmers by automating many of those highly repetitive development tasks and eliminating those integration backlogs that can stifle new deployments. It also helps new users learn and get productive quickly with our platform.
How does AI assist big data analytics?
Craig Stewart: It’s enabling companies to do more with their big data analytics projects, and a lot quicker. So whether it’s using AI for the data analysis itself or to integrate, manage and maintain data sets, there are numerous applications for it.
Something we see as really critical within this is the use of AI to free up teams from those rote repetitive, time-intensive data tasks. By using AI to automate those tasks it allows teams to spend more time focusing on the output from the data analytics activity and its value in the decision-making process.
Anything else you would like to add?
Craig Stewart: Data is only getting bigger, faster, and more complex, so getting a handle on how to best wrangle, manage and use this data is critical to success. With the increasing volume and variety of data, along with the rise of cloud apps, AI and IoT, it is more important than ever that businesses are collecting, managing and analysing all of this data smartly and efficiently.
I would add, as businesses look to migrate their big data to the cloud – and they should absolutely be doing this – it would be wise to investigate a multi-cloud strategy. The major cloud providers have distinct strengths, and capabilities are being added every day, so you want to find the right cloud platform for the right job while retaining flexibility and control. Multi-cloud is definitely the way forward – businesses should experiment using different platforms to see what combinations work best for them.
Written by Leah Alger