Xccelerated

View Original

Fully automated deployment processes for Data & AI solutions @ Beerwulf

In this blog, a data Engineer will tell you about the project he has been working on at Beerwulf. He will tell you about the challenge, the solution they applied, and the result.


Challenge

Incorporate CI/CD into the development lifecycle.

Beerwulf unites beer drinkers with beer brewers through a user-centric, data-driven, storytelling, platform dedicated to selling, sharing, and exploring the world of beer. To achieve its mission, they needed a dedicated team of data scientists and data engineers to incorporate continuous integration and continuous delivery (CI/CD) into the development lifecycle. Beerwulf called on Xccelerated for assistance.

Solution

Azure Databricks workspace or Azure Data Factory

Although technologies like Azure Data Factory and Azure Databricks helps data engineers to develop data pipelines quickly and easily, delivering code changes more frequently is still a challenge. These changes have to be further validated and automated tested. This process ultimately results in an artefact that should be deployed to a target environment, in Beerwulf’s case, an Azure Databricks workspace or Azure Data Factory.

To integrate the deployment of an entire cloud environment — which contains data factories, storage accounts, and databases for different types of pipeline architectures — fully provisioned environments for any kind of source, needed to be prepared. This process required the management of template files, parameter template files, and pre- and post-deployment scripts.

Building libraries, non-notebook Apache Spark code, running automated tests, sanity checks and scheduled data pipelines & machine learning workflows, all needed to be implemented in order to incorporate CI/CD into the development process of the Data & AI solutions.

Result

Orchestrate deployments, provisioning, and staging of entire environments.

Beerwulf now can orchestrate deployments, provisioning, and staging of entire environments — including compiled code, Azure Databricks, and other cloud-native tools. This makes it very easy for the team to manage deployments, run tests, and to adapt data pipelines to use different kinds of sources or environments. It resulted in three fully automated different deployment processes, monitoring and approval systems for each of them, more than 20 test scenarios and a centralized secret management system.
Our next step? We’ll be teaching more data scientists and platform engineers on how to use the new tools. Meanwhile, the happy collaboration between ING and our data engineers has led to permanent offers.

" Besides the excellent work Xccelerated delivered, they also made us confident to hire a consultant from another unit within the Xebia Group, which led to quick and long-term Data & AI solutions. We were thrilled with the knowledge and results delivered by Xccelerated."

Roel Hermens - CIO, Beerwulf

Are you a software engineer and became interested in becoming a data or cloud engineer?

Xccelerated is an initiative within Xebia Group, accelerating careers of young professionals with 1 to 3 years of relevant work experience in the fields of data & AI. Our 13-month advanced training programs integrate hands-on learning and skill development with working at one of our partner companies, like Heineken, KLM, ING, Vattenfall, Randstad Group or FedEx.

To learn more about our projects and partner organizations see our other case studies