This course helps learners create a study plan for the PDE (Professional Data Engineer) certification exam. Learners explore the breadth and scope of the domains covered in the exam. Learners assess their exam readiness and create their individual study plan.
Professional Data Engineer Certification Learning Path
A Data Engineer designs and builds systems that collect and transform the data used to inform business decisions.
This learning path guides you through a curated collection of on-demand courses, labs, and skill badges that provide you with real-world, hands-on experience using Google Cloud technologies essential to the Data Engineer role. Once you complete the path, check out the Google Cloud Data Engineer certification to take the next steps in your professional journey.
You can follow this learning path at your own pace but other learning options are available which give you the opportunity to earn a no-cost certification exam voucher. If you’d like to continue on-demand visit Partner Certification Kickstart. If you’re interested in a hybrid learning approach visit Partner Certification Academy.
In this course, you learn about data engineering on Google Cloud, the roles and responsibilities of data engineers, and how those map to offerings provided by Google Cloud. You also learn about ways to address data engineering challenges.

While the traditional approaches of using data lakes and data warehouses can be effective, they have shortcomings, particularly in large enterprise environments. This course introduces the concept of a data lakehouse and the Google Cloud products used to create one....

In this intermediate course, you will learn to design, build, and optimize robust batch data pipelines on Google Cloud. Moving beyond fundamental data handling, you will explore large-scale data transformations and efficient workflow orchestration, essential for timely business intelligence and...

In this course you will get hands-on in order to work through real-world challenges faced when building streaming data pipelines. The primary focus is on managing continuous, unbounded data with Google Cloud products.
Incorporating machine learning into data pipelines increases the ability to extract insights from data. This course covers ways machine learning can be included in data pipelines on Google Cloud. For little to no customization, this course covers AutoML. For more...
This course is part 1 of a 3-course series on Serverless Data Processing with Dataflow. In this first course, we start with a refresher of what Apache Beam is and its relationship with Dataflow. Next, we talk about the Apache...
In this second installment of the Dataflow course series, we are going to be diving deeper on developing pipelines using the Beam SDK. We start with a review of Apache Beam concepts. Next, we discuss processing streaming data using windows,...
In the last installment of the Dataflow course series, we will introduce the components of the Dataflow operational model. We will examine tools and techniques for troubleshooting and optimizing pipeline performance. We will then review testing, deployment, and reliability best...

Complete the introductory Prepare Data for ML APIs on Google Cloud skill badge to demonstrate skills in the following: cleaning data with Dataprep by Trifacta, running data pipelines in Dataflow, creating clusters and running Apache Spark jobs in Dataproc, and...

Complete the intermediate Build a Data Warehouse with BigQuery skill badge course to demonstrate skills in the following: joining data to create new tables, troubleshooting joins, appending data with unions, creating date-partitioned tables, and working with JSON, arrays, and structs...

Complete the intermediate Engineer Data for Predictive Modeling with BigQuery ML skill badge to demonstrate skills in the following: building data transformation pipelines to BigQuery using Dataprep by Trifacta; using Cloud Storage, Dataflow, and BigQuery to build extract, transform, and...

Complete the introductory Build a Data Mesh with Dataplex skill badge to demonstrate skills in the following: building a data mesh with Dataplex to facilitate data security, governance, and discovery on Google Cloud. You practice and test your skills in...
This course explores Gemini in BigQuery, a suite of AI-driven features to assist data-to-AI workflow. These features include data exploration and preparation, code generation and troubleshooting, and workflow discovery and visualization. Through conceptual explanations, a practical use case, and hands-on...
This course demonstrates how to use AI/ML models for generative AI tasks in BigQuery. Through a practical use case involving customer relationship management, you learn the workflow of solving a business problem with Gemini models. To facilitate comprehension, the course...