GCP Series Part 1: How to become Google Cloud certified Professional Data Engineer (updated exam 2019)?

Anurag Bhatia
5 min readNov 20, 2019

--

If you have been thinking about taking the exam but don’t know whether your preparation is good enough to pass the exam or not, or are looking for the right (preferably online, self-paced) course(s) to get you ready, this article is meant precisely for someone like you. The learning path that I have created here is a result of my own experience of preparing for the exam, especially the mistakes I could have avoided and the benefit of hindsight.

Context: Let’s start with the basic question. Do I need to have a certain background to pass this exam? Short answer: Not necessarily. Let me explain.

The official website mentions the following:

I fell short on both these accounts. In fact, after having spent many years in banking, I had switched to machine learning rather recently at the ripe young age of 38. Still, I could pass the exam in my first attempt. So, while it’s certainly good to have the relevant experience, its absence need not deter you from pursuing your goal anyway. But be ready to slog it out that much more in that case, of course.

Challenges: It is not an easy exam, for sure. And here is why:

  1. A quick glance at the exam guide will suffice to give you a sense of the wide range of topics on which you’ll be assessed.
  2. About 10–20% of the questions are likely to be directly or indirectly related to machine learning. [This turned to be my strength area, but more on that, later.]
  3. If you are new to GCP, just like I was at that stage, it can be slightly intimidating especially early on, to keep track of many different GCP services having apparently similar and confusing names: Dataflow, Dataproc, Dataprep, Data-Studio, not to mention BigQuery and Bigtable. :) It may take a few weeks to get used to them, so just hang in there.
  4. There is no one source which can single-handedly and thoroughly prepare you for the exam. The exam format was revised early this year (e.g. no case studies any more) but not every online training portal has updated the content accordingly. Having said that, I’ll mention those I found useful for my preparation. (I took the exam in August 2019.) If possible, please stick to the order in which I mention them..

Coursera specialization: A set of 6 courses offered by Google Cloud itself. Its USP is the many hours of qwiklabs’ access that it provides you. So, better make the most out of it. Repeat the lab exercises, if required. I had done it in June 2019 and it had not been updated for the new exam format until then. I am still not sure whether they have done it now. Nonetheless, if working on GCP is not part of your daily work, then your preparation begins right here. Requires a fair amount of time. It took me about a month to complete this. Very good for getting started. Necessary, but not sufficient.

Udemy course: Just like any other course on Udemy, just wait for a few days and you should get it for $10 or even less. Good value for money. Relatively short and quite under-rated. Doesn’t cover every exam topic, though.

Linux Academy’s course: Arguably the best online course out there, though not the first one to begin your preparation with. Two big USPs: 1. It has been updated according to the new exam. (I waited more than a month for it to be updated, but it was certainly worth it.) 2. Unlike most others, it offers a few quizzes and a mock test which is quite representative of the type of questions you’ll come across in your actual exam. Attempt it at least twice, even if a few questions get repeated.

Dmitri Lerko’s blog: Not a course or even a tutorial. Instead, this is a short collection of topics which are important from the exam perspective. Though I can go on and on about how relevant it is, let me just say this: It is indispensable if you don’t want to take any chances. Even if you have gone through all the courses mentioned above, ignore this blog at your own peril. Period.

Official Practice exam: Take it at least thrice. I was skeptical in doing it more than once since I assumed that the questions would remain the same. Thankfully, better sense prevailed. A few questions were indeed the same, but many others were not.

Machine Learning (ML): Since the exam has a multiple-choice questions’ format, you are unlikely to be asked to write code on this. Still, you are expected to have a decent understanding of the two aspects related to ML:

  1. Basic jargon (e.g. difference between supervised and unsupervised learning, labels and features, overfitting etc.) and industry-landscape (ML use-cases across industries, issues of ‘black-box’ and fairness in ML, AutoML etc.) If some of this currently intimidates you a bit, probably your understanding is not yet good enough. Work on it accordingly. AI for Everyone is a very good MOOC to get a quick grasp over most of these topics.
  2. Machine learning related GCP products/APIs: Personally, since I had hands-on experience with some of these (Dialogflow, Cloud Vision API, AutoML Natural Language API) as part of my ML projects, I could answer such questions easily. But Data-Loss Prevention API, AI Platform and AutoML Tables are also there, just to name a few. Given a situation, you should be able to judge a) whether or not this is a fit case for using any of these products and b) why (or why not).

One of the most fascinating things in ML is that a lot of excellent content is being shared online, absolutely free. As Károly Zsolnai-Fehér very aptly put it: “What a time to be alive!” Even if you are an absolute beginner in a topic, help is just a mouse click away. Here are some amazing YouTube channels you’ll find very interesting and useful (for ML):

a. Luis Serrano

b. Arxiv Insights

c. Brandon Rohrer

d. 3Blue1Brown

Of course, there is also the official GCP channel where they share their latest updates in the form of short videos: This Week in Cloud.

Bottom line: At the risk of repeating myself, the exam is challenging especially if you are new to GCP or/and machine learning. Having said that, if I could pass it by creating and following this learning path, there is no reason why you can’t. So, GO FOR IT.

Looking forward to see you in that hoodie with the Google Cloud logo soon. Good luck.. :)

Feel free to contact me if you have any queries or feedback.

In Part 2, I’ll write about another — and more generic — GCP certification, Professional Cloud Architect.

--

--