Software Development

To understand what it takes to be a data engineer, start here

Among data scientists and analysts, think of data engineers as the architect. Here's a roundup of free courses and other resources to get you learning about the role.

Start learning now. (Photo by Prateek Katyal from Pexels)

Skill-Based Learning is a series brought to you by Women in Data — Philadelphia that highlights different skills required for being successful in data and tech and offer resources for you to start learning. This week, we are focusing on understanding data engineering.


To understand the basic skill sets required to be a data engineer, let’s start with understanding what a data engineer’s role comprises of.

This role is ever growing and ever changing. A data engineer has the responsibility to collect, store, query, clean and manipulate databases in an efficient way. A typical data engineer works closely with data scientists and data analysts. Think of a data engineer as the architect who builds data tables for the scientists and analysts to analyze.

A very strong requirement for any data engineer role is a broad knowledge base of different database languages like SQL or NoSQL. A data engineer needs to be proficient in Python, as well as certain data warehousing techniques like Hadoop, MapReduce, HIVE and Apache Spark.

This famous blog by a data engineer who worked at Airbnb, Robert Chang, explains that day-to-day of a data engineer, especially if you are someone who is looking to understand the difference between the roles of a data scientist and a data engineer.

Free (or nearly free) courses

If you are someone who has decided to walk the path of data engineer, these three courses will be a good starting point for you:

  1. Big Data in AWS (Amazon Web Services) Cloud — Learning about AWS is crucial for an aspiring data engineer. This Udemy course is a basic introduction to all the AWS offerings. This course expects the applicant to have a basic understanding of data-related concepts like data streaming, databases and data warehousing.
  2. Data Engineering, Big Data, and Machine Learning on GCP Specialization — This online course provides participants with an introduction to designing and building data pipelines on Google Cloud Platform. At the end of this course, you will be able to work on a data engineering project.
  3. Microsoft Certified: Azure Data Engineer Associate — If you are looking for a more advanced course to upskill your data engineer knowledge, it will be beneficial to look into this course. Each module trains the user to be a successful data engineer on the Azure platform.

Books

If you are interested in learning about new and enhanced star schema dimensional modeling patterns and certain case studies that might help you understand certain business use cases, read “The Data Engineering Cookbook.” The author, Andreas Kretz, discusses his knowledge of data engineering that is based on his data science workflow.

And the book “DW 2.0: The Architecture for the Next Generation of Data Warehousing” describes the future of data warehousing at both architecture and technology level.

###

If you are someone who is working as a data engineer or have an experience in this field, please let me know what kind of other resources we could add to these series of data engineering: rajvi.mehta@womenindata.org.

This is a guest post by Rajvi Mehta, Women in Data — Philadelphia's regional sponsorship lead.
Companies: Women in Data — Philadelphia

Before you go...

Please consider supporting Technical.ly to keep our independent journalism strong. Unlike most business-focused media outlets, we don’t have a paywall. Instead, we count on your personal and organizational support.

Our services Preferred partners The journalism fund
Engagement

Join our growing Slack community

Join 5,000 tech professionals and entrepreneurs in our community Slack today!

Trending

19 tech and entrepreneurship events to check out before the holidays

Are digital navigators the answer to closing Philadelphia’s tech gap?

Expect high-speed internet at 100 Philly rec centers in 2025, Verizon says

EDA officials are ‘hopeful’ Tech Hubs program will live on under Trump

Technically Media