Software Development

To understand what it takes to be a data engineer, start here

Among data scientists and analysts, think of data engineers as the architect. Here's a roundup of free courses and other resources to get you learning about the role.

Start learning now. (Photo by Prateek Katyal from Pexels)

Skill-Based Learning is a series brought to you by Women in Data — Philadelphia that highlights different skills required for being successful in data and tech and offer resources for you to start learning. This week, we are focusing on understanding data engineering.


To understand the basic skill sets required to be a data engineer, let’s start with understanding what a data engineer’s role comprises of.

This role is ever growing and ever changing. A data engineer has the responsibility to collect, store, query, clean and manipulate databases in an efficient way. A typical data engineer works closely with data scientists and data analysts. Think of a data engineer as the architect who builds data tables for the scientists and analysts to analyze.

A very strong requirement for any data engineer role is a broad knowledge base of different database languages like SQL or NoSQL. A data engineer needs to be proficient in Python, as well as certain data warehousing techniques like Hadoop, MapReduce, HIVE and Apache Spark.

This famous blog by a data engineer who worked at Airbnb, Robert Chang, explains that day-to-day of a data engineer, especially if you are someone who is looking to understand the difference between the roles of a data scientist and a data engineer.

Free (or nearly free) courses

If you are someone who has decided to walk the path of data engineer, these three courses will be a good starting point for you:

  1. Big Data in AWS (Amazon Web Services) Cloud — Learning about AWS is crucial for an aspiring data engineer. This Udemy course is a basic introduction to all the AWS offerings. This course expects the applicant to have a basic understanding of data-related concepts like data streaming, databases and data warehousing.
  2. Data Engineering, Big Data, and Machine Learning on GCP Specialization — This online course provides participants with an introduction to designing and building data pipelines on Google Cloud Platform. At the end of this course, you will be able to work on a data engineering project.
  3. Microsoft Certified: Azure Data Engineer Associate — If you are looking for a more advanced course to upskill your data engineer knowledge, it will be beneficial to look into this course. Each module trains the user to be a successful data engineer on the Azure platform.

Books

If you are interested in learning about new and enhanced star schema dimensional modeling patterns and certain case studies that might help you understand certain business use cases, read “The Data Engineering Cookbook.” The author, Andreas Kretz, discusses his knowledge of data engineering that is based on his data science workflow.

And the book “DW 2.0: The Architecture for the Next Generation of Data Warehousing” describes the future of data warehousing at both architecture and technology level.

###

If you are someone who is working as a data engineer or have an experience in this field, please let me know what kind of other resources we could add to these series of data engineering: rajvi.mehta@womenindata.org.

This is a guest post by Rajvi Mehta, Women in Data — Philadelphia's regional sponsorship lead.
Companies: Women in Data — Philadelphia

Before you go...

Please consider supporting Technical.ly to keep our independent journalism strong. Unlike most business-focused media outlets, we don’t have a paywall. Instead, we count on your personal and organizational support.

3 ways to support our work:
  • Contribute to the Journalism Fund. Charitable giving ensures our information remains free and accessible for residents to discover workforce programs and entrepreneurship pathways. This includes philanthropic grants and individual tax-deductible donations from readers like you.
  • Use our Preferred Partners. Our directory of vetted providers offers high-quality recommendations for services our readers need, and each referral supports our journalism.
  • Use our services. If you need entrepreneurs and tech leaders to buy your services, are seeking technologists to hire or want more professionals to know about your ecosystem, Technical.ly has the biggest and most engaged audience in the mid-Atlantic. We help companies tell their stories and answer big questions to meet and serve our community.
The journalism fund Preferred partners Our services
Engagement

Join our growing Slack community

Join 5,000 tech professionals and entrepreneurs in our community Slack today!

Trending

The metrics and mechanics that get startups funded, according to 5 active investors

A sneak peek inside Penn Engineering’s new $137.5M mass timber building 

Silicon Valley venture firm launches ‘Rising America’ fund to back diverse founders

This Week in Jobs: 31 open roles to cure the common career

Technically Media