Data quality is crucial for making informed decisions in today’s data-driven world. ETL (Extract, Transform, Load) processes play a vital role in ensuring data quality by integrating data from disparate sources into a unified format. However, ETL testing can be cumbersome and time-consuming. In this conference talk, we will explore how the open-source Python library, Great Expectations, can revolutionize ETL testing by simplifying data validation and improving data quality.
Takeaways from the talk:
- Attendees will learn about the challenges of traditional ETL testing methods and the advantages of employing Great Expectations. The session will provide a quick look at the key features of the library, such as data profiling, automated testing, and data documentation. We will also walk through a practical example to demonstrate how Great Expectations can be seamlessly integrated into ETL pipelines to streamline data validation and enhance data reliability.
- By the end of the talk, participants will have a solid understanding of how Great Expectations can optimize their ETL testing processes and boost confidence in the quality of their data assets. This session is perfect for data engineers, data analysts, and data scientists who want to enhance their ETL testing workflows and ensure the integrity of their data pipelines.
May 19 @ 10:45
10:45 — 11:30 (45′)
Patricio Miner