Featured Categories
Databricks - Data Engineer Associate Certification Prep
Databricks is one of the most popular emerging platforms in the data industry. It enables teams to collaborate effectively on big data and machine learning projects. With its in...
Apache Iceberg vs Delta Lake: What are the differences?
The cloud data lakehouse is gaining momentum, driven by the evolution of table formats like Apache Iceberg, Delta Lake, and Hudi. With improved transactional support, ACID compl...
SQL - Order of Execution of a Query
Each query begins with finding the data we need in a database and then filtering that data down into something that can be processed and understood as quickly as possible. Becau...
PySpark - Basics
1. What is PySpark, how does it relate to Apache Spark? PySpark is the Python API for Apache Spark, an open-source distributed computing system designed for large-scale dat...
SQL - The QUALIFY Clause
Teradata introduced The QUALIFY clause in a SELECT SQL query years ago. It’s been followed over the years by Oracle, Snowflake, Google BigQuery, Databricks, and other relational...