Automate your Java project’s dependency resolution, testing, and more with Gradle.
Using the SQLAlchemy ORM to build data models with meaningful relationships.
Perform SQL-like joins and aggregations on your PySpark DataFrames.
Get the most out of Redshift by performance tuning your cluster and learning how to query your data optimally.
Manage files in your Google Cloud Storage bucket using the google-cloud-storage Python library.
Working with Spark’s original data structure API: Resilient Distributed Datasets.
Use Apache Airflow to build and monitor better data pipelines.
Use Panda’s Multiindex to make your data work harder for you.
Login to submit your review.