Apache Flink Relational Programming using Table API and SQL

Learn Apache Flink Table and SQL Interfaces via Python to process batch and streaming data workloads at scale

Apache Flink is widely growing in popularity for its ability to perform advanced stateful computations in a way that scales to meet the demands of both high throughput and high performance use cases. Not only is Apache Flink very scalable and performant it also integrates with a wide variety of source and sink data systems like flat files (CSV,TXT,TSV), Databases, and Message Queues (Kafka, AWS Kinesis, GCP Pub/Sub, RabbitMQ).

What you’ll learn

  • Apache Flink Table API.
  • Apache Flink SQL Interface.
  • Apache Flink with Python (PyFlink).
  • Batch Data Processing.
  • Stream Data Processing.

Course Content

  • Introduction –> 5 lectures • 4min.
  • Introduction to Apache Flink Table API and SQL Interface –> 7 lectures • 13min.
  • TableEnvironment, Table Sources and Table Sinks –> 9 lectures • 1hr 1min.
  • Operations on the Table Object using Table API and SQL –> 21 lectures • 2hr 54min.

Apache Flink Relational Programming using Table API and SQL

Requirements

  • Previous experience with Python programming.
  • Basic Understanding of Operating Systems and Docker.
  • Basic Understanding of Distributed Computing.

Apache Flink is widely growing in popularity for its ability to perform advanced stateful computations in a way that scales to meet the demands of both high throughput and high performance use cases. Not only is Apache Flink very scalable and performant it also integrates with a wide variety of source and sink data systems like flat files (CSV,TXT,TSV), Databases, and Message Queues (Kafka, AWS Kinesis, GCP Pub/Sub, RabbitMQ).

In this course students will learn to harness the power of Apache Flink which is a modern distributed computing framework providing a unified approach to both batch and streaming data processing workloads. This course specifically focuses on the relational programming paradigm exposed through Apache Flink’s Table API and SQL interface (with examples in Python) offering intuitive yet powerful abstractions to process vast amounts of data in either bounded (batch) or unbounded (streaming) sources.

  • Students learn batch processing with Flink through many examples of consuming, processing, and producing results from/to the filesystem in CSV format.
  • Students also learn stream processing with Flink through several examples consuming, processing and producing results from/to Apache Kafka running in a local Dockerized Kafka cluster.

Apache Flink offers support for developing Flink applications with the Table API and SQL interface in Java, Scala and Python. However, this course focuses on using the Python bindings for Apache Flink. The focus on Python for this course was chosen due to the popularity of the Python programming language, particularly in the big data engineering ecosystem, but also due to the underrepresentation of Python in existing Apache Flink courses which primarily cover the Java and Scala APIs of Flink.