Query Tableflow Tables with Flink in Confluent Cloud for Apache Flink®

Confluent Cloud for Apache Flink® supports snapshot queries that read data from a Tableflow-enabled topic at a specific point in time.

Querying a Tableflow-enabled topic is similar to querying a Flink topic.

  • If Tableflow is enabled on a topic with Confluent Managed Storage, the query reads from both Kafka and Parquet.
  • If Tableflow is enabled on a topic with custom storage, the query reads from your S3 bucket.

This guide shows how to run a snapshot query on a Tableflow-enabled topic.

Note

Snapshot query is an Early Access Program feature in Confluent Cloud for Apache Flink.

An Early Access feature is a component of Confluent Cloud introduced to gain feedback. This feature should be used only for evaluation and non-production testing purposes or to provide feedback to Confluent, particularly as it becomes more widely available in follow-on preview editions.

Early Access Program features are intended for evaluation use in development and testing environments only, and not for production use. Early Access Program features are provided: (a) without support; (b) “AS IS”; and (c) without indemnification, warranty, or condition of any kind. No service level commitment will apply to Early Access Program features. Early Access Program features are considered to be a Proof of Concept as defined in the Confluent Cloud Terms of Service. Confluent may discontinue providing preview releases of the Early Access Program features at any time in Confluent’s sole discretion.

Prerequisites

  • Access to Confluent Cloud.
  • The OrganizationAdmin, EnvironmentAdmin, or FlinkAdmin role for creating compute pools, or the FlinkDeveloper role if you already have a compute pool. If you don’t have the appropriate role, contact your OrganizationAdmin or EnvironmentAdmin. For more information, see Grant Role-Based Access in Confluent Cloud for Apache Flink.
  • A provisioned Flink compute pool.

Step 1: Enable Tableflow on your topic

  • If you want to try querying a table with mock data, complete the steps in Run a Snapshot Query, then proceed to the next step.

  • If you want to query a table with mock data or data from your Kafka topic, and you want to use Confluent Managed Storage, complete the following steps, then proceed to Step 2: Run a snapshot query with Tableflow.

    1. In Confluent Cloud Console, navigate to your cluster.

    2. In the navigation menu, click Topics.

    3. In the topics list, find your topic and click it to open the details page.

    4. Click Enable Tableflow.

    5. In the Enable Tableflow dialog, select Iceberg and click Use Confluent storage.

      The topic status updates to Tableflow Syncing.

  • If you want to query a table with data from your Kafka topic, and you want to use custom storage, complete steps 1-4 in Tableflow Quick Start Using Your Storage and AWS Glue and proceed to Step 2: Run a snapshot query with Tableflow.

Step 2: Run a snapshot query with Tableflow

Once Tableflow is enabled on your topic, you can run a snapshot query on the table by using the same statements that you use for Flink tables.

In a Flink workspace or the Flink SQL shell, prepend your query with the following SET statement:

SET 'sql.snapshot.mode' = 'now';

Also, in the Flink workspace, you can change the Mode dropdown setting to Snapshot before running your query.

For more information, see Run a Snapshot Query.