Skip to main content

Apache Kafka

DataSQRL supports Apache Kafka as a data source and data sink to ingest or export data stream.

Kafka Connector

system.discovery.table.json
{
"type": "source",
"connector": {
"name": "kafka",
"bootstrap.servers": "111.0.0.100"
}
}

The Kafka connector has the following configuration options:

Field NameDescriptionRequired?
bootstrap.serversList of server IPs for Kafka instances to connect toYes
prefixLimits data discovery to topic with the given prefixNo

You can add additional Kafka consumer or producer configuration options to the Kafka connector configuration.

Data Format

The Kafka connector supports all streaming data formats.

The Kafka connector supports automatic data format discovery based on the topic name extension. For example, a topic name that ends in .json is assumed to have the JSON data format. Unless your Kafka topics use this particular naming convention, you have to configure a data format in the data system configuration.

Data Discovery

Data discovery retrieve all topics in the Apache Kafka cluster. If a topicPrefix is configured, it filters out all topics that don't match the prefix. Data discovery maps each topic to a table source or sink. If the topic name has an extension that matches a data format, data discovery configures that data format for the table. The data format configured in the data system configuration takes precedent. The table name is the name of the topic (without extension).

Data discovery reads part of the data stream for each topic to determine the schema of the table, unless a schema is configured.

Data Sink

Kafka can be configured as a data sink by setting the type to sink or source_and_sink. format is a required field when using the Kafka as a sink. The exported stream table published to the topic with the name of the table sink.