Kafka Streams
This tutorial shows how to integrate Upstash Kafka with Kafka Streams
Kafka Streams is a client library, which streams data from one Kafka topic to another.
Upstash Kafka Setup
Create a Kafka cluster using Upstash Console or Upstash CLI by following Getting Started.
Create two topics by following the creating topic steps. Let’s name first topic “input”, since we are going to stream this topic to other one, which we can name it as “output”.
Project Setup
If you already have a project and want to use Kafka Streams with Upstash Kafka in it, you can skip this section and continue with Add Kafka Streams into the Project.
Install Maven to your machine by following Maven Installation Guide.
Run mvn –version
in a terminal or in a command prompt to make sure you have
Maven downloaded.
It should print out the version of the Maven you have:
To create the Maven project;
Go into the folder that you want to create the project in your terminal or
command prompt by running cd <folder path>
Run the following command:
Add Kafka Streams into the Project
Open the project folder by using an IDE which has maven plugin such as Intellij,
Visual Studio, Eclipse etc. Add following dependencies into the dependencies tag
in pom.xml
file.
Streaming From One Topic to Another Topic
Import the following packages first:
Define the names of the topics you are going to work on:
Create the following properties for Kafka Streams and replace UPSTASH-KAFKA-* placeholders with your cluster information.
Start the builder for streaming and assign input topic as the source:
Apply the following steps to count the words in the sentence sent to input topic and stream the results to the output topic:
Since “groupby” function causing repartition and creation of a new internal topic to store the groups intermediately, be sure that there is enough partition capacity on your Upstash Kafka. For detailed information about the max partition capacity of Kafka cluster, check plans.
Just to be sure, you can check from topic section on console if the internal repartition topic created successfully when you run your application and send data to input topic. For reference, naming convention for internal repartition topics:
Next, finalize and build the streams builder. Create a topology of your process. It can be viewed by printing.
Here is the example topology in this scenario:
Finally, start the Kafka Streams that was built and run it.
Was this page helpful?