Sample dataset generator for Aiven for Apache Kafka®#
Learning to work with streaming data is much more fun with data, so to get you started on your Apache Kafka® journey we help you create fake streaming data to a topic.
Note
The following example is based on Docker images, which require Docker or Podman to be executed.
The following example assumes you have an Aiven for Apache Kafka® service running. You can create one following the dedicated instructions.
Fake data generator on Docker#
To learn data streaming, you need a continuous flow of data and for that you can use the Dockerized fake data producer for Aiven for Apache Kafka®. To start using the generator:
Clone the repository:
git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker
Copy the file
conf/env.conf.sample
toconf/env.conf
Create a new access token via the Aiven Console or the following command in the Aiven CLI, changing the
max-age-seconds
appropriately for the duration of your test:avn user access-token create \ --description "Token used by Fake data generator" \ --max-age-seconds 3600 \ --json | jq -r '.[].full_token'
Tip
The above command uses
jq
(https://stedolan.github.io/jq/) to parse the result of the Aiven CLI command. If you don’t havejq
installed, you can remove the| jq -r '.[].full_token'
section from the above command and parse the JSON result manually to extract the access token.Edit the
conf/env.conf
file filling the following placeholders:my_project_name
: the name of your Aiven projectmy_kafka_service_name
: the name of your Aiven for Apache Kafka instancemy_topic_name
: the name of the target topic, can be any namemy_aiven_email
: the email address used as username to log in to Aiven servicesmy_aiven_token
: the access token generated during the previous step
Build the Docker image with:
docker build -t fake-data-producer-for-apache-kafka-docker .
Tip
Every time you change any parameters in the
conf/env.conf
file, you need to rebuild the Docker image to start using them.Start the streaming data flow with:
docker run fake-data-producer-for-apache-kafka-docker
Once the Docker image is running, check in the target Aiven for Apache Kafka® service that the topic is populated. This can be done with the Aiven Console, if the Kafka REST option is enabled, in the Topics tab. Alternatively you can use tools like kcat to achieve the same.