Bartosz Bąbol

Software engineering

Apache Kafka Example With Sbt-docker

Introduction

TLDR; Github Repository

If you have been working on for example distributed microservice system that you know that managing it or even simple run is not simple anymore. To make your app running on your local machine you need to setup different things like each service, probably some message queues, databases etc. So you need to prepare environment for your system at least twice: for your local machine and for production environment which sounds like a perfect case for docker. In this example we will dockerize simple ‘microservice’ application. Diagram:

images

Producer produces messages and there are 2 consumers, each in different consumer group. Those 3 are separated projects built with sbt. As you see there is kafka in version 0.10 as message broker but each service might have different version of client api. This is fine since kafka is backward compatible and we intend to run 3 separated applications on 3 different jvm’s so we don’t have dependencies conflict.

Prerequisites

– Docker installed

Running the example

Clone the repository:

1
git clone git@github.com:BBartosz/kafka-sbt-docker.git

and inside catalog with repository run:

1
sbt docker

And then:

1
docker-compose up

Which boots up whole application inside docker containers. Logs should look similar to this:

images

Sbt docker

If you are familiar with docker (if not check this quick intro) you should know that in order to build docker image you need to have specified dockerfile. And depending what image you need to build, you write different dockerfile which is set of instructions for docker to create an image which later can be run in container. In previous post we have created small dockerfile which created environment required by golang to run very simple http server. In this kafka example we need to build at least 5 images: 2 for consumer, 1 producer and one for kafka and one for zookeper. So what are the steps which dockerfile should specify for for example producer?

  1. First of all we need some unix os.
  2. Then we need to have jvm installed.
  3. And finally we need to have a command to run our compiled sbt project

We can of course create this dockerfile manually but in this post we will use sbt docker existing library which will help us in this task.

Installing sbt docker

Sbt-docker is sbt plugin so you add it like every other plugin, instructions in readme

Using sbt-docker

Go to build.sbt to see how some of dockerfile settings:

build.sbt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
    dockerfile in docker := {
      // The assembly task generates a fat JAR file
      val baseDir = baseDirectory.value
      val artifact: File = assembly.value
      val artifactTargetPath = s"/app/${artifact.name}"
      val dockerResourcesDir = baseDir / "docker-scripts"
      val dockerResourcesTargetPath = "/app/"

      new Dockerfile {
        from("java")
        add(artifact, artifactTargetPath)
        copy(dockerResourcesDir, dockerResourcesTargetPath)
        entryPoint(s"/app/entrypoint.sh")
        debugPort match {
          case Some(port) => cmd(s"${name.value}", s"${version.value}", s"$port")
          case None => cmd(s"${name.value}", s"${version.value}")
        }
      }
    }

This settings build for us Dockerfile. Main idea is to add jar file with our sbt application (in this example we use sbt-assembly to create jar) to base docker image “java”. As entrypoint I specified script which boots up our jvm application. You can preview the content of entrypoint.sh to get idea why it’s done this way. I will explain more in next paragraph about docker-compose. The entry point script takes 3 arguments:

  • name of application
  • version of application
  • and optional parameter debug port. It will be used for exposing port for debugging.

We can additionally specify name of docker image:

build.sbt
1
2
3
4
5
6
7
8
9
10
11
  imageNames in docker := Seq(
    // Sets the latest tag
    ImageName(s"${name.value}:latest"),

    // Sets a name with a tag that contains the project version
    ImageName(
      namespace = Some(organization.value),
      repository = name.value,
      tag = Some("v" + version.value)
    )
  )

And as you see those settings are used by all from projects use in this example so by 2 consumers and producer.

By typing:

1
sbt docker

you create 3 images with specified options.

IMPORTANT NOTE I’ve encountered some issues when I run this command and it was related to not enough permissions for bash scripts in /docker-scripts. To solve it set correct permissions to those scripts:

1
sudo chmod +x ./consumerNew/docker-scripts/

And do this for every application.

docker-compose

Now we need to somehow glue together those 3 created images (+2 with kafka and zookeper) because we want our application to be multi-container. Docker compose is a tool for achieving this goal. To create many containers from images in one place, make one container dependent on another etc. Our example will be hopefully simple enough to understand without going to docs. Let’s look at docker-compose.yml:

docker-compose.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
version: '3'

services:
  zookeeper:
    image: wurstmeister/zookeeper:latest
    ports:
      - "2181:2181"
  kafka:
    image: wurstmeister/kafka:0.10.0.0
    ports:
      - "9092:9092"
    depends_on:
      - zookeeper
    environment:
      KAFKA_ADVERTISED_HOST_NAME: kafka
      KAFKA_ADVERTISED_PORT: "9092"
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_CREATE_TOPICS: "kafka_docker_topic:1:1"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
  example-producer:
    image: example-producer
    ports:
      - "8080:8080"
    tty: true
    depends_on:
      - kafka
  example-consumer-new:
    image: example-consumer-new
    ports:
      - "8081:8081"
    tty: true
    depends_on:
      - kafka
  example-consumer-old:
    image: example-consumer-old
    ports:
      - "8082:8082"
    tty: true
    depends_on:
      - kafka

As you see docker-file is in yaml key value format. When specifying image to be run we need to specify name of the image which we want to run. If docker won’t find this image name

docker-compose.yml
1
   image: example-consumer-new

in our local machine it will go search in dockerhub.

We need to expose some ports like in consumer example:

docker-compose.yml
1
2
    ports:
      - "8082:8082"

Which says exactly: map port 8082 on my machine to port 8082 in docker container network.

We want to link it to kafka image created on top of docker-compose.yml and make consumer container dependent on it.

docker-compose.yml
1
2
    depends_on:
      - kafka

Option depends_on express dependencies between services. Depends_on specifies which service needs to be started before. In our example kafka will be started before service of producer and consumers and zookeeper will be started before kafka.

There is important thing here. Depends_on doesn’t guarantee that service which is dependency to other service is “ready”. Depends_on only ensures that your dependency service starts. This is how we get to another case which is…

Controlling startup order in docker-compose

We need to make sure that before our consumer starts, kafka and zookeeper are running and to do this we will use script which will wait for kafka and zk to start.

In this example I’ve used wait-for-it script which pings the specified port and waits till the service is “ready”. In our simple case it’s working fine but you might specify your own script to wait for specified service to be ready.

Important thing is we want to run this script and have acknowledgment about ‘parent’ container before we will run our application. To do this go to one of the entrypoint.sh:

entrypoint.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    #!/bin/bash

    set -e
    set -x

    jar_name=$1
    version=$2
    debug_port=$3

    source /app/start-kafka-and-zk.sh

    if [ -n "$debug_port" ];
        then java -jar -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address="$debug_port" /app/"$jar_name"-assembly-"$version".jar;
    else
        java -jar /app/"$jar_name"-assembly-"$version".jar;
    fi

This script is first thing which our container will invoker during the start It waits for kafka and zookeeper to start and then run the application. There is some condition for debug port because we want to run app in debug mode when user specifies port and if not then not.

Diggression

You might ask why did I do it this way? Why not specify in docker settings in build sbt, instead of entrypoint, command like this:

build.sbt
1
2
3
4
5
  new Dockerfile {
    ...
    cmd("java", "-jar", "other arguments")
    ...
  }

And then in docker-compose.yml specify entrypoint which would be script for waiting. After waiting script from entrypoint finishes, then the command from dockerfile will be executed. The reason is if you do it this way, entrypoint from docker-compose “wipes out” cmd parameters specified in Dockerfile. To read more about this behaviour check this discussion.

Official docs about startup order and this waiting scripts check are here.

Debugging with Intellij

As you can see docker settings is method which accepts debug port which will be exposed for intellij idea.

Let’s say you want to debug exampleConsumerOld on port 5005:

build.sbt
1
2
3
4
5
6
lazy val exampleConsumerOld = (project in file("consumerOld"))
  .enablePlugins(sbtdocker.DockerPlugin)
  .settings(
    libraryDependencies += "org.apache.kafka" % "kafka_2.11" % "0.9.0.0",
    dockerSettings(Some(5005))
  )

Next you need to expose this 5005 port in container with exampleConsumerOld:

docker-compose-yml
1
2
3
4
5
6
7
8
  example-consumer-old:
    image: example-consumer-old
    ports:
      - "8082:8082"
      - "5005:5005"
    tty: true
    depends_on:
      - kafka

And then you can run remote config for debugging in intellij. Put few breakpoints to try if it works.

Summary

This simple example is to show how to use docker-compose, build multi-container docker environment using sbt plugin for docker. Docker principles are relatively easy to understand so it’s worth to use it even for quick prototyping like in example above. Also starting many completely separated modules with one simple command, with guarantee that they will work the same no matter on which environment they run is a paradise for developer. Hope you’ve found something useful for you and don’t hesitate to give me some comments if you found something wrong/not clear in this post. Thanks for reading.

Comments