Skip to content

Issue #16003: Add Kafka to no-error testing#18263

Merged
romani merged 1 commit intocheckstyle:masterfrom
stoyanK7:16003
Dec 11, 2025
Merged

Issue #16003: Add Kafka to no-error testing#18263
romani merged 1 commit intocheckstyle:masterfrom
stoyanK7:16003

Conversation

@stoyanK7
Copy link
Copy Markdown
Contributor

@stoyanK7 stoyanK7 commented Dec 9, 2025

Resolves issue

Based on the work of:


Added GitHub Actions Workflow for regression testing with Apache Kafka: no-error-kafka. The job is identical to the other no-error-* jobs. For instance:

no-error-hazelcast)
CS_POM_VERSION="$(getCheckstylePomVersion)"
echo "CS_version: ${CS_POM_VERSION}"
./mvnw -e --no-transfer-progress clean install -Pno-validations
echo "Checkout Hazelcast sources..."
checkout_from "https://github.com/hazelcast/hazelcast.git"
cd .ci-temp/hazelcast
mvn -e --no-transfer-progress checkstyle:check \
-Dcheckstyle.version="${CS_POM_VERSION}"
cd ..
removeFolderWithProtectedFiles hazelcast
;;

Usually, such jobs use Semaphore and CircleCI, but this one must use GitHub Actions due to high memory usage.

@stoyanK7 stoyanK7 changed the title Issue #16003: Add no-error-kafka CI job Issue #16003: Add Kafka to no-error testing Dec 9, 2025
@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

The daemon indeed runs out of memory on CircleCI runners:

https://app.circleci.com/pipelines/github/checkstyle/checkstyle/38501/workflows/3fb84575-1dd8-4229-8687-5bedd16ebe49/jobs/1164442?invite=true#step-103-31513_90

The message received from the daemon indicates that the daemon has disappeared.
Build request sent: Build{id=01b4c81f-6240-4469-87d6-adeb5c33548d, currentDir=/home/circleci/project/.ci-temp/kafka}
Attempting to read last messages from the daemon log...
Daemon pid: 638
  log file: /home/circleci/.gradle/daemon/9.2.1/daemon-638.out.log
----- Last 20 lines from daemon log file - daemon-638.out.log -----
2025-12-09T10:00:10.006+0000 [DEBUG] [org.gradle.launcher.daemon.server.DaemonStateCoordinator] resetting idle timer
2025-12-09T10:00:10.007+0000 [DEBUG] [org.gradle.launcher.daemon.server.DaemonStateCoordinator] daemon is running. Sleeping until state changes.
2025-12-09T10:00:10.008+0000 [INFO] [org.gradle.launcher.daemon.server.exec.StartBuildOrRespondWithBusy] Daemon is about to start building Build{id=01b4c81f-6240-4469-87d6-adeb5c33548d, currentDir=/home/circleci/project/.ci-temp/kafka}. Dispatching build started information...
2025-12-09T10:00:10.008+0000 [DEBUG] [org.gradle.launcher.daemon.server.SynchronizedDispatchConnection] thread 27: dispatching org.gradle.launcher.daemon.protocol.BuildStarted@738e25fe
2025-12-09T10:00:10.010+0000 [DEBUG] [org.gradle.launcher.daemon.server.exec.EstablishBuildEnvironment] Configuring env variables: [PATH, container, MILL_VERSION, NO_PROXY, CIRCLE_WORKFLOW_WORKSPACE_ID, SBT_VERSION, CIRCLE_PR_USERNAME, CIRCLE_PULL_REQUEST, CIRCLE_PROJECT_REPONAME, CIRCLE_OIDC_TOKEN_V2, CIRCLE_WORKING_DIRECTORY, CIRCLE_INTERNAL_TASK_DATA, PWD, LANGUAGE, GRADLE_VERSION, BASH_ENV, CIRCLE_BUILD_NUM, PAGER, CI_PULL_REQUEST, MAVEN_VERSION, COMPOSE_VER, CIRCLE_SHA1, OLDPWD, CIRCLE_NODE_INDEX, CIRCLE_NODE_TOTAL, DEBIAN_FRONTEND, LC_ALL, CIRCLE_SHELL_ENV, SHLVL, CIRCLE_PIPELINE_ID, COMPOSE_SWITCH_VERSION, DOCKER_VERSION, CIRCLE_PR_REPONAME, JAVA_HOME, TERM, CIRCLE_PROJECT_USERNAME, LANG, CIRCLE_BUILD_URL, CIRCLE_INTERNAL_SCRATCH, CIRCLE_BRANCH, CIRCLE_PULL_REQUESTS, CIRCLE_ORGANIZATION_ID, JAVA_VERSION, _, CIRCLECI, CIRCLE_PR_NUMBER, CIRCLE_USERNAME, CIRCLE_OIDC_TOKEN, CIRCLE_REPOSITORY_URL, CI, CIRCLE_JOB, CIRCLE_WORKFLOW_ID, SSH_AUTH_SOCK, HOSTNAME, CIRCLE_PROJECT_ID, CIRCLE_WORKFLOW_JOB_ID, HOME]
2025-12-09T10:00:10.014+0000 [DEBUG] [org.gradle.launcher.daemon.server.exec.LogToClient] About to start relaying all logs to the client via the connection.
2025-12-09T10:00:10.015+0000 [INFO] [org.gradle.launcher.daemon.server.exec.LogToClient] The client will now receive all logging from the daemon (pid: 638). The daemon log file: /home/circleci/.gradle/daemon/9.2.1/daemon-638.out.log
2025-12-09T10:00:10.037+0000 [INFO] [org.gradle.launcher.daemon.server.exec.LogAndCheckHealth] Starting build in new daemon [memory: 3.5 GiB]
2025-12-09T10:00:10.044+0000 [DEBUG] [org.gradle.launcher.daemon.server.exec.ExecuteBuild] The daemon has started executing the build.
2025-12-09T10:00:10.045+0000 [DEBUG] [org.gradle.launcher.daemon.server.exec.ExecuteBuild] Executing build with daemon context: DefaultDaemonContext[uid=0ee2360c-3c97-45b4-bdd2-6cc4fe0573cf,javaHome=/usr/local/jdk-25.0,javaVersion=25,javaVendor=Eclipse Adoptium,daemonRegistryDir=/home/circleci/.gradle/daemon,pid=638,idleTimeout=10800000,priority=NORMAL,applyInstrumentationAgent=true,nativeServicesMode=ENABLED,daemonOpts=-Xss4m,-XX:+UseParallelGC,-Xmx4g,-Dfile.encoding=UTF-8,-Duser.country=US,-Duser.language=en,-Duser.variant]
Starting build with version 4.3.0-SNAPSHOT (commit id 8ff7addc) using Gradle 9.2.1, Java 25 and Scala 2.13.17
Build properties: ignoreFailures=false, maxParallelForks=2, maxScalacThreads=2, maxTestRetries=0
MessageGenerator: processed 26 Kafka message JSON file(s).
MessageGenerator: processed 197 Kafka message JSON file(s).
MessageGenerator: processed 1 Kafka message JSON file(s).
MessageGenerator: processed 5 Kafka message JSON file(s).
MessageGenerator: processed 44 Kafka message JSON file(s).
MessageGenerator: processed 4 Kafka message JSON file(s).
MessageGenerator: processed 2 Kafka message JSON file(s).
MessageGenerator: processed 1 Kafka message JSON file(s).----- End of the daemon log -----


FAILURE: Build failed with an exception.

* What went wrong:
Gradle build daemon disappeared unexpectedly (it may have been killed or may have crashed)

I executed the same CircleCI job successfully locally and noticed that the memory usage gradually climbs to ~10 GB (sometimes more):
Frame 1

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

I notice a lot of compileJava, compileScala and classes tasks running. Checkstlyle does not use compiled bytecode (right?), so I am going to exclude those:

> Task :generator:classes
> Task :raft:compileJava
> Task :metadata:compileJava
> Task :storage:storage-api:compileJava
> Task :storage:compileJava
> Task :coordinator-common:compileJava
> Task :core:compileScala

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

Looking at the task dependency graph via

./gradlew checkstyleMain checkstyleTest --task-graph

we can see that checkstyleMain and checkstyleTest require compileJava, classes, processResources, etc... to run first:

Tasks graph for: checkstyleMain checkstyleTest
+--- :clients:checkstyleMain (org.gradle.api.plugins.quality.Checkstyle)
|    +--- :clients:classes (org.gradle.api.DefaultTask)
|    |    +--- :clients:compileJava (org.gradle.api.tasks.compile.JavaCompile)
|    |    |    \--- :clients:processMessages (org.gradle.api.tasks.JavaExec)
|    |    |         \--- :generator:jar (org.gradle.api.tasks.bundling.Jar)
|    |    |              +--- :generator:classes (org.gradle.api.DefaultTask)
|    |    |              |    +--- :generator:compileJava (org.gradle.api.tasks.compile.JavaCompile)
|    |    |              |    \--- :generator:processResources (org.gradle.language.jvm.tasks.ProcessResources)
|    |    |              \--- :generator:compileJava (*)
|    |    \--- :clients:processResources (org.gradle.language.jvm.tasks.ProcessResources)
|    \--- :clients:compileJava (*)
+--- :connect:checkstyleMain (org.gradle.api.plugins.quality.Checkstyle)
|    +--- :connect:classes (org.gradle.api.DefaultTask)
|    |    +--- :connect:compileJava (org.gradle.api.tasks.compile.JavaCompile)
|    |    \--- :connect:processResources (org.gradle.language.jvm.tasks.ProcessResources)
|    \--- :connect:compileJava (*)
+--- :coordinator-common:checkstyleMain (org.gradle.api.plugins.quality.Checkstyle)
|    +--- :clients:compileJava (*)
|    +--- :coordinator-common:classes (org.gradle.api.DefaultTask)
|    |    +--- :coordinator-common:compileJava (org.gradle.api.tasks.compile.JavaCompile)

Ideally, this task graph should be flat and contain only checkstyleMain and checkstyleTest tasks (no compileJava, no classes, etc..)

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

I find it strange that the Checkstyle tasks depend on the compileJava tasks. Checkstyle doesn't need compiled code.

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

It seems like the Gradle Checkstyle plugin creates an implicit dependency on the compiled files via this

task.setClasspath(sourceSet.getOutput().plus(sourceSet.getCompileClasspath()));

Related issues:

To get rid of the dependency on the compilation tasks, we should use this (as described here)

tasks.withType<Checkstyle>().configureEach {
    classpath = files()
}

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

Well, removing the compilation tasks doesn't solve the memory issues:
image

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

Side note for later

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

I tried playing with the maxHeapSize and minHeapSize properties of the Checkstyle Gradle Plugin. From the docs:

Checkstyle analysis is performed in a separate process. By default, the Checkstyle process is given a max heap of 512MB. When analyzing many source files, you may need to provide additional memory to this process. You can change the amount of memory for Checkstyle by configuring the [Checkstyle.maxHeapSize](https://docs.gradle.org/current/dsl/org.gradle.api.plugins.quality.Checkstyle.html#org.gradle.api.plugins.quality.Checkstyle:maxHeapSize).

Kafka increased their maxHeapSize from the default of 512m to 1g a year ago due to memory issues:

I tried setting lower max values such as 64m, 128m and 512m but no success. 64m and 128m crash, while 512m succeeds, but the job peaks at ~12GB memory usage.

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

Side note: Apart from GitHub Actions, another option is to configure a more powerful CircleCI container. Though, that depends on the pricing plan that Checkstyle has with CircleCI: https://circleci.com/docs/reference/configuration-reference/#docker-execution-environment:

image

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

The following approach of running all checkstyle tasks sequentially "works" on CircleCI but comes with the tradeoff of increased runtime (14 minutes 56 seconds on CI) and the daemon respawning 2 times due to lack of memory:
https://app.circleci.com/pipelines/github/checkstyle/checkstyle/38503/workflows/3829fb54-382c-45ad-b5af-448917f31d9c/jobs/1164494

no-error-kafka)
  CS_POM_VERSION="$(getCheckstylePomVersion)"
  echo "CS_version: ${CS_POM_VERSION}"
  ./mvnw -e --no-transfer-progress clean install -Pno-validations
  echo "Checkout target sources ..."
  checkout_from "https://github.com/apache/kafka.git"
  cd .ci-temp/kafka/
  cat >> customConfig.gradle<< EOF
allprojects {
    repositories {
        mavenLocal()
    }
    gradle.projectsEvaluated {
        tasks.withType(Checkstyle) {
            classpath = files()
            maxHeapSize = "512m"
        }
    }
}
EOF
  mapfile -t tasks < <(
    ./gradlew checkstyleMain checkstyleTest \
      --task-graph \
      -PcheckstyleVersion="${CS_POM_VERSION}" \
      -I customConfig.gradle \
      | grep -E 'checkstyle(Main|Test)' \
      | grep -Eo ':(.+:)+(checkstyleMain|checkstyleTest)'
  )
  for task in "${tasks[@]}"
  do
    ./gradlew "${task}" \
      -PcheckstyleVersion="${CS_POM_VERSION}" \
      -I customConfig.gradle || true
  done
  cd ..
  removeFolderWithProtectedFiles kafka
  ;;

Here's a graph from a local run:
image

@romani
Copy link
Copy Markdown
Member

romani commented Dec 9, 2025

Checkstlyle does not use compiled bytecode (right?),

yes

we can see that checkstyleMain and checkstyleTest require compileJava, classes, processResources, etc... to run first:

checkstyle always should run after compilation. Just to be sure that all target text files are compiled.

Side note for later

no problem, CI just need to be configured to cache folder.

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

I'm out of ideas here. Disabling parallelization (--no-parallel) doesn't help; disabling the daemon is not as straightforward as using --no-daemon. It seems Checkstyle is just memory hungry, and Kafka is a big project to analyze.

I'll go with GitHub Actions. It should (hopefully) work since the Kafka project already does that. The GitHub runners have 16 GiB of RAM which has to be plenty

@romani
Copy link
Copy Markdown
Member

romani commented Dec 9, 2025

but the job peaks at ~12GB memory usage.

do we have leak in Checkstyle ? It might be possible.
We use openjdk as target project for regression testing (without compilation of this project), so big projects are ok for us.
We can check if we can reduce target files (exclude some folders), or we can simply presume that Kafka sources always compilable and exclude compilation of sources to reduce memory. If Kafka leak non-compiled code, It is same for us, CI will be red due to compilation in target folder or bad execution of checkstyle. So we can skip compilation as hack.
If natural run by build system is so complicated, we can run it as our usual maven plugin regression diff report, All we care is same config and same scope of files.

: Apart from GitHub Actions, another option is to configure a more powerful CircleCI container.

we are free plan.

all checkstyle tasks sequentially "works" on CircleCI but comes with the tradeoff of increased runtime (14 minutes 56 seconds on CI)

we can increase wait time for special CI job, just to make it work.

@romani
Copy link
Copy Markdown
Member

romani commented Dec 9, 2025

@dejan2609 , can you suggest something here ?

Levski (0) CSKA (1)

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 9, 2025

@romani @dejan2609

I moved the job to GitHub Actions, and it works 🚀

However, there are some violations:

> Task :metadata:checkstyleMain
Error: eckstyle] [ERROR] /home/runner/work/checkstyle/checkstyle/.ci-temp/kafka/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java:103:15: 'new' child has incorrect indentation level 14, expected level should be 16. [Indentation]
Error: eckstyle] [ERROR] /home/runner/work/checkstyle/checkstyle/.ci-temp/kafka/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java:104:15: 'new' child has incorrect indentation level 14, expected level should be 16. [Indentation]
> Task :transaction-coordinator:compileJava
Error: eckstyle] [ERROR] /home/runner/work/checkstyle/checkstyle/.ci-temp/kafka/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java:105:15: 'new' child has incorrect indentation level 14, expected level should be 16. [Indentation]

> Task :metadata:checkstyleMain
Error: eckstyle] [ERROR] /home/runner/work/checkstyle/checkstyle/.ci-temp/kafka/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java:106:15: 'new' child has incorrect indentation level 14, expected level should be 16. [Indentation]
Error: eckstyle] [ERROR] /home/runner/work/checkstyle/checkstyle/.ci-temp/kafka/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java:107:15: 'new' child has incorrect indentation level 14, expected level should be 16. [Indentation]

How do we proceed with that?

@dejan2609
Copy link
Copy Markdown

dejan2609 commented Dec 9, 2025

I moved the job to GitHub Actions, and it works 🚀

🎉
Mental note to all of us: CircleCI is not a way to go 🙂

@stoyanK7 Don't worry about these CheckStyle violations: you are probably compiling against most recent ChekStyle version while Kafka is still using CheckStyle 10.20.2 (I have started this version upgrade PR and I'm waiting for a review from the Kafka maintainers.

Hint: if you want to be 💯 percent sure just pass CS 10.20.2 version to your GA workflow 😉

@dejan2609
Copy link
Copy Markdown

@dejan2609 , can you suggest something here ?

@romani @stoyanK7 I would suggest a celebration 😃
(typing with my left and reaching for a bottle of plum brandy with my right ☺️)

@romani
Copy link
Copy Markdown
Member

romani commented Dec 10, 2025

@stoyanK7 , please use such branch to make sure all works, we can even merge it, and eventually such PR is merged and we can remove extra checkout line in script.

@stoyanK7
Copy link
Copy Markdown
Contributor Author

stoyanK7 commented Dec 10, 2025

@dejan2609 @romani PR is ready for review. Please poke at it 😃


Actionlint CI failure is unrelated.

@dejan2609
Copy link
Copy Markdown

Совсем хорошо/Съвсем добре/Сасвим добро 👌

One non-binding LGTM ✅

Copy link
Copy Markdown
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor naming alignment

Comment thread .github/workflows/no-error-kafka.yml Outdated
@@ -0,0 +1,27 @@
name: 'no-error-kafka'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name workflow as "no-error testing"
Rename file to match our existing workflow on no-exception
To match

Image

Copy link
Copy Markdown
Contributor Author

@stoyanK7 stoyanK7 Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

image

Side note: We must also move no-error-trino under this no-error testing workflow. I will do that in a followup PR (after this gets merged).

Copy link
Copy Markdown
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot !!!!!!!

We did it !!!
It took a while, new partnership is singed.

@romani romani merged commit 4cca2b7 into checkstyle:master Dec 11, 2025
125 of 128 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants