Messing around with Babashka

Ian Muge
10 min readNov 29, 2022

--

source: babashka.org

Premise

Over the years, CircleCI (Circle) has done an exceptional job when performing tests; parallelization, visualization and execution in general. But as Github Actions (GA) becomes more prevalent, we decided to play around a bit with it , and noticed that some things might not be as easy to translate across; a typical example would be test parallelization and splitting. Parallelization in CircleCI is done with a single directive that is: paralellism: {int} , where {int} is the number of parallel workers to spawn, while in GA you need to implement this as a matrix job, for example:

test:
runs-on: ubuntu-22.04
name: "Runner #${{ matrix.ci_node_index }}: Run test suite in parallel"
strategy:
fail-fast: false
matrix:
ci_node_total: [3]
ci_node_index: [0,1,2]
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Test
run: |
echo "a test task";

So now that we have multiple workers, how do we split the tests, assign them to the right worker and collate the results afterwards for use in subsequent executions…Let’s dive in and break it down, Barney style.

Dive-in

As a baseline, in Circle, this seems to be more or less direct as they have a range of in-built test helper functions to ease in test discovery and splitting. We use:

circleci tests glob "test/**/*.clj"

for test discovery and simply use:

circleci tests split - split-by=timings - timings-type=classname

to split the test based on previous junit classname-based test timings and they perform all the underlying magic for you. But what about GA, well we have to perform some of this magic ourselves. There are some well documented test splitting GA actions on the marketplace and a few methods but only one stood out: https://github.com/chaosaffe/split-tests and even that is based on a Circle splitting script. So given all that, we decided to build our own.

At Ardoq, we are very much a Clojure shop, and I hadn’t had any exposure to Clojure prior to joining; all props go out to our backend team that spurred this on. As time goes on it seems I just might be becoming a convert, but I shall not accept that, just not yet. So it only seemed rational to do it in Clojure, and in came Babashka. Babashka describes itself as: a Clojure scripting runtime; no JVM to boot up and nothing as heavy…perfect for this use case.

Our goals at the end of all this was to:

  • Read and parse previous test results
  • Parse and determine what files are to be considered as containing tests and are not just utility files
  • Split the test based on previous test run times and pack them in their respective workers, in a reasonable and efficient manner.

What the ancients called a clever fighter is one who not only wins, but excels in winning with ease — Sun Tzu

That encompasses most of our approach to the problem, we were aiming for ease of implementation, speed and reduced complexity, so how did that look, well…

Babashka needs a bb.edn file describing our tasks. Of note, we define the project root as the Babashka search path, this is to ensure we can require the actual script,and define it as a valid Clojure namespace, despite not being in a project source.

cwd: .

{:paths          ["."]
:min-bb-version "1.0.165"
:tasks
{:requires ([ci.test-helper :as cth])
split {:docs "Split tests based on timings"
:task (cth/run-split *command-line-args*)}}}

We define what would considered as valid cli-options, we also define some parse functions that would ensure we get the options in the format that we can immediately use, like parsing to valid strings or to numbers

(def cli-options
;; Options with a required argument
[["-w" "--workers WORKERS" "Number of Workers/Runners"
:default 30
:parse-fn parse-long]
["-i" "--index INDEX" "Runner Index"
:parse-fn parse-long]
["-b" "--test-results-base-path TEST_RESULTS_BASE_PATH" "Test results base path"
:default "./target/test2junit/"
:parse-fn str]
["-p" "--test-results-pattern TEST_RESULTS_PATTERN" "Test results pattern"
:default "**/*.xml"
:parse-fn str]
["-t" "--tests-path TESTS_PATH" "Tests path"
:default "test/"
:parse-fn str]
["-h" "--help"]
["-p" "--plan" "Execution plan"
:default false]])

We add some helper functions to read files, parse the previous test result files and identify actual test files. In the case of the test files, considering this is Clojure, we look for any occurrence of deftest. This is a rudimentary test; because, what happens when you use it in a comment in a utility file? The beauty of it is; it doesn’t add any significant overhead to the actual test execution time calculation as we parse it as a testable file but it’s execution is in the milliseconds range, so it does not skew the calculations

;; This is to ensure we parse the test times correctly in spite of execution locale
(defn parse-to-double [string]
"Decimal seperator varies depending on system locale, string replacement standardizes this"
(-> string
(str/replace #"," ".")
parse-double))
;; when we parse the files as using the glob function, we get them as unix objects
(defn read-file [file]
(-> file
str
slurp))
;; parse the test files and determine if it actually has tests
(defn parse-test-files [file test-results]
(let [file-content (read-file file)]
(when (str/includes? file-content "deftest")
{:filename (str file)
:line-count (count (str/split-lines file-content))
:test-time ((keyword (filename-to-classname file)) test-results 1)
})))

A typical JUnit test file is similar to what we have below, we parse it and extract just the class name and execution time:

<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="org.class-name" errors="0" failures="0" tests="2" time="21.3709" timestamp="2022-11-15_18:31:22+0000">
<testcase name="test-1" classname="org.class-name" time="0.0132">
</testcase>
<testcase name="test-2" classname="org.class-name" time="21.3577">
</testcase>
<system-err>
</system-err>
<system-out>
</system-out>
<properties>
... way too many properties ...
</properties>
</testsuite>

Seems like overkill but, that is all that’s needed for our use-case, we add some sugar to ensure we accommodate for instances where the test results are incomplete, malformed or they simply do not exist. We default the timing to 1 so that when we do not have any previous tests or points of reference, we would spread the tests out equally to all the runners to set a baseline. You may ask, why not approximate the runtime based on some other metric instead of using defaults? Well, we did try to find an accurate way to perform that, but determined that line count isn’t full-proof, though it is a good starting point. It is not easy to properly approximate the complexity and runtime of a test

;; If the test file was not properly constructed, populate the times with default values
(defn try-parse-xml [xml-content filename]
(try
(parse-str xml-content)
(catch Exception _
{:tag :testsuite
:attrs {:name (filename-to-classname(str filename))
:errors 1
:failures 1
:tests 0
:time 1
:timestamp 0}
:content []}
)))
;; Parse the test file and extract the test class name and the timing
(defn parse-test-results [file]
(-> (read-file file)
(try-parse-xml file)
:attrs
((juxt :name :time))))

Now for the main act, assigning a number of timed tests to a limited number of runners is a “Multiway Number Partitioning Problem”. Wikipedia describes it as “problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible”, or simply put, trying to assign the tests/tasks to runners so as to attain almost similar average test timings per test runner. There are many ways discussed attempting to solve the problem:

  • Minimizing the largest sum
  • Maximizing the smallest sum
  • Maximizing the sum of products

As set out before, we were aiming for ease of execution, so we opted for the “Greedy Number Partitioning”, that is subset of “Minimizing the largest sum” solutions. We order the tests based on previous test timings and assign the runner(bin) with the smallest size the the current test, that is, we look for the smallest bin, assign a task to it, move on to the next smallest bin and repeat until all tasks have been assigned. We order the tests based on their timings, in that we always have the significantly longer tests in the first couple of runners and spread out to the rest.

;; appending to a bin
(defn add-to-bin [bin [_ size :as item]]
{:size (+ (:size bin) size)
:items (conj (:items bin) item)})

;; searching for the smallest bin
(defn select-smallest-bin [bins]
(first (apply min-key
(fn [[_ bin]] (:size bin))
(map-indexed vector bins))))
(defn pack
"Simple first fit packing algorithm."
[items no-fit-fn]
(let [init-bins [{:size 0 :items []}]]
(reduce no-fit-fn init-bins items)))

(defn grow-bins [bins item]
(conj bins (add-to-bin {:size 0 :items []} item)))

;; searching for and adding to the smallest bin
(defn add-to-smallest-bin [n bins item]
(if (< (count bins) n)
(grow-bins bins item)
(update-in bins [(select-smallest-bin bins)]
#(add-to-bin % item))))
(defn pack-n-bins
[items n]
(pack items (partial add-to-smallest-bin n)))

For a bit more reading, closely related problems are:

More information and reading here: https://en.wikipedia.org/wiki/Multiway_number_partitioning

And finally the full collated code.

cwd: ./ci/test-helper.clj

#!/usr/bin/env bb

(ns ci.test-helper
(:require
[babashka.fs :as fs]
[clojure.data.xml :refer :all]
[clojure.java.shell :refer :all]
[clojure.string :as str]
[clojure.tools.cli :as cli]
[clojure.repl :as repl]))

(def cli-options
;; Options with a required argument
[["-w" "--workers WORKERS" "Number of Workers/Runners"
:default 30
:parse-fn parse-long]
["-i" "--index INDEX" "Runner Index"
:parse-fn parse-long]
["-b" "--test-results-base-path TEST_RESULTS_BASE_PATH" "Test results base path"
:default "./target/test2junit/"
:parse-fn str]
["-p" "--test-results-pattern TEST_RESULTS_PATTERN" "Test results pattern"
:default "**/*.xml"
:parse-fn str]
["-t" "--tests-path TESTS_PATH" "Tests path"
:default "test/"
:parse-fn str]
["-h" "--help"]
["-p" "--plan" "Execution plan"
:default false]])

(defn filename-to-classname [filename]
(-> filename
(str/replace #"^target/test2junit/xml/(.+?)\.xml$" "$1")
(str/replace #"^test/(.+?)\.clj$" "$1")
(str/replace #"/" ".")
repl/demunge))

(defn add-to-bin [bin [_ size :as item]]
{:size (+ (:size bin) size)
:items (conj (:items bin) item)})

(defn select-smallest-bin [bins]
(first (apply min-key
(fn [[_ bin]] (:size bin))
(map-indexed vector bins))))
(defn pack
"Simple first fit packing algorithm."
[items no-fit-fn]
(let [init-bins [{:size 0 :items []}]]
(reduce no-fit-fn init-bins items)))

(defn grow-bins [bins item]
(conj bins (add-to-bin {:size 0 :items []} item)))

(defn add-to-smallest-bin [n bins item]
(if (< (count bins) n)
(grow-bins bins item)
(update-in bins [(select-smallest-bin bins)]
#(add-to-bin % item))))

(defn pack-n-bins
[items n]
(pack items (partial add-to-smallest-bin n)))

(defn parse-to-double [string]
"Decimal seperator varies depending on system locale, string replacement standardizes this"
(-> string
(str/replace #"," ".")
parse-double))

(defn try-parse-xml [xml-content filename]
(try
(parse-str xml-content)
(catch Exception _
{:tag :testsuite
:attrs {:name (filename-to-classname(str filename))
:errors 1
:failures 1
:tests 0
:time 1
:timestamp 0}
:content []}
)))

(defn read-file [file]
(-> file
str
slurp))

(defn parse-test-results [file]
(-> (read-file file)
(try-parse-xml file)
:attrs
((juxt :name :time))))

(defn parse-test-files [file test-results]
(let [file-content (read-file file)]
(when (str/includes? file-content "deftest")
{:filename (str file)
:line-count (count (str/split-lines file-content))
:test-time ((keyword (filename-to-classname file)) test-results 1)
})))

(defn run-split [arglist]
(if-let [errors (:errors (cli/parse-opts arglist cli-options))]
(print errors)
(let [options (:options (cli/parse-opts arglist cli-options))
{:keys [workers index tests-path test-results-base-path test-results-pattern]} options
test-results (when (fs/exists? test-results-base-path)
(->> (fs/glob test-results-base-path test-results-pattern {:recursive true})
(map parse-test-results)
(into {})
(#(update-vals % parse-to-double))
(#(update-keys % keyword))))
aggregated-results (->> (fs/glob tests-path "**/*.clj" {:recursive true})
(keep #(parse-test-files % test-results))
(sort-by :test-time #(compare %2 %1)))
items (->> (pack-n-bins
(map vector
(map (comp filename-to-classname :filename) aggregated-results)
(map :test-time aggregated-results))
workers))
runner-spec (-> items
(get index))
runner-items (->> runner-spec
(:items)
(into {})
keys)]
(if (:plan options)
(do
(println (str "Number of workers: " workers))
(println (str "Runner Index: " index))
(println (str "Estimated runtime: " (:size runner-spec)))
(println (str "Planned classes: " runner-items)))
(doseq [item runner-items]
(print (str item " ")))))))

And after all that, within our GA matrix job:

bb split --workers ${{ matrix.ci_node_total }} --index ${{ matrix.ci_node_index }} --plan
bb split --workers ${{ matrix.ci_node_total }} --index ${{ matrix.ci_node_index }} | xargs lein test2junit

TL;DR

Clojure isn’t half bad, maybe a bit too many parentheses but the more you use it, the more you appreciate it. Babashka is pretty good and efficient at what it does too. As time goes by, it seems we are embracing the whole “We wish to be lazy and use a pre-made solution that someone else has built, as long as it’s secure”, but as soon as we see something lacking or amiss, if we can raise a PR to improve it, we shall, otherwise just build it “inspired” by a previous or original solution (Always with credit, where credit is due).

source: tenor.com

That being said, this is a seemingly simple problem for a human to solve because it seems like it makes sense, but it doesn’t exactly scale. Admittedly, this is not exactly the best solution, but at the scale of a few thousand of tasks and hundreds of bins, this would perform pretty.

Next challenges and improvements for this:

  • Approximate proper timing defaults based on test file complexity and line count

But at the end of the day, we just had this, it performed pretty well and there was some learning to be had.

All code available here: https://gist.github.com/ianmuge/bfc1e036de7294ad7cd55329268b4ce8

References

https://github.com/jsofra/bin-packing

https://en.wikipedia.org/wiki/Multiway_number_partitioning

https://github.com/chaosaffe/split-tests

https://babashka.org/

#StayLazy

--

--

Ian Muge
Ian Muge

Written by Ian Muge

If I have to do it more than twice, I am automating it. #StayLazy

No responses yet