How we build and operate the Keboola data platform
Continuous Integration Martin Vaško 8 min read

In-Depth Look at Go Build Cache

How the Go build cache works and how you can troubleshoot when tests aren’t being cached as expected

When I started my journey with Keboola, my first task was to speed up the build and testing process for our Go-based project in CI. Initially, I assumed I could simply search for a solution online, copy it, and then be done. I found an article that provided a condensed overview of how to integrate build caching with CI. While helpful, it didn't address all our challenges, especially since our codebase is large and complex. About 10% of our tests weren’t being cached, and we needed to figure out why.

This article aims to explain in-depth how the Go build cache works and how you can troubleshoot when tests aren’t being cached as expected 😀.

Understanding Go Build Cache

The Go build cache not only stores build files, but also caches other elements like fuzz output and test results. To understand how the Go test cache works, I reviewed the official documentation. Here are two key points to know:

  1. Not all flags allow caching: For example, the -coverprofile flag prevents tests from being cached, so check the flags you're using.
  2. Debugging Cache Misses: There are three main debugging environment variables that are key for debugging cache issues:
    1. GODEBUG=gocachetest=1: Shows why there was a cache miss.
    2. GODEBUG=gocacheverify=1: Verifies the test cache consistency.
    3. GODEBUG=gocachehash=1: Provides a stack of debug information showing why a cache miss occurred.

In this article, we'll focus mainly on understanding gocachehash=1, which is often overlooked in forums, blogs, documentation, or pull requests in the Go GitHub repository.

Debugging Go Build Cache in CI (Naive Approach)

In our CI workflows (using GitHub Actions), we use various operating systems. Knowing where the build folder is stored by default is essential when compiling/testing our Go program.

  1. Linux: ~.cache/go-build
  2. MacOS: ~Library/Caches/go-build
  3. Windows: ~\AppData\Local\go-build

You can customize the cache location using the GOCACHE environment variable. In our case, we moved the cache to a faster drive on Windows runners to speed up the process. 

We discovered that the cache is typically stored on the C: drive, which is slower than the D: drive where the checkout code is usually located. Moving the cache to the D: drive resulted in a speedup of about 10–15%.

Here is an example of how we set up caching in our CI workflow:

- name: Checkout code
  uses: actions/checkout@v4
- name: Install Go
  uses: actions/setup-go@v5
  with:
    go-version: ${{ env.GO_VERSION }}
    cache: false
- name: Load Go cache
  id: load-go-cache
  uses: actions/cache@v4
  with:
    path: |
      ~/.cache/go-build
      ~/Library/Caches/go-build
      ~\AppData\Local\go-build
    key:  ${{ runner.os }}-go-1.22.2-v1-unit-${{ hashFiles('**/go.sum') }}
    restore-keys: |
      ${{ runner.os }}-go-1.22.2-v1-unit-
- name: Run tests
  run: go test ./…

We have multiple locations for multiple runners. However, the actions/cache@v4 only caches folders that already exist. We store the cache under a specific runner.os to restore it from a matching cache entry. Typically, we use hashFiles(‘**/go.sum’) to avoid caching all dependencies indefinitely. However, when you change the libraries, dependencies, or the Go version, a new cache entry is created. If a specific cache does not exist, it attempts to restore a cache that matches the prefix found under the restore-keys section. This way, we can restore older dependencies or Go versions when available.

With this approach, we achieved a cache hit rate of about 90%, which covers most of the project. But someone might ask, “Why isn't it 100%? It works perfectly on the local machine!” So, you dig deeper into the CI setup to figure out why it's not caching at 100%, even though it works that way in a local environment.

Investigating Cache Misses

To understand why the hit ratio wasn't 100%, we needed to address two things: 1) how the build cache stores test results and 2) how the CI process works.

Build Cache Inspection on Local

When running go test, if test results aren't cached and it's not due to flag mismatches (as mentioned earlier), you can inspect why by running:

GODEBUG=gocachehash=1 go test ./… 2> test-output

This command produces a large output comparing stored hashes with those generated during the test run. To reduce the output, use the -run option to target specific tests.

The output is quite large, so I will focus only on the key parts. Go conveniently compiles the something_test.go files first, followed by the compilation of the test and its dependencies into a test binary, which is then executed with the specified flags. In the output, the first section highlights the compilation process and the packages used. Next, the section labeled [testInputs] lists the various sources of input required to run the test.

Important Sections to Check:

  1. Compilation: Files needed to compile the tests (not part of the debug information).
  2. Test Inputs: Includes syscall details such as:
    1. Files not changed during the test (stat)
    2. Directory changes (chdir)
    3. Environment variables used (getenv)
    4. Files opened during test execution (open)

A test typically imports other packages, and it checks the modification time (modtime) of the package files. The go test command compares the file path and modification time hash with cached test entries. Cached entries are usually stored in ~.cache/go-build, allowing the test determine whether they should be (cached) or not. Here's a simple example (files: a.go, b.go, a_test.go, myfile.txt):

package a

import (
  github.com/Matovidlo/test/b
)

func TestA(t *testing.T) {
  b.Something()
  os.Open(“myfile.txt”)
  os.Getenv(“TEST_ENV”)
}

If you modify the b package, the tests are no longer cached. The b package must be recompiled, a_test.go relinked, and the test executed. Similarly, if you change the modification time of myfile.txt (indicating the file was edited), the same workflow applies. Finally, changing the TEST_ENV variable to a different value will also invalidate the cache (as the environment variable is used in the test). To identify the dependencies of your test file, check the build/test cache.

To investigate why a specific test wasn't cached, locate the relevant a_test.go test hash. It can typically be found on the line where the HASH subkey is generated in the output of the GODEBUG=gocachehash=1 go test ./… 2> test-output command.

HASH[testInputs]: 69f1fb999e7878ab0fc1683496d49a5fdb2c1e189d0c42e3f6baa0c891d348d8
HASH subkey fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c "inputs:69f1fb999e7878ab0fc1683496d49a5fdb2c1e189d0c42e3f6baa0c891d348d8" = 0de1f5d74d957332923a83e863949214b23a3d2ff52a3fd61b15678e963a4551

This test hash appears multiple times within the file; however, the first occurrence in the [testResult] section indicates where the build cache is located. It looks something like this: you take the hash ID and use it to inspect the file less ~/.cache/go-build/fa/fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c-a.

The file path is constructed using the first two letters of the hash fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c, followed by the hash itself. This approach simplifies finding the hash, as they are organized into two-letter subfolders. The -a suffix represents the compiled object file of the source test. The output of the command looks like this:

v1 
fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c 
e7180aa367193fcc1d8f7e85d20c7ccca04856897a4c87b3632e315695dabfd5 
43  1717321489418793374

Next, you take the third line (the hash next to our input) and go through the build cache again using the following command:

less ~/.cache/go-build/e7/e7180aa367193fcc1d8f7e85d20c7ccca04856897a4c87b3632e315695dabfd5-d

Here, the -d suffix represents the debug information of the original test file.

If the output is in binary format, it indicates that you retrieved the hash of the compiled test file or dependencies, rather than the test hash.

Here we can easily identify the test dependencies. With each execution, the test checks the open syscall of myfile.txt. The purpose of this check is to ensure that the file has not changed since the previous run. A hash is created from the modtime to capture this state. The environment variable is simpler to understand—it's verified to remain the same. If either the file or the environment variable changes, the cache is invalidated, and the test is re-executed. This is how you can debug the test cache, which is likely what our CI is missing. Some of these dependencies can change with every execution.

How CI Works

To understand why the cache hit ratio isn't 100%, we need to first look at how CI operates:

  1. Each CI run is executed in a fresh Docker image with new volumes, etc. Prerequisites must be injected before execution.
  2. When pulling or checking out files from the repository, we use git clone, which sets the modtime of each file to the time of the clone, not the last commit. This leads CI to treat the files as new.
  3. After the CI run, the files are discarded. When the CI is executed again, it starts from step 1, repeating the process.

How to Incorporate CI/CD to Cache Various Dependencies?

To achieve a 100% cache hit ratio, we need to ensure that files and paths used within tests—particularly those cloned as test inputs—have a static modtime so the test cache assumes that they have not changed.

  • This is feasible only when we know that the files used in functions like os.Open or other file operations are static and not generated dynamically during test execution, as dynamically generated files typically serve as dependencies for other tests.
  • When copying files from one location to another using os.Walk, the folder where the file is copied will always cause a cache miss. This happens because the file's modtime changes with each execution. To prevent caching issues, the modtime must be set programmatically, as this cannot be handled through a standard step in GitHub Actions. See the next section for further details.
  • Additionally, changing the directory to a dynamic value will always result in a cache miss (due to the directory change).

Temporary Files

When files are copied or created during go test execution, one rule applies: the file must exist for at least two seconds, or the Go cache will not be generated. This issue can be detected using GODEBUG=gocachetest=1, where you'll see an exception indicating that the file used as input is too new

To resolve this, simply adjust the file's modtime:

  tm := time.Unix(1, 0).Local()
  //This prevents file too new error when running go test caching.
  err = os.Chtimes(t.workingDirFS.BasePath()+"/actual-state.json", tm, tm) 

You can find further details about this behavior in the Go source code.

Let’s Write the Full Workflow

- name: Checkout code
  uses: actions/checkout@v4
- name: Install Go
  uses: actions/setup-go@v5
  with:
    go-version: ${{ env.GO_VERSION }}
    cache: false
- name: Load Go cache
  id: load-go-cache
  uses: actions/cache@v4
  with:
    path: |
      ~/.cache/go-build
      ~/Library/Caches/go-build
      ~\AppData\Local\go-build
    key: ${{ runner.os }}-go-1.22.2-v1-unit-${{ hashFiles('**/go.sum') }}
    restore-keys: |
      ${{ runner.os }}-go-1.22.2-v1-unit-
- name: Change modtime of files to 1 unix timestamp
  shell: bash
  run: |
    excluded='-path "**/.out*" -or -path "**/.git/**"' # .out is our directory where we store test outputs that are copied with each execution.
    find “.” -not \( $excluded \) -type f,d -print0 | xargs -0 touch -d '1970-01-01T00:00:01'
- name: Run tests
  run: go test ./…

The workflow begins by checking out the branch and installing Go. It then attempts to load the cache—if a cache hit occurs, it loads the data; otherwise, the first execution proceeds. We modify the modtime of all files to Unix timestamp 1, ensuring the hash of the used files is saved into the cache statically. When the tests are executed again, Go checks the files more thoroughly while keeping other dependencies static.

The workflow also includes a post-load step where the cache is saved. In subsequent CI executions, this should result in a cache hit, providing the cached results in your go test ./… command.

Is it as simple as this example? Yes, it is—but not every test and test case will work perfectly. There are many factors that can prevent a test from being cached. It's up to you to dig through the test cache using the guidance from this article. Look for files, environment variables, or other elements that could be causing cache misses. Remember, this is an evolving process, so expect things to change over time.

Additionally, there are common issues others have faced with the test cache, and there's an open issue regarding cache misses that you may find useful.

Summary

In this article, I demonstrated what the Go test cache looks like and how to inspect files to help debug caching issues in CI. We also updated the CI pipeline to use the test cache more effectively for improved performance. While this isn't a 100% solution for everyone, it has worked well for our tests, speeding up around 60–70% of our cached workflows.

The information needed to create this article was originally found in this helpful video, and I believe it remains useful, even after three years of ongoing development in Go testing and caching.

If you liked this article please share it.

Comments ()