In-Depth Look at Go Build Cache
How the Go build cache works and how you can troubleshoot when tests aren’t being cached as expected
When I started my journey with Keboola, my first task was to speed up the build and testing process for our Go-based project in CI. Initially, I assumed I could simply search for a solution online, copy it, and then be done. I found an article that provided a condensed overview of how to integrate build caching with CI. While helpful, it didn't address all our challenges, especially since our codebase is large and complex. About 10% of our tests weren’t being cached, and we needed to figure out why.
This article aims to explain in-depth how the Go build cache works and how you can troubleshoot when tests aren’t being cached as expected 😀.
Understanding Go Build Cache
The Go build cache not only stores build files, but also caches other elements like fuzz output and test results. To understand how the Go test cache works, I reviewed the official documentation. Here are two key points to know:
- Not all flags allow caching: For example, the
-coverprofile
flag prevents tests from being cached, so check the flags you're using. - Debugging Cache Misses: There are three main debugging environment variables that are key for debugging cache issues:
GODEBUG=gocachetest=1
: Shows why there was a cache miss.GODEBUG=gocacheverify=1
: Verifies the test cache consistency.GODEBUG=gocachehash=1
: Provides a stack of debug information showing why a cache miss occurred.
In this article, we'll focus mainly on understanding gocachehash=1
, which is often overlooked in forums, blogs, documentation, or pull requests in the Go GitHub repository.
Debugging Go Build Cache in CI (Naive Approach)
In our CI workflows (using GitHub Actions), we use various operating systems. Knowing where the build folder is stored by default is essential when compiling/testing our Go program.
- Linux:
~.cache/go-build
- MacOS:
~Library/Caches/go-build
- Windows:
~\AppData\Local\go-build
You can customize the cache location using the GOCACHE
environment variable. In our case, we moved the cache to a faster drive on Windows runners to speed up the process.
We discovered that the cache is typically stored on the C:
drive, which is slower than the D:
drive where the checkout code is usually located. Moving the cache to the D:
drive resulted in a speedup of about 10–15%.
Here is an example of how we set up caching in our CI workflow:
- name: Checkout code
uses: actions/checkout@v4
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: false
- name: Load Go cache
id: load-go-cache
uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/Library/Caches/go-build
~\AppData\Local\go-build
key: ${{ runner.os }}-go-1.22.2-v1-unit-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-1.22.2-v1-unit-
- name: Run tests
run: go test ./…
We have multiple locations for multiple runners. However, the actions/cache@v4
only caches folders that already exist. We store the cache under a specific runner.os
to restore it from a matching cache entry. Typically, we use hashFiles(‘**/go.sum’)
to avoid caching all dependencies indefinitely. However, when you change the libraries, dependencies, or the Go
version, a new cache entry is created. If a specific cache does not exist, it attempts to restore a cache that matches the prefix found under the restore-keys
section. This way, we can restore older dependencies or Go versions when available.
With this approach, we achieved a cache hit rate of about 90%, which covers most of the project. But someone might ask, “Why isn't it 100%? It works perfectly on the local machine!” So, you dig deeper into the CI setup to figure out why it's not caching at 100%, even though it works that way in a local environment.
Investigating Cache Misses
To understand why the hit ratio wasn't 100%, we needed to address two things: 1) how the build cache stores test results and 2) how the CI process works.
Build Cache Inspection on Local
When running go test
, if test results aren't cached and it's not due to flag mismatches (as mentioned earlier), you can inspect why by running:
GODEBUG=gocachehash=1 go test ./… 2> test-output
This command produces a large output comparing stored hashes with those generated during the test run. To reduce the output, use the -run
option to target specific tests.
The output is quite large, so I will focus only on the key parts. Go conveniently compiles the something_test.go
files first, followed by the compilation of the test and its dependencies into a test binary, which is then executed with the specified flags. In the output, the first section highlights the compilation process and the packages used. Next, the section labeled [testInputs]
lists the various sources of input required to run the test.
Important Sections to Check:
- Compilation: Files needed to compile the tests (not part of the debug information).
- Test Inputs: Includes syscall details such as:
- Files not changed during the test (
stat
) - Directory changes (
chdir
) - Environment variables used (
getenv
) - Files opened during test execution (
open
)
- Files not changed during the test (
A test typically imports other packages, and it checks the modification time (modtime
) of the package files. The go test
command compares the file path and modification time hash with cached test entries. Cached entries are usually stored in ~.cache/go-build
, allowing the test determine whether they should be (cached)
or not. Here's a simple example (files: a.go, b.go, a_test.go, myfile.txt):
package a
import (
github.com/Matovidlo/test/b
)
func TestA(t *testing.T) {
b.Something()
os.Open(“myfile.txt”)
os.Getenv(“TEST_ENV”)
}
If you modify the b
package, the tests are no longer cached. The b
package must be recompiled, a_test.go
relinked, and the test executed. Similarly, if you change the modification time of myfile.txt
(indicating the file was edited), the same workflow applies. Finally, changing the TEST_ENV
variable to a different value will also invalidate the cache (as the environment variable is used in the test). To identify the dependencies of your test file, check the build/test cache.
To investigate why a specific test wasn't cached, locate the relevant a_test.go
test hash. It can typically be found on the line where the HASH subkey
is generated in the output of the GODEBUG=gocachehash=1 go test ./… 2> test-output
command.
HASH[testInputs]: 69f1fb999e7878ab0fc1683496d49a5fdb2c1e189d0c42e3f6baa0c891d348d8
HASH subkey fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c "inputs:69f1fb999e7878ab0fc1683496d49a5fdb2c1e189d0c42e3f6baa0c891d348d8" = 0de1f5d74d957332923a83e863949214b23a3d2ff52a3fd61b15678e963a4551
This test hash appears multiple times within the file; however, the first occurrence in the [testResult]
section indicates where the build cache is located. It looks something like this: you take the hash ID and use it to inspect the file less ~/.cache/go-build/fa/fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c-a
.
The file path is constructed using the first two letters of the hash fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c
, followed by the hash itself. This approach simplifies finding the hash, as they are organized into two-letter subfolders. The -a
suffix represents the compiled object file of the source test. The output of the command looks like this:
v1
fa5782ee6b29e70a851571c28d4080b44f287b6eeb10df18e5a9146bf228478c
e7180aa367193fcc1d8f7e85d20c7ccca04856897a4c87b3632e315695dabfd5
43 1717321489418793374
Next, you take the third line (the hash next to our input) and go through the build cache again using the following command:
less ~/.cache/go-build/e7/e7180aa367193fcc1d8f7e85d20c7ccca04856897a4c87b3632e315695dabfd5-d
Here, the -d
suffix represents the debug information of the original test file.
If the output is in binary format, it indicates that you retrieved the hash of the compiled test file or dependencies, rather than the test hash.
Here we can easily identify the test dependencies. With each execution, the test checks the open
syscall of myfile.txt
. The purpose of this check is to ensure that the file has not changed since the previous run. A hash is created from the modtime
to capture this state. The environment variable is simpler to understand—it's verified to remain the same. If either the file or the environment variable changes, the cache is invalidated, and the test is re-executed. This is how you can debug the test cache, which is likely what our CI is missing. Some of these dependencies can change with every execution.
How CI Works
To understand why the cache hit ratio isn't 100%, we need to first look at how CI operates:
- Each CI run is executed in a fresh Docker image with new volumes, etc. Prerequisites must be injected before execution.
- When pulling or checking out files from the repository, we use
git clone
, which sets themodtime
of each file to the time of the clone, not the last commit. This leads CI to treat the files as new. - After the CI run, the files are discarded. When the CI is executed again, it starts from step 1, repeating the process.
How to Incorporate CI/CD to Cache Various Dependencies?
To achieve a 100% cache hit ratio, we need to ensure that files and paths used within tests—particularly those cloned as test inputs—have a static modtime
so the test cache assumes that they have not changed.
- This is feasible only when we know that the files used in functions like
os.Open
or other file operations are static and not generated dynamically during test execution, as dynamically generated files typically serve as dependencies for other tests. - When copying files from one location to another using
os.Walk
, the folder where the file is copied will always cause acache miss
. This happens because the file'smodtime
changes with each execution. To prevent caching issues, themodtime
must be set programmatically, as this cannot be handled through a standard step in GitHub Actions. See the next section for further details. - Additionally, changing the directory to a dynamic value will always result in a cache miss (due to the directory change).
Temporary Files
When files are copied or created during go test
execution, one rule applies: the file must exist for at least two seconds, or the Go cache
will not be generated. This issue can be detected using GODEBUG=gocachetest=1
, where you'll see an exception indicating that the file used as input is too new
.
To resolve this, simply adjust the file's modtime
:
tm := time.Unix(1, 0).Local()
//This prevents file too new error when running go test caching.
err = os.Chtimes(t.workingDirFS.BasePath()+"/actual-state.json", tm, tm)
You can find further details about this behavior in the Go source code.
Let’s Write the Full Workflow
- name: Checkout code
uses: actions/checkout@v4
- name: Install Go
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: false
- name: Load Go cache
id: load-go-cache
uses: actions/cache@v4
with:
path: |
~/.cache/go-build
~/Library/Caches/go-build
~\AppData\Local\go-build
key: ${{ runner.os }}-go-1.22.2-v1-unit-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-1.22.2-v1-unit-
- name: Change modtime of files to 1 unix timestamp
shell: bash
run: |
excluded='-path "**/.out*" -or -path "**/.git/**"' # .out is our directory where we store test outputs that are copied with each execution.
find “.” -not \( $excluded \) -type f,d -print0 | xargs -0 touch -d '1970-01-01T00:00:01'
- name: Run tests
run: go test ./…
The workflow begins by checking out the branch and installing Go. It then attempts to load the cache—if a cache hit occurs, it loads the data; otherwise, the first execution proceeds. We modify the modtime
of all files to Unix timestamp 1
, ensuring the hash of the used files is saved into the cache statically. When the tests are executed again, Go checks the files more thoroughly while keeping other dependencies static.
The workflow also includes a post-load step where the cache is saved. In subsequent CI executions, this should result in a cache hit, providing the cached results in your go test ./…
command.
Is it as simple as this example? Yes, it is—but not every test and test case will work perfectly. There are many factors that can prevent a test from being cached. It's up to you to dig through the test cache
using the guidance from this article. Look for files, environment variables, or other elements that could be causing cache misses. Remember, this is an evolving process, so expect things to change over time.
Additionally, there are common issues others have faced with the test cache, and there's an open issue regarding cache misses that you may find useful.
Summary
In this article, I demonstrated what the Go test cache looks like and how to inspect files to help debug caching issues in CI. We also updated the CI pipeline to use the test cache more effectively for improved performance. While this isn't a 100% solution for everyone, it has worked well for our tests, speeding up around 60–70% of our cached workflows.
The information needed to create this article was originally found in this helpful video, and I believe it remains useful, even after three years of ongoing development in Go testing and caching.