Normal view

There are new articles available, click to refresh the page.
Yesterday — 21 December 2024The Go

Go Developer Survey 2024 H2 Results

20 December 2024 at 07:00

The Go Blog

Go Developer Survey 2024 H2 Results

Alice Merrick
20 December 2024

Background

Go was designed with a focus on developer experience, and we deeply value the feedback we receive through proposals, issues, and community interactions. However, these channels often represent the voices of our most experienced or engaged users, a small subset of the broader Go community. To ensure we’re serving developers of all skill levels, including those who may not have strong opinions on language design, we conduct this survey once or twice a year to gather systematic feedback and quantitative evidence. This inclusive approach allows us to hear from a wider range of Go developers, providing valuable insights into how Go is used across different contexts and experience levels. Your participation is critical in informing our decisions about language changes and resource allocation, ultimately shaping the future of Go. Thank you to everyone who contributed, and we strongly encourage your continued participation in future surveys. Your experience matters to us.

This post shares the results of our most recent Go Developer Survey, conducted from September 9–23, 2024. We recruited participants from the Go blog and through randomized prompts in the VS Code Go plug-in and GoLand IDE, allowing us to recruit a more representative sample of Go developers. We received a total of 4,156 responses. A huge thank you to all those who contributed to making this possible.

Along with capturing sentiments and challenges around using Go and Go tooling, our primary focus areas for this survey were on uncovering sources of toil, challenges to performing best practices, and how developers are using AI assistance.

Highlights

  • Developer sentiment towards Go remains extremely positive, with 93% of survey respondents saying they felt satisfied while working with Go during the prior year.
  • Ease of deployment and an easy to use API/SDK were respondents’ favorite things about using Go on the top three cloud providers. First-class Go support is critical to keeping up with developer expectations.
  • 70% of respondents were using AI assistants when developing with Go. The most common uses were LLM-based code completion, writing tests, generating Go code from natural language descriptions, and brainstorming. There was a significant discrepancy between what respondents said they wanted to use AI for last year, and what they currently use AI for.
  • The biggest challenge for teams using Go was maintaining consistent coding standards across their codebase. This was often due to team members having different levels of Go experience and coming from different programming backgrounds, leading to inconsistencies in coding style and adoption of non-idiomatic patterns.

Contents

Overall satisfaction

Overall satisfaction remains high in the survey with 93% of respondents saying they were somewhat or very satisfied with Go during the last year. Although the exact percentages fluctuate slightly from cycle to cycle, we do not see any statistically significant differences from our 2023 H2 or 2024 H1 Surveys when the satisfaction rate was 90% and 93%, respectively.

Chart of developer satisfaction with Go

The open comments we received on the survey continue to highlight what developers like most about using Go, for example, its simplicity, the Go toolchain, and its promise of backwards compatibility:

“I am a programming languages enjoyer (C-like) and I always come back to Go for its simplicity, fast compilation and robust toolchain. Keep it up!”

“Thank you for creating Go! It is my favorite language, because it is pretty minimal, the development cycle has rapid build-test cycles, and when using a random open source project written in Go, there is a good chance that it will work, even 10 years after. I love the 1.0 compatibility guarantee.”

Development environments and tools

Developer OS

Consistent with previous years, most survey respondents develop with Go on Linux (61%) and macOS (59%) systems. Historically, the proportion of Linux and macOS users has been very close, and we didn’t see any significant changes from the last survey. The randomly sampled groups from JetBrains and VS Code were more likely (33% and 36%, respectively) to develop on Windows than the self-selected group (16%).

Chart of operating systems respondents
use when developing Go software Chart of operating systems respondents
use when developing Go software, split by difference sample sources

Deployment environments

Given the prevalence of Go for cloud development and containerized workloads, it’s no surprise that Go developers primarily deploy to Linux environments (96%).

Chart of operating systems
respondents deploy to when developing Go software

We included several questions to understand what architectures respondents are deploying to when deploying to Linux, Windows or WebAssembly. The x86-64 / AMD64 architecture was by far the most popular choice for those deploying to both Linux (92%) and Windows (97%). ARM64 was second at 49% for Linux and 21% for Windows.

Linux architecture usage Windows architecture
usage

Not many respondents deployed to Web Assembly (only about 4% of overall respondents), but 73% that do said they deploy to JS and 48% to WASI Preview 1.

Web assembly architecture usage

Editor awareness and preferences

We introduced a new question on this survey to assess awareness and usage of popular editors for Go. When interpreting these results, keep in mind that 34% of respondents came to the survey from VS Code and 9% of respondents came from GoLand, so it is more likely for them to use those editors regularly.

VS Code was the most widely used editor, with 66% of respondents using it regularly, and GoLand was the second most used at 35%. Almost all respondents had heard of both VS Code and GoLand, but respondents were much more likely to have at least tried VS Code. Interestingly, 33% of respondents said they regularly use 2 or more editors. They may use different editors for different tasks or environments, such as using Emacs or Vim via SSH, where IDEs aren’t available.

Level of familiarity with each
editor

We also asked a question about editor preference, the same as we have asked on previous surveys. Because our randomly sampled populations were recruited from within VS Code or GoLand, they are strongly biased towards preferring those editors. To avoid skewing the results, we show the data for the most preferred editor here from the self-selected group only. 38% preferred VS Code and 35% preferred GoLand. This is a notable difference from the last survey in H1, when 43% preferred VS Code and 33% preferred GoLand. A possible explanation could be in how respondents were recruited this year. This year the VS Code notification began inviting developers to take the survey before the Go blog entry was posted, so a larger proportion of respondents came from the VS Code prompt this year who might have otherwise come from the blog post. Because we only show the self-selected respondents in this chart, data from respondents from the VS Code prompt data are not represented here. Another contributing factor could be the slight increase in those who prefer “Other” (4%). The write-in responses suggest there is increased interest in editors like Zed, which made up 43% of the write-in responses.

Code editors respondents most prefer to
use with Go

Code analysis tools

The most popular code analysis tool was gopls, which was knowingly used by 65% of respondents. Because gopls is used under-the-hood by default in VS Code, this is likely an undercount. Following closely behind, golangci-lint was used by 57% of respondents, and staticcheck was used by 34% of respondents. A much smaller proportion used custom or other tools, which suggests that most respondents prefer common established tools over custom solutions. Only 10% of respondents indicated they don’t use any code analysis tools.

Code analysis tools respondents
use with Go

Go in the Clouds

Go is a popular language for modern cloud-based development, so we typically include survey questions to help us understand which cloud platforms and services Go developers are using. In this cycle, we sought to learn about preferences and experiences of Go developers across cloud providers, with a particular focus on the largest cloud providers: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud. We also included an additional option for “Bare Metal Servers” for those who deploy to servers without virtualization.

Similar to previous years, almost half of respondents (50%) deploy Go programs to Amazon Web Services. AWS is followed by self-owned or company-owned servers (37%), and Google Cloud (30%). Respondents who work at large organizations are a little more likely to deploy to self-owned or company-owned servers (48%) than those who work at small-to-medium organizations (34%). They‘re also a little more likely to deploy to Microsoft Azure (25%) than small-to-medium organizations (12%).

Cloud providers where
respondents deploy Go software

The most commonly used cloud services were AWS Elastic Kubernetes Service (41%), AWS EC2 (39%), and Google Cloud GKE (29%). Although we’ve seen Kubernetes usage increase over time, this is the first time we’ve seen EKS become more widely used than EC2. Overall, Kubernetes offerings were the most popular services for AWS, Google Cloud, and Azure, followed by VMs and then Serverless offerings. Go’s strengths in containerization and microservices development naturally align with the rising popularity of Kubernetes, as it provides an efficient and scalable platform for deploying and managing these types of applications.

Cloud platforms where respondents
deploy Go software

We asked a followup question to respondents who deployed Go code to the top three cloud providers, Amazon Web Services, Google Cloud, and Microsoft Azure on what they like most about deploying Go code to each cloud. The most popular response across different providers was actually Go’s performance and language features rather than something about the cloud provider.

Other common reasons were:

  • Familiarity with the given cloud provider compared to other clouds
  • Ease of deployment of Go applications on the given cloud provider
  • The cloud provider’s API/SDK for Go is easy to use
  • The API/SDK is well documented

Other than familiarity, the top favorite things highlight the importance of having first class support for Go to keep up with developer expectations.

It was also fairly common for respondents to say they don’t have a favorite thing about their cloud provider. From a previous version of the survey that involved write-in responses, this often meant that they did not interact directly with the Cloud. In particular, respondents who use Microsoft Azure were much more likely to say that “Nothing” was their favorite thing (51%) compared to AWS (27%) or Google Cloud (30%).

What respondents liked most
about each of the top 3 Clouds

AI assistance

The Go team hypothesizes that AI assistance has the potential to alleviate developers from tedious and repetitive tasks, allowing them to focus on more creative and fulfilling aspects of their work. To gain insights into areas where AI assistance could be most beneficial, we included a section in our survey to identify common developer toil.

The majority of respondents (70%) are using AI assistants when developing with Go. The most common usage of AI assistants was in LLM-based code completion (35%). Other common responses were writing tests (29%), generating Go code from a natural language description (27%), and brainstorming ideas (25%). There was also a sizable minority (30%) of respondents who had not used any LLM for assistance in the last month.

Most common tasks used with AI
assistance

Some of these results stood out when compared to findings from our 2023 H2 survey, where we asked respondents for the top 5 use cases they would like to see AI/ML support Go developers. Although a couple new responses were introduced in the current survey, we can still do a rough comparison between what respondents said they wanted AI support for, and what their actual usage was like. In that previous survey, writing tests was the most desired use case (49%). In our latest 2024 H2 survey, about 29% of respondents had used AI assistants for this in the last month. This suggests that current offerings are not meeting developer needs for writing tests. Similarly, in 2023, 47% respondents said they would like suggestions for best practices while coding, while only 14% a year later said they are using AI assistance for this use case. 46% said they wanted help catching common mistakes while coding, and only 13% said they were using AI assistance for this. This could indicate that current AI assistants are not well-equipped for these kinds of tasks, or they’re not well integrated into developer workflows or tooling.

It was also surprising to see such high usage of AI for generating Go code from natural language and brainstorming, since the previous survey didn’t indicate these as highly desired use cases. There could be a number of explanations for these differences. While previous respondents might not have explicitly wanted AI for code generation or brainstorming initially, they might be gravitating towards these uses because they align with the current strengths of generative AI—natural language processing and creative text generation. We should also keep in mind that people are not necessarily the best predictors of their own behavior.

Tasks used AI assistance
in 2024 compared to those wanted in 2023

We also saw some notable differences in how different groups responded to this question. Respondents at small to medium sized organizations were a little more likely to have used LLMs (75%) compared to those at large organizations (66%). There could be a number of reasons why, for example, larger organizations may have stricter security and compliance requirements and concerns about the security of LLM coding assistants, the potential for data leakage, or compliance with industry-specific regulations. They also may have already invested in other developer tools and practices that already provide similar benefits to developer productivity.

Most common tasks used
with AI assistance by org size

Go developers with less than 2 years of experience were more likely to use AI assistants (75%) compared to Go developers with 5+ years of experience (67%). Less experienced Go developers were also more likely to use them for more tasks, on average 3.50. Although all experience levels tended to use LLM-based code completion, less experienced Go developers were more likely to use Go for more tasks related to learning and debugging, such as explaining what a piece of Go code does, resolving compiler errors and debugging failures in their Go code. This suggests that AI assistants are currently providing the greatest utility to those who are less familiar with Go. We don’t know how AI assistants affect learning or getting started on a new Go project, something we want to investigate in the future. However, all experience levels had similar rates of satisfaction with their AI assistants, around 73%, so new Go developers are not more satisfied with AI assistants, despite using them more often.

Most common tasks used
with AI assistance by experience with Go

To respondents who reported using AI assistance for at least one task related to writing Go code, we asked some follow up questions to learn more about their AI assistant usage. The most commonly used AI assistants were ChatGPT (68%) and GitHub Copilot (50%). When asked which AI assistant they used most in the last month, ChatGPT and Copilot were about even at 36% each, so although more respondents used ChatGPT, it wasn’t necessarily their primary assistant. Participants were similarly satisfied with both tools (73% satisfied with ChatGPT, vs. 78% with GitHub CoPilot). The highest satisfaction rate for any AI assistant was Anthropic Claude, at 87%.

Most common AI
assistants used Most common primary AI assistants used

Challenges for teams using Go

In this section of the survey, we wanted to understand which best practices or tools should be better integrated into developer workflows. Our approach was to identify common problems for teams using Go. We then asked respondents which challenges would bring them the most benefit if they were “magically” solved for them. (This was so that respondents would not focus on particular solutions.) Common problems that would provide the most benefit if they were solved would be considered candidates for improvement.

The most commonly reported challenges for teams were maintaining consistent coding standards across our Go codebase (58%), identifying performance issues in a running Go program (58%) and identifying resource usage inefficiencies in a running Go program (57%).

Most common challenges for
teams

21% of respondents said their team would benefit most from maintaining consistent coding standards across their Go codebase. This was the most common response, making it a good candidate to address. In a follow-up question, we got more details as to why specifically this was so challenging.

Most benefit to
solve

According to the write-in responses, many teams face challenges maintaining consistent coding standards because their members have varying levels of experience with Go and come from different programming backgrounds. This led to inconsistencies in coding style and the adoption of non-idiomatic patterns.

“There’s lots of polyglot engineers where I work. So the Go written is not consistent. I do consider myself a Gopher and spend time trying to convince my teammates what is idiomatic in Go”—Go developer with 2–4 years of experience.

“Most of the team members are learning Go from scratch. Coming from the dynamically typed languages, it takes them a while to get used to the new language. They seem to struggle maintaining the code consistency following the Go guidelines.”—Go developer with 2–4 years of experience.

This echoes some feedback we’ve heard before about teammates who write “Gava” or “Guby” due to their previous language experiences. Although static analysis was a class of tool we had in mind to address this issue when we came up with this question, we are currently exploring different ways we might address this.

Single Instruction, Multiple Data (SIMD)

SIMD, or Single Instruction, Multiple Data, is a type of parallel processing that allows a single CPU instruction to operate on multiple data points simultaneously. This facilitates tasks involving large datasets and repetitive operations, and is often used to optimize performance in fields like game development, data processing, and scientific computing. In this section of the survey we wanted to assess respondents’ needs for native SIMD support in Go.

The majority of respondents (89%) say that work on projects where performance optimizations are crucial at least some of the time. 40% said they work on such projects at least half the time. This held true across different organization sizes and experience levels, suggesting that performance is an important issue for most developers.

How often respondents work on
performance critical software

About half of respondents (54%), said they are at least a little familiar with the concept of SIMD. Working with SIMD often requires a deeper understanding of computer architecture and low-level programming concepts, so unsurprisingly we find that less experienced developers were less likely to be familiar with SIMD. Respondents with more experience and who worked on performance-crucial applications at least half the time were the most likely to be familiar with SIMD.

Familiarity with SIMD

For those who were at least slightly familiar with SIMD, we asked some follow -up questions to understand how respondents were affected by the absence of native SIMD support in Go. Over a third, about 37%, said they had been impacted. 17% of respondents said they had been limited in the performance they could achieve in their projects, 15% said they had to use another language instead of Go to achieve their goals, and 13% said they had to use non-Go libraries when they would have preferred to use Go libraries. Interestingly, respondents who were negatively impacted by the absence of native SIMD support were a little more likely to use Go for data processing and AI/ML. This suggests that adding SIMD support could make Go a better option for these domains.

Impacts of lack of native Go support
for SIMD What impacted respondents build with Go

Demographics

We ask similar demographic questions during each cycle of this survey so we can understand how comparable the year-over-year results may be. For example, if we saw changes in who responded to the survey in terms of Go experience, it’d be very likely that other differences in results from prior cycles were due to this demographic shift. We also use these questions to provide comparisons between groups, such as satisfaction according to how long respondents have been using Go.

We didn’t see any significant changes in levels of experience among respondents during this cycle.

Experience levels of respondents

There are differences in the demographics of respondents according to whether they came from The Go Blog, the VS Code extension, or GoLand. The population who responded to survey notifications in VS Code skews toward less experience with Go; we suspect this a reflection of VS Code’s popularity with new Go developers, who may not be ready to invest in an IDE license while they’re still learning. With respect to years of Go experience, the respondents randomly selected from GoLand are more similar to our self-selected population who found the survey through the Go Blog. Seeing consistencies between samples allows us to more confidently generalize findings to the rest of the community.

Experience with Go by survey
source

In addition to years of experience with Go, we also measured years of professional coding experience. Our audience tends to be a pretty experienced bunch, with 26% of respondents having 16 or more years of professional coding experience.

Overall levels of professional
developer experience

The self-selected group was even more experienced than the randomly selected groups, with 29% having 16 or more years of professional experience. This suggests that our self-selected group is generally more experienced than our randomly selected groups and can help explain some of the differences we see in this group.

Levels of professional developer
experience by survey source

We found that 81% of respondents were fully employed. When we look at our individual samples, we see a small but significant difference within our respondents from VS Code, who are slightly more likely to be students. This makes sense given that VS Code is free.

Employment status Employment status by survey
source

Similar to previous years, the most common use cases for Go were API/RPC services (75%) and command line tools (62%). More experienced Go developers reported building a wider variety of applications in Go. This trend was consistent across every category of app or service. We did not find any notable differences in what respondents are building based on their organization size. Respondents from the random VS Code and GoLand samples did not display significant differences either.

What respondents build with Go

Firmographics

We heard from respondents at a variety of different organizations. About 29% worked at large organizations with 1,001 or more employees, 25% were from midsize organizations of 101–1,000 employees, and 43% worked at smaller organizations with fewer than 100 employees. As in previous years, the most common industry people work in was technology (43%) while the second most common was financial services (13%).

Organization sizes where respondents
work Industries
respondents work in

As in previous surveys, the most common location for survey respondents was the United States (19%). This year we saw a significant shift in the proportion of respondents coming from Ukraine, from 1% to 6%, making it the third most common location for survey respondents. Because we only saw this difference among our self-selected respondents, and not in the randomly sampled groups, this suggests that something affected who discovered the survey, rather than a widespread increase in Go adoption across all developers in Ukraine. Perhaps there was increased visibility or awareness of the survey or the Go Blog among developers in Ukraine.

Where respondents are located

Methodology

We announce the survey primarily through the Go Blog, where it is often picked up on various social channels like Reddit, or Hacker News. We also recruit respondents by using the VS Code Go plugin to randomly select users to whom we show a prompt asking if they’d like to participate in the survey. With some help from our friends at JetBrains, we also have an additional random sample from prompting a random subset of GoLand users to take the survey. This gave us two sources we used to compare the self-selected respondents from our traditional channels and help identify potential effects of self-selection bias.

57% of survey respondents “self-selected” to take the survey, meaning they found it on the Go blog or other social Go channels. People who don’t follow these channels are less likely to learn about the survey from them, and in some cases, they respond differently than people who do closely follow them. For example, they might be new to the Go community and not yet aware of the Go blog. About 43% of respondents were randomly sampled, meaning they responded to the survey after seeing a prompt in VS Code (25%) or GoLand (11%). Over the period of September 9–23, 2024, there was roughly a 10% chance users of the VS Code plugin would have seen this prompt. The prompt in GoLand was similarly active between September 9–20. By examining how the randomly sampled groups differ from the self-selected responses, as well as from each other, we’re able to more confidently generalize findings to the larger community of Go developers.

Chart of different sources of survey
respondents

How to read these results

Throughout this report we use charts of survey responses to provide supporting evidence for our findings. All of these charts use a similar format. The title is the exact question that survey respondents saw. Unless otherwise noted, questions were multiple choice and participants could only select a single response choice; each chart’s subtitle will tell the reader if the question allowed multiple response choices or was an open-ended text box instead of a multiple choice question. For charts of open-ended text responses, a Go team member read and manually categorized all of the responses. Many open-ended questions elicited a wide variety of responses; to keep the chart sizes reasonable, we condensed them to a maximum of the top 10-12 themes, with additional themes all grouped under “Other”. The percentage labels shown in charts are rounded to the nearest integer (e.g., 1.4% and 0.8% will both be displayed as 1%), but the length of each bar and row ordering are based on the unrounded values.

To help readers understand the weight of evidence underlying each finding, we included error bars showing the 95% confidence interval for responses; narrower bars indicate increased confidence. Sometimes two or more responses have overlapping error bars, which means the relative order of those responses is not statistically meaningful (i.e., the responses are effectively tied). The lower right of each chart shows the number of people whose responses are included in the chart, in the form “n = [number of respondents]”. In cases where we found interesting differences in responses between groups, (e.g., years of experience, organization size, or sample source) we showed a color-coded breakdown of the differences.

Closing

Thanks for reviewing our semi-annual Go Developer Survey! And many thanks to everyone who shared their thoughts on Go and everyone who contributed to making this survey happen. It means the world to us and truly helps us improve Go.

— Alice (on behalf of the Go team at Google)

Before yesterdayThe Go

Go Protobuf: The new Opaque API

16 December 2024 at 07:00

The Go Blog

Go Protobuf: The new Opaque API

Michael Stapelberg
16 December 2024

[Protocol Buffers (Protobuf) is Google’s language-neutral data interchange format. See protobuf.dev.]

Back in March 2020, we released the google.golang.org/protobuf module, a major overhaul of the Go Protobuf API. This package introduced first-class support for reflection, a dynamicpb implementation and the protocmp package for easier testing.

That release introduced a new protobuf module with a new API. Today, we are releasing an additional API for generated code, meaning the Go code in the .pb.go files created by the protocol compiler (protoc). This blog post explains our motivation for creating a new API and shows you how to use it in your projects.

To be clear: We are not removing anything. We will continue to support the existing API for generated code, just like we still support the older protobuf module (by wrapping the google.golang.org/protobuf implementation). Go is committed to backwards compatibility and this applies to Go Protobuf, too!

Background: the (existing) Open Struct API

We now call the existing API the Open Struct API, because generated struct types are open to direct access. In the next section, we will see how it differs from the new Opaque API.

To work with protocol buffers, you first create a .proto definition file like this one:

edition = "2023";  // successor to proto2 and proto3

package log;

message LogEntry {
  string backend_server = 1;
  uint32 request_size = 2;
  string ip_address = 3;
}

Then, you run the protocol compiler (protoc) to generate code like the following (in a .pb.go file):

package logpb

type LogEntry struct {
  BackendServer *string
  RequestSize   *uint32
  IPAddress     *string
  // …internal fields elided…
}

func (l *LogEntry) GetBackendServer() string { … }
func (l *LogEntry) GetRequestSize() uint32   { … }
func (l *LogEntry) GetIPAddress() string     { … }

Now you can import the generated logpb package from your Go code and call functions like proto.Marshal to encode logpb.LogEntry messages into protobuf wire format.

You can find more details in the Generated Code API documentation.

(Existing) Open Struct API: Field Presence

An important aspect of this generated code is how field presence (whether a field is set or not) is modeled. For instance, the above example models presence using pointers, so you could set the BackendServer field to:

  1. proto.String("zrh01.prod"): the field is set and contains “zrh01.prod”
  2. proto.String(""): the field is set (non-nil pointer) but contains an empty value
  3. nil pointer: the field is not set

If you are used to generated code not having pointers, you are probably using .proto files that start with syntax = "proto3". The field presence behavior changed over the years:

The new Opaque API

We created the new Opaque API to uncouple the Generated Code API from the underlying in-memory representation. The (existing) Open Struct API has no such separation: it allows programs direct access to the protobuf message memory. For example, one could use the flag package to parse command-line flag values into protobuf message fields:

var req logpb.LogEntry
flag.StringVar(&req.BackendServer, "backend", os.Getenv("HOST"), "…")
flag.Parse() // fills the BackendServer field from -backend flag

The problem with such a tight coupling is that we can never change how we lay out protobuf messages in memory. Lifting this restriction enables many implementation improvements, which we’ll see below.

What changes with the new Opaque API? Here is how the generated code from the above example would change:

package logpb

type LogEntry struct {
  xxx_hidden_BackendServer *string // no longer exported
  xxx_hidden_RequestSize   uint32  // no longer exported
  xxx_hidden_IPAddress     *string // no longer exported
  // …internal fields elided…
}

func (l *LogEntry) GetBackendServer() string { … }
func (l *LogEntry) HasBackendServer() bool   { … }
func (l *LogEntry) SetBackendServer(string)  { … }
func (l *LogEntry) ClearBackendServer()      { … }
// …

With the Opaque API, the struct fields are hidden and can no longer be directly accessed. Instead, the new accessor methods allow for getting, setting, or clearing a field.

Opaque structs use less memory

One change we made to the memory layout is to model field presence for elementary fields more efficiently:

  • The (existing) Open Struct API uses pointers, which adds a 64-bit word to the space cost of the field.
  • The Opaque API uses bit fields, which require one bit per field (ignoring padding overhead).

Using fewer variables and pointers also lowers load on the allocator and on the garbage collector.

The performance improvement depends heavily on the shapes of your protocol messages: The change only affects elementary fields like integers, bools, enums, and floats, but not strings, repeated fields, or submessages (because it is less profitable for those types).

Our benchmark results show that messages with few elementary fields exhibit performance that is as good as before, whereas messages with more elementary fields are decoded with significantly fewer allocations:

             │ Open Struct API │             Opaque API             │
             │    allocs/op    │  allocs/op   vs base               │
Prod#1          360.3k ± 0%       360.3k ± 0%  +0.00% (p=0.002 n=6)
Search#1       1413.7k ± 0%       762.3k ± 0%  -46.08% (p=0.002 n=6)
Search#2        314.8k ± 0%       132.4k ± 0%  -57.95% (p=0.002 n=6)

Reducing allocations also makes decoding protobuf messages more efficient:

             │ Open Struct API │             Opaque API            │
             │   user-sec/op   │ user-sec/op  vs base              │
Prod#1         55.55m ± 6%        55.28m ± 4%  ~ (p=0.180 n=6)
Search#1       324.3m ± 22%       292.0m ± 6%  -9.97% (p=0.015 n=6)
Search#2       67.53m ± 10%       45.04m ± 8%  -33.29% (p=0.002 n=6)

(All measurements done on an AMD Castle Peak Zen 2. Results on ARM and Intel CPUs are similar.)

Note: proto3 with implicit presence similarly does not use pointers, so you will not see a performance improvement if you are coming from proto3. If you were using implicit presence for performance reasons, forgoing the convenience of being able to distinguish empty fields from unset ones, then the Opaque API now makes it possible to use explicit presence without a performance penalty.

Motivation: Lazy Decoding

Lazy decoding is a performance optimization where the contents of a submessage are decoded when first accessed instead of during proto.Unmarshal. Lazy decoding can improve performance by avoiding unnecessarily decoding fields which are never accessed.

Lazy decoding can’t be supported safely by the (existing) Open Struct API. While the Open Struct API provides getters, leaving the (un-decoded) struct fields exposed would be extremely error-prone. To ensure that the decoding logic runs immediately before the field is first accessed, we must make the field private and mediate all accesses to it through getter and setter functions.

This approach made it possible to implement lazy decoding with the Opaque API. Of course, not every workload will benefit from this optimization, but for those that do benefit, the results can be spectacular: We have seen logs analysis pipelines that discard messages based on a top-level message condition (e.g. whether backend_server is one of the machines running a new Linux kernel version) and can skip decoding deeply nested subtrees of messages.

As an example, here are the results of the micro-benchmark we included, demonstrating how lazy decoding saves over 50% of the work and over 87% of allocations!

                  │   nolazy    │                lazy                │
                  │   sec/op    │   sec/op     vs base               │
Unmarshal/lazy-24   6.742µ ± 0%   2.816µ ± 0%  -58.23% (p=0.002 n=6)

                  │    nolazy    │                lazy                 │
                  │     B/op     │     B/op      vs base               │
Unmarshal/lazy-24   3.666Ki ± 0%   1.814Ki ± 0%  -50.51% (p=0.002 n=6)

                  │   nolazy    │               lazy                │
                  │  allocs/op  │ allocs/op   vs base               │
Unmarshal/lazy-24   64.000 ± 0%   8.000 ± 0%  -87.50% (p=0.002 n=6)

Motivation: reduce pointer comparison mistakes

Modeling field presence with pointers invites pointer-related bugs.

Consider an enum, declared within the LogEntry message:

message LogEntry {
  enum DeviceType {
    DESKTOP = 0;
    MOBILE = 1;
    VR = 2;
  };
  DeviceType device_type = 1;
}

A simple mistake is to compare the device_type enum field like so:

if cv.DeviceType == logpb.LogEntry_DESKTOP.Enum() { // incorrect!

Did you spot the bug? The condition compares the memory address instead of the value. Because the Enum() accessor allocates a new variable on each call, the condition can never be true. The check should have read:

if cv.GetDeviceType() == logpb.LogEntry_DESKTOP {

The new Opaque API prevents this mistake: Because fields are hidden, all access must go through the getter.

Motivation: reduce accidental sharing mistakes

Let’s consider a slightly more involved pointer-related bug. Assume you are trying to stabilize an RPC service that fails under high load. The following part of the request middleware looks correct, but still the entire service goes down whenever just one customer sends a high volume of requests:

logEntry.IPAddress = req.IPAddress
logEntry.BackendServer = proto.String(hostname)
// The redactIP() function redacts IPAddress to 127.0.0.1,
// unexpectedly not just in logEntry *but also* in req!
go auditlog(redactIP(logEntry))
if quotaExceeded(req) {
    // BUG: All requests end up here, regardless of their source.
    return fmt.Errorf("server overloaded")
}

Did you spot the bug? The first line accidentally copied the pointer (thereby sharing the pointed-to variable between the logEntry and req messages) instead of its value. It should have read:

logEntry.IPAddress = proto.String(req.GetIPAddress())

The new Opaque API prevents this problem as the setter takes a value (string) instead of a pointer:

logEntry.SetIPAddress(req.GetIPAddress())

Motivation: Fix Sharp Edges: reflection

To write code that works not only with a specific message type (e.g. logpb.LogEntry), but with any message type, one needs some kind of reflection. The previous example used a function to redact IP addresses. To work with any type of message, it could have been defined as func redactIP(proto.Message) proto.Message { … }.

Many years ago, your only option to implement a function like redactIP was to reach for Go’s reflect package, which resulted in very tight coupling: you had only the generator output and had to reverse-engineer what the input protobuf message definition might have looked like. The google.golang.org/protobuf module release (from March 2020) introduced Protobuf reflection, which should always be preferred: Go’s reflect package traverses the data structure’s representation, which should be an implementation detail. Protobuf reflection traverses the logical tree of protocol messages without regard to its representation.

Unfortunately, merely providing protobuf reflection is not sufficient and still leaves some sharp edges exposed: In some cases, users might accidentally use Go reflection instead of protobuf reflection.

For example, encoding a protobuf message with the encoding/json package (which uses Go reflection) was technically possible, but the result is not canonical Protobuf JSON encoding. Use the protojson package instead.

The new Opaque API prevents this problem because the message struct fields are hidden: accidental usage of Go reflection will see an empty message. This is clear enough to steer developers towards protobuf reflection.

Motivation: Making the ideal memory layout possible

The benchmark results from the More Efficient Memory Representation section have already shown that protobuf performance heavily depends on the specific usage: How are the messages defined? Which fields are set?

To keep Go Protobuf as fast as possible for everyone, we cannot implement optimizations that help only one program, but hurt the performance of other programs.

The Go compiler used to be in a similar situation, up until Go 1.20 introduced Profile-Guided Optimization (PGO). By recording the production behavior (through profiling) and feeding that profile back to the compiler, we allow the compiler to make better trade-offs for a specific program or workload.

We think using profiles to optimize for specific workloads is a promising approach for further Go Protobuf optimizations. The Opaque API makes those possible: Program code uses accessors and does not need to be updated when the memory representation changes, so we could, for example, move rarely set fields into an overflow struct.

Migration

You can migrate on your own schedule, or even not at all—the (existing) Open Struct API will not be removed. But, if you’re not on the new Opaque API, you won’t benefit from its improved performance, or future optimizations that target it.

We recommend you select the Opaque API for new development. Protobuf Edition 2024 (see Protobuf Editions Overview if you are not yet familiar) will make the Opaque API the default.

The Hybrid API

Aside from the Open Struct API and Opaque API, there is also the Hybrid API, which keeps existing code working by keeping struct fields exported, but also enabling migration to the Opaque API by adding the new accessor methods.

With the Hybrid API, the protobuf compiler will generate code on two API levels: the .pb.go is on the Hybrid API, whereas the _protoopaque.pb.go version is on the Opaque API and can be selected by building with the protoopaque build tag.

Rewriting Code to the Opaque API

See the migration guide for detailed instructions. The high-level steps are:

  1. Enable the Hybrid API.
  2. Update existing code using the open2opaque migration tool.
  3. Switch to the Opaque API.

Advice for published generated code: Use Hybrid API

Small usages of protobuf can live entirely within the same repository, but usually, .proto files are shared between different projects that are owned by different teams. An obvious example is when different companies are involved: To call Google APIs (with protobuf), use the Google Cloud Client Libraries for Go from your project. Switching the Cloud Client Libraries to the Opaque API is not an option, as that would be a breaking API change, but switching to the Hybrid API is safe.

Our advice for such packages that publish generated code (.pb.go files) is to switch to the Hybrid API please! Publish both the .pb.go and the _protoopaque.pb.go files, please. The protoopaque version allows your consumers to migrate on their own schedule.

Enabling Lazy Decoding

Lazy decoding is available (but not enabled) once you migrate to the Opaque API! 🎉

To enable: in your .proto file, annotate your message-typed fields with the [lazy = true] annotation.

To opt out of lazy decoding (despite .proto annotations), the protolazy package documentation describes the available opt-outs, which affect either an individual Unmarshal operation or the entire program.

Next Steps

By using the open2opaque tool in an automated fashion over the last few years, we have converted the vast majority of Google’s .proto files and Go code to the Opaque API. We continuously improved the Opaque API implementation as we moved more and more production workloads to it.

Therefore, we expect you should not encounter problems when trying the Opaque API. In case you do encounter any issues after all, please let us know on the Go Protobuf issue tracker.

Reference documentation for Go Protobuf can be found on protobuf.dev → Go Reference.

Go Turns 15

11 November 2024 at 07:00

The Go Blog

Go Turns 15

Austin Clements, for the Go team
11 November 2024


Thanks to Renee French for drawing and animating the gopher doing the “15 puzzle”.

Happy birthday, Go!

On Sunday, we celebrated the 15th anniversary of the Go open source release!

So much has changed since Go’s 10 year anniversary, both in Go and in the world. In other ways, so much has stayed the same: Go remains committed to stability, safety, and supporting software engineering and production at scale.

And Go is going strong! Go’s user base has more than tripled in the past five years, making it one of the fastest growing languages. From its beginnings just fifteen years ago, Go has become a top 10 language and the language of the modern cloud.

With the releases of Go 1.22 in February and Go 1.23 in August, it’s been the year of for loops. Go 1.22 made variables introduced by for loops scoped per iteration, rather than per loop, addressing a long-standing language “gotcha”. Over ten years ago, leading up to the release of Go 1, the Go team made decisions about several language details; among them whether for loops should create a new loop variable on each iteration. Amusingly, the discussion was quite brief and distinctly unopinionated. Rob Pike closed it out in true Rob Pike fashion with a single word: “stet” (leave it be). And so it was. While seemingly insignificant at the time, years of production experience highlighted the implications of this decision. But in that time, we also built robust tools for understanding the effects of changes to Go—notably, ecosystem-wide analysis and testing across the entire Google codebase—and established processes for working with the community and getting feedback. Following extensive testing, analysis, and community discussion, we rolled out the change, accompanied by a hash bisection tool to assist developers in pinpointing code affected by the change at scale.

The change to for loops was part of a five year trajectory of measured changes. It would not have been possible without forward language compatibility introduced in Go 1.21. This, in turn, built upon the foundation laid by Go modules, which were introduced in Go 1.14 four and a half years ago.

Go 1.23 further built on this change to introduce iterators and user-defined for-range loops. Combined with generics—introduced in Go 1.18, just two and a half years ago!—this creates a powerful and ergonomic foundation for custom collections and many other programming patterns.

These releases have also brought many improvements in production readiness, including much-anticipated enhancements to the standard library’s HTTP router, a total overhaul of execution traces, and stronger randomness for all Go applications. Additionally, the introduction of our first v2 standard library package establishes a template for future library evolution and modernization.

Over the past year we’ve also been cautiously rolling out opt-in telemetry for Go tools. This system will give Go’s developers data to make better decisions, while remaining completely open and anonymous. Go telemetry first appeared in gopls, the Go language server, where it has already led to a litany of improvements. This effort paves the way to make programming in Go an even better experience for everyone.

Looking forward, we’re evolving Go to better leverage the capabilities of current and future hardware. Hardware has changed a lot in the past 15 years. In order to ensure Go continues to support high-performance, large-scale production workloads for the next 15 years, we need to adapt to large multicores, advanced instruction sets, and the growing importance of locality in increasingly non-uniform memory hierarchies. Some of these improvements will be transparent. Go 1.24 will have a totally new map implementation under the hood that’s more efficient on modern CPUs. And we’re prototyping new garbage collection algorithms designed around the capabilities and constraints of modern hardware. Some improvements will be in the form of new APIs and tools so Go developers can better leverage modern hardware. We’re looking at how to support the latest vector and matrix hardware instructions, and multiple ways that applications can build in CPU and memory locality. A core principle guiding our efforts is composable optimization: the impact of an optimization on a codebase should be as localized as possible, ensuring that the ease of development across the rest of the codebase is not compromised.

We’re continuing to ensure Go’s standard library is safe by default and safe by design. This includes ongoing efforts to incorporate built-in, native support for FIPS-certified cryptography, so that FIPS crypto will be just a flag flip away for applications that need it. Furthermore, we’re evolving Go’s standard library packages where we can and, following the example of math/rand/v2, considering where new APIs can significantly enhance the ease of writing safe and secure Go code.

We’re working on making Go better for AI—and AI better for Go—by enhancing Go’s capabilities in AI infrastructure, applications, and developer assistance. Go is a great language for building production systems, and we want it to be a great language for building production AI systems, too. Go’s dependability as a language for Cloud infrastructure has made it a natural choice for LLM infrastructure as well. For AI applications, we will continue building out first-class support for Go in popular AI SDKs, including LangChainGo and Genkit. And from its very beginning, Go aimed to improve the end-to-end software engineering process, so naturally we’re looking at bringing the latest tools and techniques from AI to bear on reducing developer toil, leaving more time for the fun stuff—like actually programming!

Thank you

All of this is only possible because of Go’s incredible contributors and thriving community. Fifteen years ago we could only dream of the success that Go has become and the community that has developed around Go. Thank you to everyone who has played a part, large and small. We wish you all the best in the coming year.

What's in an (Alias) Name?

17 September 2024 at 07:00

The Go Blog

What's in an (Alias) Name?

Robert Griesemer
17 September 2024

This post is about generic alias types, what they are, and why we need them.

Background

Go was designed for programming at scale. Programming at scale means dealing with large amounts of data, but also large codebases, with many engineers working on those codebases over long periods of time.

Go’s organization of code into packages enables programming at scale by splitting up large codebases into smaller, more manageable pieces, often written by different people, and connected through public APIs. In Go, these APIs consist of the identifiers exported by a package: the exported constants, types, variables, and functions. This includes the exported fields of structs and methods of types.

As software projects evolve over time or requirements change, the original organization of the code into packages may turn out to be inadequate and require refactoring. Refactoring may involve moving exported identifiers and their respective declarations from an old package to a new package. This also requires that any references to the moved declarations must be updated so that they refer to the new location. In large codebases it may be unpractical or infeasible to make such a change atomically; or in other words, to do the move and update all clients in a single change. Instead, the change must happen incrementally: for instance, to “move” a function F, we add its declaration in a new package without deleting the original declaration in the old package. This way, clients can be updated incrementally, over time. Once all callers refer to F in the new package, the original declaration of F may be safely deleted (unless it must be retained indefinitely, for backward compatibility). Russ Cox describes refactoring in detail in his 2016 article on Codebase Refactoring (with help from Go).

Moving a function F from one package to another while also retaining it in the original package is easy: a wrapper function is all that’s needed. To move F from pkg1 to pkg2, pkg2 declares a new function F (the wrapper function) with the same signature as pkg1.F, and pkg2.F calls pkg1.F. New callers may call pkg2.F, old callers may call pkg1.F, yet in both cases the function eventually called is the same.

Moving constants is similarly straightforward. Variables take a bit more work: one may have to introduce a pointer to the original variable in the new package or perhaps use accessor functions. This is less ideal, but at least it is workable. The point here is that for constants, variables, and functions, existing language features exist that permit incremental refactoring as described above.

But what about moving a type?

In Go, the (qualified) identifier, or just name for short, determines the identity of types: a type T defined and exported by a package pkg1 is different from an otherwise identical type definition of a type T exported by a package pkg2. This property complicates a move of T from one package to another while retaining a copy of it in the original package. For instance, a value of type pkg2.T is not assignable to a variable of type pkg1.T because their type names and thus their type identities are different. During an incremental update phase, clients may have values and variables of both types, even though the programmer’s intent is for them to have the same type.

To solve this problem, Go 1.9 introduced the notion of a type alias. A type alias provides a new name for an existing type without introducing a new type that has a different identity.

In contrast to a regular type definition

type T T0

which declares a new type that is never identical to the type on the right-hand side of the declaration, an alias declaration

type A = T  // the "=" indicates an alias declaration

declares only a new name A for the type on the right-hand side: here, A and T denote the same and thus identical type T.

Alias declarations make it possible to provide a new name (in a new package!) for a given type while retaining type identity:

package pkg2

import "path/to/pkg1"

type T = pkg1.T

The type name has changed from pkg1.T to pkg2.T but values of type pkg2.T have the same type as variables of type pkg1.T.

Generic alias types

Go 1.18 introduced generics. Since that release, type definitions and function declarations can be customized through type parameters. For technical reasons, alias types didn’t gain the same ability at that time. Obviously, there were also no large codebases exporting generic types and requiring refactoring.

Today, generics have been around for a couple of years, and large codebases are making use of generic features. Eventually the need will arise to refactor these codebases, and with that the need to migrate generic types from one package to another.

To support incremental refactorings involving generic types, the future Go 1.24 release, planned for early February 2025, will fully support type parameters on alias types in accordance with proposal #46477. The new syntax follows the same pattern as it does for type definitions and function declarations, with an optional type parameter list following the identifier (the alias name) on the left-hand side. Before this change one could only write:

type Alias = someType

but now we can also declare type parameters with the alias declaration:

type Alias[P1 C1, P2 C2] = someType

Consider the previous example, now with generic types. The original package pkg1 declared and exported a generic type G with a type parameter P that is suitably constrained:

package pkg1

type Constraint      someConstraint
type G[P Constraint] someType

If the need arises to provide access to the same type G from a new package pkg2, a generic alias type is just the ticket (playground):

package pkg2

import "path/to/pkg1"

type Constraint      = pkg1.Constraint  // pkg1.Constraint could also be used directly in G
type G[P Constraint] = pkg1.G[P]

Note that one cannot simply write

type G = pkg1.G

for a couple of reasons:

  1. Per existing spec rules, generic types must be instantiated when they are used. The right-hand side of the alias declaration uses the type pkg1.G and therefore type arguments must be provided. Not doing so would require an exception for this case, making the spec more complicated. It is not obvious that the minor convenience is worth the complication.

  2. If the alias declaration doesn’t need to declare its own type parameters and instead simply “inherits” them from the aliased type pkg1.G, the declaration of G provides no indication that it is a generic type. Its type parameters and constraints would have to be retrieved from the declaration of pkg1.G (which itself might be an alias). Readability will suffer, yet readable code is one of the primary aims of the Go project.

Writing down an explicit type parameter list may seem like an unnecessary burden at first, but it also provides additional flexibility. For one, the number of type parameters declared by the alias type doesn’t have to match the number of type parameters of the aliased type. Consider a generic map type:

type Map[K comparable, V any] mapImplementation

If uses of Map as sets are common, the alias

type Set[K comparable] = Map[K, bool]

might be useful (playground). Because it is an alias, types such as Set[int] and Map[int, bool] are identical. This would not be the case if Set were a defined (non-alias) type.

Furthermore, the type constraints of a generic alias type don’t have to match the constraints of the aliased type, they only have to satisfy them. For instance, reusing the set example above, one could define an IntSet as follows:

type integers interface{ ~int | ~int8 | ~int16 | ~int32 | ~int64 }
type IntSet[K integers] = Set[K]

This map can be instantiated with any key type that satisfies the integers constraint (playground). Because integers satisfies comparable, the type parameter K may be used as type argument for the K parameter of Set, following the usual instantiation rules.

Finally, because an alias may also denote a type literal, parameterized aliases make it possible to create generic type literals (playground):

type Point3D[E any] = struct{ x, y, z E }

To be clear, none of these examples are “special cases” or somehow require additional rules in the spec. They follow directly from the application of the existing rules put in place for generics. The only thing that changed in the spec is the ability to declare type parameters in an alias declaration.

An interlude about type names

Before the introduction of alias types, Go had only one form of type declarations:

type TypeName existingType

This declaration creates a new and different type from an existing type and gives that new type a name. It was natural to call such types named types as they have a type name in contrast to unnamed type literals such as struct{ x, y int }.

With the introduction of alias types in Go 1.9 it became possible to give a name (an alias) to type literals, too. For instance, consider:

type Point2D = struct{ x, y int }

Suddenly, the notion of a named type describing something that is different from a type literal didn’t make that much sense anymore, since an alias name clearly is a name for a type, and thus the denoted type (which might be a type literal, not a type name!) arguably could be called a “named type”.

Because (proper) named types have special properties (one can bind methods to them, they follow different assignment rules, etc.), it seemed prudent to use a new term in order to avoid confusions. Thus, since Go 1.9, the spec calls the types formerly called named types defined types: only defined types have properties (methods, assignability restrictions, etc) that are tied to their names. Defined types are introduced through type definitions, and alias types are introduced through alias declarations. In both cases, names are given to types.

The introduction of generics in Go 1.18 made things more complicated. Type parameters are types, too, they have a name, and they share rules with defined types. For instance, like defined types, two differently named type parameters denote different types. In other words, type parameters are named types, and furthermore, they behave similarly to Go’s original named types in some ways.

To top things off, Go’s predeclared types (int, string and so on) can only be accessed through their names, and like defined types and type parameters, are different if their names are different (ignoring for a moment the byte and rune alias types). The predeclared types truly are named types.

Therefore, with Go 1.18, the spec came full circle and formally re-introduced the notion of a named type which now comprises “predeclared types, defined types, and type parameters”. To correct for alias types denoting type literals the spec says: “An alias denotes a named type if the type given in the alias declaration is a named type.”

Stepping back and outside the box of Go nomenclature for a moment, the correct technical term for a named type in Go is probably nominal type. A nominal type’s identity is explicitly tied to its name which is exactly what Go’s named types (now using the 1.18 terminology) are all about. A nominal type’s behavior is in contrast to a structural type which has behavior that only depends on its structure and not its name (if it has one in the first place). Putting it all together, Go’s predeclared, defined, and type parameter types are all nominal types, while Go’s type literals and aliases denoting type literals are structural types. Both nominal and structural types can have names, but having a name doesn’t mean the type is nominal, it just means it is named.

None of this matters for day-to-day use of Go and in practice the details can safely be ignored. But precise terminology matters in the spec because it makes it easier to describe the rules governing the language. So should the spec change its terminology one more time? It is probably not worth the churn: it is not just the spec that would need to be updated, but also a lot of supporting documentation. A fair number of books written on Go might become inaccurate. Furthermore, “named”, while less precise, is probably intuitively clearer than “nominal” for most people. It also matches the original terminology used in the spec, even if it now requires an exception for alias types denoting type literals.

Availability

Implementing generic type aliases has taken longer than expected: the necessary changes required adding a new exported Alias type to go/types and then adding the ability to record type parameters with that type. On the compiler side, the analogous changes also required modifications to the export data format, the file format that describes a package’s exports, which now needs to be able to describe type parameters for aliases. The impact of these changes is not confined to the compiler, but affects clients of go/types and thus many third-party packages. This was very much a change affecting a large code base; to avoid breaking things, an incremental roll-out over several releases was necessary.

After all this work, generic alias types will finally be available by default in Go 1.24.

To allow third-party clients to get their code ready, starting with Go 1.23, support for generic type aliases can be enabled by setting GOEXPERIMENT=aliastypeparams when invoking the go tool. However, be aware that support for exported generic aliases is still missing for that version.

Full support (including export) is implemented at tip, and the default setting for GOEXPERIMENT will soon be switched so that generic type aliases are enabled by default. Thus, another option is to experiement with the latest version of Go at tip.

As always, please let us know if you encounter any problems by filing an issue; the better we test a new feature, the smoother the general roll-out.

Thanks and happy refactoring!

❌
❌