Concurrency in Go is often praised for its simplicity, but building systems that are both scalable and correct requires more than sprinkling go keywords. Teams frequently encounter deadlocks, resource leaks, and baffling race conditions that erode velocity and trust. This guide is for developers who understand goroutine basics but want to move from working code to production-grade concurrent systems. We will examine proven patterns, dissect their trade-offs, and show you how to choose the right tool for each scenario. Along the way, we share anonymized lessons from real projects—so you can avoid the same costly mistakes.
Why Concurrency Patterns Matter: From Goroutines to Production
At first glance, Go’s concurrency primitives seem straightforward: launch a goroutine with go f(), communicate via channels. But naive usage quickly leads to subtle bugs. For instance, a team building a log aggregator once launched a goroutine per incoming log line—thousands per second—without any backpressure. The system became unresponsive, and the root cause was not CPU saturation but excessive goroutine scheduling overhead and memory pressure.
Patterns are not just academic; they encode hard-won experience. The worker pool pattern, for example, limits the number of concurrent goroutines, providing predictable resource usage. The fan-out/fan-in pattern distributes work across multiple goroutines and collects results, but requires careful synchronization to avoid races. Without these patterns, teams often reinvent the wheel—and introduce bugs.
Another common pitfall is ignoring goroutine lifecycle. A goroutine that blocks forever on a channel read, with no way to signal shutdown, will leak memory and eventually crash the process. Patterns like context cancellation and graceful shutdown are essential for long-running services. We will cover these in depth, along with testing strategies that catch concurrency bugs before they reach production.
What This Guide Covers
We will walk through the most impactful patterns: worker pools for rate-limiting, fan-out/fan-in for parallel processing, pipelines for streaming data, and errgroups for error propagation. Each pattern includes a composite scenario, a code sketch, and a discussion of when to use—or avoid—it. We also dedicate sections to common mistakes and a decision checklist to help you choose the right pattern for your next project.
The Core Patterns: Worker Pools, Fan-Out/Fan-In, and Pipelines
These three patterns form the backbone of most concurrent Go systems. Each addresses a different need: bounding concurrency, parallelizing independent tasks, and chaining stages of processing.
Worker Pool: Bounding Concurrency
The worker pool pattern creates a fixed number of goroutines (workers) that read jobs from a shared channel and write results to another. This prevents unbounded goroutine creation and provides backpressure. A typical implementation uses a sync.WaitGroup to wait for all workers to finish.
Scenario: A team building a thumbnail generator needed to process images uploaded by users. Naively launching a goroutine per image caused memory spikes when users uploaded batches. By using a worker pool of 10 goroutines, they kept memory usage flat and throughput stable, even under load.
When to use: When you have many independent tasks that are CPU-bound or I/O-bound, and you need to limit resource consumption. When to avoid: When tasks have dependencies or require ordering; a pipeline pattern may be better.
Fan-Out/Fan-In: Parallel Independent Tasks
Fan-out distributes a set of tasks across multiple goroutines, and fan-in merges their results into a single channel. This pattern is ideal for embarrassingly parallel problems, like processing a batch of files or making multiple API calls.
Scenario: A data pipeline needed to fetch weather data from 1000 stations. Fan-out dispatched one goroutine per station, each writing its result to a shared channel. Fan-in collected all results into a slice for aggregation. The team used a sync.WaitGroup to ensure all fetchers completed before closing the results channel.
Key consideration: The fan-in channel must be closed exactly once, typically after all producers finish. Using sync.WaitGroup with a dedicated goroutine for closing is the standard approach. Failure to close correctly leads to deadlocks in the consumer.
Pipeline: Chaining Stages
Pipelines connect stages via channels, where each stage performs a transformation and sends its output to the next. This pattern is natural for streaming data, like processing log lines or transforming records.
Scenario: A log analysis system ingested raw logs, parsed them into structured events, enriched them with geolocation data, and stored them. Each stage was a goroutine reading from an input channel and writing to an output channel. The pipeline allowed easy scaling: if parsing was the bottleneck, they could fan-out that stage across multiple goroutines.
When to use: When data flows through multiple processing steps, especially if each step can be independently scaled. When to avoid: When stages have tight coupling or when the overhead of channel communication dominates (e.g., for very fast operations).
Putting Patterns into Practice: Step-by-Step Implementation
Let's implement a concrete example: a service that fetches URLs, scrapes their titles, and stores the results. We'll use a worker pool for the I/O-bound fetch, then a pipeline for processing.
Step 1: Define the Job and Result Types
We start with structs for jobs (URLs) and results (URL + title). This makes the code self-documenting and easy to extend.
Step 2: Create the Worker Pool
We create a fixed number of workers, each reading from a jobs channel. Workers call fetchTitle and send results to a results channel. We use a sync.WaitGroup to wait for all workers to finish, then close the results channel.
Step 3: Fan-Out the Fetches
If the list of URLs is large, we can fan-out: launch multiple workers (e.g., 20) to fetch concurrently. The jobs channel acts as a buffer; workers block when the channel is empty.
Step 4: Fan-In Results with a Collector
We launch a collector goroutine that reads from the results channel and appends to a slice. After all workers are done (the results channel is closed), the collector signals completion via a done channel.
Step 5: Combine with Pipeline for Post-Processing
After collection, we might want to filter results (e.g., remove empty titles) and store them. We can add a pipeline stage: a filter goroutine that reads from the collector's output channel and writes to a store channel. This separation keeps each stage testable.
Tooling and Maintenance: Testing, Profiling, and Debugging
Concurrent code is notoriously hard to test and debug. Go provides built-in tools, but teams must adopt disciplined practices.
Testing with the Race Detector
Go's race detector (-race flag) is invaluable. It instruments memory accesses and reports unsynchronized reads/writes. Run it regularly in CI. However, it only catches races that occur during execution, so it's not a substitute for careful design.
Profiling Goroutines and Memory
Use pprof to profile goroutine count and heap usage. A goroutine profile (goroutine pprof) shows stack traces of all goroutines, helping you spot leaks. For example, a goroutine stuck on a channel read that never receives will appear as a blocked goroutine. Set up HTTP endpoints for profiling in development.
Graceful Shutdown with Context
Use context.Context to propagate cancellation signals. Pass a context to every long-running goroutine. When your application receives a SIGTERM, cancel the context, and goroutines should respond by cleaning up and returning. This prevents half-written data and resource leaks.
Logging and Tracing
Add structured logging with correlation IDs to trace requests across goroutines. Distributed tracing (e.g., OpenTelemetry) can help visualize the flow through a pipeline. This is especially useful when debugging latency spikes caused by a single slow stage.
Growth Mechanics: Scaling from Prototype to Production
As your system grows, patterns that worked for a prototype may break down. Anticipate scaling challenges early.
Backpressure and Rate Limiting
Worker pools inherently provide backpressure: if all workers are busy, new jobs queue up. But if the queue grows unbounded, memory pressure increases. Use a buffered channel with a bounded capacity, or implement a drop policy for overloaded systems. For example, a team building a notification service used a channel of size 100; when full, they logged a warning and dropped the job, rather than crashing.
Dynamic Scaling
Fixed-size worker pools are simple but may underutilize resources during low load and be insufficient during spikes. Consider using a semaphore.Weighted from the golang.org/x/sync package to adjust concurrency at runtime based on metrics like CPU usage or queue depth. However, dynamic scaling adds complexity; start with a fixed pool and only add dynamism if needed.
Error Handling and Retries
Errors in concurrent systems are tricky: a single failed goroutine should not bring down the whole pipeline. Use the errgroup package (golang.org/x/sync/errgroup) to collect errors from multiple goroutines and cancel the rest on the first failure. For transient errors, implement retries with exponential backoff, but be careful not to overload downstream services.
Risks, Pitfalls, and Mistakes: What Can Go Wrong
Even experienced teams fall into these traps. Recognizing them early saves debugging time.
Goroutine Leaks
A goroutine that blocks forever on a channel read or write, with no way to unblock, will leak. Common causes: forgetting to close a channel, or a goroutine waiting on a channel that no other goroutine writes to. Use context.Context with a timeout or cancellation to unblock stuck goroutines. Profile goroutine counts in production to detect leaks early.
Deadlocks
Deadlocks occur when goroutines wait on each other in a cycle. For example, goroutine A sends to channel 1, while goroutine B sends to channel 2, and each waits for the other's channel to be read. Use a consistent locking order for mutexes, and prefer channels over shared memory where possible. The race detector does not catch deadlocks; use the go vet tool and runtime stack dumps (SIGQUIT) to diagnose.
Channel Overuse
Not every concurrent interaction needs a channel. For simple coordination (e.g., one goroutine signals another), sync.Cond or a sync.WaitGroup may be simpler. Channels add overhead and can obscure intent. A rule of thumb: if you are passing data, use a channel; if you are signaling, consider a sync.Mutex or atomic operation.
Ignoring the Cost of Goroutine Creation
Although goroutines are lightweight (a few KB), creating millions per second can overwhelm the scheduler. Always bound concurrency with a worker pool or semaphore. In one incident, a team created a goroutine per HTTP request without a pool; under a DDoS-like surge, the goroutine count hit millions, causing memory exhaustion and a crash.
Decision Checklist: Choosing the Right Pattern
Use this checklist when designing a concurrent component. It helps you match the pattern to the problem.
When to Use Each Pattern
Worker Pool: Use when you have many independent tasks and need to limit concurrency for resource management. Avoid when tasks have ordering constraints or when you need dynamic scaling.
Fan-Out/Fan-In: Use when you can parallelize independent work and need to collect all results. Avoid when tasks are not independent (e.g., shared state) or when the number of tasks is small (overhead may not be worth it).
Pipeline: Use when data flows through sequential stages, and you want to parallelize each stage independently. Avoid when stages are tightly coupled or when the processing per item is very fast (channel overhead dominates).
Decision Table
| Pattern | Best For | Key Risk |
|---|---|---|
| Worker Pool | Rate-limited processing, I/O tasks | Queue overflow if not bounded |
| Fan-Out/Fan-In | Parallel independent tasks, batch processing | Channel close coordination |
| Pipeline | Streaming data, multi-stage transforms | Stage imbalance causing backpressure |
Quick Checklist
- Are tasks independent? → Fan-Out/Fan-In or Worker Pool
- Is data streaming? → Pipeline
- Do you need to limit resource usage? → Worker Pool
- Do you need to cancel on first error? → Errgroup
- Is graceful shutdown required? → Context cancellation
Bringing It All Together: Building a Concurrency-Ready Mindset
Mastering concurrency in Go is not about memorizing patterns—it's about developing judgment. Start by identifying the nature of your tasks: independent or dependent, CPU-bound or I/O-bound, bounded or unbounded. Then apply the simplest pattern that fits. Resist the urge to over-engineer; a single goroutine with a mutex can be the right solution.
We have covered the core patterns, implementation steps, testing strategies, and common pitfalls. The next time you face a concurrency challenge, refer back to this guide. Use the decision checklist, profile your code, and test with the race detector. Remember that concurrency bugs are often non-deterministic; invest in thorough testing and monitoring.
As you build larger systems, consider composing patterns: a pipeline that uses worker pools for each stage, or fan-out within a pipeline stage. The Go ecosystem also offers higher-level abstractions like the conc library for structured concurrency, but understand the fundamentals before adopting them.
Finally, share your experiences with the community. Concurrency patterns evolve with practice, and what works for one team may not work for another. By staying curious and disciplined, you can build systems that are both fast and reliable.
Comments (0)
Please sign in to post a comment.
Don't have an account? Create one
No comments yet. Be the first to comment!