Errors, Errors Everywhere: How We Centralized and Structured Error Handling
Handling errors in Go is simple and flexible – yet no structure!
It’s supposed to be simple, right? Just return an error
, wrapped with a message, and move on. Well, that simplicity quickly turns into chaotic as our codebase grows with more packages, more developers, and more “quick fixes” that stay there forever. Over time, the logs are full of “failed to do this” and “unexpected that”, and nobody knows if it’s the user’s fault, the server’s fault, buggy code, or it’s just a misalignment of the stars!
Errors are created with inconsistent messages. Each package has it own set of styles, constants, or custom error types. Error codes are added arbitrarily. No easy way to tell which errors may be returned from which function without digging into its implementation!
So, I took the challenge of creating a new error framework. We decided to go with a structured, centralized system using namespace codes to make errors meaningful, traceable, and – most importantly – give us peace of mind!
This is the story of how we started with a simple error handling approach, got thoroughly frustrated as the problems grew, and eventually built our own error framework. The design decisions, how it’s implemented, the lessons learned, and why it transformed our approach to managing errors. I hope that it will bring some ideas for you too!
Go errors are just values
Go has a straightforward way to handle errors: errors are just values. An error is just a value that implements the error
interface with a single method Error() string
. Instead of throwing an exception and disrupting the current execution flow, Go functions return an error
value alongside other results. The caller can then decide how to handle it: check its value to make decision, wrap with new messages and context, or simply return the error, leaving the handling logic for parent callers.
We can make any type an error
by adding the Error() string
method on it. This flexibility allows each package to define its own error-handling strategy, and choose whatever works best for them. This also integrates well with Go’s philosophy of composability, making it easy to wrap, extend, or customize errors as required.
Every package needs to deal with errors
The common practice is to return an error value that implements the error
interface and lets the caller decide what to do next. Here’s a typical example:
func loadCredentials() (Credentials, error) {
data, err := os.ReadFile("cred.json")
if errors.Is(err, os.ErrNotExist) {
return nil, fmt.Errorf("file not found: %w", err)
}
if err != nil {
return nil, fmt.Errorf("failed to read file: %w", err)
}
cred, err := verifyCredentials(cred);
if err != nil {
return nil, fmt.Errorf("invalid credentials: %w", err)
}
return cred, nil
}
Go provides a handful of utilities for working with errors:
- Creating errors:
errors.New()
andfmt.Errorf()
for generating simple errors. - Wrapping errors: Wrap errors with additional context using
fmt.Errorf()
and the%w
verb. - Combining errors:
errors.Join()
merges multiple errors into a single one. - Checking and handling errors:
errors.Is()
matches an error with a specific value,errors.As()
matches an error to a specific type, anderrors.Unwrap()
retrieves the underlying error.
In practice, we usually see these patterns:
- Using standard packages: Returning simple errors with
errors.New()
orfmt.Errorf()
. - Exporting constants or variables: For instance, go-redis and gorm.io define reusable error variables.
- Custom error types: Libraries like lib/pq or grpc/status.Error create specialized error types, often with associated codes for additional context.
- Error interfaces with implementations: The aws-sdk-go uses an interface-based approach to define error types with various implementations.
- Or multiple interfaces: Like Docker’s errdefs, which defines multiple interfaces to classify and manage errors.
We started with a common approach
In the early days, like many Go developers, we followed Go’s common practices and kept error handling minimal yet functional. It worked well enough for a couple of years.
- Include stacktrace using pkg/errors, a popular package at that time.
- Export constants or variables for package-specific errors.
- Use
errors.Is()
to check for specific errors. - Wrap errors with a new messages and context.
- For API errors, we define error types and codes with Protobuf enum.
Including stacktrace with pkg/errors
We used pkg/errors, a popular error-handling package at the time, to include stacktrace in our errors. This was particularly helpful for debugging, as it allowed us to trace the origin of errors across different parts of the application.
To create, wrap, and propagate errors with stacktrace, we implemented functions like Newf()
, NewValuef()
, and Wrapf()
. Here’s an example of our early implementation:
type xError struct {
msg message,
stack: callers(),
}
func Newf(msg string, args ...any) error {
return &xError{
msg: fmt.Sprintf(msg, args...),
stack: callers(), // 👈 stacktrace
}
}
func NewValuef(msg string, args ...any) error {
return fmt.Errorf(msg, args...) // 👈 no stacktrace
}
func Wrapf(err error, msg string, args ...any) error {
if err == nil { return nil }
stack := getStack(err)
if stack == nil { stack = callers() }
return &xError{
msg: fmt.Sprintf(msg, args...),
stack: stack,
}
}
Exporting error variables
Each package in our codebase defined its own error variables, often with inconsistent styles.
package database
var ErrNotFound = errors.NewValue("record not found")
var ErrMultipleFound = errors.NewValue("multiple records found")
var ErrTimeout = errors.NewValue("request timeout")
package profile
var ErrUserNotFound = errors.NewValue("user not found")
var ErrBusinessNotFound = errors.NewValue("business not found")
var ErrContextCancel = errors.NewValue("context canceled")
Checking errors with errors.Is()
and wrapping with additional context
res, err := repo.QueryUser(ctx, req)
switch {
case err == nil:
// continue
case errors.Is(database.NotFound):
return nil, errors.Wrapf(ErrUserNotFound, "user not found (id=%v)", req.UserID)
default:
return nil, errors.Wrapf(ctx, "failed to query user (id=%v)", req.UserID)
}
This helped propagate errors with more detail but often resulted in verbosity, duplication, and less clarity in logs:
internal server error: failed to query user: user not found (id=52a0a433-3922-48bd-a7ac-35dd8972dfe5): record not found: not found
Defining external errors with Protobuf
For external-facing APIs, we adopted a Protobuf-based error model inspired by Meta’s Graph API:
message Error {
string message = 1;
ErrorType type = 2;
ErrorCode code = 3;
string user_title = 4;
string user_message = 5;
string trace_id = 6;
map<string, string> details = 7;
}
enum ErrorType {
ERROR_TYPE_UNSPECIFIED = 1;
ERROR_TYPE_AUTHENTICATION = 2;
ERROR_TYPE_INVALID_REQUEST = 3;
ERROR_TYPE_RATE_LIMIT = 4;
ERROR_TYPE_BUSINESS_LIMIT = 5;
ERROR_TYPE_WEBHOOK_DELIVERY = 6;
}
enum ErrorCode {
ERROR_CODE_UNSPECIFIED = 1 [(error_type = UNSPECIFIED)];
ERROR_CODE_UNAUTHENTICATED = 2 [(error_type = AUTHENTICATION)];
ERROR_CODE_CAMPAIGN_NOT_FOUND = 3 [(error_type = NOT_FOUND)];
ERROR_CODE_META_CHOSE_NOT_TO_DELIVER = 4 /* ... */;
ERROR_CODE_MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS = 5;
}
This approach helped structure errors, but over time, error types and codes were added without a clear plan, leading to inconsistencies and duplication.
And problems grew over time
Errors were declared everywhere
- Each package defined its own error constants with no centralized system.
- Constants and messages were scattered across the codebase, making it unclear which errors a function might return – ugh, is it
gorm.ErrRecordNotFound
oruser.ErrNotFound
or both?
Random error wrapping led to inconsistent and arbitrary logs
- Many functions wrapped errors with arbitrary, inconsistent messages without declaring their own error types.
- Logs were verbose, redundant, and difficult to search or monitor.
- Error messages were generic and often didn’t explain what went wrong or how it happened. Also brittle and prone to unnoticed changes.
unexpected gorm error: failed to find business channel: error received when invoking API: unexpected: context canceled
No standardization led to improper error handling
- Each package handled errors differently, making it hard to know if a function returned, wrapped, or transformed errors.
- Context was often lost as errors propagated.
- Upper layers received vague 500 Internal Server Errors without clear root causes.
No categorization made monitoring impossible
- Errors weren’t classified by severity or behavior: A
context.Canceled
error may be a normal behavior when the user closes the browser tab, but it’s important if the request is canceled because that query is randomly slow. - Important issues were buried under noisy logs, making them hard to identify.
- Without categorization, it was impossible to monitor error frequency, severity, or impact effectively.
It’s time to centralize error handling
Back to the drawing board
To address the growing challenges, we decided to build a better error strategy around the core idea of centralized and structured error codes.
- Errors are declared everywhere → Centralize error declaration in a single place for better organization and traceability.
- Inconsistent and arbitrary logs → Structured error codes with clear and consistent formatting.
- Improper error handling → Standardize error creation and checking on the new
Error
type with a comprehensive set of helpers. - No categorization → Categorize error codes with tags for effective monitoring through logs and metrics.
Design decisions
All error codes are defined at a centralized place with namespace structure.
Use namespaces to create clear, meaningful, and extendable error codes. Example:
PRFL.USR.NOT_FOUND
for “User not found.”FLD.NOT_FOUND
for “Flow document not found.”- Both can share an underlying base code
DEPS.PG.NOT_FOUND
, meaning “Record not found in PostgreSQL.”
Each layer of service or library must only return its own namespace codes.
- Each layer of service, repository, or library declares its own set of error codes.
- When a layer receives an error from a dependency, it must wrap it with its own namespace code before returning it.
- For example: When receiving an error
gorm.ErrRecordNotFound
from a dependency, the “database” package must wrap it asDEPS.PG.NOT_FOUND
. Later, the “profile/user” service must wrap it again asPRFL.USR.NOT_FOUND
.
All errors must implement the Error
interface.
- This creates a clear boundary between errors from third-party libraries (
error
) and our internalError
s. - This also helps for migration progress, to separate between migrated packages and not-yet-migrated ones.
An error can wrap one or multiple errors. Together, they form a tree.
[FLD.INVALID_ARGUMENT] invalid argument
→ [TPL.INVALID_PARAMS] invalid input params
1. [TPL.PARAM.EMPTY] name can not be empty
2. [TPL.PARAM.MALFORM] invalid format for param[2]
Always require context.Context
. Can attach context to the error.
- Many times we saw logs with standalone errors with no context, no
trace_id
, and have no idea where it comes from. - Can attach additional key/value to errors, which can be used in logs or monitoring.
When errors are sent across service boundary, only the top-level error code is exposed.
- The callers do not need to see the internal implementation details of that service.
For external errors, keep using the current Protobuf ErrorCode and ErrorType.
- This ensures backward compatibility, so our clients don’t need to rewrite their code.
Automap namespace error codes to Protobuf codes, HTTP status codes, and tags.
- Engineers define the mapping in the centralized place, and the framework will map each error code to the corresponding Protobuf
ErrorCode
,ErrorType
, gRPC status, HTTP status, and tags for logging/metrics. - This ensures consistency and reduces duplication.
The namespace error framework
Core packages and types
There are a few core packages that form the foundation of our new error-handling framework.
connectly.ai/go/pkgs/
errors
: The main package that defines theError
type and codes.errors/api
: For sending errors to the front-end or external API.errors/E
: Helper package intended to be used with dot import.testing
: Testing utilities for working with namespace errors.
Error
and Code
The Error
interface is an extension of the standard error
interface, with additional methods to return a Code
. A Code
is implemented as an uint16
.
package errors // import "connectly.ai/go/pkgs/errors"
type Error interface {
error
Code() Code
}
type Code struct {
code uint16
}
type CodeI interface {
CodeDesc() CodeDesc
}
type GroupI interface { /* ... */ }
type CodeDesc struct { /* ... */ }
Package errors/E
exports all error codes and common types
package E // import "connectly.ai/go/pkgs/errors/E"
import "connectly.ai/go/pkgs/errors"
type Error = errors.Error
var (
DEPS = errors.DEPS
PRFL = errors.PRFL
)
func MapError(ctx context.Context, err error) errors.Mapper { /* ... */ }
func IsErrorCode(err error, codes ...errors.CodeI) { /* ... */ }
func IsErrorGroup(err error, groups ...errors.GroupI) { /* ... */ }
Example usage
Example error codes:
// dependencies → postgres
DEPS.PG.NOT_FOUND
DEPS.PG.UNEXPECTED
// sdk → hash
SDK.HASH.UNEXPECTED
// profile → user
PRFL.USR.NOT_FOUND
PFRL.USR.UNKNOWN
// profile → user → repository
PRFL.USR.REPO.NOT_FOUND
PRFL.USR.REPO.UNKNOWN
// profile → auth
PRFL.AUTH.UNAUTHENTICATED
PRFL.AUTH.UNKNOWN
PRFL.AUTH.UNEXPECTED
Package database
:
package database // import "connectly.ai/go/pkgs/database"
import "gorm.io/gorm"
import . "connectly.ai/go/pkgs/errors/E"
type DB struct { gorm: gorm.DB }
func (d *DB) Exec(ctx context.Context, sql string, params ...any) *DB {
tx := d.gorm.WithContext(ctx).Exec(sql, params...)
return wrapTx(tx)
}
func (x *DB) Error(msgArgs ...any) Error {
return wrapError(tx.Error()) // 👈 convert gorm error to 'Error'
}
func (x *DB) SingleRowError(msgArgs ...any) Error {
if err := x.Error(); err != nil { return err }
switch {
case x.RowsAffected == 1: return nil
case x.RowsAffected == 0:
return DEPS.PG.NOT_FOUND.CallerSkip(1).
New(x.Context(), formatMsgArgs(msgArgs))
default:
return DEPS.PG.UNEXPECTED.CallerSkip(1).
New(x.Context(), formatMsgArgs(msgArgs))
}
}
Package pb/services/profile
:
package profile // import "connectly.ai/pb/services/profile"
// these types are generated from services/profile.proto
type QueryUserRequest struct {
BusinessId string
UserId string
}
type LoginRequest struct {
Username string
Password string
}
Package service/profile
:
package profile
import uuid "github.com/google/uuid"
import . "connectly.ai/go/pkgs/errors/E"
import l "connectly.ai/go/pkgs/logging/l"
import profilepb "connectly.ai/pb/services/profile"
// repository requests
type QueryUserByUsernameRequest struct {
Username string
}
// repository layer → query user
func (r *UserRepository) QueryUserByUsernameAuth(
ctx context.Context, req *QueryUserByUsernameRequest,
) (*User, Error) {
if req.Username == "" {
return PRFL.USR.REPO.INVALID_ARGUMENT.New(ctx, "empty request")
}
var user User
sqlQuery := `SELECT * FROM "user" WHERE username = ? LIMIT 1`
tx := r.db.Exec(ctx, sqlQuery, req.Username).Scan(&user)
err := tx.SingleRowError()
switch {
case err == nil:
return &user, nil
case IsErrorCode(DEPS.PG.NOT_FOUND):
return PRFL.USR.REPO.USER_NOT_FOUND.
With(l.String("username", req.Username))
Wrap(ctx, "user not found")
default:
return PRFL.USR.REPO.UNKNOWN.
Wrap(ctx, "failed to query user")
}
}
// user service layer → query user
func (u *UserService) QueryUser(
ctx context.Context, req *profilepb.QueryUserRequest,
) (*profilepb.QueryUserResponse, Error) {
// ...
rr := QueryUserByUsernameRequest{ Username: req.Username }
err := u.repo.QueryUserByUsername(ctx, rr)
if err != nil {
return nil, MapError(ctx, err).
Map(PRFL.USR.REPO.NOT_FOUND, PRFL.USR.NOT_FOUND,
"the user %q cannot be found", req.UserName,
api.UserTitle("User Not Found"),
api.UserMsg("The requested user id %q can not be found", req.UserId)).
KeepGroup(PRFL.USR).
Default(PRFL.USR.UNKNOWN, "failed to query user")
}
// ...
return resp, nil
}
// auth service layer → login user
func (a *AuthService) Login(
ctx context.Context, req *profilepb.LoginRequest,
) (*profilepb.LoginResponse, *profilepb.LoginResponse, Error) {
vl := PRFL.AUTH.INVALID_ARGUMENT.WithMsg("invalid request")
vl.Vl(req.Username != "", "no username", api.Detail("username is required"))
vl.Vl(req.Password != "", "no password", api.Detail("password is required"))
if err := vl.ToError(ctx); err != nil {
return err
}
hashpwd, err := hash.Hash(req.Password)
if err != nil {
return PRFL.AUTH.UNEXPECTED.Wrap(ctx, err, "failed to calc hash")
}
usrReq := profilepb.QueryUserByUsernameRequest{/*...*/}
usrRes, err := a.userServiceClient.QueryUserByUsername(ctx, usrReq)
if err != nil {
return nil, MapError(ctx, err).
Map(PRFL.USR.NOT_FOUND, PRFL.AUTH.UNAUTHENTICATED, "unauthenticated").
Default(PRFL.AUTH.UNKNOWN, "failed to query by username")
}
// ...
}
Well, there are a lot of new functions and concepts in the above code. Let’s go through them step by step.
Creating and wrapping errors
First, import package errors/E
using dot import
This will allow you to directly use common types like Error
instead of errors.Error
and access to codes by PRFL.USR.NOT_FOUND
instead of errors.PRFL.USR.NOT_FOUND
.
import . "connectly.ai/go/pkgs/errors/E"
Create new errors using CODE.New()
Suppose you get an invalid request, you can create a new error by:
err := PRFL.USR.INVALID_ARGUMENT.New(ctx, "invalid request")
PRFL.USR.INVALID_ARGUMENT
is aCode
.- A
Code
exposes methods likeNew()
orWrap()
for creating a new error. - The
New()
function receivescontext.Context
as the first argument, followed by message and optional arguments.
Print it with fmt.Print(err)
:
[PRFL.USR.INVALID_ARGUMENT] invalid request
or with fmt.Printf("%+v")
to see more details:
[PRFL.USR.INVALID_ARGUMENT] invalid request
connectly.ai/go/services/profile.(*UserService).QueryUser
/usr/i/src/go/services/profile/user.go:1234
connectly.ai/go/services/profile.(*UserRepository).QueryUser
/usr/i/src/go/services/profile/repo/user.go:2341
Wrap an error within a new error using CODE.Wrap()
dbErr := DEPS.PG.NOT_FOUND.Wrap(ctx, gorm.ErrRecordNotFound, "not found")
usrErr := PRFL.USR.NOT_FOUND.Wrap(ctx, dbErr, "user not found")
will produce this output with fmt.Print(usrErr)
:
[PRFL.USR.NOT_FOUND] user not found → [DEPS.PG.NOT_FOUND] not found → record not found
or with fmt.Printf("%+v", usrErr)
[PRFL.USR.NOT_FOUND] user not found
→ [DEPS.PG.NOT_FOUND] not found
→ record not found
connectly.ai/go/services/profile.(*UserService).QueryUser
/usr/i/src/go/services/profile/user.go:1234
The stacktrace will come from the innermost Error
. If you are writing a helper function, you can use CallerSkip(skip)
to skip frames:
func mapUserError(ctx context.Context, err error) Error {
switch {
case IsErrorCode(err, DEPS.PG.NOT_FOUND):
return PRFL.USR.NOT_FOUND.CallerSkip(1).Wrap(ctx, err, "...")
default:
return PRFL.USR.UNKNOWN.CallerSkip(1).Wrap(ctx, err, "...")
}
}
Adding context to errors
Add context to an error using With()
- You can add additional key/value pairs to errors by
.With(l.String(...))
. logging/l
is a helper package to export sugar functions for logging.l.String("flag", flag)
return aTag{String: flag}
andl.UUID("user_id, userID)
returnTag{Stringer: userID}
.
import l "connectly.ai/go/pkgs/logging/l"
usrErr := PRFL.USR.NOT_FOUND.
With(l.UUID("user_id", req.UserID), l.String("flag", flag)).
Wrap(ctx, dbErr, "user not found")
The tags can be output with fmt.Printf("%+v", usrErr)
:
[PRFL.USR.NOT_FOUND] user not found
{"user_id": "81febc07-5c06-4e01-8f9d-995bdc2e0a9a", "flag": "ABRW"}
→ [DEPS.PG.NOT_FOUND] not found
{"a number": 42}
→ record not found
Add context to errors directly inside New()
, Wrap()
, or MapError()
:
By leverage l.String()
function and its family, New()
and similar functions can smartly detect tags among formatting arguments. No need to introduce different functions.
err := INF.HEALTH.NOT_READY.New(ctx,
"service %q is not ready (retried %v times)",
req.ServiceName,
l.String("flag", flag)
countRetries,
l.Number("count", countRetries),
)
will output:
[INF.HEALTH.NOT_READY] service "magic" is not ready (retried 2 times)
{"flag": "ABRW", "count": 2}
Different types: Error0
, VlError
, ApiError
Currently, there are 3 types that implements the Error
interfaces. You can add more types if necessary. Each one can have different structure, with custom methods for specific needs.
Error
is an extension of Go’s standard error
interface
type Error interface {
error
Code()
Message()
Fields() []tags.Field
StackTrace() stacktrace.StackTrace
_base() *base // a private method
}
It contains a private method to ensure that we don’t accidentally implement new Error
types outside of the errors
package. We may (or may not) lift that restriction in the future when we experience with more usage patterns.
Why don’t we just use the standard error
interface and use type assertion?
Because we want to separate between third-party errors and our internal errors. All layers and packages in our internal codes must always return Error
. This way we can safely know when we have to convert third-party errors, and when we only need to deal with our internal error codes.
It also creates a boundary between migrated packages and not-yet-migrated packages. Back to reality, we cannot just declare a new type, wave a magic wand, whisper a spell prompt, and then all millions lines of code are magically converted and work seamlessly with no bugs! No, that future is not here yet. It may come someday, but for now, we still have to migrate our packages one by one.
Error0
is the default Error
type
Most error codes will produce an Error0
value. It contains a base
and an optional sub-error. You can use NewX()
to return a concrete *Error0
struct instead of an Error
interface, but you need to be careful.
type Error0 struct {
base
err error
}
var errA: Error = DEPS.PG.NOT_FOUND.New (ctx, "not found")
var errB: *Error0 = DEPS.PG.NOT_FOUND.NewX(ctx, "not found")
base
is the common structure shared by all Error
implementation to provide common functionality: Code()
, Message()
, StackTrace()
, Fields()
, and more.
type base struct {
code Code
msg string
kv []tags.Field
stack stacktrace.StackTrace
}
VlError
is for validation errors
It can contain multiple sub-errors, and provide nice methods to work with validation helpers.
type VlError struct {
base
errs []error
}
You can create a VlError
similar to other Error
:
err := PRFL.USR.INVALID_ARGUMENT.New(ctx, "invalid request")
Or make a VlBuilder
, add errors to it, then convert it to a VlError
:
userID, err0 := parseUUID(req.UserId)
err1 := validatePassword(req.Password)
vl := PRFL.USR.INVALID_ARGUMENT.WithMsg("invalid request")
vl.Add(err0, err1)
vlErr := vl.ToError(ctx)
And include key/value pairs as usual:
vl := PRFL.USR.INVALID_ARGUMENT.
With(l.Bool("testingenv", true)).
WithMsg("invalid request")
userID, err0 := parseUUID(req.UserId)
err1 := validatePassword(req.Password)
vl.Add(err0, err1)
vlErr := vl.ToError(ctx, l.String("user_id", req.UserId))
Using fmt.Printf("%+v", vlErr)
will output:
[PRFL.USR.INVALID_ARGUMENT] invalid request
{"testingenv": true, "user_id": "A1234567890"}
ApiError
is an adapter for migrating API errors
Previously, we used a separate api.Error
struct for returning API errors to the front-end and external clients. It includes ErrorType
as ErrorCode
as mentioned before.
package api
import errorpb "connectly.ai/pb/models/error"
// Deprecated
type Error struct {
pbType errorpb.ErrorType
pbCode errorpb.ErrorCode
cause error
msg string
usrMsg string
usrTitle string
// ...
}
This type is now deprecated. Instead, we will declare all the mapping (ErrorType
, ErrorCode
, gRPC code, HTTP code) in a centralize place, and convert them at corresponding boundaries. I will discuss about code declaration in the next section.
To do the migration to the new namespace error framework, we added a temporary namespace ZZZ.API_TODO
. Every ErrorCode
becomes a ZZZ.API_TODO
code.
ZZZ.API_TODO.UNEXPECTED
ZZZ.API_TODO.INVALID_REQUEST
ZZZ.API_TODO.USERNAME_
ZZZ.API_TODO.META_CHOSE_NOT_TO_DELIVER
ZZZ.API_TODO.MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS
And ApiError
is created as an adapter. All functions that previously return *api.Error
were changed to return Error
(implemented by *ApiError
) instead.
package api
import . "connectly.ai/go/pkgs/errors/E"
// previous
func FailPreconditionf(err error, msg string, args ...any) *Error {
return &Error{
pbType: ERROR_TYPE_FAILED_PRECONDITION,
pbCode: ERROR_CODE_MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS,
cause: err,
msg: fmt.Sprintf(msg, args...)
}
}
// current: this is deprecated, and serves and an adapter
func FailPreconditionf(err error, msg string, args ...any) *Error {
ctx := context.TODO()
return ZZZ.API_TODO.MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS.
CallerSkip(1). // correct the stacktrace by 1 frame
Wrap(ctx, err, msg, args...)
}
When all the migration is done, the previous usage:
wabaErr := verifyWabaTemplateStatus(tpl)
apiErr := api.FailPreconditionf(wabaErr, "template cannot be edited").
WithErrorCode(ERROR_CODE_MESSAGE_WABA_TEMPLATE_CAN_ONLY_EDIT_ONCE_IN_24_HOURS).
WithUserMsg("According to WhatsApp, the message template can be only edited once in 24 hours. Consider creating a new message template instead.").
ErrorOrNil()
should become:
CPG.TPL.EDIT_ONCE_IN_24_HOURS.Wrap(
wabaErr, "template cannot be edited",
api.UserMsg("According to WhatsApp, the message template can be only edited once in 24 hours. Consider creating a new message template instead."))
Notice that the ErrorCode
is implicitly derived from the internal namespace code. No need to explicitly assign it every time. But how to declare the relationship between codes? It will be explained in the next section.
Declaring new error codes
At this point, you already know how to create new errors from existing codes. It’s time to explain about codes and how to add a new one.
A Code
is implemented as an uint16
value, which has a corresponding string presentation.
type Code struct { code: uint16 }
fmt.Printf("%q", DEPS.PG.NOT_FOUND)
// "DEPS.PG.NOT_FOUND"
To store those strings, there is an array of all available CodeDesc
:
const MaxCode = 321 // 👈 this value is generated
var allCodes [MaxCode]CodeDesc
type CodeDesc {
c int // 42
code string // DEPS.PG.NOT_FOUND
api APICodeDesc
}
type APICodeDesc {
ErrorType errorpb.ErrorType
ErrorCode errorpb.ErrorCode
HttpCode int
DefMessage string
UserMessage string
UserTitle string
}
Here’s how codes are declared:
var DEPS deps // dependencies
var PRFL prfl // profile
var FLD fld // flow document
type deps struct {
PG pg // postgres
RD rd // redis
}
// tag:postgres
type pg struct {
NOT_FOUND Code0 // record not found
CONFLICT Code0 // record already exist
MALFORM_SQL Code0
}
// tag:profile
type PRFL struct {
REPO prfl_repo
USR usr
AUTH auth
}
// tag:profile
type prfl_repo struct {
NOT_FOUND Code0 // internal error code
INVALID_ARGUMENT VlCode // internal error code
}
// tag:usr
type usr struct {
NOT_FOUND Code0 `api-code:"USER_NOT_FOUND"`
INVALID_ARGUMENT VlCode `api-code:"INVALID_ARGUMENT"`
DISABlED_ACCOUNT Code0 `api-code:"DISABLED_ACCOUNT"`
}
// tag:auth
type auth struct {
UNAUTHENTICATED Code0 `api-code:"UNAUTHENTICATED"`
PERMISSION_DENIED Code0 `api-code:"PERMISSION_DENIED"`
}
After declaring new codes, you need to run the generation script:
The generated code will look like this:
// Code generated by error-codes. DO NOT EDIT.
func init() {
// ...
PRFL.AUTH.UNAUTHENTICATED = Code0{Code{code: 143}}
PRFL.AUTH.PERMISSION_DENIED = Code0{Code{code: 144}}
// ...
allCodes[143] = CodeDesc{
c: 143, code: "PRFL.AUTH.UNAUTHENTICATED",
tags: []string{"auth", "profile"},
api: APICodeDesc{
ErrorType: ERROR_TYPE_UNAUTHENTICATED,
ErrorCode: ERROR_CODE_UNAUTHENTICATED,
HTTPCode: 401,
DefMessage: "Unauthenticated error",
UserMessage: "You are not authenticated.",
UserTitle: "Unauthenticated error",
}))
}
Each Error
type has a corresponding Code
type
Ever wonder how PRFL.USR.NOT_FOUND.New()
creates an *Error0
while PRFL.USR.INVALID_ARGUMENTS.New()
creates an *VlError
? It’s because they use different code types.
And each Code
type returns different Error
type, each can have its own extra methods:
type Code0 struct { Code }
type VlCode struct { Code }
func (c Code0) New(/*...*/) Error {
return &Error0{/*...*/}
}
func (c VlCode) New(/*...*/) Error {
return &VlError{/*...*/}
}
// extra methods on VlCode to create VlBuilder
func (c VlCode) WithMsg(msg string, args ...any) *VlBuilder {/*...*/}
type VlBuilder struct {
code VlCode
msg string
args []any
}
func (b *VlBuilder) ToError(/*...*/) Error {
return &VlError{Code: code, /*...*/ }
}
Use api-code
to mark the codes available for external API
- The namespace error code should be used internally.
- To make a code available for returning in external HTTP API, you need to mark it with
api-code
. The value is the correspondingerrorpb.ErrorCode
. - If an error code is not marked with
api-code
, it’s internal code and will be shown as a genericInternal Server Error
. - Notice that
PRFL.USR.NOT_FOUND
is external code, whilePRFL.USR.REPO.NOT_FOUND
is internal code.
Declare mapping between ErrorCode
, ErrorType
, and gRPC/HTTP codes in protobuf using enum option:
// error/type.proto
ERROR_TYPE_PERMISSION_DENIED = 707 [(error_type_detail_option) = {
type: "PermissionDeniedError",
grpc_code: PERMISSION_DENIED,
http_code: 403, // Forbidden
message: "permission denied",
user_title: "Permission denied",
user_message: "The caller does not have permission to execute the specified operation.",
}];
// error/code.proto
ERROR_CODE_DISABlED_ACCOUNT = 70020 [(error_code_detail_option) = {
error_type: ERROR_TYPE_DISABlED_ACCOUNT,
grpc_code: PERMISSION_DENIED,
http_code: 403, // Forbidden
message: "account is disabled",
user_title: "Account is disabled",
user_message: "Your account is disabled. Please contact support for more information.",
}];
UNEXPECTED
and UNKNOWN
codes
Each layer usually has 2 generic codes UNEXPECTED
and UNKNOWN
. They serve slightly different purposes:
UNEXPECTED
code is used for errors that should never happen.UNKNOWN
code is used for errors that are not explicitly handled.
Mapping errors to new code
When receiving an error returned from a function, you need to handle it: convert third-party errors to internal namespace errors and map error codes from inner layers to outer layers.
Convert third-party errors to internal namespace errors
How you handle errors depends on: what the third-party package returns and what your application needs. For example, when handling database or external API errors:
switch {
case errors.Is(err, sql.ErrNoRows):
// map a database "no rows" error to an internal "not found" error
return nil, PRFL.USR.NOT_FOUND.Wrap(ctx, err, "user not found")
case errors.Is(err, context.DeadlineExceeded):
// map a context deadline exceeded error to a timeout error
return nil, PRFL.USR.TIMEOUT.Wrap(ctx, err, "query timeout")
default:
// wrap any other error as unknown
return nil, PRFL.USR.UNKNOWN.Wrap(ctx, err, "unexpected error")
}
Using helpers for internal namespace errors
IsErrorCode(err, CODES...)
: Checks if the error contains any of the specified codes.IsErrorGroup(err, GROUP)
: Return true if the error belongs to the input group.
Typical usage pattern:
user, err := queryUser(ctx, userReq)
switch {
case err == nil:
// continue
case IsErrorCode(PRL.USR.REPO.NOT_FOUND):
// check for specific error code and convert to external code
// and return as HTTP 400 Not Found
return nil, PRFL.USR.NOT_FOUND.Wrap(ctx, err, "user not found")
case IsGroup(PRL.USR):
// errors belong to the PRFL.USR group are returned as is
return nil, err
default:
return nil, PRL.USR.UNKNOWN.Wrap(ctx, err, "failed to query user")
}
MapError()
for writing mapping code easier:
Since mapping error codes is a common pattern, there is a MapError()
helper to make writing code faster. The above code can be rewritten as:
user, err := queryUser(ctx, userReq)
if err != nil {
return nil, MapError(ctx, err).
Map(PRL.USR.REPO.NOT_FOUND, PRFL.USR.NOT_FOUND, "user not found").
KeepGroup(PRF.USR).
Default(PRL.USR.UNKNOWN, "failed to query user")
}
You can format arguments and add key/value pairs as usual:
return nil, MapError(ctx, err).
Map(PRL.USR.REPO.NOT_FOUND, PRFL.USR.NOT_FOUND,
"user %v not found", username,
l.String("flag", flag)).
KeepGroup(PRF.USR).
Default(PRL.USR.UNKNOWN, "failed to query user",
l.Any("retries", retryCount))
Testing with namespace Error
s
Testing is critical for any serious code base. The framework provides specialized helpers like ΩxError()
to make writing and asserting error conditions in tests easier and more expressive.
// 👉 return true if the error contains the message
ΩxError(err).Contains("not found")
// 👉 return true if the error does not contain the message
ΩxError(err).NOT().Contains("not found")
There are many more methods, and you can chain them too:
ΩxError(err).
MatchCode(DEPS.PG.NOT_FOUND). // match any code in top or wrapped errors
TopErrorMatchCode(PRFL.TPL.NOT_FOUND) // only match code from the top error
MatchAPICode(API_CODE.WABA_TEMPLATE_NOTE_FOUND). // match errorpb.ErrorCode
MatchExact("exact message to match")
Why use methods instead of Ω(err).To(testing.MatchCode())
?
Because methods are more discoverable. When you’re faced with dozens of functions like testing.MatchValues()
, it’s hard to know which ones will work with Error
s and which will not. With methods, you can simply type a dot .
, and your IDE will list all available methods specifically designed for asserting Error
s.
Migration
The framework is just half of the story. Writing the code? That’s the easy part. The real challenge starts when you have to bring it into a massive, living codebase where dozens of engineers are pushing changes daily, customers expect everything to work perfectly, and the system just can’t stop running.
Migration comes with responsibility. It’s about carefully splitting hair tiny bits of code, making tiny changes at a time, breaking a ton of tests in the process. Then manually inspecting and fixing them one by one, merging into the main branch, deploying to production, watching the logs and alerts. Repeating it over and over…
Here are some tips for migration that we learned along the way:
Start with search and replace: Begin by replacing old patterns with the new framework. Fix any compilation issues that arise from this process.
For example, replace all error
in this package with Error
.
type ProfileController interface {
LoginUser(req *LoginRequest) (*LoginResponse, error)
QueryUser(req *QueryUserRequest) (*QueryUserResponse, error)
}
The new code will look like this:
import . "connectly.ai/go/pkgs/errors"
type ProfileController interface {
LoginUser(req *LoginRequest) (*LoginResponse, Error)
QueryUser(req *QueryUserRequest) (*QueryUserResponse, Error)
}
Migrate one package at a time: Start with the lowest-level packages and work your way up. This way, you can ensure that the lower-level packages are fully migrated before moving on to the higher-level ones.
Add missing unit tests: If parts of the codebase lack tests, add them. If you are not confident in your changes, add more tests. They are helpful to make sure that your changes don’t break existing functionality.
If your package depends on calling higher-level packages: Consider changing the related functions to DEPRECATED then add new functions with the new Error
type.
Assume that you are migrating the database package, which has the Transaction()
method:
package database
func (db *DB) Transaction(ctx context.Context,
fn func(tx *gorm.DB) error) error {
return db.gorm.Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
}
And it is used in the user service package:
err = s.DB(ctx).Transaction(func(tx *database.DB) error {
user, usrErr := s.repo.CreateUser(ctx, tx, user)
if usrErr != nil {
return usrErr
}
}
Since you are migrating the database
package first, leaving the user
and dozens of other packages as it. The s.repo.CreateUser()
call still returns the old error
type while the Transaction()
method needs to return the new Error
type. You can change the Transaction()
method to DEPRECATED
and add a new TransactionV2()
method:
package database
// DEPRECATED: use TransactionV2 instead
func (db *DB) Transaction_DEPRECATED(ctx context.Context,
fn func(tx *gorm.DB) error) error {
return db.gorm.Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
}
func (db *DB) TransactionV2(ctx context.Context,
fn func(tx *gorm.DB) error) Error {
err := db.gorm.Transaction(func(tx *gorm.DB) error {
return fn(tx)
})
return adaptToErrorV2(err)
}
Add new error codes as you go: When you encounter an error that doesn’t fit into the existing ones, add a new code. This will help you build a comprehensive set of error codes over time. Codes from other packages are always available as references.
Conclusion
Error handling in Go can feel simple at first—just return an error
and move on. But as our codebase grew, that simplicity turned into a tangled mess of vague logs, inconsistent handling, and endless debugging sessions.
By stepping back and rethinking how we handle errors, we’ve built a system that works for us, not against us. Centralized and structured namespace codes give us clarity, while tools for mapping, wrapping, and testing errors make our lives easier. Instead of swimming through sea of logs, we now have meaningful, traceable errors that tell us what’s wrong and where to look.
This framework isn’t just about making our code cleaner; it’s about saving time, reducing frustration, and helping us prepare for the unknown. It’s just the beginning of a journey — we are still discovering more patterns — but the result is a system that can somehow bring peace of mind to error handling. Hopefully, it can spark some ideas for your projects too! 😊
Let’s stay connected!
Author
I’m Oliver Nguyen. A software maker working mostly in Go and JavaScript. I enjoy learning and seeing a better version of myself each day. Occasionally spin off new open source projects. Share knowledge and thoughts during my journey. Connect with me on , , , , or subscribe to my posts.