Introduction
"With great power comes great responsibility" - Spiderman (or Voltaire, for the so culturally inclined)
This text is an ever evolving collection of conventions, idioms and tricks that reflects the experience of developing a production-grade application in Nim with a small team of developers.
The guide is a living document to help manage the complexities of using an off-the-beaten-track language and environment to produce a stable product ready for an adverserial internet.
Each guideline starts with a general recommendation to use or not use a particular feature, this recommendation represents a safe "default" choice. It is followed by a rationale to help you decide when and how to apply the guideline with nuance - it will not be right for every situation out there but all other things being equal, following the guideline will make life easier for others, your future self included.
Following the principles and defaults set out here helps newcomers to familiarise themselves with the codebase more quickly, while experienced developers will appreciate the consistency when deciphering the intent behind a specific passage of code -- above all when trying to debug production issues under pressure.
The pros
and cons
sections are based on bugs, confusions and security issues that have been found in real-life code and that could easily have been avoided with.. a bit of style. The objective of this section is to pass the experience on to you, dear reader!
In particular when coming from a different language, experience with features like exception handling, generics and compile-time guarantees may not carry over due to subtle, and sometimes surprising, differences in semantics.
Much Nim code "out there" hails from past times when certain language features were not yet developed and best practices not yet established - this also applies to this guide, which will change over time as the practice and language evolves.
When in doubt:
- Read your code
- Deal with errors
- Favour simplicity
- Default to safety
- Consider the adversary
- Pay back your debt regularly
- Correct, readable, elegant, efficient, in that order
The latest version of this book can be found online or on GitHub.
This guide currently targets Nim v2.0.
At the time of writing, v2.0 has been released but its new garbage collector is not yet stable enough for production use. It is advisable to test new code with both --mm:refc
and --mm:orc
(the default) in the transition period.
Practical notes
- When deviating from the guide, document the rationale in the module, allowing the next developer to understand the motivation behind the deviation
- When encountering code that does not follow this guide, follow its local conventions or refactor it
- When refactoring code, ensure good test coverage first to avoid regressions
- Strive towards the guidelines where practical
- Consider backwards compatibility when changing code
- Good code usually happens after several rewrites: on the first pass, the focus is on the problem, not the code - when the problem is well understood, the code can be rewritten
Updates to this guide
Updates to this guide go through review as usual for code - ultimately, some choices in style guides come down to personal preference and contributions of that nature may end up being rejected.
In general, the guide will aim to prioritise:
- safe defaults - avoid footguns and code that is easily abused
- secure practices - assume code is run in an untrusted environment
- compile-time strictness - get the most out of the compiler and language before it hits the user
- readers over writers - only others can judge the quality of your code
Useful resources
While this book covers Nim at Status in general, there are other resources available that partially may overlap with this guide:
- Nim language manual - the authorative source for understanding the features of the language
- Nim documentation - other official Nim documentation, including its standard library and toolchain
- Nim by Example - Nim tutorials to start with
- Chronos guides
- nim-libp2p docs
Workflow
Pull requests
- One PR, one feature or fix
- Avoid mixing refactoring with features and bugfixes
- Post refactoring PR:s early, while working on feature that benefits from them
- Rebase on top of target branch
- Squash-merge the PR branch for easy rollback
- Since branches contain only one logical change, there's usually no need for more than one target branch commit
- Revert work that causes breakage and investigate in new PR
Contributing
We welcome code contributions and welcome our code being used in other projects.
Generally, all significant code changes are reviewed by at least one team member and must pass CI.
- For style and other trivial fixes, no review is needed (passing CI is sufficent)
- For small ideas, use a PR
- For big ideas, use an RFC issue
Formatting
Style [formatting.style]
We strive to follow NEP-1 for style matters - naming, capitalization, 80-character limit etc. Common places where deviations happen include:
- Code based on external projects
- Wrappers / FFI
- Implementations of specs that have their own naming convention
- Ports from other languages
- Small differences due to manual formatting
- Aligned indents - we prefer python-style hanging indent for in multiline code
- This is to avoid realignments when changes occur on the first line. The extra level of indentation is there to clearly distinguish itself as a continuation line.
func someLongFunctinName(
alsoLongVariableName: int) = # Double-indent
discard # back to normal indent
if someLongCondition and
moreLongConditions: # Double-indent
discard # back to normal indent
Practical notes
- We do not use
nimpretty
- as of writing (Nim 2.0), it is not stable enough for daily use:- Can break working code
- Naive formatting algorithm
- We do not make use of Nim's "flexible" identifier names - all uses of an identifier should match the declaration in capitalization and underscores
- Enable
--styleCheck:usages
and, where feasible,--styleCheck:error
- Enable
Naming conventions [formatting.naming]
Always use the same identifier style (case, underscores) as the declaration.
Enable --styleCheck:usages
, and, where feasible, --styleCheck:error
.
Ref
forref object
types, which have surprising semanticstype XxxRef = ref Xxx
type XxxRef = ref object ...
func init(T: type Xxx, params...): T
for "constructors"func init(T: type ref Xxx, params...): T
whenT
is aref
func new(T: type Xxx, params...): ref T
for "constructors" that return aref T
new
introducesref
to a non-ref
type
XxxError
for exceptions inheriting fromCatchableError
XxxDefect
for exceptions inheriting fromDefect
Language features
Nim is a language that organically has grown to include many advanced features and constructs. These features allow you to express your intent with great creativity, but often come with significant stability, simplicity and correctness caveats when combined.
Before stepping off the well-trodden path, consider the maintenance and compatibilty costs.
Import, export [language.import]
import
a minimal set of modules using explicit paths.
export
all modules whose types appear in public symbols of the current module.
Prefer specific imports. Avoid include
.
# Group by std, external then internal imports
import
# Standard library imports are prefixed with `std/`
std/[options, sets],
# use full name for "external" dependencies (those from other packages)
package/[a, b],
# use relative path for "local" dependencies
./c, ../d
# export modules whose types are used in public symbols in the current module
export options
Practical notes
Modules in Nim share a global namespace, both for the module name itself and for all symbols contained therein - because of this, your code might break because a dependency introduces a module or symbol with the same name - using prefixed imports (relative or package) helps mitigate some of these conflicts.
Because of overloading and generic catch-alls, the same code can behave differently depending on which modules have been imported and in which order - reexporting modules that are used in public symbols helps avoid some of these differences.
See also: sandwich problem
Macros [language.macros]
Be judicious in macro usage - prefer more simple constructs. Avoid generating public API functions with macros.
Pros
- Concise domain-specific languages precisely convey the central idea while hiding underlying details
- Suitable for cross-cutting libraries such as logging and serialization, that have a simple public API
- Prevent repetition, sometimes
- Encode domain-specific knowledge that otherwise would be hard to express
Cons
- Easy to write, hard to understand
- Require extensive knowledge of the
Nim
AST - Code-about-code requires tooling to turn macro into final execution form, for audit and debugging
- Unintended macro expansion costs can surprise even experienced developers
- Require extensive knowledge of the
- Unsuitable for public API
- Nowhere to put per-function documentation
- Tooling needed to discover API - return types, parameters, error handling
- Obfuscated data and control flow
- Poor debugging support
- Surprising scope effects on identifier names
Practical notes
- Consider a more specific, non-macro version first
- Use a difficulty multiplier to weigh introduction of macros:
- Templates are 10x harder to understand than plain code
- Macros are 10x harder than templates, thus 100x harder than plain code
- Write as much code as possible in templates, and glue together using macros
See also: macro defense
Object construction [language.objconstr]
Use Xxx(x: 42, y: Yyy(z: 54))
style, or if type has an init
function, Type.init(a, b, c)
.
Prefer that the default 0-initialization is a valid state for the type.
# `init` functions are a convention for constructors - they are not enforced by the language
func init(T: type Xxx, a, b: int): T = T(
x: a,
y: OtherType(s: b) # Prefer Type(field: value)-style initialization
)
let m = Xxx.init(1, 2)
# `new` returns a reference to the given type:
func new(T: type Xxx, a, b: int ): ref T = ...
# ... or `init` when used with a `ref Xxx`:
func init(T: type (ref Xxx), a, b: int ): T = ...
Pros
- Correct order of initialization enforced by compiler / code structure
- Dedicated syntax constructs a clean instance resetting all fields
- Possible to build static analysis tools to detect uninitialized fields
- Works for both
ref
and non-ref
types
Cons
- Sometimes inefficient compared to updating an existing
var
instance, since all fields must be re-initialized - Compared to
func newXxx()
,func new(T: type Xxx)
will be a generic procedure, which can cause issues. See Import, export
Practical notes
- The default, 0-initialized state of the object often gets constructed in the language - avoiding a requirement that a magic
init
function be called makes the type more ergonomic to use - Avoid using
result
orvar instance: Type
which disable several compiler diagnostics - When using inheritance,
func new(T: type Xxx)
will also bind to any type inheriting from Xxx
ref object
types [language.refobject]
Avoid ref object
types, except:
- for "handle" types that manage a resource and thus break under value semantics
- where shared ownership is intended
- in reference-based data structures (trees, linked lists)
- where a stable pointer is needed for 3rd-party compatibility
Prefer explicit ref MyType
where reference semantics are needed, allowing the caller to choose where possible.
# prefer explicit ref modifiers at usage site
func f(v: ref Xxx) = discard
let x: ref Xxx = new Xxx
# Consider using Hungarian naming convention with `ref object` - this makes it clear at usage sites that the type follows the unusual `ref` semantics
type XxxRef = ref object
# ...
Pros
ref object
types useful to prevent unintended copies- Limits risk of accidental stack allocation for large types
- This commonly may lead to stack overflow, specially when RVO is missed
- Garbage collector simplifies some algorithms
Cons
ref object
types have surprising semantics - the meaning of basic operations like=
changes- Shared ownership leads to resource leaks and data races
nil
references cause runtime crashes- Semantic differences not visible at usage site
- Always mutable - no way to express immutability
- Cannot be stack-allocated
- Hard to emulate value semantics
Notes
XxxRef = ref object
is a syntactic shortcut that hides the more explicit ref Xxx
where the type is used - by explicitly spelling out ref
, readers of the code become aware of the alternative reference / shared ownership semantics, which generally allows a deeper understanding of the code without having to look up the type declaration.
Memory allocation [language.memory]
Prefer to use stack-based and statically sized data types in core/low-level libraries. Use heap allocation in glue layers.
Avoid alloca
.
func init(T: type Yyy, a, b: int): T = ...
# Heap allocation as a local decision
let x = (ref Xxx)(
field: Yyy.init(a, b) # In-place initialization using RVO
)
Pros
- RVO can be used for "in-place" initialization of value types
- Better chance of reuse on embedded systems
- https://barrgroup.com/Embedded-Systems/How-To/Malloc-Free-Dynamic-Memory-Allocation
- http://www.drdobbs.com/embedded-systems/embedded-memory-allocation/240169150
- https://www.quora.com/Why-is-malloc-harmful-in-embedded-systems
- Allows consumer of library to decide on memory handling strategy
- It's always possible to turn plain type into
ref
, but not the other way around
- It's always possible to turn plain type into
Cons
- Stack space limited - large types on stack cause hard-to-diagnose crashes
- Hard to deal with variable-sized data correctly
Practical notes
alloca
has confusing semantics that easily cause stack overflows - in particular, memory is released when function ends which means that in a loop, each iteration will add to the stack usage. Several C
compilers implement alloca
incorrectly, specially when inlining.
Variable declaration [language.vardecl]
Use the most restrictive of const
, let
and var
that the situation allows.
# Group related variables
const
a = 10
b = 20
Practical notes
const
and let
each introduce compile-time constraints that help limit the scope of bugs that must be considered when reading and debugging code.
Variable initialization [language.varinit]
Prefer expressions to initialize variables and return values
let x =
if a > 4: 5
else: 6
func f(b: bool): int =
if b: 1
else: 2
# Avoid - `x` is not guaranteed to be initialized by all branches and in correct order (for composite types)
var x: int
if a > 4: x = 5
else: x = 6
Pros
- Stronger compile-time checks
- Lower risk of uninitialized variables even after refactoring
Cons
- Becomes hard to read when deeply nested
Functions and procedures [language.proc]
Prefer func
- use proc
when side effects cannot conveniently be avoided.
Avoid public functions and variables (*
) that don't make up an intended part of public API.
Practical notes
- Public functions are not covered by dead-code warnings and contribute to overload resolution in the the global namespace
- Prefer
openArray
as argument type overseq
for traversals
Methods [language.methods]
Use method
sparingly - consider a "manual" vtable with proc
closures instead.
Pros
- Compiler-implemented way of doing dynamic dispatch
Cons
- Poor implementation
- Implemented using
if
tree - Require full program view to "find" all implementations
- Implemented using
- Poor discoverability - hard to tell which
method
's belong together and form a virtual interface for a type- All implementations must be public (
*
)!
- All implementations must be public (
Practical notes
- Does not work with generics
- No longer does multi-dispatch
Callbacks, closures and forward declarations [language.proctypes]
Annotate proc
type definitions and forward declarations with {.raises [], gcsafe.}
or specific exception types.
# By default, Nim assumes closures may raise any exception and are not gcsafe
# By annotating the callback with raises and gcsafe, the compiler ensures that
# any functions assigned to the closure fit the given constraints
type Callback = proc(...) {.raises: [], gcsafe.}
Practical notes
- Without annotations,
{.raises [Exception].}
and no GC-safety is assumed by the compiler, infecting deduction in the whole call stack - Annotations constrain the functions being assigned to the callback to follow its declaration, simplifying calling the callback safely
- In particular, callbacks are difficult to reason about when they raise exceptions - what should the caller of the callback do?
result
return [language.result]
Avoid using result
for returning values.
Use expression-based return or explicit return
keyword with a value
Pros
- Recommended by NEP-1
- Used in standard library
- Saves a line of code avoiding an explicit
var
declaration - Accumulation-style functions that gradually build up a return value gain consistency
Cons
- No visual (or compiler) help when a branch is missing a value, or overwrites a previous value
- Disables compiler diagnostics for code branches that forget to set result
- Risk of using partially initialized instances due to
result
being default-initialized- For
ref
types,result
starts out asnil
which accidentally might be returned - Helpers may accidentally use
result
before it was fully initialized - Async/await using result prematurely due to out-of-order execution
- For
- Partially initialized instances lead to exception-unsafe code where resource leaks happen
- RVO causes observable stores in the left-hand side of assignments when exceptions are raised after partially modifying
result
- RVO causes observable stores in the left-hand side of assignments when exceptions are raised after partially modifying
- Confusing to people coming from other languages
- Confusing semantics in templates and macros
Practical notes
Nim has 3 ways to assign a return value to a function: result
, return
and "expressions".
Of the three:
- "expression" returns guarantee that all code branches produce one (and only one) value to be returned
- Used mainly when exit points are balanced and not deeply nested
- Explict
return
with a value make explicit what value is being returned in each branch- Used to avoid deep nesting and early exit, above all when returning early due to errors
result
is used to accumulate / build up return value, allowing it to take on invalid values in the interim
Multiple security issues, nil
reference crashes and wrong-init-order issues have been linked to the use of result
and lack of assignment in branches.
In general, the use of accumulation-style initialization is discouraged unless made necessary by the data type - see Variable initialization
Inline functions [language.inline]
Avoid using explicit {.inline.}
functions.
Pros
- Sometimes give performance advantages
Cons
- Adds clutter to function definitions
- Larger code size, longer compile times
- Prevent certain LTO optimizations
Practical notes
{.inline.}
does not inline code - rather it copies the function definition into everyC
module making it available for theC
compiler to inline- Compilers can use contextual information to balance inlining
- LTO achieves a similar end result without the cons
Converters [language.converters]
Avoid using converters.
Pros
- Implicit conversions lead to low visual overhead of converting types
Cons
- Surprising conversions lead to ambiguous calls:
converter toInt256*(a: int{lit}): Int256 = a.i256 if stringValue.len > 32: ...
Error: ambiguous call; both constants.>(a: Int256, b: int)[declared in constants.nim(76, 5)] and constants.>(a: UInt256, b: int)[declared in constants.nim(82, 5)] match for: (int, int literal(32))
Finalizers [language.finalizers]
Don't use finalizers.
Pros
- Alleviates the need for manual cleanup
Cons
- Buggy, cause random GC crashes
- Calling
new
with finalizer for one instance infects all instances with same finalizer - Newer Nim versions migrating new implementation of finalizers that are sometimes deterministic (aka destructors)
Binary data [language.binary]
Use byte
to denote binary data. Use seq[byte]
for dynamic byte arrays.
Avoid string
for binary data. If stdlib returns strings, convert to seq[byte]
as early as possible
Pros
- Explicit type for binary data helps convey intent
Cons
char
anduint8
are common choices often seen inNim
- hidden assumption that 1 byte == 8 bits
- language still being developed to handle this properly - many legacy functions return
string
for binary data
Practical notes
- stew contains helpers for dealing with bytes and strings
Integers [language.integers]
Prefer signed integers for counting, lengths, array indexing etc.
Prefer unsigned integers of specified size for interfacing with binary data, bit manipulation, low-level hardware access and similar contexts.
Don't cast pointers to int
.
Practical notes
- Signed integers are overflow-checked and raise an untracked
Defect
on overflow, unsigned integers wrap int
anduint
vary depending on platform pointer size - use judiciously- Perform range checks before converting to
int
, or convert to larger type- Conversion to signed integer raises untracked
Defect
on overflow - When comparing lengths to unsigned integers, convert the length to unsigned
- Conversion to signed integer raises untracked
- Pointers may overflow
int
when used for arithmetic - Avoid
Natural
- implicit conversion fromint
toNatural
can raise aDefect
- see
range
- see
range
[language.range]
Avoid range
types.
Pros
- Range-checking done by compiler
- More accurate bounds than
intXX
- Communicates intent
Cons
- Implicit conversions to "smaller" ranges may raise
Defect
- Language feature has several fundamental design and implementation issues
- https://github.com/nim-lang/Nim/issues/16839
- https://github.com/nim-lang/Nim/issues/16744
- https://github.com/nim-lang/Nim/issues/13618
- https://github.com/nim-lang/Nim/issues/12780
- https://github.com/nim-lang/Nim/issues/10027
- https://github.com/nim-lang/Nim/issues?page=1&q=is%3Aissue+is%3Aopen+range
string
[language.string]
The string
type in Nim represents text in an unspecified encoding, typically UTF-8 on modern systems.
Avoid string
for binary data (see language.binary)
Practical notes
- The text encoding is undefined for
string
types and is instead determined by the source of the data (usually UTF-8 for terminals and text files)- When dealing with passwords, differences in encoding between platforms may lead to key loss
Error handling
Error handling in Nim is a subject under constant re-evaluation - similar to C++, several paradigms are supported leading to confusion as to which one to choose.
In part, the confusion stems from the various contexts in which Nim can be used: when executed as small, one-off scripts that can easily be restarted, exceptions allow low visual overhead and ease of use.
When faced with more complex and long-running programs where errors must be dealt with as part of control flow, the use of exceptions can directly be linked to issues like resource leaks, security bugs and crashes.
Likewise, when preparing code for refactoring, the compiler offers little help in exception-based code: although raising a new exception breaks ABI, there is no corresponding change in the API: this means that changes deep inside dependencies silently break dependent code until the issue becomes apparent at runtime (often under exceptional circumstances).
A final note is that although exceptions may have been used successfully in some languages, these languages typically offer complementary features that help manage the complexities introduced by exceptions - RAII, mandatory checking of exceptions, static analysis etc - these have yet to be developed for Nim.
Because of the controversies and changing landscape, the preference for Status projects is to avoid the use of exceptions unless specially motivated, if only to maintain consistency and simplicity.
Porting legacy code
When dealing with legacy code, there are several common issues, most often linked to abstraction and effect leaks. In Nim, exception effects are part of the function signature but deduced based on code. Sometimes the deduction must make a conservative estimate, and these estimates infect the entire call tree until neutralised with a try/except
.
When porting code, there are two approaches:
- Bottom up - fix the underlying library / code
- Top down - isolate the legacy code with
try/except
- In this case, we note where the
Exception
effect is coming from, should it be fixed in the future
- In this case, we note where the
Result [errors.result]
Prefer bool
, Opt
or Result
to signal failure outcomes explicitly. Avoid using the result
identifier.
Prefer the use of Result
when multiple failure paths exist and the calling code might need to differentiate between them.
Raise Defect
to signal panics such as logic errors or preconditions being violated.
Make error handling explicit and visible at call site using explicit control flow (if
, try
, results.?
).
Handle errors locally at each abstraction level, avoiding spurious abstraction leakage.
Isolate legacy code with explicit exception handling, converting the errors to Result
or handling them locally, as appropriate.
# Enable exception tracking for all functions in this module
{.push raises: [].} # Always at start of module
import results
export results # Re-export modules used in public symbols
# Use `Result` to propagate additional information expected errors
# See `Result` section for specific guidlines for errror type
func f*(): Result[void, cstring]
# In special cases that warrant the use of exceptions, list these explicitly using the `raises` pragma.
func parse(): Type {.raises: [ParseError]}
See also Result for more recommendations about Result
.
See also Error handling helpers in stew that may change some of these guidelines.
Exceptions [errors.exceptions]
In general, prefer explicit error handling mechanisms.
Annotate each module at top-level (before imports):
{.push raises: [].}
Use explicit {.raises.}
annotation for each public (*
) function.
Raise Defect
to signal panics and undefined behavior that the code is not prepared to handle.
# Enable exception tracking for all functions in this module
`{.push raises: [].}` # Always at start of module
# Inherit from CatchableError and name XxxError
type MyLibraryError = object of CatchableError
# Raise Defect when panicking - this crashes the application (in different ways
# depending on Nim version and compiler flags) - name `XxxDefect`
type SomeDefect = object of Defect
# Use hierarchy for more specific errors
type MySpecificError = object of MyLibraryError
# Explicitly annotate functions with raises - this replaces the more strict
# module-level push declaration on top
func f() {.raises: [MySpecificError]} = discard
# Isolate code that may generate exceptions using expression-based try:
let x =
try: ...
except MyError as exc: ... # use the most specific error kind possible
# Be careful to catch excpetions inside loops, to avoid partial loop evaluations:
for x in y:
try: ..
except MyError: ..
# Provide contextual data when raising specific errors
raise (ref MyError)(msg: "description", data: value)
Pros
- Used by
Nim
standard library - Good for quick prototyping without error handling
- Good performance on happy path without
try
- Compatible with RVO
Cons
- Poor readability - exceptions not part of API / signatures by default
- Have to assume every line may fail
- Poor maintenance / refactoring support - compiler can't help detect affected code because they're not part of API
- Nim exception hierarchy unclear and changes between versions
- The distinction between
Exception
,CatchableError
andDefect
is inconsistently implemented Defect
is not tracked
- The distinction between
- Without translation, exceptions leak information between abstraction layers
- Writing exception-safe code in Nim impractical due to missing critical features present in C++
- No RAII - resources often leak in the presence of exceptions
- Destructors incomplete / unstable and thus not usable for safe EH
- No constructors, thus no way to force particular object states at construction
ref
types incompatible with destructors, even if they worked
- Poor performance of error path
- Several heap allocations for each `Exception`` (exception, stack trace, message)
- Expensive stack trace
- Poor performance on happy path
- Every
try
anddefer
has significant performance overhead due tosetjmp
exception handling implementation
- Every
Practical notes
The use of exceptions in Nim has significantly contributed to resource leaks, deadlocks and other difficult bugs. The various exception handling proposals aim to alleviate some of the issues but have not found sufficient grounding in the Nim community to warrant the language changes necessary to proceed.
Defect
Defect
does not cause a raises
effect - code must be manually verified - common sources of Defect
include:
- Over/underflows in signed arithmetic
[]
operator for indexing arrays/seqs/etc (but not tables!)- accidental/implicit conversions to
range
types
CatchableError
Catching CatchableError
implies that all errors are funnelled through the same exception handler. When called code starts raising new exceptions, it becomes difficult to find affected code - catching more specific errors avoids this maintenance problem.
Frameworks may catch CatchableError
to forward exceptions through layers. Doing so leads to type erasure of the actual raised exception type in raises
tracking.
Open questions
- Should a hierarchy be used?
- Why? It's rare that calling code differentiates between errors
- What to start the hierarchy with? Unclear whether it should be a global type (like
CatchableError
orValueError
, or a module-local type
- Should exceptions be translated?
- Leaking exception types between layers means no isolation, joining all modules in one big spaghetti bowl
- Translating exceptions has high visual overhead, specially when hierachy is used - not practical, all advantages lost
- Should
raises
be used?- Equivalent to
Result[T, SomeError]
but lacks generics - Additive - asymptotically tends towards
raises: [CatchableError]
, losing value unless exceptions are translated locally - No way to transport accurate raises type information across Future/async/generic code boundaries - no
raisesof
equivalent oftypeof
- Equivalent to
Background
- Stew EH helpers - Helpers that make working with checked exceptions easier
- Nim Exception RFC - seeks to differentiate between recoverable and unrecoverable errors
- Zahary's handling proposal - seeks to handle any kind of error-generating API
- C++ proposal - After 25 years of encouragement, half the polled C++ developers continue avoiding exceptions and Herb Sutter argues about the consequences of doing so
- Google and llvm style guides on exceptions
Status codes [errors.status]
Avoid status codes.
type StatusCode = enum
Success
Error1
...
func f(output: var Type): StatusCode
Pros
- Interop with
C
Cons
output
undefined in case of error- Verbose to use, must first declare mutable variable then call function and check result - mutable variable remains in scope even in "error" branch leading to bugs
Practical notes
Unlike "Error Enums" used with Result
, status codes mix "success" and "error" returns in a single enum, making it hard to detect "successful" completion of a function in a generic way.
Callbacks
See language section on callbacks.
Libraries
The libraries section contains guidelines for libraries and modules frequently used in the codebase.
Standard library usage [libraries.std]
Use the Nim standard library judiciously. Prefer smaller, separate packages that implement similar functionality, where available.
Pros
- Using components from the standard library increases compatibility with other Nim projects
- Fewer dependencies in general
Cons
- Large, monolithic releases make upgrading difficult - bugs, fixes and improvements are released together causing upgrade churn
- Many modules in the standard library are unmaintained and don't use state-of-the-art features of Nim
- Long lead times for getting fixes and improvements to market
- Often not tailored for specific use cases
- Stability and backwards compatibility requirements prevent fixing poor and unsafe API
Practical notes
Use the following stdlib replacements that offer safer API (allowing more issues to be detected at compile time):
- async -> chronos
- bitops -> stew/bitops2
- endians -> stew/endians2
- exceptions -> results
- io -> stew/io2
- sqlite -> nim-sqlite3-abi
- streams -> nim-faststreams
Results [libraries.results]
Use Result
to document all outcomes of functions.
Use cstring
errors to provide diagnostics without expectation of error differentiation.
Use enum
errors when error kind matters.
Use complex types when additional error information needs to be included.
Use Opt
(Result
-based Option
) for simple functions that fail only in trivial ways.
# Stringly errors - the cstring is just for information and
# should not be used for comparisons! The expectation is that
# the caller doesn't have to differentiate between different
# kinds of errors and uses the string as a print-only diagnostic.
func f(): Result[int, cstring] = ...
# Calling code acts on error specifics - use an enum
func f2(): Result[int, SomeEnum] = ...
if f2.isErr and f2.error == SomeEnum.value: ...
# Transport exceptions - Result has special support for this case
func f3(): Result[int, ref SomeError] = ...
Pros
- Give equal consideration to normal and error case
- Easier control flow vulnerability analysis
- Good for "binary" cases that either fail or not
- No heap allocations for simple errors
Cons
- Visual overhead and poor language integration in
Nim
- uglyif
trees grow - Nim compiler generates ineffient code for complex types due to how return values are 0-intialized
- Lack of pattern matching makes for inconvenient code
- Standard library raises many exceptions, hard to use cleanly
Practical notes
- When converting modules, isolate errors from legacy code with
try/except
- Common helpers may be added at some point to deal with third-party dependencies that are hard to change - see
stew/shims
- Common helpers may be added at some point to deal with third-party dependencies that are hard to change - see
Hex output [libraries.hex]
Print hex output in lowercase. Accept upper and lower case.
Pros
- Single case helps tooling
- Arbitrary choice, aim for consistency
Cons
- No community consensus - some examples in the wild use upper case
Practical notes
byteutils contains a convenient hex printer.
Wrappers [libraries.wrappers]
Prefer native Nim
code when available.
C
libraries and libraries that expose a C
API may be used (including rust
, C++
, go
).
Avoid C++
libraries.
Prefer building the library on-the-fly from source using {.compile.}
. Pin the library code using a submodule or amalgamation.
The interop guide contains more information about foreing language interoperability.
Pros
- Wrapping existing code can improve time-to-market for certain features
- Maintenance is shared with upstream
- Build simplicity is maintained when
{.compile.}
is used
Cons
- Often leads to unnatural API for
Nim
- Constrains platform support
- Nim and
nimble
tooling poorly supports 3rd-party build systems making installation difficult - Nim
C++
support incomplete- Less test suite coverage - most of
Nim
test suite usesC
backend - Many core
C++
features likeconst
,&
and&&
difficult to express - in particular post-C++11
code has a large semantic gap compared to Nim - Different semantics for exceptions and temporaries compared to
C
backend - All-or-nothing - can't use
C++
backend selectively forC++
libraries
- Less test suite coverage - most of
- Using
{.compile.}
increases build times, specially for multi-binary projects - use judiciously for large dependencies
Practical notes
- Consider tooling like
c2nim
andnimterop
to create initial wrapper - Generate a
.nim
file corresponding to the.h
file of the C project- preferably avoid the dependency on the
.h
file (avoid{.header.}
directives unless necessary)
- preferably avoid the dependency on the
- Write a separate "raw" interface that only imports
C
names and types as they're declared inC
, then do convenience accessors on the Nim side- Name it
xxx_abi.nim
- Name it
- To use a
C++
library, write aC
wrapper first- See
llvm
for example
- See
- When wrapping a
C
library, consider ABI, struct layout etc
Examples
stew
[libraries.stew]
stew
contains small utilities and replacements for std
libraries.
If similar libraries exist in std
and stew
, prefer stew.
Pros
stew
solves bugs and practical API design issues in stdlib without having to wait for nim release- Fast development cycle
- Allows battle-testing API before stdlib consideration (think boost)
- Encourages not growing nim stdlib further, which helps upstream maintenance
Cons
- Less code reuse across community
- More dependencies that are not part of nim standard distribution
Practical notes
nim-stew
exists as a staging area for code that could be considered for future inclusion in the standard library or, preferably, a separate package, but that has not yet been fully fleshed out as a separate and complete library.
Tooling
Nim version
We support a single Nim version that is upgraded between release cycles of our own projects. Individual projects and libraries may choose to support multiple Nim versions, though this involves significant overhead.
Pros
- Nim
devel
branch, as well as feature and bugfix releases often break the codebase due to subtle changes in the language and code generation which are hard to diagnose - each upgrade requires extensive testing - Easier for community to understand exact set of dependencies
- Balance between cutting edge and stability
- Own branch enables escape hatch for critical issues
Cons
- Work-arounds in our code for
Nim
issues add technical debt - Compiler is rebuilt in every clone
Practical notes
- Following Nim
devel
, from experience, leads to frequent disruptions as "mysterious" issues appear - To support multiple Nim versions in a project, the project should be set up to run CI with all supported versions
Build system [tooling.build]
We use a build system with make
and git
submodules. The long term plan is to move to a dedicated package and build manager once one becomes available.
Pros
- Reproducible build environment
- Fewer disruptions due to mismatching versions of compiler and dependencies
Cons
- Increased build system complexity with tools that may not be familiar to
nim
developers - Build system dependencies hard to use on Windows and constrained environments
nimble
We do not use nimble
, due to the lack of build reproducibility and other team-oriented features. We sometimes provide .nimble
packages but these may be out of date and/or incomplete.
Dependency management [tooling.deps]
We track dependencies using git
submodules to ensure a consistent build environment for all development. This includes the Nim compiler, which is treated like just another dependency - when checking out a top-level project, it comes with an env.sh
file that allows you to enter the build environment, similar to python venv
.
When working with upstream projects, it's sometimes convenient to fork the project and submodule the fork, in case urgent fixes / patches are needed. These patches should be passed on to the relevant upstream.
Pros
- Reproducible build environment ensures that developers and users talk about the same code
- dependencies must be audited for security issues
- Easier for community to understand exact set of dependencies
- Fork enables escape hatch for critical issues
Cons
- Forking incurs overhead when upgrading
- Transitive dependencies are difficult to coordinate
- Cross-project commits hard to orchestrate
Practical notes
- All continuous integration tools build using the same Nim compiler and dependencies
- When a
Nim
or other upstream issue is encountered, consider project priorities:- Use a work-around, report issue upstream and leave a note in code so that the work-around can be removed when a fix is available
- Patch our branch after achieving team consensus
Editors [tooling.editors]
vscode
Most nim
developers use vscode
.
- Nim Extension gets you syntax highlighting, goto definition and other modernities
- The older, but less maintained Nim plugin is an alternative
- To start
vscode
with the correct Nim compiler, run it with./env.sh code
- Run nim files with
F6
- Suggestions, goto and similar features mostly work, but sometimes hang
- You might need to
killall nimsuggest
occasionally
- You might need to
Other editors with Nim integration
- Sublime text
vim
Debugging [tooling.debugging]
- Debugging can be done with
gdb
just as ifC
was being debugged- Follow the C/C++ guide for setting it up in
vscode
- Pass
--opt:none --debugger:native
to disable optimizations and enable debug symbols
- Follow the C/C++ guide for setting it up in
Profiling
- Linux:
perf
- Anywhere: vtune
Code tricks [tooling.tricks]
- Find out where a function is used: temporarily mark it
{.deprecated.}
Interop with other languages (FFI)
Nim comes with powerful interoperability options, both when integrating Nim code in other languages and vice versa.
Acting as a complement to the manual, this section of the book covers interoperability / FFI: how to integrate Nim into other languages and how to use libraries from other languages in Nim.
While it is possible to automate many things related to FFI, this guide focuses on core functionality - while tooling, macros and helpers can simplify the process, they remain a cosmetic layer on top of the fundamentals presented here.
The focus of this guide is on pragmatic solutions available for the currently supported versions of Nim - 1.6 at the time of writing - the recommendations may change as new libraries and Nim versions become available.
For examples, head to the interop folder in the style guide repository.
Basics
In interop, we rely on a lowest common denominator of features between languages - for compiled languages, this is typically the mutually overlapping part of the ABI.
Nim is unique in that it also allows interoperability at the API level with C/C++ - however, this guide focuses on interoperability via ABI since this is more general and broadly useful.
Most languages define their FFI in terms of a simplified version of the C ABI - thus, the process of using code from one language in another typically consists of two steps:
- exporting the source library functions and types as "simple C"
- importing the "simple C" functions and types in the target language
We'll refer to this part of the process as ABI wrapping.
Since libraries tend to use the full feature set of their native language, we can see two additional steps:
- exposing the native library code in a "simple C" variant via a wrapper
- adding a wrapper around the "simple C" variant to make the foreign library feel "native"
We'll call this API wrapping - the API wrapper takes care of:
- conversions to/from Nim integer types
- introducing Nim idioms such as generics
- adapting the error handling model
The C ABI serves as the "lingua franca" of interop - the C guide in particular can be studied for topics not covered in the other language-specific sections.
Calling Nim code from other languages
Nim code can be compiled both as shared and static libraries and thus used from other languages.
Exporting Nim functions to other languages
To export functions to other languages, the function must be marked as exportc, dynlib
- in addition, the function should not raise exceptions and use the cdecl
calling convention typically.
We can declare a helper pragma
to set all the options at once:
{.pragma: exported, exportc, cdecl, raises: [].}
Importing other language functions to Nim
Similar to when exporting functions, imported functions need to be annotated with several pragmas to ensure they are imported correctly. Since imported functions don't interact with Nim exceptions or the garbage collector, they should be marked with raises[], gcsafe
.
{.pragma: imported, importc, cdecl, raises: [], gcsafe.}
Runtime library initialization
When calling Nim from other languages, the Nim runtime must first be initialized. Additionally, if using garbage collected types, the garbage collector must also be initialized once per thread.
Runtime initialization is done by calling the NimMain
function. It can be called either separately from the host language or guarded by a boolean from every exported function.
Garbage collector initialization is a two-step process:
- the garbage collector itself must be inititialized with a call to
setupForeignThreadGc
nimGC_setStackBottom
must be called to establish the starting point of the stack- this function must be called in all places where it is possible that the exported function is being called from a "shorter" stack frame
Typically, this is solved with a "library initialization" call that users of the library should call near the beginning of every thread (ie in their main
or thread entry point function):
proc NimMain() {.importc.} # This function is generated by the Nim compiler
var initialized: Atomic[bool]
proc initializeMyLibrary() {.exported.} =
if not initialized.exchange(true):
NimMain() # Every Nim library needs to call `NimMain` once exactly
when declared(setupForeignThreadGc): setupForeignThreadGc()
when declared(nimGC_setStackBottom):
var locals {.volatile, noinit.}: pointer
locals = addr(locals)
nimGC_setStackBottom(locals)
proc exportedFunction {.exported.} =
assert initialized, "You forgot to call `initializeMyLibrary"
echo "Hello from Nim
In languages such as Go, it is hard to anticipate which thread code will be called from - in such cases, you can safely initialize the garbage collector in every exported function instead:
proc exportedFunction {.exported.} =
initializeMyLibrary() # Initialize the library on behalf of the user - this is usually more convenient
echo "Hello from Nim
The garbage collector can be avoided using manual memory management techniques, thus removing the requirement to initialize it in each thread - the runtime must always be initialized.
See also the Nim documentation on this topic.
Globals and top-level code
Code written outside of a proc
/ func
is executed as part of import
:ing the module, or, in the case of the "main" module of the program, as part of executing the module itself similar to the main
function in C.
This code will be run as part of calling NimMain
as noted above!
Exceptions
You must ensure that no exceptions pass to the foreign language - instead, catch all exceptions and covert them to a different error handling mechanism, annotating the exported function with {.raises: [].}
.
Memory
Nim is generally a GC-first language meaning that memory is typically managed via a thread-local garbage collector.
Nim also supports manual memory management - this is most commonly used for threading and FFI.
Garbage-collected types
Garbage-collection applies to the following types which are allocated from a thread-local heap:
string
andseq
- these are value types that underneath use the GC heap for the payload- the
string
uses a dedicated length field but also ensures NULL-termination which makes it easy to pass to C seq
uses a similar in-memory layout without the NULL termination- addresses to elements are stable as long as as elements are not added
- the
ref
types- types that are declared as
ref object
- non-ref types that are allocated on the heap with
new
(and thus becomeref T
)
- types that are declared as
ref
types and pointers
The lifetime of garbage-collected types is undefined - the garbage collector generally runs during memory allocation but this should not be relied upon - instead, lifetime can be extended by calling GC_ref
and GC_unref
.
ref
types have a stable memory address - to pass the address of a ref
instance via FFI, care must be taken to extend the lifetime of the instance so that it is not garbage-collected
proc register(v: ptr cint) {.importc.}
proc unregister(v: ptr cint) {.importc.}
# Allocate a `ref cint` instance
let number = new cint
# Let the garbage collector know we'll be creating a long-lived pointer for FFI
GC_ref(number)
# Pass the address of the instance to the FFI function
register(addr number[])
# ... later, in reverse order:
# Stop using the instance in FFI - address is guaranteed to be stable
unregister(addr number[])
# Let the garbage collector know we're done
GC_unref(number)
Manual memory management
Manual memory management is done with create
(by type), alloc
(by size) and dealloc
:
proc register(v: ptr cint) {.importc.}
proc unregister(v: ptr cint) {.importc.}
# Allocate a `ptr cint` instance
let number = create cint
# Pass the address of the instance to the FFI function
register(number)
# ... later, in reverse order:
# Stop using the instance in FFI - address is guaranteed to be stable
unregister(number)
# Free the instance
dealloc(number)
To allocate memory for cross-thread usage, ie allocating in one thread and deallocating in the other, use createShared
/ allocShared
and deallocShared
instead.
Threads
Threads in Nim are created with createThread
which creates the thread and prepares the garbage collector for use on that thread.
See above for how to initialize the garbage collector when calling Nim from threads created in other languages.
Passing data between threads
The primary method of passing data between threads is to encode the data into a shared memory section then transfer ownership of the memory section to the receiving thread either via a thread-safe queue, channel, socket or pipe.
The queue itself can be passed to thread either at creation or via a global variable, though we generally seek to avoid global variables.
# TODO pick a queue
type ReadStatus = enum
Empty
Ok
Done
proc read(queue: ptr Queue[pointer], var data: seq[byte]): ReadStatus =
var p: pointer
if queue.read(p):
if isNil(p):
ReadStatus.Done
else:
var len: int
copyMem(addr len, p, sizeof(len))
data = newSeqUninitalized[byte](len)
copyMem(addr data[0], cast[pointer](cast[uint](data) + sizeof(len)), len)
ReadStatus.Ok
else:
ReadStatus.Empty
proc write(queue: ptr Queue[pointer], data: openArray[byte]) =
# Copy data to a shared length-prefixed buffer
let
copy = allocShared(int(len) + sizeof(len))
copyMem(copy, addr len, sizeof(len))
copyMem(cast[pointer](cast[uint](copy) + sizeof(len)), v, len)
# Put the data on a thread-safe queue / list
queue.add(copy)
proc reader(queue: ptr Queue[pointer]):
var data: seq[byte]
while true:
case queue.read(data)
of Done: return
of Ok: process(data)
of Empty:
# Polling should usually be replaced with an appropriate "wake-up" mechanism
sleep(100)
async / await
When chronos
is used, execution is typically controlled by the chronos
per-thread dispatcher - passing data to chronos
is done either via a pipe / socket or by polling a thread-safe queue.
See the async example.
Resources
C / General wrapping
ABI wrapping is the process describing the low-level interface of a library in an interop-friendly way using the lowest common denominator between the languages. For interop, we typically separate the "raw" ABI wrapper from higher-level code that adds native-language conveniece.
When importing foreign libraries in Nim, the ABI wrapper can be thought of as a C "header" file: it describes to the compiler what code and data types are available in the library and how to encode them.
Conversely, exporting Nim code typically consists of creating special functions in Nim using the C-compatible subset of the langauge then creating a corrsponding ABI description in the target language.
Typical of the ABI wrapper is the use of the FFI pragmas (importc
, exportc
etc) and, depending on the library, C types such as cint
, csize_t
as well as manual memory management directives such as pointer
, ptr
.
In some cases, it may be necessary to write an "export wrapper" in C - this happens in particular when the library was not written with ineroperability in mind, for example when there is heavy C pre-processor use or function implementations are defined in the C header file.
Exporting
Exporting Nim code is done by creating an export module that presents the Nim code as a simplified C interface:
import mylibrary
# either `c`-prefixed types (`cint` etc) or explicitly sized types (int64 etc) work well
proc function(arg: int64): cint {.exportc: "function", cdecl, raises: [].} =
# Validate incoming arguments before converting them to Nim equivalents
if arg >= int64(int.high) or arg <= int64(int.low):
return 0 # Expose error handling
mylibrary.function(int(arg))
Importing
Build process
To import a library into Nim, it must first be built by its native compiler - depending on the complexity of the library, this can be done in several ways.
The preferred way of compiling a native library is it include it in the Nim build process via {.compile.}
directives:
{.compile: "somesource.c".}
This ensures that the library is built together with the Nim code using the same C compiler as the rest of the build, automatically passing compilation flags and using the expected version of the library.
Alterantives include:
- build the library as a static or shared library, then make it part of the Nim compilation via
{.passL.}
- difficult to ensure version compatiblity
- shared library requires updating dynamic library lookup path when running the binary
- build the library as a shared library, then make it part of the Nim compilation via
{.dynlib.}
- nim will load the library via
dlopen
(or similar) - easy to run into ABI / version mismatches
- no record in binary about the linked library - tools like
ldd
will not display the dependencies correctly
- nim will load the library via
Naming
ABI wrappers are identified by abi
in their name, either as a suffix or as the module name itself:
Functions and types
Having created a separate module for the type, create definitions for each function and type that is meant to be used from Nim:
# Create a helper pragma that describes the ABI of typical C functions:
# * No Nim exceptions
# * No GC interation
{.pragma imported, importc, cdecl, raises: [], gcsafe.}
proc function(arg: int64): cint {.imported.}
Callbacks
Callbacks are functions in the Nim code that are registered with the imported library and called from the library:
# The "callback" helper pragma:
#
# * sets an explicit calling convention to match C
# * ensures no exceptions leak from Nim to the caller of the callback
{.pragma: callback, cdecl, raises: [], gcsafe.}
import strutils
proc mycallback(arg: cstring) {.callback.} =
# Write nim code as usual
echo "hello from nim: ", arg
# Don't let exceptions escape the callback
try:
echo "parsed: ", parseInt($arg)
except ValueError:
echo "couldn't parse"
proc registerCallback(callback: proc(arg: cstring) {.callback.}) {.imported.}
registerCallback(mycallback)
Care must be taken that the callback is called from a Nim thread - if the callback is called from a thread controlled by the library, the thread might need to be initialized first.
Memory allocation
Nim supports both garbage-collected, stack-based and manually managed memory allocation.
When using garbage-collected types, care must be taken to extend the lifetime of objects passed to C code whose lifetime extends beyond the function call:
# Register a long-lived instance with C library
proc register(arg: ptr cint) {.imported.}
# Unregister a previously registered instance
proc unregister(arg: ptr cint) {.imported.}
proc setup(): ref cint =
let arg = new cint
# When passing garbage-collected types whose lifetime extends beyond the
# function call, we must first protect the them from collection:
GC_ref(arg)
register(addr arg[])
arg
proc teardown(arg: ref cint) =
# ... later
unregister(addr arg[])
GC_unref(arg)
C wrappers
Sometimes, C headers contain not only declarations but also definitions and / or macros. Such code, when exported to Nim, can cause build problems, symbol duplication and other related issues.
The easiest way to expose such code to Nim is to create a plain C file that re-exports the functionality as a normal function:
#include <inlined_code.h>
/* Reexport `function` using a name less likely to conflict with other "global" symbols */
int library_function() {
/* function() is either a macro or an inline funtion defined in the header */
return function();
}
Tooling
c2nim
- translate C header files to Nim, providing a starting place for wrappers
References
Go interop
Nim and Go are both statically typed, compiled languages capable of interop via a simplifed C ABI.
On the Go side, interop is handled via cgo.
Threads
Go includes a native M:N
scheduler for running Go tasks - because of this, care must be taken both when calling Nim code from Go: the thread from which the call will happen is controlled by Go and we must initialise the Nim garbage collector in every function exposed to Go, as documented in the main guide.
As an alternative, we can pass the work to a dedicated thread instead - this works well for asynchronous code that reports the result via a callback mechanism:
{.pragma: callback, cdecl, raises: [], gcsafe.}
type
MyAPI = object
queue: ThreadSafeQueue[ExportedFunctionData] # TODO document where to find a thread safe queue
ExportedFunctionCallback = proc(result: cint) {.callback.}
ExportedFunctionData =
v: cint
callback: ExportedFunctionCallback
proc runner(api: ptr MyAPI) =
while true:
processQueue(api[].queue)
proc initMyAPI(): ptr MyAPI {.exportc, raises: [].}=
let api = createShared(MyAPI)
# Shutdown / cleanup omitted for brevity
discard createThread(runner, api)
api
proc exportedFunction(api: ptr MyAPI, v: cint, callback: ExportedFunctionCallback) =
# By not allocating any garbage-collected data, we avoid the need to initialize the garbage collector
queue.add(ExportedFunctionData(v: cint, callback: callback))
The go
thread scheduler can detect blocking functions and start new threads as appropriate - thus, blocking the C API function is a good alternative to callbacks - for example, results can be posted onto a queue that is read from by a blocking call.
Variables
When calling Nim code from Go, care must be taken that instances of garbage-collected types don't pass between threads - this means process-wide globals and other forms of shared-memory apporaches of GC types must be avoided.
LockOSThread
can be used to constrain the thread from which a particular goroutine
calls Nim.
go
interop resources
- cgo wiki
- cockroachdb experience - general cgo costs
Rust interop
Nim and Rust are both statically typed, compiled languages capable of "systems programming".
Because of these similarities, interop between Nim and rust
is generally straightforward and handled the same way as C interop in both languages: Rust code is exported to C then imported in Nim as C code and vice versa.
Memory
While Nim is a GC-first language, rust
in general uses lifetime tracking (via Box
) and / or reference counting (via Rc
/Arc
) outside of "simple" memory usage.
When used with Nim, care must be taken to extend the lifetimes of Nim objects via GC_ref
/ GC_unref
.
Tooling
nbindgen
- create Nim "ABI headers" from exportedrust
code