8 minutes
Improved C Variadics in Rust and C2Rust
By Andrei HomescuIntroduction to C Variadics
The C language provides a special class of functions called variadic functions that can be called with a variable number of arguments. The declaration of a variadic function ends with an ellipsis, e.g.:
void variadic_function(int x, ...);
Variadic functions can be called with any number of arguments in place of the ellipsis (including none at all).
The C runtime provides a set of helper macros that developers use to retrieve
the values of the variadic arguments.
All the macros, along with the va_list
type, are defined in the
stdarg.h
header and most commonly implemented as compiler builtins:
-
va_start(ap, last)
initializes the variableap
of typeva_list
to the next argument of the current function followinglast
, wherelast
is the name of one of the non-variadic arguments of the function.last
most commonly refers to the argument immediately before the ellipsis, soap
is effectively initialized with the address of the first variadic argument. -
va_arg(ap, type)
returns the value of the next argument fromap
, under the assumption that the type of the argument at the variadic function’s call site istype
. -
va_copy(aq, ap)
copies the current state ofap
intoaq
. Both arguments have typeva_list
. -
va_end(ap)
frees all resources used byap
. Eachva_start
andva_copy
call must be matched by exactly oneva_end
call in the same function.
For example, this is a very common implementation of the printf
function:
int printf(const char *fmt, ...) {
int res;
va_list ap;
va_start(ap, fmt);
res = vprintf(fmt, ap);
va_end(ap);
return res;
}
The signature of the vprintf
function is
int vprintf(const char *fmt, va_list ap);
and it internally calls va_arg
to retrieve the values of the
arguments passed to printf
.
C Variadics in Rust
Our goal in the C2Rust project is to translate any valid C99 program into equivalent Rust code. Naturally, this means we need to properly support translating C variadic functions into Rust.
For a long time, the Rust-C FFI only allowed one-way calls to C variadic
functions: Rust code could call C variadic functions, but not the other way
around. For example, the printf
function could be declared and called from
Rust as:
extern "C" {
fn printf(fmt: *const c_char, ...) -> c_int;
}
but such a function could not be implemented in Rust.
Rust RFC 2137 proposed an interface for Rust code to provide C-compatible variadic functions, which was later implemented as a series of patches by Dan Robertson that have been merged into nightly Rust from November 2018 to February 2019.
The new interface provides a new VaList
that is compatible with C’s
va_list
and which implements the following interface (simplified for brevity):
impl VaList {
/// Rust equivalent of `va_arg`, extracts the next argument of type `T`.
pub unsafe fn arg<T>(&mut self) -> T;
/// Calls function `f` with a copy of `self`, constructed with `va_copy`
/// and safely destroyed using `va_end`.
pub unsafe fn with_copy<F, R>(&self, f: F) -> R
where F: FnOnce(VaList) -> R;
}
These methods provide a safer alternative to their C counterparts (but still
marked unsafe
, since arg
still performs a form of type punning),
guaranteeing that every call to va_start
and va_copy
has a matching va_end
.
C variadic functions defined in Rust use a special syntax: the variadic
arguments are defined with a special ellipsis type, which the compiler
internally transforms into a VaList
type and automatically calls va_start
and va_end
for that parameter, e.g.:
pub unsafe extern "C" variadic_function(mut ap: ...) {
// rustc calls `va_start(ap)` internally here
// Print the first argument as a `u32`
println!("{}", ap.arg::<u32>());
// rustc now calls `va_end(ap)` automatically
}
Implementing Clone
for VaList
While the VaList
API above does not directly match C’s macros,
it provides a sufficient interface for the C2Rust transpiler to support
conversion of C va_start
, va_arg
and va_end
calls. However, some uses of va_copy
cannot be translated to with_copy
, such as
va_list ap1, ap2;
va_copy(ap1, ap);
if (condition)
va_copy(ap2, ap);
// ...other code that uses ap1 and ap2...
va_end(ap1);
// ...more code that uses ap2...
if (same_condition)
va_end(ap2);
and
va_list aq, ap1, ap2;
// ...initialize ap1 and ap2...
if (condition) {
va_copy(aq, ap1);
} else {
va_copy(aq, ap2);
}
// ...code that uses aq...
va_end(aq);
In the first example, the problem is that the lifetimes of the ap1
and ap2
variables overlap, but neither includes the other. Therefore, we cannot replace each va_copy
call with with_copy
while also maintaining the order of variadics operations (we could move va_end(ap1);
after the last if
statement, but that might change the behavior of the entire program). In the second example, the aq
copy is initialized in one of the branches of the if
statement, but needs to escape the statement and live until the corresponding va_end
call which is outside the statement. If we were to place the with_copy
call inside the if
, the latest its internal scope could end (and implicitly call va_end
) would be at the end of each branch.
For a real-world example of C code that cannot be trivially transformed to a with_copy
call,
see the vasprintf
function in the Julia language implementation.
The underlying issue is that with_copy
creates a new scope which the copy lives and is destroyed in at the end, and assumes that all other uses of the copy can be cleanly moved into this scope. Our examples show that this is not always simple or even possible at all.
To solve this problem, we submitted a Rust language pull request
with an extension to this interface that would expose a Rust version of va_copy
.
After several redesigns based on discussions and suggestions from Rust language
team, we settled on a final version of the interface with the following changes
(based on a design proposed by Rust compiler team member eddyb):
-
Split the previous
VaList
into two structures: an internalVaListImpl
used by the compiler as the backing structure for the ellipsis argument, andVaList
used as the public C-compatible interface for the former. -
Implement the
Clone
trait which copies a givenVaListImpl
and returns the copy.
The VaList
split brings Rust’s data structures closer to their C equivalents,
since on some architectures va_list
is defined, e.g., by
clang, as
typedef struct __va_list_tag va_list[1];
Due to C’s implicit array-to-pointer decay, va_list
decays to the
struct __va_list_tag*
pointer type when used in function signatures, but
remains a single-element array (with the same size as the structure itself)
when used to declare a local variable.
To preserve this distinction in the Rust interface, we refactored VaList
to match
the pointer version of va_list
, and added VaListImpl
as an equivalent for
the __va_list_tag
structure.
The new interface for the two structures is (simplified once again):
impl VaListImpl {
/// Explicitly convert `VaListImpl` -> `VaList` for callees.
pub fn as_va_list(&mut self) -> VaList;
pub unsafe fn arg<T>(&mut self) -> T;
/// Calls function `f` with a copy of `self`, constructed with `va_copy`
/// and safely destroyed using `va_end`.
pub unsafe fn with_copy<F, R>(&self, f: F) -> R
where F: FnOnce(VaList) -> R;
}
impl Deref for VaList {
/// Deref-coercion for `VaList`, so it can be used in lieu of a `VaListImpl`.
fn deref(&self) -> &VaListImpl;
}
/// `DerefMut` implementation for `arg`
impl DerefMut for VaList { ... }
impl Clone for VaListImpl {
/// Copy `self` into a new `VaListImpl` and return it.
fn clone(&self) -> Self;
}
We omitted one interesting and significant detail from the interface for
brevity: the actual types are not VaListImpl
and VaList
, but VaListImpl<'f>
and VaList<'a, 'f>
. The 'a
lifetime has a simple purpose: since VaList
is internally just a &'a mut VaListImpl
reference to its backing VaListImpl
,
'a
is the lifetime of that reference. On the other hand, 'f
has a much more
interesting motivation: by making both structures invariant over this lifetime,
we tie each VaListImpl
structure to the function it was created in (for the
VaListImpl<'f>
creates implicitly by the compiler for an ellipsis argument,
its lifetime argument is always the entire body of that variadic function),
and tie each VaList
to the lifetime of the VaListImpl
it was created from.
This has some interesting safety consequences:
-
it prevents users from accidentally assigning incompatible
VaList
orVaListImpl
values, i.e.,VaList
values from two different variadic functions, and -
it prevents
VaListImpl<'f>
values from escaping their variadic function.
For example, this code will fail to compile:
pub unsafe extern fn foo<'a>(mut ap: ...) -> VaListImpl<'a> {
// `VaListImpl` would escape
ap
}
fn bar<'a, 'f, 'g: 'f>(ap: &mut VaList<'a, 'f>, aq: VaList<'a, 'g>) {
// Incompatible types
*ap = aq;
}
Translating C Variadics to Rust
With the extended interface, we have all the tools we need to convert C variadic macros and types into their Rust equivalents using the following rules:
C code | Rust equivalent | Rule |
---|---|---|
void foo(int x, ...) |
fn foo(x: c_int, mut args: ...) |
The ellipsis argument is named args |
void bar(int, va_list) |
fn bar(_: c_int, _: VaList) |
va_list function arguments get the VaList type |
va_list ap; |
let mut ap: VaListImpl; |
va_list function locals get the VaListImpl type |
va_start(ap, x); |
ap = args.clone(); |
va_start becomes args.clone() |
va_arg(ap, int) |
ap.arg::<c_int>() |
va_arg becomes VaListImpl::arg |
va_clone(aq, ap); |
aq = ap.clone(); |
va_copy becomes VaListImpl::clone |
va_end(ap); |
/* Ignore va_end */ |
The rules above are generally straightforward, with a few notable exceptions.
First, since VaListImpl
does not provide a constructor or a start
function,
we actually implement va_start
using the clone
function.
We use the function’s ellipsis argument as the source for every such clone
call. Since this requires that the argument is named, we assign it a unique
name of args
plus an optional suffix.
Second, since VaListImpl
values are automatically dropped, we simply ignore
va_end
calls.
For example,
void bar(int, va_list);
void foo(int x, ...) {
va_list ap, aq;
va_start(ap, x);
va_copy(aq, ap);
bar(x, aq);
va_end(ap);
va_end(aq);
}
becomes
extern "C" {
#[no_mangle]
fn bar(_: c_int, _: VaList);
}
pub unsafe extern fn foo(x: c_int, mut args: ...) {
let mut ap: VaListImpl;
let mut aq: VaListImpl;
ap = args.clone();
aq = ap.clone();
bar(x, aq.as_va_list());
}
Acknowledgments
We would like to thank Dan Robertson, Josh Triplett, and Eduard-Mihai Burtescu
(eddyb) for the comments and suggestions for the design and implementation of
Clone
for VaList
and the VaList
/VaListImpl
split.