Type Punning Functions in C

August 10, 2016

C has a reputation for being inflexible. But did you know you can change the argument order of C functions if you don't like them?

#include <math.h>
#include <stdio.h>

double  DoubleToTheInt(double base, int power) {
    return pow(base, power);
}

int main() {
    // cast to a function pointer with arguments reversed
    double (*IntPowerOfDouble)(int, double) =
        (double (*)(int, double))&DoubleToTheInt;

    printf("(0.99)^100: %lf \n", DoubleToTheInt(0.99, 100));
    printf("(0.99)^100: %lf \n", IntPowerOfDouble(100, 0.99));
}

The code above never actually defines the function IntPowerOfDouble — because there is no function IntPowerOfDouble. It's a variable that points to DoubleToTheInt, but with a type that says it likes its integer arguments to come before its doubles.

You might expect the IntPowerOfDouble to take its arguments in the same order as DoubleToTheInt, but cast the arguments to a different type, or something like that. But that's not what happens.

Try it out — you'll see the same result value printed on both lines.

emiller@gibbon ~> clang something.c 
emiller@gibbon ~> ./a.out 
(0.99)^100: 0.366032 
(0.99)^100: 0.366032

Now try changing all the int arguments to float — you'll see that FloatPowerOfDouble does something even stranger. That is,

double  DoubleToTheFloat(double base, float power) {
    return pow(base, power);
}

int main() {
    double (*FloatPowerOfDouble)(float, double) =
        (double (*)(float, double))&DoubleToTheFloat;

    printf("(0.99)^100: %lf \n", DoubleToTheFloat(0.99, 100));   // OK
    printf("(0.99)^100: %lf \n", FloatPowerOfDouble(100, 0.99)); // Uh-oh...
}

Produces:

(0.99)^100: 0.366032 
(0.99)^100: 0.000000

The value on the second line is “not even wrong” — if it were merely a matter of argument reversal, we'd expect the answer to be 100^0.99 = 95.5 and not zero. What's going on?

The above code examples represent a kind of type punning of functions — a dangerous form of "assembly without assembly" that should never be used on the job, in the vicinity of heavy machinery, or in conjunction with prescription drugs. The code examples will make perfect sense to anyone who understands code at the assembly level — but is likely to be baffling to everyone else.

I cheated a little bit above — I assumed you're running code on a 64-bit x86 PC. If you're on another architecture, the trick above might not work. In spite of C's reputation for having an infinite number of dark corners, the int-double-argument-order behavior is certainly not a part of the C standard. It's a result of how functions are called on today's x86 machines, and can be used for some neat programming tricks.

That’s Not My Signature

If you took a C class in college, you might remember that arguments are passed to functions on the stack. The caller puts arguments on the stack in reverse order, and the called function (the callee) reads its arguments off of the stack memory.

That's how I was taught it, anyway — but most computer nowadays pass the first several arguments directly in CPU registers. That way the function never has to hit stack memory, which is slow to access compared to the registers.

The number and location of registers used for function arguments depend on something called the calling convention. Windows has one convention — it has four registers set aside for floating-point values, and four registers for integer and pointer values. Unix has another convention, called the System V convention. It has eight registers set aside for floating-point values, and six registers for integer and pointer values. (If arguments don't fit into registers, then they go into stack memory the old-fashioned way.)

In C, header files really just exist to tell the compiler where to put the function arguments, often a combination of registers and the stack. Each calling convention has its own algorithm for allocating those arguments into registers and onto the stack. Unix, for example, is very aggressive about breaking up structs and trying to fit all of the fields into registers, whereas Windows is a bit lazier and just passes a pointer to a large struct parameter.

But in both Windows and Unix, the basic algorithm works like this:

Floating-point arguments are placed, in order, into SSE registers, labeled XMM0, XMM1, etc.
Integer and pointer arguments are placed, in order, into general registers, labeled RDX, RCX, etc.

Let's briefly look at how arguments are passed to the function DoubleToTheInt.

The function signature is:

double  DoubleToTheInt(double base, int power);

When the compiler encounters DoubleToTheInt(0.99, 100), it lays out the registers like this:

RDX	RCX	R8	R9
100	???	???	???
XMM0	XMM1	XMM2	XMM3
0.99	???	???	???

(I'm using Windows calling convention for simplicity.) If the function were instead:

double  DoubleToTheDouble(double base, double power);

The arguments would be laid out like this:

RDX	RCX	R8	R9
???	???	???	???
XMM0	XMM1	XMM2	XMM3
0.99	100	???	???

Now you might have an inkling of why the little trick at the beginning worked. Consider the function signature:

double IntPowerOfDouble(int y, double x);

Called as IntPowerOfDouble(100, 0.99), the compiler will lay out the registers thus:

RDX	RCX	R8	R9
100	???	???	???
XMM0	XMM1	XMM2	XMM3
0.99	???	???	???

In other words — exactly the same as DoubleToTheInt(0.99, 100)!

Because the compiled function has no idea how it was called — only where in registers and stack memory to expect its arguments — we can call a function using a different argument order by casting the function pointer to an incorrect (but ABI-compatible) function signature.

In fact, as long as integer arguments and floating-point arguments appear in the same order, we can interleave integer and floating-point arguments in any way we'd like, and the register layout will be identical. That is,

double functionA(double a, double b, float c, int x, int y, int z);

Will have the same register layout as:

double functionB(int x, double a, int y, double b, int z, float c);

And the same register layout as:

double functionC(int x, int y, int z, double a, double b, float c);

In all three cases the register allocation will be:

RDX	RCX	R8	R9
`int x`	`int y`	`int z`	???
XMM0	XMM1	XMM2	XMM3
`double a`	`double b`	`float c`	???

Note that double-precision and single-precision arguments both occupy the XMM registers — but they are not ABI-compatible with each other. So if you recall the second code sample at the beginning, the reason that FloatPowerOfDouble returned zero (and not 95.5) is that the compiler placed a single-precision (32-bit) version of 100.0 into XMM0, and a double-precision (64-bit) version of 0.99 into XMM1 — but the callee expected a double-precision number in XMM0 and a single-precision number in XMM1. In the ensuing confusion, exponents went around pretending to be significands, significand bits were chopped off or treated as exponents, and the FloatPowerOfDouble function ended up raising a Very Small Number to a Very Large Number, producing a zero. Mystery solved.

Also note the ??? in the above diagrams. These register values are undefined — they could have any value from previous computations. The callee doesn't care what's in them, and is free to write over them during its own computations.

This raises an interesting possibility — in addition to calling a function with arguments in a different order, we can also call a function with a different number of arguments than it expects. There are a couple of reasons we might want to do something crazy like that.

Dial 1-800-I-Really-Enjoy-Type-Punning

Try this:

#include <math.h>
#include <stdio.h>

double  DoubleToTheInt(double x, int y) {
    return pow(x, y);
}

int main() {
    double (*DoubleToTheIntVerbose)(
            double, double, double, double, int, int, int, int) =
    (double (*)(double, double, double, double, int, int, int, int))&DoubleToTheInt;

      printf("(0.99)^100: %lf \n", DoubleToTheIntVerbose(
                                   0.99, 0.0, 0.0, 0.0, 100, 0, 0, 0));
      printf("(0.99)^100: %lf \n", DoubleToTheInt(0.99, 100));
}

It should come as no surprise that both lines return the same result — all the arguments fit into registers, and the register layout is the same.

Now here's where the fun part comes in. We can define a new "verbose" function type that can be used to call many different kinds of functions, provided the arguments fit into registers and the functions have the same return type.

#include <math.h>
#include <stdio.h>

typedef double (*verbose_func_t)(double, double, double, double, int, int, int, int);

int main() {
	verbose_func_t verboseSin = (verbose_func_t)&sin;
	verbose_func_t verboseCos = (verbose_func_t)&cos;
	verbose_func_t verbosePow = (verbose_func_t)&pow;
	verbose_func_t verboseLDExp = (verbose_func_t)&ldexp;

	printf("Sin(0.5) = %lf\n",
		verboseSin(0.5, 0.0, 0.0, 0.0, 0, 0, 0, 0));
	printf("Cos(0.5) = %lf\n",
		verboseCos(0.5, 0.0, 0.0, 0.0, 0, 0, 0, 0));
	printf("Pow(0.99, 100) = %lf\n",
		verbosePow(0.99, 100.0, 0.0, 0.0, 0, 0, 0, 0));
	printf("0.99 * 2^12 = %lf\n",
		verboseLDExp(0.99, 0.0, 0.0, 0.0, 12, 0, 0, 0));
}

The type compatibility is handy because we could, for instance, build a simple calculator that dispatches to arbitrary functions that take and return doubles:

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef double (*four_arg_func_t)(double, double, double, double);

int main(int argc, char **argv) {
    four_arg_func_t verboseFunction = NULL;
    if (strcmp(argv[1], "sin") == 0) {
        verboseFunction = (four_arg_func_t)&sin;
    } else if (strcmp(argv[1], "cos") == 0) {
        verboseFunction = (four_arg_func_t)&cos;
    } else if (strcmp(argv[1], "pow") == 0) {
        verboseFunction = (four_arg_func_t)&pow;
    } else {
        return 1;
    }
    double xmm[4];
    int i;
    for (i=2; i<argc; i++) {
        xmm[i-2] = strtod(argv[i], NULL);
    }

    printf("%lf\n", verboseFunction(xmm[0], xmm[1], xmm[2], xmm[3]));
    return 0;
}

Testing it out:

emiller@gibbon ~> clang calc.c
emiller@gibbon ~> ./a.out pow 0.99 100
0.366032
emiller@gibbon ~> ./a.out sin 0.5
0.479426
emiller@gibbon ~> ./a.out cos 0.5
0.877583

It is not exactly a competitive threat to Mathematica, but you might imagine a more sophisticated version with a table of function names that map to function pointers — the calculator could be updated with new functions just by updating the table, rather than invoking the new functions explicitly in code.

Another application involves JIT compilers. If you've ever worked through an LLVM tutorial, you may have unexpectedly encountered the message:

"Full-featured argument passing not supported yet!"

LLVM is adept at turning code into machine code, and loading the machine code into memory — but it's actually not very flexible when it comes to calling a function loaded into memory. With LLVMRunFunction, you can call main()-like functions (integer arg, pointer arg, pointer arg, integer return value), but not a whole lot else. Most tutorials recommend wrapping your compiled function around a function that looks like main(), stuffing all of your parameters behind a pointer argument, and using the wrapper function to pull the arguments out from behind the pointer and call the real function.

But with our newfound knowledge of X86 registers, we can simplify the ceremony, getting rid of the wrapper function in many cases. Rather than checking the provided function against a finite list of C-callable function signatures (int main(), int main(int), int main(int, void *), etc.), we can create a pointer that has a function signature that saturates all of the parameter registers — and so is assembly-compatible with all functions that pass arguments only via registers — and call that, passing in zero (or anything, really) for unused arguments. Thus we just need to define a separate type for each return type, rather than for every possible function signature, and then call the function in a flexible way that would normally require use of assembly.

I'll show you one last trick before locking up the liquor cabinet. Try to figure out how this code works:

double NoOp(double a) {
	return a;
}

int main() {
	double (*ReturnLastReturnValue)() = (double (*)())&NoOp;
	double value = pow(0.99, 100.0);
	double other_value = ReturnLastReturnValue();
	printf("Value: %lf   Other value: %lf\n" value, other_value);
}

(You might want to read up on your calling conventions first…)

Some Assembly Required

If you ever ask a question on a programmer forum about assembly language, the usual first answer is something like: You don't need to know assembly — leave assembly to the genius Ph.D. compiler writers. Also, please keep your hands where I can see them.

Compiler writers are smart people, but I think it's mistaken to think that assembly language should be scrupulously avoided by everyone else. In this short foray into function type-punning, we saw how register allocation and calling conventions — supposedly the exclusive concern of assembly-spinning compiler writers — occasionally pop their heads up in C, and we saw how to use this knowledge to do things that regular C programmers would think impossible.

But that's really just scratching the surface of assembly programming — deliberately undertaken here without a single line of assembly code — and I encourage anyone with the time to dig deeper into the subject. Assembly is the key to understanding how a CPU goes about the business of executing instructions — what a program counter is, what a frame pointer is, what a stack pointer is, what registers do — and lets you think about computer programs in a different (and brighter) light. Knowing even the basics can help you come up with solutions to problems that might not have occurred to you otherwise, and give you the lay of the land when you slip past the prison guards of your preferred high-level language, and begin squinting into the harsh, wonderful sun.

You’re reading evanmiller.org, a random collection of math, tech, and musings. If you liked this you might also enjoy:

You Can’t Dig Upwards
Elixir RAM and the Template of Doom
A Taste of Rust
Four Days of Go

Get new articles as they’re published, via LinkedIn, Twitter, or RSS.

Want to look for statistical patterns in your MySQL, PostgreSQL, or SQLite database? My desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!

Wizard
Statistics the Mac way

Back to Evan Miller’s home page – Subscribe to RSS – LinkedIn – Twitter