Category: Low level programming

  • Swapping values of two variables with XOR

    As programmers, we know that we can swap two variables by using a temporary variable. That’s fairly simple and intuitive: given two variables (x and y) that you want to swap, you store x in a temporary variable z, copy y to x, then finally copy z (the temporary) to y. But I learned recently (in my compiler’s course) that you can swap two variables without allocating a temporary variable, by using three xor operations. Here’s a little C program that demonstrates that idea:

    # include<stdio.h>
    void
    swap(int *x, int *y)
    {
         *x ^= *y;
         *y ^= *x;
         *x ^= *y;
    }
    
    int
    main(int argc, char *argv[])
    {
        int x = 123;
        int y = 234;
        printf("x=%d;y=%d\n", x, y);
        swap(&x, &y);
        printf("x=%d;y=%d\n", x, y);
    }
    ./xor-swap
    x=123;y=234
    x=234;y=123

    Neat, right?! But why would you ever want to take this approach? Especially if the underlying CPU instructions are more expensive with the xor approach. That is, the typical process for swapping two variables consists of 3 MOV instructions and the total number of cycles is 3 — one cycle per MOV instruction. Whereas 3 XOR instructions takes 9 cycles (3 cycles per instruction).

    Well, one explanation is that compiler writer is making a tradeoff: we’re sacrificing CPU cycles but in return we free up a register, which is especially important for the register allocator. In other words, we’re reducing register allocation pressure.

    Anyways, the last thing I want to demonstrate for you (and really, for myself) is to see 3 XOR operations at the binary level

    XOR at the binary level

    Although the code (above) proves that we can swap two variables using three XOR operations without a temporary variable, I wanted to actually carry out the XOR operations at the binary level to see for myself (cause seeing is believing, right?)

    For the example below, assume that we have two registers ebx (1101)and eax (0001).

    First operation – xor %ebx, %eax

    1101
    0001
    ====
    1100

    The xor operation yields 1100 and is then store into ebx.

    Second operation – xor %eax, ebx

    0001
    1100
    ====
    1101

    This operation yields 1101 which is then stored eax.

    Third operation – xor %ebx, %eax

    1100
    1101
    ====
    0001

    And hooray — the final xor yields: 0001.

    References

    http://www.asciitable.com
    https://c9x.me/x86/html/file_module_x86_id_330.html

  • Here’s some assembly instructions … now write the corresponding C code

    A wide grin formed on my face, after successfully completing an exercise (from Computer Systems a Programmer’s perspective) that required me to write C code, based off a sequence of a six assembly instructions:

    void decode1(long *xp, long *yp, long *zp) {
    /* xp stored in %rdi, yp in %rsi, zp in %rdx)
    
    decode1:
        movq (%rdi), %r8
        movq (%rsi), %rcx
        movq (%rdx), %rax
        movq %r8, (%rsi)
        movq %rcx, (%rdx)
        movq %rax, (%rdi)

    The exercise consists of solely of mov instructions, which is capable of moving bytes (movq specifically moves a quad word) from:

    • register to memory
    • register to register
    • immediate value to register
    • immediate value to memory

    So, I pulled a pen and paper from my backpack and began tracing the flow of data between the multiple registers and memory locations.  I then took that chicken scratch, and wrote the corresponding C code.  Finally, I flipped to the end of the chapter to compare my code.  Bingo—the C code I wrote mapped exactly to the author’s answer key.

    I celebrated the tiny victory.

    The purpose of the exercise is two fold.  First, the exercise illustrates that higher level programming languages—including C, Python, and Java—compile into a lower level language: assembly.  Moreover, this compels programmers, I think, to pause while writing code and question the performance implication; does this line require two instructions, three, four? Second, the exercise demystifies the concept of C pointers, a construct that many novice programmers stumble over. But after completing this exercise, I find pointers intuitive—nothing more than a value of a memory address.