Don’t create reference from pointer (part 1/2)

When programming in C++ (or any other language) you can sometime hit very weird behavior. Like your “if” statements are ignored. You usually find this the hard way. Like when your program crashes. You check the debugger and you see, that it crashes on null pointer inside “if”. Yes the “if” that checks for null pointers. Strange.
Well in this post I will try to explain what is happening to your program.

In C++ we have something called a reference. Reference is basically an alias to some value/object. This alias has several key points defined in standard. Like a reference has to be bound to a valid object in declaration.

This binding in declaration has the effect that reference can not be null. You will basically always create a valid reference because it is an alias to something. This has some consequences in compilation and especially in resulting assembly.

In compilation compiler can detect when you are doing something wrong and warn you. In resulting assembly the optimizer can omit some checks because it knows, that the object is valid.

A little example

This post is going to explain you why checks are ignored, when using references. First, however lets see a little sample code without references. You will see that the binary doesn’t contain the “if” we put there.

Check this simple code:

int doSomething();

int main( int argc, char** argv )
{
    int i = 5;

    if( i == 5 )
    {
        doSomething();
    }

    return 0;
}

This is the assembly result when no optimization is applied.

main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        subq    $32, %rsp
        movl    %edi, -20(%rbp)
        movq    %rsi, -32(%rbp)
        movl    $5, -4(%rbp)
        cmpl    $5, -4(%rbp)
        jne     .L2
        call    _Z11doSomethingv
.L2:
        movl    $0, %eax
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc

This is with -O3 passed to GCC compiler

main:
.LFB0:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        call    _Z11doSomethingv
        xorl    %eax, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc

As you can see the compiled just completely ignored the “if” in the optimized build. Why? Because it knows, that the “if” will always evaluate to true.

This simple code and compiled assembly showed that compiler do some analysis based on what it knows. The same will apply to references. More on that in the second part of this article.

SfDD (Segmentation fault Driven Development)

You know it, you just did not realize it till now.

Crash, some call it crush, boom, explode … . It happened before, it is happening right now and it will happen in future. Lets face the truth. Programmers are people and people make mistakes. It is not constrained to C devs or C++ devs, Python devs, javascript devs or even Rust devs do make mistakes. When you program in some language the product of this mistakes are core dumps

Most of us make multi-threaded or asynchronous(event base) programs, the result of this is that the code is complex. The program flow is not linear, not just a bunch of ifs but rather modules that are somehow interconnected. It is very complex and with complexity bugs are very easily introduced.

We programmers get issues with our code all the time, most of the time it is something simple/stupid, but there are times when someone hit an issue that crashes our program. Not just crash, but we simply have no clue what is happening there(the result of the complexity). Some of us take such things as a challenge and that is where the SfDD starts.

Sure a wise programmer makes tests that can help, or try to reproduce the crash with minimal steps/config/environment required. However in the end we are just searching for answer for the question – is it still crashing?

Sometime we spend with SfDD a few days. The fix is usually simple but it includes a deep understanding of the code/app. The SfDD is actually very good in this regard, because it always teach a lesson or two to the programmer.

The best thing about SfDD is when it ends. Lets imagine a very real example. You have a crash in the system, you don’t know what is causing it. You start to debug, unit test your code, producing core dumps like crazy. You just had to clear /var/core for the second time because you ran out of disk space and now finally you found it. You make a commit of only a few lines changed, maybe some few if-s added or some legacy code removed that was there only because the other modules were not working as expected, but they fixed that already. Now you just pushed it and it all ends. The issue is solved and the feeling you get is just super. This is the best part of SfDD. The feeling of good work, accomplishment, good game, work nicely done. You feel the power of your new knowledge, the new stuff about C standard you just learned 2 days ago is really useful. The crazy stuff about pointers and references in C++ is still crazy but at least you know it. The compiler optimizations is looking less like a black magic. Also you know a part of the app you code and you start being the target for asking for help from others. For the rest of the day you take the least IQ demanding issue you can find and solve it like nothing.

This is SfDD, one of the best software development method ever invented.