Tag Archives: optimization

Learning C even more

Continue to read C book. Bitwise operations today. Not a very hard, but a crucial topic. Every programmer should be able to do binary math. Well, I have learned all this stuff in high school and than again in university, but doing web development relaxes the muscle which ticks in tact with CPU.

So I was playing with bits, mostly on paper, imagining how programmers did their job in times of PDP-7, times were men were men; when this idea came: let’s build some function to determine if the number is even and do it quick.

My first try looked like this:

bool odd(int i)
{
        return 1 & i;
}

This takes about 11.5 seconds on my Core 2 Duo, and it’s 1 second slower than traditional i % 2 == 0 solution. Who said re-inventing the bicycle?

Now, what does it actually do under the hood? Here’s how my odd function looks like:

_odd:
0000000100000f00	pushq	%rbp
0000000100000f01	movq	%rsp, %rbp
0000000100000f04	movl	%edi, 0xfffffffffffffffc(%rbp)
0000000100000f07	movl	0xfffffffffffffffc(%rbp), %edi
0000000100000f0a	andl	$0x1, %edi
0000000100000f10	movl	%edi, %eax
0000000100000f12	popq	%rbp
0000000100000f13	ret

I have used otool to generate this disassembly listing, but gcc can do it too with -S key
Ok, it moves stuff around and does “andl” which corresponds to the one operation, the function actually does. Interesting… Now what’s going to happen if I will ask gcc to optimize things a bit? `gcc even.c -O1 -f even` yields this:

_odd:
0000000100000ef0	pushq	%rbp
0000000100000ef1	movq	%rsp, %rbp
0000000100000ef4	andl	$0x1, %edi
0000000100000ef7	movl	%edi, %eax
0000000100000ef9	popq	%rbp
0000000100000efa	ret

Which runs just under 3.5 seconds. By removing memory access commands, gcc made this function three times faster!
Interesting enough here’s how ‘traditional’ function work after compile optimization:

_even:
0000000100000ef0	pushq	%rbp
0000000100000ef1	movq	%rsp, %rbp
0000000100000ef4	andl	$0x1, %edi
0000000100000ef7	xorl	$0x1, %edi
0000000100000efa	movl	%edi, %eax
0000000100000efc	popq	%rbp
0000000100000efd	ret

Which is just one operation longer, but runs a bit faster still. Well, that’s something i will have to dig into next time.

Today’s takeouts:

  • don’t re-invent the bicycle;
  • re-invent the bicycle only if you want to understand how things work;
  • C is fast, but using your compiler right is even faster;
  • writing fast code is hard (for obvious reasons);
  • is there a good C IDE for Mac?