Wednesday, December 15, 2010

"Inline" functions and you

There seems to be much confusion about inline functions in C++ and what, if anything, they offer. Many people believe that use of the 'inline' keyword and defining member functions inside class declarations actually cause the compiler to inline functions like macros so that a JMP call (and possibly more) can be avoided. Thus, in the name of efficiency people will define vast numbers of functions inside of headersand invent half-assed coding policies in order to gain the benefit of this optimization. Are they actually gaining anything or are they just causing themselves problems?

The first thing to realize about "inline" functions is that they don't have to be. The C++ standard clearly dictates that a compiler may completely ignore the request to inline a function. When a compiler does this, the only recourse it has is to create multiple definitions of that function, one for each translation unit (compiled source file) that it is used in. When this happens, not only has there been no speed gained through the inlining of the function, but there is now a size increase caused by there being more than one chunk of code that does exactly the same thing.

Of course, people who advocate use of inline functions for performance gain will point out that the linker will then come along and remove duplicates of the function if it was not actually inlined. Is this true? Maybe. The linker is of course free to do so if it is able to. The standard allows implementations to remove any chunk of code that's removal does not change the result of the program. Therefor unused functions and duplicates can be removed. It certainly does not HAVE to do this though.

There is also the point that program size is usually not as important as program speed. Well, this is often true and it is also sometimes false. Size vs. performance requires a cost/benefit analysis that cannot possibly dictate policy beyond any given project. A effort to produce a program on an embedded architecture, for example, will often have exactly the opposite requirement and optimization will be about making the program smaller and use less memory more than making it run faster.

So at this point we can safely say that putting your functions in headers as inline members or globals can increase performance (in which case it increases program space because inlined functions make the functions that use them larger) and may very well use up more space and not provide any performance benefit at all, depending on the implementation.

What about the other end though? It is an apparently wide held belief that not using inline functions makes it impossible for an implementation to perform an inline optimization. Even supposed experts make this very claim. This of course leads those worried about making their programs as fast as possible to spend an inordinate amount of time trying to make everything "inline" in order to gain this boost that may or may not ever occur. Worse, they spend all this effort long before ever actually testing the program to see if it even needs a performance boost at all or that the functions they are doing this with are bottlenecks and need to be inlined at all (something compilers are fairly good at deciding themselves these days).

But is it true that for a function to get inlined it must be in the header so that it is available to the compiler at the point where it is called? In other words, does a function definition have to be available within scope of the place it is called in order for an implementation to perform an inline optimization upon it? The answer to that question is nothing short but an astounding and certain maybe. You see, as with the case of code duplication caused by the inclusion of inlined functions that were not inlined being removed by the linker, today's linkers can also perform function inlining! That's right, an implementation is as free to inline functions at any point in the process as it is to remove duplicates or unused code.

What does your compiler do? Well, you'll just have to read the documentation to find out. The MS VC++ suite calls these linker optimizations "Whole Program Optimization" and you have to turn it on. In the GNU world it is called the standard link-time optimizer and is governed by the -flto switch.

When all is said and done, it is absolutely silly to dictate use of inline functions as policy. You're not necessarily gaining anything at all and can actually be causing bloat without purpose. What you should do, and it's really sad that so few people seem to understand this, is write your program however it seems makes it the most clear, maintainable, and understandable. THEN, if you notice that you're not getting the performance you want or need, you profile your program to find out WHERE the bottleneck is and fix it. This may or may not be answerable by using function inlining.

Too many people start with, "I want my program to be fast and so I must [INSERT ASSUMPTION HERE] in order to make that happen." They are, as in the case of using inline functions, almost certainly in error.

Friday, December 10, 2010

A pet peeve

Quite often on various help sites I frequent I'll run into someone confused about pointers that wonders something like, "I've got two pointers that point at the same thing, do I delete both? One? Which?" etc... I've yet to see anyone but myself correct their use of the terms and I think that's because people aren't really paying attention to exactly what is going on and why the common phrase, "...delete a pointer...," confuses the issue. Experts do know what's going on, but still use the terms incorrectly. I think it's time that we stopped and I bet that by correcting use and correcting those using them wrong, we could improve the understanding of noobs.

The first and foremost thing that all noobs fail to get is that pointers are values. A 'pointer to T' is not anything magical but for the green developer they always seem to be. The reason being is that we don't hammer into them that 'pointer to T' is nothing but another value type. In fact, the only reason that 'T' is involved at all is so that the compiler knows how far to increment it when we do math on the value. Besides that, 'pointer to T' is ENTIRELY unrelated to T or any instance of T. The 'pointer to T' type is simply another value type that holds a number and allows a few additional operators to manipulate it: *, ->, and delete.

The second thing that noobs seem to misunderstand is that they think 'pointer to T' somehow holds a T; this couldn't be further from the truth. The T that is 'pointed to' by a 'pointer to T' is exactly like any other T that you might create. It looks the same, acts the same, it IS the same. There doesn't even have to be any difference in location because 'pointer to T' can still 'point at' a T that was simply created as a value on the 'stack'. Another way of showing it is to point out that you can use new (or malloc) to create a T on the free-store (or heap) and not have any pointer anywhere: 'new T;'. That T exists just like it would with a pointer and is exactly the same shape as any other T created anywhere else. All that the value type 'pointer to T' does is tell you how to find it.

The next thing that noobs misunderstand, and this is the one I think experts should work harder to correct even in themselves, is that you do not delete (or free) pointers; you delete what they point at. Since a 'pointer to T' is a value type just like T is, it behaves just like T does. You don't delete them unless you created them with new. The only difference between 'pointer to T' and T itself (assuming T is not also a pointer type) is that it can be an operand of the delete operator. Passing a pointer into the delete function, by using the delete operator, tells that function to destroy the object pointed to by that pointer; you are NOT deleting the pointer. If that object is not on the free-store, or has been deleted already, then passing that pointer, a value that still exists and has the same value it always had, into the delete function results in very bad things. If you want to delete a pointer then you need to have a pointer to it and use THAT pointer as the delete operand; you will not be deleting the 'pointer to pointer to T' and hopefully the 'pointer to T' that 'pointer to pointer to T' points at was created on the free-store and still exists. In short, you NEVER delete value types that are not created on the free-store with the new operator and 'pointer to T' is not different in this regard.

So, the next time some noob asks something like:

I created an object with new, assigned it to ptr1 and then copied ptr1 into ptr2. Do I need to delete both pointers? Just one? Which one should I delete? What happens to the other one?...etc...


Please, when someone asks this, correct their terminology and your own. Explain to them that...

Neither. You don't delete the pointers, you want to delete what they point at. In order to delete an object you need to pass a value into the delete operator that contains a number representing the location of that object, a pointer. Since both of your pointers have the very same value, you can use either to delete the object they point to. Nothing happens to the pointers after that, and this is very important because they still contain a number representing a location that is no longer being occupied. You need to explicitly set those values to 0 so that you can tell again later or you need to destroy those pointers by allowing them to go out of scope.


Of course some people think that you shouldn't set a pointer to 0 because doing so hides "dual delete errors". Those people are retards though and are trying to debug through undefined behavior, which is just plain stupid.

At any rate, the point is to make sure that people understand that a pointer is simply another type of value. You do not delete it unless you've newed it and nothing is done to it by passing it as an operand to delete.