E01: Stupid C++ Tricks with Dave

In Software Development


E01: Stupid C++ Tricks with Dave - read the full article about C++ 2021, Software Development and from Daves Garage on Qualified.One
alt
Daves Garage
Youtube Blogger
alt

Hey, Im Dave, welcome to my shop! Im Dave Plummer, retired operating systems engineer from Microsoft going back to the MS-DOS and Windows 95 days, and Ive been coding in C and C++ for almost 40 years.

Now that Im finally getting good at it, I thought Id take some time to show you a few of the cool and essential things Ive learned along the way.

Sometimes theyll be about performance, sometimes about style, and sometimes just ways to make your life a lot easier or stay up to date, but either way Im confident youll discover something new that you can integrate into your own C and C++ development.

All that plus the story of the time I changed a million lines of the Windows source code with a single checkin.

Are you sitting comfortably? Then well begin! [Intro] Hey, whats new? Exactly! New is the topic today because well be talking about the new operator and everything you always want to know but were afraid to ask, like bad allocation exceptions, implementing your own new, using placement new, unique overloads per class, using different types of memory pools, and so on.

Every now and then Ill also throw in a tangentially related C++ quicktip, like this: if you need the min or max of more than one value, I often see this done: int x = min(a, min(b, min (c, d)))); Thats seriously messy in my opinion, so instead, write it like this using set notation: int x = min({ a, b, c, d }); Its vastly more readable! Behind the scenes its using the initializer list class and so its still entirely type safe as well, unlike a macro might be or even a function with a variable number of arguments.

To get started, one story I want to tell soon is the story of the day that I personally changed about a million lines of Windows code.

Literally.

It really needs its own episode to be told properly, but Ill give you the highlights today.

It all takes place in about 1994 back on Microsoft campus.

I had only been at Microsoft for about a year and had just moved over from the OLE / COM32 team to join a newly formed NT Shell team.

This was politically a bit of a hot potato because there already was a Windows NT Shell team, but they were working on a new, more better user interface for the future operating system known as Cairo.

It was assumed that Win95 would be the end of the line for its interface and that Cairo would take over the desktop of the future with its suitably futuristic and powerful interface.

Since we all know that didnt come to pass, what actually did happen? Well, it is a long story and thats why it needs its own episode, so be sure youre subscribed to the channel such that you dont miss it when I release it! But, long story short, me and a few close friends ported the entire Windows 95 user interface, including the start menu, common dialogs, common controls, desktop, and so on over to Windows NT.

That meant porting it from 8-bit ascii text and the Windows 95 kernel over to 16-bit Unicode and the Windows NT kernel.

As the RISC guy, for me it also meant porting it to 32-bit MIPS platforms with all the attendant alignment concerns.

But perhaps the biggest portion of the changes was the move from 8 bit to 16 bit characters.

The problem is that before Unicode, programmers pretty much assumed that a character was synonymous with "8 bit quantity" and always would be.

It could be signed or unsigned, but it had always been 8 bits.

When we changed that assumption by making characters 16 bits, about a million lines of code had to be changed and hand-reviewed.

Anything that used a character, or a character pointer, or that used a sizeof operator, was suspect.

So was any memory allocation or string operation.

Consider that in those days it was incredibly common to pass the byte size of a character buffer to an API that expects the number of characters.

They used to mean the same thing, but not anymore.

This required me to have the entire source code to all of Windows checked out onto my trusty 100MHz dual core MIPS RISC box.

The vast majority of our changes would be on the User side of things, focused primarily in the shell namespace, but it still mandated some kernel changes here and there as well.

You can see in my photo that Im rocking dual full height external scsi hard drives just to accommodate the space needed to build all of Windows back then.

They were probably Seagate 68-pin SCSI II drives, but Id imagine they were at most one gigabyte each.

Somewhere I wrote down some statistics about how many lines changed, and between every instance of sizeof, character pointer, arraysize, memcpy, strcpy, malloc and calloc and anything else impacted by the Unicode change.

All together it amounted to more than a million individual changes.

And they were all checked out on my own dev machine using our source library management system known as SLM, which was so rudimentary back in the day that it didnt even support branching yet.

Heres another quick tip - when you need the size of an array, but you need it in elements and not bytes, you should create or use the ARRAYSIZE macro.

Its typically defined by dividing the size of the overall array by the size of an individual element, which will return the number of elements within.

For the shell, I hand reviewed every instance of sizeof in the code.

I then changed them to uppercase SIZEOF or ARRAYSIZE as each situation called for.

Rinse and repeat until all the lower case sizeofs are eliminated, and youre done that one step.

When the time came to check these million or so changes in, you can imagine theres a certain pucker factor.

I dont think I assumed the build lab would get through it the first time without a hitch, so needless to say I hung around until I knew it all built, and it did before I left for the day, so it was largely without incident.

In the photo you might have noticed that Im posing along with a dollar bill on my corkboard.

Whats that about? This was our first build of the shell that successfully compiled, but would it even run? And if so, how far would it make it before hurling digital chunks? Our dev lead Bob and I placed our bets - I bet wed at least see the desktop colors, but Bob thought it would crash early in startup, long before that.

Once it was done compiling, I nervously typed explorer.exe into the command line and pressed enter.

It was a full symbolic debug build, so much larger and slower than normal.

And the hardware was all made 25 years ago! It churned away with the old barracuda disc heads thrashing like mad as it brought in page after page of freshly ported binary.

And then, first light: the tell-tale aqua color of the Windows 95 desktop appeared.

That was followed moments later by the desktop icons and start button.

Everything had the wrong icons and the text was entirely messed up but it got far enough to show us that much, so Bob cheerfully handed me his dollar, and I pinned it up on the corkboard for all to see and ask about for pretty much the rest of my career thereafter! The huge check in was a success, and the only big mistake I made, in fact, was automatically updating any file history it found with a new entry that read something like: November-24-1995 davepl Ported to NT from Win95 The problem is that quite often this meant my alias was now the ONLY one that appeared in the file, and that was true for hundreds if not thousands of Windows source files, and so Id get desperate late-night requests from interns and new hires looking for information on some obscure source file like the TAPI area code dialog.

And people would think I was merely playing dumb when I claimed not to know anything about it.

As noted, we had to inspect and correct every case where the memory allocations used new or malloc or any other memory API.

If character strings are involved at all, you had to factor in the new 16 bit width of the characters.

As a result I got a lot of practice at reviewing and rewriting memory allocation calculations, and along the way I saw a bug repeated from time to time: misuse of and misunderstanding of the new operator, and so the new operator will be the focus for todays episode.

A lot of people think that new is basically the same as malloc, except it runs the constructor of the object before returning it to you.

And in essence, that is primarily what happens.

Theres a subtle and important difference, however, that if you dont know about, will cause your error path never to be taken.

Instead, your program will just silently exit and you wont even know why most of the time.

Lets look at a hypothetical implementation of the new operator.

To be clear, when implementing new you do not need to call the object constructor yourself.

The compiler does so separately; all you need to worry about when writing your new is satisfying the memory allocation.

Our example code is straightforward enough: allocate the memory using malloc and then return the result.

You could get the memory in any way you wanted, so for example if you wanted new to return memory from a shared pool of some sort you could overload the operator to do that sort of thing.

Ive done similar on the ESP32 to return SPI memory when desired, and well talk about that shortly.

You can even overload it on a class by class basis so some objects come from different pools or sources.

When someone uses the new operator, they often tend to follow this pattern: they call new and check for a nullptr coming back, just as they would if using malloc.

And if new were actually implemented as shown in the sample, that would work.

But it isnt, because new is specd by the standard to never return a null.

If it cannot satisfy the memory request, it throws a bad_alloc exception instead.

Checking for null doesnt work, and will do nothing, because when the C standard new operator cant allocate the memory it needs it just throws an exception.

That means you really have three options: you can either let those exceptions go unhandled and the system will terminate your process and clean up the mess.

Or your code can catch the exception and do something reasonable in response, like backing up to the last operation.

Perhaps, though, you dont want to mess with exceptions at all and would prefer the allocator just returned null when it fails to allocate, as so many people think it does anyway.

If thats the case, you can simply pass the constant std::nothrow to the new operator which will cause it to return a null instead of throwing an exception.

Heres some sample code that first allocates all the memory it can using new until an exception is thrown and the code catches it.

In the second variation, it loops while allocating with the std::nothrow version of null.

This second version returns a null when the allocation fails.

[std::nothrow - cppreference.com] If this looks a little bit new, lets consider how it works by looking at the signatures for the new operator, both with and without the second constant argument.

Passing the nothrow constant serves to disambiguate the new operator so that the compiler calls the specialization with that nothrow constant.

void* operator new(std::size_t) throw (std::bad_alloc); void* operator new[](std::size_t) throw (std::bad_alloc); void* operator new(std::size_t, const std::nothrow_t&); void* operator new[](std::size_t, const std::nothrow_t&); Now what if you had an object in ROM, lets say? Or even one copied from ROM or disc into RAM.

Its ready to go, and you want to call its constructor and start using it, but how do you "construct" a C++ object at a particular place in memory, or if it already exists? I suppose you could just manually invoke its constructor, but that seems rather bush league.

Instead, we should use the rarely understood placement new operator.

void* operator new(std::size_t, void*) When you call placement new, you provide the size and type as normal but you also specify the address in memory where the object will live.

That memory must already be set up and ready to go.

The compiler then calls the constructor for you but simply returns the same pointer you gave it, which has the effect of constructing the object in place and returning its address as the "this" pointer to the new object, ready to go.

Thanks to C++ inheritance, its a little more complicated than that.

The actual order of operations is to first call the constructors for the virtual base classes.

Next it calls the constructors of the nonvirtual base classes.

Then it calls the constructors of all class members before finally calling the constructor of the class itself.

Thus, you can see why you likely dont want to try to replicate all that behavior yourself.

Lets pause to address a question you might have had by now - why does the regular allocating new operator exist at all? Couldnt you simply malloc the object and then run placement new on the address? And yes, you could.

But one benefit of the new operator is that its type safe - youre required to cast and force the result of a malloc to match your type, but the return value from new actually IS that type, so its safer as you dont coerce or change it.

Plus, another big benefit is that you can overload the new operator on a class by class basis, something that would be difficult or at least messy to accomplish with malloc.

Lets have a look at how to do that.

For this case Ill use a real-world example.

On the ESP32 microcontroller chip there are two types of RAM: regular RAM, which is directly connected to the CPU, and PSRam, which is connected via the SPI bus.

Regular RAM is faster, but in much shorter supply.

PSRam is slower, but more plentiful.

You had a similar situation on the old Amiga architecture with Chip RAM and Fast RAM.

Thus for certain types youd like them to come out of PSRam and others should come out of regular ram.

Overloading new and delete operators at Global and Class level - thispointer.com The new and delete operators can be overloaded globally or per class.

Lets say that we have a class called AudioBuffer and we want it to be loaded from PSRam rather than regular ram whenever anyone creates one.

Even if the caller doesnt know anything about it, the special overloaded version for just that class will be called.

Here you can see that the AudioBuffer class defines its own new and delete operators that call PSAlloc and PSFree instead of the regular malloc and free.

Now any creation of an AudioBuffer class object will automatically happen in PSRam instead of regular ram.

It will also be freed properly as we defined the delete operator for it as well.

The user or caller of the class need not know anything about this, it just happens automatically because weve overloaded the operator for this one class.

I once massively increased the speed of a component because it allocated tons of small buffers.

It placed a significant load on the heap manager, and they were always used in scope or below, never passed back out.

I simply overloaded the new and delete operator for that class to allocate from the stack using alloca() instead of malloc.

Alloca() allocates memory form the stack instead of the heap and is orders of magnitude faster than allocating heap memory because it simply advances the stack pointer register, which is almost free to do, versus searching for and returning to you a free block of the appropriate size from the system heap.

The only catch, of course, is that they go away at the end of the function, since theyre on the stack! The point, however, is that you can do some powerful things by overloading the new and delete operators appropriately and judiciously.

When it comes to arrays, theres a fair bit of responsibility placed on the programmer in C and C++.

When you allocate an array of object like this: Foo * f = new Foo[100]; The system allocates and constructs 100 Foo object for you, and all you get to show for it is a pointer to the first one.

When it comes time to free the memory, what if you just did this: delete f; Well, the problem is that means you want to just delete the first element, since thats literally what youve said.

But it probably corrupts the heap too, for all we know.

The compiler has no way to know youre holding onto a pointer to an array or where you got it from, so when deleting an array of objects the onus is on you to do it properly by using the delete array version with the square brackets, as follows: delete [] f; Keep in mind this is true for integral types like integer as well, not just fancy classes with constructor hierarchies.

Failing to properly indicate youre deleting an array will likely implode at runtime no matter what the type is.

This all begs one question - if the compiler must be told that its even an array to begin with, how does it then know how many elements are in it? And why cant it use that information to know its an array in the first place? Well, the answer is that if and only if its an array, the compiler will have squirreled away the info it needs about how many elements there are.

How and where the compiler stores that is proprietary - it could be stored in memory along with the array or it might be in an associative map elsewhere.

But thats only present for arrays at all and not simple pointers to single objects, so telling the compiler that youre deleting an array is what clues it into the fact that it can now look up the size.

Now that were up to speed on the new operator and how it works relative to pointers and arrays, next time well look at how to remove them from your life entirely by replacing them with safe pointers, or more specifically, unique and shared pointers from the standard C library.

Thats all coming up in the next episode, so please make sure youre subscribed to the channel to you dont miss it! And remember, if you learned anything along the way today, please drop me a like on the video so I get a rough gauge of how advanced to make these.

Ill judge their success largely your likes and subs, so please take a moment to like the video if you indeed found it useful or interesting! Thanks for joining me out here in the shop today! In the meantime, and in between time, I hope to see you next time, right here in Daves Garage.

Daves Garage: E01: Stupid C++ Tricks with Dave - Software Development