Home Rumble Youtube Twitter/X Kofi Contact / Crypto

The Big C Trap

The biggest danger in C isn't pointers or macros or obscure undefined behavior surrounding bit shifts of signed integers. It's trying to write the fastest code all the time.

We've been warned all along: "Premature optimization is the root of all evil." This is like telling an alcoholic to "drink responsibly". Yes, but actually no. It's not specific enough, and not strong enough. It doesn't communicate the dangers and consequences effectively to some shithead who woke up drunk, grabbed the bottle on his night stand, and started trying to drink the hangover away. On a Thursday afternoon.

The affliction is bad. Really bad. The sort of person who is attracted to C, especially in this day and age, loves the idea of squeezing every last bit and clock cycle out of every struct and algorithm. In C it's possible. You have all the power, plus what the compiler gives you. SIMD is at your fingertips. And threads, and atomics, and Vulkan, and direct syscalls. Nobody has you on a leash, no stupid runtime is secretly adding branches to your pointer math or dragging around secret function parameters. Everyone else is just pretending to program by comparison.

And that's all true and wonderful! But with great power comes great responsitrillitrance. Or more precisely, if the original quote had been by a proper philosopher, great power requires great discipline. You don't have to go out and use your power for good, but you do have to restrain from using it for evil. And the greatest evil in C, aside from advocating defer, is the psychotic, obsessive compulsion to write efficient code.

At least that's what we call it to ourselves. I openly include myself in this category. I am a recoving optimizer, and I relapse frequently, especially if I haven't had any alochol in a while. It's not optimization though. It's a mental illness that causes us to prioritize the tiniest, most insignificant performance and memory gains over actually important things like clarifying our thoughts, exploring the problem space, experimenting with other methods, and, you know, finishing the fucking project. The fact that this OCD is confined to C code means that it doesn't directly destroy out lives, but it doesn't change the nature or diminish the severity of the problem within our projects.


The core of every mental illness is a conflict between our desires and the nature of reality. If there wasn't then it wouldn't be an illness, merely a philosophy. An alcoholic wants to drink all the time for whatever reason, but reality says that being drunk all the time will ruin your health and relationships. A coffee addict wants to drink coffee all the time and reality is fine with this. Alcoholism is a mental illness because it harms your life whereas coffeeism is a hobby at best because your life can go on normally. Neither alcohol nor coffee is inherently bad; most people drink both responsibly on occassion without any issues whatsoever. Likewise, there are times when it is completely appropriate to spend two days squeezing a handful of clock cycles out of an algorithm. But that's not every single string copy. That's not even once a year for most developers.

The limiting factor of software development is cognitive load. If you're not bottlenecked by congitive load then you aren't using powerful enough programming techniques or you're bogged down by clunky toolchains. The first is a skill issue (see other articles on this site) and the second is a wisdom issue (stop doing dumb stuff like using grunt). Your precious, limited supply of cognitive capacity needs to be allocated to the most important, most productive parts of the application, not optimizing out a couple strlen() calls on file names. It's the same sort of insanity as someone who obsessively cleans their apartment while failing university classes. It's the same sort of insanity as an obese person obsessing over whether their snacks have GMO ingredients. It's the same sort of insanity as a compulsive gambler driving to the other side of town to save five cents a gallon on gas; sometimes you don't even get faster OCD code because you made it too complicated and made performance assumptions that were not correct.

C is already faster than everything else without you even trying. It's frequently twice as fast as idiomatic C++, which is an order of magnitude faster than most other compiled languages, which are an order of magnitude faster than jokes like Python and PHP. The most naive string handling in C is fast as hell. You can asprintf(3) everything and still win language shootouts.

Except those are the words of an addict. I naturally wandered off when writing this, and decided to leave it in and discuss it instead. I'm still thinking about the speed of string operations. String operations don't matter at all in nearly all cases. They're memory bound so cpu cycles are irrelevant.

And there I go again, right back to obsessing over unimportant things. The problem is the obsession, the constant mental attraction to the concept and feeling of microoptimization. It really is a feeling, a warm and cozy feeling that all is right in the world, or at least this function, because a couple of bytes or cycles didn't go to waste. It's like hipsters in Seattle feeling good about using a low-flow toilet before stepping out into the rain.

My name is C_Otter and I'm an addict.


We must stay eternally vigilant against this mental illness because we, unlike alcoholics, cannot go completely dry. We are master distillers who have to taste the casks to determine the right blend. Optimization is an important part of programming and we need to do it daily, just in the correct places. Otherwise we wake up in a ditch covered in our own for loops.

So what are the 12 steps? What are the down-to-earth practical things you can do to avoid falling into the biggest trap in C?

First, forget about optimizing strings. Completely. Use sprintfdup() or str_join() for everything. Alloc a new string, use it on one line, and free it on the next. (Prolly was in a fastbin anyw— there it goes again… (I grabbed the HTML entity for ellipsis while looking up the m-dash.)) String performance doesn't matter at all. It rounds to zero compared to everything else. So stop thinking about it at all. Call strdup() willy nilly on any and every string as you see fit. It doesn't matter, and you shouln't be bothered by it either. Being bothered is the mental illness talking.

Second, stop worrying about memory allocation. malloc() is free. Well, not free(), but gratis. You know what I mean. Allocators are much faster than they used to be anyway (and again it rears its ugly head…). When you need memory, allocated it. When you're done, free it. Stop worrying about which block is where or if it's contiguous. You will naturally end up with reasonable memory layout just by not being completely retarded; any further optimizatio should be done at the behest of and with the guidance of real-world profiling. If you write your functions with the proper zen then it doesn't matter where the memory is. You can change the layout trivially at any point.

Third, zero-initialize everything, even if you're about to overwrite most of it with actual data. The compiler will figure out if your initialization of stack variables can be elided. In the case of heap allocations, the initialization cost of small ones is irrelevant, and big mmap()-ed ones are going to come from the OS zeroed out anyway so calloc() is free compared to malloc(). (I can't help it…) Zero-init also avoids entire classes of silly bugs and allows you to skip having constructors on most data structures.

Fourth, just use a loop, dynamic array, or hash table unless your testing shows that something more sophisticated is necessary. You don't need a balanced tree. You dont't need a linked list. Those aren't even usually faster anyway due to memory locality. The simplest algorithm is the best one unless testing shows otherwise. Spend your precious mental energy optimizing the parts that are actually slow instead of the parts that you assume are slow but probably aren't.

And finally, write or steal a set of boilerplate helpers for common operations. Stop thinking about FILE pointers and SEEK_SET and the flags for mmap(). You have a file path and you need the contents in a buffer, or vice versa. That's one simple function. Write that function and stop worrying about data streaming or some other nonsense unless the circumstance or testing shows that you actually need to. Combined with the prior points, this means that walking a directory consists of calling a function that returns/fills a dynamic array of dynamically-allocated strings for all the files. That's it. No opendir(), no FindNextFile(), no nonsense.


This one big trap held me back for years. It still holds me back and I catch myself falling in all the time, but at least now I'm cognizant of this tendency and can guard against it. With proper discipline, you can be incredibly productive in C and get better performance to boot. Microoptimization steals mental energy and focus from macrooptimization which has real gains.

Where should you focus on optimizing instead? Nowhere, at least to start. Just write the program in whatever way has the best zen that you can see at the moment. Very likely the process of doing that will expose facets of the problem that you didn't see before and you'll have to rewrite significant portions of code to reflect the proper structure of things. Any time spend optimizing would have been completely wasted. Further, you seldom have a good enough grasp of the problem to be able to properly optimize it from the start anyway. Usually only the experience of writing the naive code can give you enough wisdom to make a better version, even if you think you already know the problem well.

Once you have a working application, profile it using whatever tools or system you want. Personally, I find that per-function profiling tools are of little utility. They clutter readouts with noise even when using graphical tools. I use a couple macros to time the sections of code I'm interested in, doing a sort of binary search until I find the hot spots. With manual instrumentation you can easily combine or subtract logical sections even if they are not organized that way in the code iself. To each his own; if you like gprof then you do you. The important part is that your optimization decisions and time usage is based on actual performance data instead of intuition. I've been at this a while and my intuition is still wrong more often than not; never forget that computers are magic.

Or maybe you get lucky and find out that the application is fast and lean and doesn't need any optimization at all. The difference between a 30k image and a 40k binary image rounds to zero on a machine with 32 gigs of ram.