Introduction

Why C++?

  • Runs very fast and remains one of the more scalable modern programming languages
  • Programmer has fine control over memory

Why not C++?

  • Easy to shoot yourself in the foot with poor memory management
  • Because you have so much control, it’s easy to mess up and not realize it
  • Supports many older legacy features that don’t belong in modern programming
  • Compiler errors can be cryptic sometimes

Basic datatypes

Remember that a byte is defined as 8 bits.

  • int: signed integer, 4 bytes (so 32 bits) and unsigned int: only zero and positive numbers, also 4 bytes
    • Can use other, larger integer and unsigned integer types such as int64_t (8 bytes, signed)
  • float: 4 bytes, double: 8 bytes (so 64 bits) to represent decimal numbers
  • bool: 1 byte. Can represent a bool with either a 0 or 1
    • C++ doesn’t support data sizes smaller than 1 byte. So, the upper 7 bits of bool are unused.
  • char: ASCII character, 1 byte.
    • While the main usage of char is to represent characters, at the end of the day they just specify the size of the datatype, and are actually numbers e.g. the expression 'a' + 6 is legal, 'a' is actually represented by a ASCII code (a number).

There are many different compilers for C++. The three main ones are

  • MSVC: for Microsoft Visual Studio, one of the more common C++ IDEs, Windows only
  • GNU Compiler Collection (GCC): works on all UNIX-based systems
    • MinGW is the Windows port of GCC
  • clang works for UNIX and Windows

Compilation

Top-down compilation: must be defined earlier in the code in order to be used.

Forward declarations

// forward declaration
int twice(int);
 
int main() {
  int i = 5;
  i = twice(i);
}
 
// otherwise, wouldn't compile
int twice(int x) {
  return x * 2;
}

The code promises the compiler that an implementation of twice() will be provided later. Allows the code to compile (but only because twice() is actually implemented later).

Header files

Collections of forward declarations. Declarations for classes, functions, constants, etc.

  • Use #include "name.h" to refer to them. Effectively you’re inserting entire contents of the .h file into your .cpp file when you #include it

Separating the declaration and implementation

The declaration and implementations can be located in separate files. You can have a forward declaration in header.h, implementation in impl.cpp, and then call the function in main.cpp.

  • Usually call the header and implementation the same filename, e.g. wow.h and wow.cpp.
  • This will correctly work because the compiler will check other .cpp files if it can’t find the implementation in main.cpp before throwing an error.

Building a project

Each .cpp file is treated as its “own” thing, called a compilation unit or translation unit. By “own” I mean it doesn’t know about any other .cpp files.

  • This is why we #include "header.h" in each source file. If we want to use a function from another source file, we first forward declare it here.

Each source file is compiled to an intermediary object file, signaled by a .obj file extension. Each object file is linked together, and that’s when we check if the forward declaration in a certain source file is actually implemented somewhere in another object file.

  • These object files are linked together into one final singular executable.
  • I assume this is also why we don’t include entire source files in other ones e.g. #include "source.cpp" instead of #include "header.h". They all get linked in the end anyway, so inserting the implementations for every include directive probably takes more time to process than a simple forward declaration.

Preprocessor directives

Include guards

Avoids compiling the same header more than once. There are two ways to accomplish this. #pragma once is the simplest way.

#pragma once
 
float add(float a, float b);

An older method (but guaranteed to be available) is using #ifndef.

#ifndef MATHS_H
#define MATHS_H
 
float add(float a, float b);
 
#endif

What’s happening here is that the first time the preprocessor sees this header file, it will define the macro MATHS_H, because #ifndef MATHS_H is true (the macro has never been defined before because it’s never been processed before).

These macros stay defined unless we explicitly #undef it (which we don’t). The next time we meet this header file again (in another file, or hell, the same one), the #ifndef MATHS_H test will fail, because we did it previously. This essentially comments out the whole file up to the #endif line, which should be the last line in the file. This avoids including it again.

We should always use include guards because the compiler throws an error if it finds two functions, classes, etc. with the same name.

Pass by copy

By default, all data passed into a scope is copied in. This means in a function call, the data is copied into the stack frame, to be used by the callee.

Pass by reference

The ampersand & symbol allows a function to access memory outside of its scope.

void foo(float &a) {
  a = a * 2.f;
}
 
int main() {
  float x = 10.f;
  std::cout << x << std::endl; // 10.000000
  
  foo(x);
  std::cout << x << std::endl; // 20.000000
}

In this example, x first prints out 10.000000, and then 20.000000, because x was modified by foo() since it was passed in by reference. Inside foo(), &a is a reference to the value stored in x, not a copy of it. They both point to the same value.

Readonly access with const

If we combine a reference with a const, this allows us to get readonly access to an external variable without having to copy it.

vec2 operator+(const vec2& v);

This allows us to save memory and also not have to worry about whether the function we’re calling modifies the value in some unexpected side effect.

Typically has less memory usage

If you pass variables containing large data structures into functions as a reference, you save a ton of memory stack space, and you don’t waste time copying the data into a new object.

The amount of memory used by a reference is constant. It’s 64 bits on a 64-bit system, and 32 bits on a 32-bit system. If the object we’re referencing is much, much larger, then this saves memory, because a memory address will always take up 64 bits, for example.

Of course, simple types commonly take up less space than 64 bits (for example, integers), so it would be wasting space to use a reference; copying is cheaper here.

References are a general concept

References are not just limited to function headers.

int main() {
  int a = 6;
  
  int &ref = a;
  ref = 10;
 
  std::cout << a; // 10
}

Here, we directly take a reference of another variable in the same function scope. We then modify it, which also changes the value of the original variable, a.

Memory model

The stack

Stores temporary variables. It’s managed by the CPU and has fast read/write. This is where everything is allocated by default in C++.

int main() {
  // allocates memory on the stack for `a`
  vec3 a = vec3(1.f, 3.f, 2.f);
}

Variables are pushed onto the stack in a top-down fashion. When a function creates a variable, it gets pushed onto the stack. When the function exits scope, all variables created from that function are popped from the stack. Also called unwinding the stack.

  • Memory on the stack is relatively small. We’ll eventually run out of it
  • Trying to maintain many different variables and keeping them alive in the same scope can become difficult, especially with many function calls.

Passing values across scopes

When a function scope ends, the local variables are popped off the stack, and the memory is freed. So how do we get “out” the value that we’ve calculated within functions?

Returning it

Seems obvious but a return statement pushes the value onto the stack in a very specific place (specified by x86), which the function caller can then read from after the callee has finished.

// regular way
float sum(float a, float b) {
    return a + b;
}
Pointer or reference parameters

Since these two concepts are basically a “window” into an outside scope, we can actually use them to store the return value. For me this was confusing at first, because I’ve always thought of function parameters as a “one-way” door into the function, but that’s not true.

// Pass in a reference to a float from the outside, store the calculated
// value inside of it. Not really recommended... not intuitive
void sum(float a, float b, float &out) {
    out = a + b;
}
 
int main() {
  float result = 0.f;
  sum(5.f, 7.f, result);
  std::cout << result; // 12.f
}
 
// This is better because we have to explicitly dereference the pointer
// to use it or set any values. It helps to differentiate that this is where
// the return value is stored. Also, when we call the function, we have to
// use the special `&` symbol to get a pointer, which also helps with clarity.
void sum(float a, float b, float *out) {
    *out = a + b;
}
 
int main() {
  float result;
  sum(5.f, 7.f, &result);
}

The heap

Much, much larger than the stack. Usually slower to read/write, and in C++ allocating and deallocating memory on the heap is not automatic.

How to leak memory

It’s fun!

void leak_memory() {
    float *pi = new float(3.1416);
}
 
int main() {
    for (int i = 0; i < 10e100; ++i) {
        leak_memory();
    }
    
    return 0;
}

While the pi variable is removed from the stack every time the function call to leak_memory() ends, the new keyword allocated memory on the heap, which is never deallocated. This causes a memory leak because the pointer (via pi) to that memory has now gone out of scope, meaning we’re never able to access it again.

Heap allocation

There are many ways to allocate memory on the heap. Older methods like the new operator or straight up malloc() are heavily discouraged. Modern C++ relies on RAII and smart pointers.

Resource acquisition is initialization (RAII)

Coined by Bjarne Stroustrup in the mid 80s. This is the idea that the object that instantiates some memory should free that memory when the time comes.

Smart pointers

C++ provides two STD classes to help us access and manage heap memory. Smart pointers help automatically deallocate the heap memory so we don’t have to remember.

std::unique_ptr

When a unique_ptr is created, it also instantiates some heap memory for the object it wraps. When a unique_ptr is destroyed, it also frees that same heap memory, automatically.

Every C++ class has a copy constructor by default. However, std::unique_ptr explicitly disables its copy constructor. This makes sense, because only one object is allowed to own and control this memory on the heap.

Accessing the underlying pointer

It’s trivial to get a pointer to the object that the unique_ptr is managing by calling get(). This allows you to use the object in functions or APIs that may not expect the smart pointer wrapper.

Note that C++ does not stop you from manually tampering with it. It is valid C++ code to call delete on this raw pointer, which will deallocate the heap memory. However, the unique_ptr does not know this happened, and when its lifetime ends, it will try to delete again. Double deletion is undefined behavior and will probably crash the program. w