Minimum Number of Virtual Pointer (vptr) Loads in C++ — vtable Explained
A Table of pointers to virtual functions is known as Virtual Table and the pointer that points to that table is known as Virtual Pointer.
Each object is assigned a virtual pointer (vptr). This pointer points to a virtual table (vtbl). During initialization, the address of the virtual table is assigned to the virtual pointer. Let's delve into the origin…
A Simple Object Model
The object model operates like a table, where each member is indicated by a pointer within the table, rather than being directly present in the object's creation. This approach allows for a more efficient use of memory and resources, as the object itself doesn't need to contain all the members within its structure. Instead, it simply holds references to the location of each member, similar to how a table holds pointers to specific data. This method of handling object creation and member allocation is foundational to many programming languages and provides a versatile way to manage data and optimize system performance.
/**********************************************************
* Author :- Aditya Gaurav *
* Mail :- [email protected] *
* *
* Please support https://www.errbits.com *
* Some content is taken for educational purpose credit to *
* The C++ Object model. *
*********************************************************/
// A simple object model in initial draft
class Point{
public:
Point(float xval);
virtual ~Point();
float x() const;
static int PointCount();
protected:
virtual ostream& print(ostream &os) const;
float _x;
static int _point_count;
};
Generated object Diagram
The table contains pointers to each member. This allows accessing each member using indexes, and the entries are initialized based on the class member declaration.
Table Object model
The model originates from the simple object model, but with a tweak: instead of a single table, two separate tables are used for data and function, and the object holds pointers to these two tables. This modification enhances data and function organization, leading to a more efficient system.
The C++ Object Model
This model is directly derived from the simple object model where:
- Non-static data member is directly placed into each object.
- Static data member is placed outside of the object.
- Static and non-static member functions are also placed outside the class object.
- Virtual function pointers are placed into a table (Virtual Table - 4a) and a pointer (Virtual Pointer - 4b) to that table is placed into each class object.
Minimizing vptr Loads — What the Compiler Does and What You Can Control
Every virtual function call requires the CPU to perform at least two memory loads:
- Load the vptr from the object (offset 0 in the object layout)
- Load the function pointer from the vtable at the correct index
In a tight loop calling virtual functions on thousands of objects, this adds up. Minimizing the number of vptr loads is a key technique in performance-critical C++ — real-time systems, game engines, hardware simulators.
Why the Compiler Loads vptr More Than Once
The compiler is conservative by default. If it cannot prove that an object's vptr hasn't changed between two virtual calls, it must reload it. This happens when:
- The object is accessed through a pointer or reference — the compiler cannot rule out aliasing
- A function call between two virtual dispatches could theoretically modify the object
- The object's dynamic type is not known at compile time
void process(Base* obj) {
obj->methodA(); // load vptr (1), load fn ptr, call
doSomething(); // compiler assumes doSomething() could alias obj
obj->methodB(); // load vptr again (2) — compiler can't skip it
}
Technique 1 — The final Keyword
Marking a class or method final tells the compiler: no further overrides exist. This unlocks devirtualization — the compiler replaces the vptr lookup with a direct function call, eliminating the vptr load entirely.
class Derived final : public Base { void methodA() override; // compiler can devirtualize calls to Derived* }; void process(Derived* obj) { obj->methodA(); // devirtualized — no vptr load at all }
final that you know will never be subclassed. This is zero-cost to the design but lets the compiler eliminate vptr loads on every call site where the concrete type is visible.
Technique 2 — Store Objects by Value, Not by Pointer
When an object is allocated on the stack or held in a container by value (not pointer), the compiler often knows its exact type at compile time and can devirtualize automatically — even without final.
// By pointer — vptr load required every call (type unknown) Base* obj = new Derived(); obj->methodA(); // By value — compiler knows exact type, may devirtualize Derived obj; obj.methodA(); // direct call — 0 vptr loads
Technique 3 — CRTP (Zero-Cost Static Polymorphism)
The Curiously Recurring Template Pattern (CRTP) achieves polymorphism entirely at compile time. There is no vtable, no vptr, and therefore zero vptr loads. The trade-off is that the type must be known at compile time — you lose runtime polymorphism.
template<typename Derived> struct Base { void interface() { static_cast<Derived*>(this)->implementation(); } }; struct Concrete : Base<Concrete> { void implementation() { /* ... */ } }; // No vtable. No vptr. No memory loads for dispatch. Concrete c; c.interface();
Technique 4 — Hoist the vptr Load Out of Loops
If you're calling the same virtual method on the same object in a loop, the compiler may not automatically hoist the vptr load. You can do it manually by caching the function pointer:
// Naive — may reload vptr every iteration for (int i = 0; i < N; i++) obj->tick(i); // Explicit hoist — one vptr load before the loop auto fn = &Base::tick; for (int i = 0; i < N; i++) (obj->*fn)(i);
Quick Reference: Minimum vptr Loads by Technique
| Technique | vptr Loads | Trade-off |
|---|---|---|
| Regular virtual call (pointer/ref) | 1 per call | Full runtime polymorphism |
final class + visible type | 0 (devirtualized) | Class cannot be subclassed |
| Object by value (stack/container) | 0 (devirtualized) | No heap, no pointer indirection |
| CRTP | 0 (no vtable) | Type must be known at compile time |
| Cached function pointer in loop | 1 total (hoisted) | Slightly less readable |
| Profile-Guided Optimization (PGO) | 0 (speculative devirt) | Requires profiling run |
final help most in tight loops over many heterogeneous objects where cache pressure is high.
Bonus
- The handling of the virtual pointer is automatically done in the constructor, destructor, and copy assignment operator.
- If a class is derived, then a separate virtual table is made for the base class, and in this case, a separate pointer (bptr) is added to the object.
- This bptr is used to access the base class virtual table. So, for each virtual base class, a separate pointer is added to the object.
- bptr is often symbolized as
__vptr__xxxx. xxxx refers to the class name.
What Is the Minimum Number of vptr Loads?
The short answer: zero — under ideal compiler optimisations. The longer answer depends on what the compiler can prove about the dynamic type at each call site.
| Scenario | Minimum vptr loads | Why |
|---|---|---|
| Regular virtual call through pointer/ref, no optimisation | 1 per call | Compiler must load vptr + load fn ptr from vtable each time |
| Same object, multiple calls, no intervening calls | 1 total (hoisted by compiler) | Compiler may cache vptr in register if aliasing is ruled out |
Class marked final, concrete type visible |
0 | Devirtualized at compile time — direct call, no table lookup |
| Object stored by value (not pointer) | 0 | Compiler knows exact type — devirtualizes automatically at -O2 |
| CRTP (compile-time polymorphism) | 0 | No vtable exists — dispatch is resolved entirely at compile time |
| Profile-Guided Optimisation (PGO) — speculative devirt | 0 (with inline guard) | Compiler speculatively inlines the most-likely derived type |
final, value semantics, or CRTP."
vtable Memory Layout — What the Compiler Actually Generates
For every class with at least one virtual function, the compiler generates a static vtable — an array of function pointers in the read-only data section of the binary. Each object of that class carries a hidden vptr at offset 0, pointing to its class's vtable.
/* Given this class hierarchy */
class Animal {
public:
virtual void speak() = 0;
virtual void move() {}
virtual ~Animal() {}
};
class Dog : public Animal {
public:
void speak() override {}
void move() override {}
};
/* Compiler generates roughly:
Animal vtable (in .rodata):
┌─────────────────────────────────────┐
│ [0] &Animal::~Animal (destructor) │
│ [1] &Animal::speak (pure = 0) │
│ [2] &Animal::move │
└─────────────────────────────────────┘
Dog vtable (in .rodata):
┌─────────────────────────────────────┐
│ [0] &Dog::~Dog (overrides base) │
│ [1] &Dog::speak (overrides base) │
│ [2] &Dog::move (overrides base) │
└─────────────────────────────────────┘
Dog object layout in memory:
┌──────────────┐
│ vptr ───────┼──→ Dog vtable[0]
│ (8 bytes) │
├──────────────┤
│ data members│
└──────────────┘
*/
When you call animal_ptr->speak(), the CPU executes:
- Load
vptrfrom*animal_ptrat offset 0 — one memory read - Load function pointer from
vptr[1](speak's vtable index) — second memory read - Call the function pointer — indirect branch
Total: 2 memory loads + 1 indirect branch per virtual call, in the unoptimised case. Under -O2 with a visible concrete type, both loads collapse to a direct call.
Counting vptr Loads in Code — Interview Examples
A common interview question gives you a code snippet and asks how many vptr loads occur. Here is how to count methodically:
Example 1 — baseline (no optimisation assumptions)
void test(Base* p) {
p->methodA(); // vptr load #1 + fn ptr load
p->methodB(); // vptr load #2 + fn ptr load (compiler cannot cache)
p->methodC(); // vptr load #3
}
/* Answer: 3 vptr loads (worst case, no inter-procedural analysis) */
Example 2 — with an intervening non-inlined call
void test(Base* p) {
p->methodA(); // vptr load #1
external_func(p); // could change *p's vptr (e.g., placement new)
p->methodB(); // vptr load #2 — compiler must reload
}
/* Answer: minimum 2 vptr loads */
Example 3 — ideal optimisation, final class
class Derived final : public Base {
void methodA() override;
void methodB() override;
};
void test(Derived* p) { // concrete type known
p->methodA(); // devirtualized — 0 vptr loads
p->methodB(); // devirtualized — 0 vptr loads
}
/* Answer under ideal optimisation: 0 vptr loads */
Example 4 — C++17, under ideal compiler optimisations
struct Base { virtual int compute() = 0; };
struct Impl final : Base { int compute() override { return 42; } };
int run() {
Impl obj; // stack object, concrete type known
return obj.compute(); // devirtualized → direct call → inlined
}
/* Under -O2: 0 vptr loads. The entire function may reduce to: return 42; */
This is the canonical answer to the interview question "under ideal compiler optimisations, when test() is executed, what is the minimum number of virtual pointer (vptr) loads?" — the answer is zero, because devirtualization eliminates the vtable dispatch entirely.
vptr During Construction and Destruction
The vptr is not constant throughout an object's lifetime. The compiler updates it at each level of the constructor chain — a subtlety that catches many experienced engineers.
class A {
public:
A() {
/* vptr points to A's vtable here */
call_virtual(); // calls A::call_virtual, NOT B's version
}
virtual void call_virtual() { std::cout << "A\n"; }
};
class B : public A {
public:
B() {
/* vptr now updated to B's vtable */
call_virtual(); // calls B::call_virtual
}
void call_virtual() override { std::cout << "B\n"; }
};
B obj;
/* Output:
A ← A::A() runs first, vptr = A's vtable
B ← B::B() runs next, vptr = B's vtable
*/
Key rules:
- During
A::A(), the object's dynamic type isA— even if the final object isB - The vptr is updated before the constructor body runs at each level
- During
A::~A(), the vptr is restored toA's vtable — destruction mirrors construction in reverse - Calling virtual functions from constructors/destructors is legal but almost always a design mistake