C++ Performance Optimization: Avoiding Common Pitfalls and Best Practices Guide

happyer - Mar 18 - - Dev Community

1. Preface

In modern C++ programming practices, performance optimization is a crucial area. As the C++ language continues to evolve, it provides developers with an increasing array of tools and features to better control program performance. However, these powerful features also bring additional complexity, making performance optimization a task that requires careful consideration. While pursuing code efficiency, developers must be vigilant against traps that may lead to performance degradation.

This article will delve into some common issues that C++ engineers may encounter while performing performance optimizations and provide corresponding solutions. We will start with the problem of abusing std::shared_ptr and discuss how to use smart pointers correctly to reduce performance overhead. Next, we will discuss the performance costs of type-erasure tools std::function and std::any, and how to use them only when necessary. The article will also cover improper use of std::optional and std::string_view, pitfalls of std::async, and misuse of std::move, among other issues.

Additionally, we will discuss the impact of hidden copies and destruction on performance, the performance overhead of virtual functions, unnecessary copying caused by structured bindings, and the importance of tail recursion optimization. New features introduced in C++20, such as concepts and constraints, coroutines, std::span, and modules, will also be examined to ensure they are used appropriately and do not become a burden on performance.

Through this article, we hope to help C++ developers better understand and address the challenges in performance optimization, thereby writing code that is both efficient and robust.

2. Misusing std::shared_ptr

Problem Description:
std::shared_ptr provides a convenient reference counting mechanism, but its construction, copying, and destruction all involve atomic operations, which can be costly in terms of performance.

Solution:

  • Prefer std::unique_ptr whenever possible, and use std::shared_ptr only when shared ownership is truly needed.
  • Use std::make_shared to construct std::shared_ptr instances to reduce the number of memory allocations.

Code Example:

// Using std::unique_ptr
std::unique_ptr<MyClass> ptr1 = std::make_unique<MyClass>();

// Using std::shared_ptr
std::shared_ptr<MyClass> ptr2 = std::make_shared<MyClass>();
Enter fullscreen mode Exit fullscreen mode

3. Type Erasure: std::function and std::any

Problem Description:
std::function and std::any provide flexible type-erasure capabilities, but they also come with performance costs.

Solution:

  • Use std::function and std::any only when you need to store functions or objects of uncertain types.
  • Consider using templates and static polymorphism as an alternative to std::function.

Code Example:

// Using std::function
std::function<void(int)> func = [](int value) { /* ... */ };

// Using templates and static polymorphism
template <typename Callable>
void callFunction(Callable&& func, int value) {
    func(value);
}
Enter fullscreen mode Exit fullscreen mode

4. Improper Use of std::optional

Problem Description:
std::optional can lead to performance issues, especially when the contained type has significant construction, copying, or moving costs.

Solution:

  • Use std::optional only when you need to represent an optional value.
  • Consider returning an empty container or a special value to represent a "no value" situation.

Code Example:

// Using std::optional
std::optional<std::vector<int>> getOptionalVector(bool condition) {
    if (condition) {
        return std::vector<int>{1, 2, 3};
    }
    return std::nullopt;
}
Enter fullscreen mode Exit fullscreen mode

5. std::string_view Lifetime Issues

Problem Description:
std::string_view does not own the string it refers to; if the original string is released or modified, std::string_view may refer to invalid memory.

Solution:

  • Ensure the lifetime of std::string_view does not exceed that of the string it references.
  • Be particularly mindful of ownership and lifetime issues when using std::string_view.

Code Example:

std::string_view getStringView() {
    std::string str = "Hello, World!";
    return std::string_view(str); // Dangerous: str is destroyed after the function returns
}
Enter fullscreen mode Exit fullscreen mode

6. Pitfalls of std::async

Problem Description:
std::async may lead to unexpected synchronous calls; if the returned std::future object is not saved, its destructor will wait for the asynchronous operation to complete, causing the code to execute synchronously.

Solution:

  • Save the std::future object returned by std::async to ensure asynchronous execution.

Code Example:

// Correct use of std::async
auto future1 = std::async(std::launch::async, []() { /* ... */ });
auto future2 = std::async(std::launch::async, []() { /* ... */ });
Enter fullscreen mode Exit fullscreen mode

7. Misusing std::move

Problem Description:
In some cases, misusing std::move is not only unhelpful but may also lead to performance degradation. Particularly in scenarios where (Named Return Value Optimization) NRVO might be triggered, using std::move can result in additional copy or move operations.

Solution:

  • Use std::move only when you need to transfer ownership.
  • Avoid using std::move when returning local objects.

Code Example:

// Avoid misusing std::move
MyClass createObject() {
    MyClass obj;
    // ... modify obj ...
    return obj; // Do not use std::move
}
Enter fullscreen mode Exit fullscreen mode

8. Hidden Copies

Problem Description:
Object copying in C++ can lead to performance issues, especially in the following scenarios:

  • Constructors not using initializer lists or not using std::move.
  • Range-based for loops not using references.
  • Lambda expressions capturing by value without using std::move or capturing by reference.
  • Implicit type conversions leading to unnecessary copies.

Solution:

  • Use initializer lists and std::move to avoid unnecessary copies.
  • Use references in range-based for loops.
  • Use reference capture or std::move in lambda expressions.

Code Example:

// Using initializer lists and std::move
class MyClass {
    std::vector<int> data;
public:
    MyClass(std::vector<int>&& d) : data(std::move(d)) {}
};

// Avoiding copies by using references
std::vector<std::string> vec;
for (const std::string& s : vec) {
    // ...
}

// Using reference capture or std::move
std::string str = "example";
auto lambda = [&str]() { /* ... */ };
auto lambda_move = [s = std::move(str)]() { /* ... */ };
Enter fullscreen mode Exit fullscreen mode

9. Hidden Destruction

Problem Description:
Destruction of complex types can be very time-consuming. If an object's destructor takes a long time to execute, it can inadvertently add to the function's execution time.

Solution:

  • Avoid creating and destroying complex objects on the hot path.
  • Use object pools to manage the lifecycle of complex objects.

Code Example:

// Assuming ComplexType is a type with significant destruction overhead
class ComplexType {
    // ...
public:
    ~ComplexType() { /* ... */ }
};

// Using object pools
class ComplexTypePool {
    std::list<ComplexType> pool;
public:
    ComplexType& acquire() {
        if (pool.empty()) {
            pool.emplace_back();
        }
        auto& obj = pool.front();
        pool.pop_front();
        return obj;
    }
    void release(ComplexType& obj) {
        pool.push_back(std::move(obj));
    }
};
Enter fullscreen mode Exit fullscreen mode

10. Virtual Functions

Problem Description:
Virtual functions provide the ability for runtime polymorphism, but they come with additional performance overhead:

  • Extra addressing operations: the specific function address needs to be found through the virtual function table.
  • Disrupting the CPU pipeline: virtual function calls are indirect calls, which require branch prediction.
  • Hindering compiler inlining: in most cases, virtual functions cannot be inlined.

Solution:

  • When polymorphism is not a necessity, consider using non-virtual member functions.
  • Use templates and static polymorphism (such as CRTP) to replace runtime polymorphism.

Code Example:

// Traditional polymorphism using virtual functions
class Base {
public:
    virtual void doWork() { /* ... */ }
    virtual ~Base() {}
};

class Derived : public Base {
public:
    void doWork() override { /* ... */ }
};

// Static polymorphism using CRTP
template <typename Derived>
class BaseCRTP {
public:
    void doWork() {
        static_cast<Derived*>(this)->doWorkImpl();
    }
};

class DerivedCRTP : public BaseCRTP<DerivedCRTP> {
public:
    void doWorkImpl() { /* ... */ }
};
Enter fullscreen mode Exit fullscreen mode

11. Unnecessary Copies with Structured Bindings

Problem Description:
C++17 introduced structured bindings, which allow you to conveniently unpack tuples or structures. However, if not careful, structured bindings can lead to unnecessary object copying.

Solution:

  • Use references to avoid copying, for example, auto& [x, y] = my_pair;.

Code Example:

std::pair<int, std::vector<int>> getPair() {
    return {1, {1, 2, 3}};
}

// May cause copying
auto [id, vec] = getPair();

// Using references to avoid copying
auto& [id_ref, vec_ref] = getPair();
Enter fullscreen mode Exit fullscreen mode

12. Tail Recursion Optimization

Problem Description:
Tail recursion optimization can reduce the stack space usage of recursive functions, but hidden operations in C++ (such as destruction) may hinder this optimization.

Solution:

  • Use trivially destructible objects, such as std::string_view, to help the compiler implement tail recursion optimization.

Code Example:

// Implementing tail recursion optimization with std::string_view
unsigned btd_tail(std::string_view input, int v) {
    if (input.empty()) {
        return v;
    } else {
        v = v * 2 + (input.front() - '0');
        return btd_tail(input.substr(1), v);
    }
}
Enter fullscreen mode Exit fullscreen mode

13. Misuse of Concepts and Constraints

Problem Description:
C++20 introduced concepts and constraints, which provide a more powerful way to specify template parameter requirements. However, overly complex concepts and constraints can increase compilation time and make error messages difficult to understand.

Solution:

  • Use concepts and constraints only when you need to clearly express interface requirements.
  • Avoid creating overly complex concepts; keep them simple and clear.

Code Example:

template<typename T>
concept Addable = requires(T a, T b) {
    { a + b } -> std::convertible_to<T>;
};

// Using concepts
template<Addable T>
T add(T a, T b) {
    return a + b;
}
Enter fullscreen mode Exit fullscreen mode

14. Improper Use of Coroutines

Problem Description:
C++20 introduced coroutines, which are a powerful tool for asynchronous programming. However, improper use of coroutines can lead to performance issues, such as excessive coroutine switching causing performance degradation.

Solution:

  • Use coroutines in I/O-intensive or asynchronous operation scenarios.
  • Avoid frequent starting and suspending of coroutines in performance-critical code paths.

Code Example:

std::future<int> asyncComputation() {
    co_return 42; // Simplifying asynchronous operations with coroutines
}
Enter fullscreen mode Exit fullscreen mode

15. Improper Use of std::span

Problem Description:
std::span is a lightweight container view introduced in C++20, providing access to a contiguous region of an array or container. However, if the original data is released or modified, std::span may refer to invalid memory.

Solution:

  • Ensure the lifetime of std::span does not exceed that of the data it references.
  • Be particularly mindful of ownership and lifetime issues when using std::span.

Code Example:

std::span<int> getSpan(std::vector<int>& vec) {
    return std::span<int>(vec); // Ensure the vec's lifetime is long enough
}
Enter fullscreen mode Exit fullscreen mode

16. Improper Use of Modules

Problem Description:
C++20 introduced modules, intended to replace the traditional header and source file model, improving compilation efficiency. However, improper module partitioning can lead to increased compilation time, especially when there are complex dependencies between modules.

Solution:

  • Reasonably partition modules to avoid over-segmentation.
  • Manage dependencies between modules well to reduce unnecessary imports.

Code Example:

// my_module.cppm
export module my_module;

export int add(int a, int b) {
    return a + b;
}

// Using modules
import my_module;

int result = add(1, 2);
Enter fullscreen mode Exit fullscreen mode

17. Developing any platform from Scratch with Codia AI Code

To integrate Codia AI into your Figma to any platform such as frontend, mobile, and Mac development process, follow these instructions:
Open the link: Codia AI Figma to code: HTML, CSS, React, Vue, iOS, Android, Flutter, ReactNative, Tailwind, Web, App

Open the link

  • Install the Codia AI Plugin: Search for and install the Codia AI Figma to Flutter plugin from the Figma plugin store.
  • Prepare Your Figma Design: Arrange your Figma design with clearly named layers and components to ensure the best code generation results.
  • Convert with Codia AI: Select your design or component in Figma and use Codia AI to instantly

Install the Codia AI Plugin

generate any platform code.

generate any platform code

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .