Picking a C++ style

Andreas Hohmann April 10, 2024 #c++ #style #clang-format

After several years in JVM and JavaScript land, I recently started working with C++ again. To catch up on the latest (and exciting) developments such as concepts and coroutines, I've been using C++ for some of my toy projects as well. This naturally raises the question of which style to follow. When coding for an organization, I just accept whatever convention the organization has chosen. While I may prefer one or the other variation, I appreciate that I'm not wasting any time thinking about these differences. Not having these constraints for my personal projects is therefore both a blessing and a curse.

Some languages (most notably Go) avoid the style discussions altogether by prescribing a style and enforcing at least some portion of it with formatting and linting tools. C++ wouldn't be C++ without plenty of choices, and coding style is not an exception. C++ styles differ significantly, from the plain formatting to naming conventions to the permitted features and overall programming paradigm. Is there a line break before every or some opening brace? Should function names be CamelCase, camelCase, or snake_case? Are exceptions allowed? Should one prefer value semantics or stick to good old pointers? How much template metaprogramming is acceptable?

Formatting

Clang-format solves half of the formatting problem by providing a well-accepted formatting tool with enough options to support the most common styles. There are even online tools such as the clang-format configurator that show the effect of the many configuration parameters.

Clicking through the base styles, I quickly decided that I like Chromium's style the best. The decisive factor was the formatting of multiple function parameters. C++ parameter lists can quickly get long, and putting each parameter in its own line in case of an overflow looks clearer to me than a mix of single and multi-parameter lines.

Allowing multi-parameter lines in multiline parameter lists (default style):

std::vector<uint32_t>
return_vector(uint32_t *some_parameter1, double *long_name_for_parameter2,
              const float &long_name_for_parameter3,
              const std::map<std::string, int32_t> &long_name_for_parameter4) {
  return {};
}

Allowing only single-parameter lines in multiline parameter lists (Chromium style):

std::vector<uint32_t> return_vector(
    uint32_t* some_parameter1,
    double* long_name_for_parameter2,
    const float& long_name_for_parameter3,
    const std::map<std::string, int32_t>& long_name_for_parameter4) {
  return {};
}

I changed only one setting, namely "BreakBeforeBraces = Stroustrup". This follows Stroustrup's convention and places the opening curly brace of a function on a new line. I wouldn't do this in Java or TypeScript, but it results in a clearer function layout for C++ with its often complicated parameter types:

std::vector<uint32_t> return_vector(
    uint32_t* some_parameter1,
    double* long_name_for_parameter2,
    const float& long_name_for_parameter3,
    const std::map<std::string, int32_t>& long_name_for_parameter4)
{
  return {};
}

That's it. Long live clang-format!

Naming

While writing a lot of Java, TypeScript, Scala, and Kotlin code trained my brain to read and write camelCase identifiers, I still prefer snake_case for function and variable names in C++. Not only does this follow Stroustrup's style (and the language's inventor should have some say when it comes to style), it's also in line with most C styles and other systems programming languages (Rust, Zig), and even the official Python's official style. So, CamelCase structs and classes and snake_case functions and variables (and members) it is!

I'm not a fan of encoding type or visibility information in names (such as the infamous Hungarian notation), but I like the trailing underscore for the names of class (instance) fields. It's just one character, and it makes it immediately obviously that we are dealing with object state (which requires more attention than local state).

This field and method naming lines up nicely for C++ getter and setter methods (if they are needed):

class PersonName {
public:
  const std::string& first_name() const { return first_name_; }
  PersonName& first_name(std::string first_name) {
    first_name = std::move(first_name);
    return *this;
  }
  ...
  std::string full_name() const {
    return std::format("{} {}", first_name_, last_name_);
  }
private:
  std::string first_name_;
  std::string last_name_;
};

I like the C++ setter convention of returning the mutable instance reference *this because the resulting fluent API is a decent substitute for named constructor arguments (if one cannot use [aggregate initializers][aggregate-initializers]).

Features

I don't miss exceptions in Rust, and C++23's std::expected or Abseil's absl::StatusOr are feasible options for return types incorporating failures. I therefore don't use exceptions in my own code, wrap the few third party APIs that I'm using in exception-less functions, and treat exceptions in the standard library as the fatal errors that they are. I may change my mind, however, when an exception-based library turns out to be too useful to ignore.

Regarding all the other features, I'm glad that I restarted my C++ programming after C++20. Having spent a lot of time coding in pre-C++11, I appreciate how all the features are slowly but surely coming together to provide a better overall C++ programming experience. It's hard to grasp at this point how one could ever write templates without concepts, parameter packs, and perfect forwarding.

Programming style

Formatting and naming are obviously only two small (albeit important) aspects of a good programming style. How to structure software to arrive at correct, clear, performant, and extensible solutions is a much bigger subject. Over time I have found the combination of the "functional core, imperative shell" idea with object-oriented types and interfaces to be fairly effective. But that's a topic for another post (or two).