Skip to content

Performance Metrics

Metrics combine multiple hardware events into calculated values — knowing you had 1 million cache misses is less useful than knowing those misses represent a 5% miss rate.

Tip

See the example: metric.cpp. For inspiration when creating custom metrics, explore the Likwid project.


Available Built-in Metrics

perf-cpp provides ready-to-use metrics that cover the most common performance analysis needs. You don't need any special setup—just use them like regular events by adding their names to your EventCounter.

Metric What It Tells You Formula
gigahertz CPU frequency during measurement cycles / (seconds × 10⁹)
cycles-per-instruction How many cycles each instruction takes (lower is better) cycles / instructions
instructions-per-cycle How many instructions complete per cycle (higher is better) instructions / cycles
cache-hit-ratio Percentage of cache accesses that found data cache-references / (cache-references + cache-misses)
cache-miss-ratio Percentage of cache accesses that missed cache-misses / (cache-references + cache-misses)
dTLB-miss-ratio How often data address translation misses dTLB-load-misses / dTLB-loads
iTLB-miss-ratio How often instruction address translation misses iTLB-load-misses / iTLB-loads
L1-data-miss-ratio L1 data cache miss rate L1-dcache-load-misses / L1-dcache-loads
branch-miss-ratio Branch prediction failure rate branch-misses / branches
watts-pkg CPU package power consumption in Watts (requires RAPL) energy-pkg / seconds
watts-cores CPU core power consumption in Watts (requires RAPL) energy-cores / seconds
watts-ram RAM power consumption in Watts (requires RAPL) energy-ram / seconds

Note

The watts-* metrics require RAPL (Running Average Power Limit) support, which is available on most modern Intel and AMD processors. Available RAPL domains vary by hardware: energy-pkg is widely supported, energy-cores and energy-ram depend on the processor model. Reading RAPL counters may require perf_event_paranoid <= 0 or CAP_SYS_ADMIN.

Working with Metrics

Metrics work exactly like regular events — add them, measure, and retrieve results:

#include <perfcpp/event_counter.hpp>

auto event_counter = perf::EventCounter{};

/// Add metrics just like regular events.
event_counter.add("cycles-per-instruction");

/// Measure your code.
event_counter.start();
/// ... your code being measured ...
event_counter.stop();

/// Get the calculated metric value.
const auto result = event_counter.result();
const auto cpi = result.get("cycles-per-instruction");

/// Release resources explicitly, or let the destructor handle it.
event_counter.close();

The required hardware events (e.g., cycles and instructions for CPI) are configured automatically if not already being measured.

Creating Custom Metrics

perf-cpp supports two approaches for defining custom metrics: formula-based and class-based.

Custom metrics are registered via the perf::CounterDefinition that needs to be passed to the EventCounter (→ read more about adding custom events and metrics).

Formula-Based Metrics

For straightforward calculations, express your metric as a mathematical formula. This approach works well when you need to combine a few events with basic arithmetic:

auto counter_definition = perf::CounterDefinition{};

/// Define a metric showing what percentage of stalls come from memory loads
counter_definition.add("stalls-by-mem-loads", 
                       "(CYCLE_ACTIVITY_STALLS_LDM_PENDING / CYCLE_ACTIVITY_STALLS_TOTAL) * 100");

auto event_counter = perf::EventCounter{ counter_definition };
event_counter.add("stalls-by-mem-loads");

This example uses Intel SkylakeX events to identify memory bottlenecks—adapted from Likwid's cycle stalls metrics.

Supported Operations

Your formulas can use: - Basic arithmetic: +, -, *, / - Scientific notation: 1E9, 2.5e-6 - Parentheses for grouping: (a + b) / c

Built-in Functions

perf-cpp provides helper functions to simplify common patterns:

Function Purpose Example
ratio(a, b) Safe division with null handling ratio('branch-misses', 'branches')
d_ratio(a, b) Same as ratio() (compatibility alias) d_ratio('misses', 'attempts')
sum(a, b, ...) Add multiple values together sum('l1_hits', 'l2_hits', 'l3_hits')

You can nest functions for complex calculations:

/// Calculate miss ratio across all cache levels
counter_definition.add("total-cache-miss-ratio", 
    "ratio("
    "  sum('mem_load_retired.l1_miss', 'mem_load_retired.l2_miss', 'mem_load_retired.l3_miss'),"
    "  sum('mem_load_retired.l1_hit', 'mem_load_retired.l2_hit', 'mem_load_retired.l3_hit')"
    ")"
);

Important

Event names containing operators (like the hyphen in L1-dcache-misses) must be wrapped in single quotes: 'L1-dcache-misses'. This prevents the parser from interpreting the hyphen as subtraction.

Class-Based Metrics

When your metric needs complex logic, validation, or stateful computation, implement it as a class. This approach gives you full control over the calculation:

#include <perfcpp/metric/metric.hpp>

class StallsPerCacheMiss final : public perf::Metric
{
public:
    /// Define the metric's identifier
    [[nodiscard]] std::string name() const override 
    {
        return "stalls-per-cache-miss"; 
    }

    /// Declare which events this metric needs
    [[nodiscard]] std::vector<std::string> required_counter_names() const override
    { 
        return {"stalls", "cache-misses"}; 
    }

    /// Perform the calculation after measurement completes
    [[nodiscard]] std::optional<double> calculate(const CounterResult& result) const override
    {
        const auto stalls = result.get("stalls");
        const auto cache_misses = result.get("cache-misses");

        /// Handle missing data gracefully
        if (stalls.has_value() && cache_misses.has_value())
        {
            /// Avoid division by zero
            if (cache_misses.value() > 0)
            {
                return stalls.value() / cache_misses.value();
            }
        }

        return std::nullopt;  // Return empty if calculation isn't possible
    }
};

Register your custom metric class with the counter definition:

auto counter_definition = perf::CounterDefinition{};

/// Register using the metric's built-in name
counter_definition.add(std::make_unique<StallsPerCacheMiss>());

/// Or register with a custom name
counter_definition.add("SPCM", std::make_unique<StallsPerCacheMiss>());

/// Use it like any other metric
auto event_counter = perf::EventCounter{ counter_definition };
event_counter.add("stalls-per-cache-miss");  /// Or "SPCM" if using custom name

Use class-based metrics when the calculation requires complex logic, validation, or architecture-specific behavior.