Performance Metrics¶

Metrics combine multiple hardware events into calculated values — knowing you had 1 million cache misses is less useful than knowing those misses represent a 5% miss rate.

Tip

See the example: metric.cpp. For inspiration when creating custom metrics, explore the Likwid project.

Available Built-in Metrics¶

perf-cpp provides ready-to-use metrics that cover the most common performance analysis needs. You don't need any special setup—just use them like regular events by adding their names to your EventCounter.

Metric	What It Tells You	Formula
`gigahertz`	CPU frequency during measurement	`cycles / (seconds × 10⁹)`
`cycles-per-instruction`	How many cycles each instruction takes (lower is better)	`cycles / instructions`
`instructions-per-cycle`	How many instructions complete per cycle (higher is better)	`instructions / cycles`
`cache-hit-ratio`	Percentage of cache accesses that found data	`cache-references / (cache-references + cache-misses)`
`cache-miss-ratio`	Percentage of cache accesses that missed	`cache-misses / (cache-references + cache-misses)`
`dTLB-miss-ratio`	How often data address translation misses	`dTLB-load-misses / dTLB-loads`
`iTLB-miss-ratio`	How often instruction address translation misses	`iTLB-load-misses / iTLB-loads`
`L1-data-miss-ratio`	L1 data cache miss rate	`L1-dcache-load-misses / L1-dcache-loads`
`branch-miss-ratio`	Branch prediction failure rate	`branch-misses / branches`
`watts-pkg`	CPU package power consumption in Watts (requires RAPL)	`energy-pkg / seconds`
`watts-cores`	CPU core power consumption in Watts (requires RAPL)	`energy-cores / seconds`
`watts-ram`	RAM power consumption in Watts (requires RAPL)	`energy-ram / seconds`

Note

The watts-* metrics require RAPL (Running Average Power Limit) support, which is available on most modern Intel and AMD processors. Available RAPL domains vary by hardware: energy-pkg is widely supported, energy-cores and energy-ram depend on the processor model. Reading RAPL counters may require perf_event_paranoid <= 0 or CAP_SYS_ADMIN.

Working with Metrics¶

Metrics work exactly like regular events — add them, measure, and retrieve results:

#include <perfcpp/event_counter.hpp>

auto event_counter = perf::EventCounter{};

/// Add metrics just like regular events.
event_counter.add("cycles-per-instruction");

/// Measure your code.
event_counter.start();
/// ... your code being measured ...
event_counter.stop();

/// Get the calculated metric value.
const auto result = event_counter.result();
const auto cpi = result.get("cycles-per-instruction");

/// Release resources explicitly, or let the destructor handle it.
event_counter.close();

The required hardware events (e.g., cycles and instructions for CPI) are configured automatically if not already being measured.

Creating Custom Metrics¶

perf-cpp supports two approaches for defining custom metrics: formula-based and class-based.

Custom metrics are registered via the perf::CounterDefinition that needs to be passed to the EventCounter (→ read more about adding custom events and metrics).

Formula-Based Metrics¶

For straightforward calculations, express your metric as a mathematical formula. This approach works well when you need to combine a few events with basic arithmetic:

auto counter_definition = perf::CounterDefinition{};

/// Define a metric showing what percentage of stalls come from memory loads
counter_definition.add("stalls-by-mem-loads", 
                       "(CYCLE_ACTIVITY_STALLS_LDM_PENDING / CYCLE_ACTIVITY_STALLS_TOTAL) * 100");

auto event_counter = perf::EventCounter{ counter_definition };
event_counter.add("stalls-by-mem-loads");

This example uses Intel SkylakeX events to identify memory bottlenecks—adapted from Likwid's cycle stalls metrics.

Supported Operations¶

Your formulas can use: - Basic arithmetic: +, -, *, / - Scientific notation: 1E9, 2.5e-6 - Parentheses for grouping: (a + b) / c

Built-in Functions¶

perf-cpp provides helper functions to simplify common patterns:

Function	Purpose	Example
`ratio(a, b)`	Safe division with null handling	`ratio('branch-misses', 'branches')`
`d_ratio(a, b)`	Same as `ratio()` (compatibility alias)	`d_ratio('misses', 'attempts')`
`sum(a, b, ...)`	Add multiple values together	`sum('l1_hits', 'l2_hits', 'l3_hits')`

You can nest functions for complex calculations:

/// Calculate miss ratio across all cache levels
counter_definition.add("total-cache-miss-ratio", 
    "ratio("
    "  sum('mem_load_retired.l1_miss', 'mem_load_retired.l2_miss', 'mem_load_retired.l3_miss'),"
    "  sum('mem_load_retired.l1_hit', 'mem_load_retired.l2_hit', 'mem_load_retired.l3_hit')"
    ")"
);

Important

Event names containing operators (like the hyphen in L1-dcache-misses) must be wrapped in single quotes: 'L1-dcache-misses'. This prevents the parser from interpreting the hyphen as subtraction.

Class-Based Metrics¶

When your metric needs complex logic, validation, or stateful computation, implement it as a class. This approach gives you full control over the calculation:

#include <perfcpp/metric/metric.hpp>

class StallsPerCacheMiss final : public perf::Metric
{
public:
    /// Define the metric's identifier
    [[nodiscard]] std::string name() const override 
    {
        return "stalls-per-cache-miss"; 
    }

    /// Declare which events this metric needs
    [[nodiscard]] std::vector<std::string> required_counter_names() const override
    { 
        return {"stalls", "cache-misses"}; 
    }

    /// Perform the calculation after measurement completes
    [[nodiscard]] std::optional<double> calculate(const CounterResult& result) const override
    {
        const auto stalls = result.get("stalls");
        const auto cache_misses = result.get("cache-misses");

        /// Handle missing data gracefully
        if (stalls.has_value() && cache_misses.has_value())
        {
            /// Avoid division by zero
            if (cache_misses.value() > 0)
            {
                return stalls.value() / cache_misses.value();
            }
        }

        return std::nullopt;  // Return empty if calculation isn't possible
    }
};

Register your custom metric class with the counter definition:

auto counter_definition = perf::CounterDefinition{};

/// Register using the metric's built-in name
counter_definition.add(std::make_unique<StallsPerCacheMiss>());

/// Or register with a custom name
counter_definition.add("SPCM", std::make_unique<StallsPerCacheMiss>());

/// Use it like any other metric
auto event_counter = perf::EventCounter{ counter_definition };
event_counter.add("stalls-per-cache-miss");  /// Or "SPCM" if using custom name

Use class-based metrics when the calculation requires complex logic, validation, or architecture-specific behavior.