2026-03-21 - Different Ways to Read CTF

There are many ways to skin a cat, but not overly many ways to read Common Trace Format (CTF) traces. Lets dive into what options actually exist by collecting a trace and implementing the same processing using different CTF libraries.

The data was collected using sudo trace-cmd record -e sched:sched_switch -C mono on my linux desktop, sporting a Ryzen 7 3700X CPU, 2666 Mhz RAM and some M.2 NVMe SSD (this setup is also the host of the processing). The sched_switch events provide information about what processes are running on each CPU, allowing us to estimate which program are occupying the CPU the most. Two sample events are provided below, showing how "UnityGfxDeviceW" gets replaced by "monado-service" on core 10.


[03:12:01.107934447] sched:sched_switch: { cpu_id = 10 }, { prev_comm = "UnityGfxDeviceW", prev_pid = 21846, prev_prio = 120, prev_state = 256, next_comm = "monado-service", next_pid = 21931, next_prio = 120 }
[03:12:01.107946159] sched:sched_switch: { cpu_id = 10 }, { prev_comm = "monado-service", prev_pid = 21931, prev_prio = 120, prev_state = 1, next_comm = "wine_sechost_se", next_pid = 21653, next_prio = 120 }

To not only have a bunch of idle CPUs (yawn), I decided to trace while operating my VR setup. So the traced system is running a VR headset, VRChat, Dance Dash and OBS recording the adventure using a CPU encoder.

Dance dash gameplay with overlaid 3D avatar

If you are in the know-how, you might have noticed that trace-cmd does not produce data in the CTF format. That is an issue, considering that I want to compare CTF readers. I could have traced the system using lttng to produce a trace in the correct format, but I was curious about the bt2-ftrace-to-ctf project presented at FOSDEM 2026 so I took this as an opportunity to test it out. bt2-ftrace-to-ctf is a babeltrace2 plugin that acts as a converter from the trace.dat format (produced by trace-cmd) to CTF. With a simple ftrace-to-ctf -b 03:12:01.107806147 -e 03:12:02.107806147 -c 2 recording/trace.dat recording/ctf I converted a one second snippet of the collected trace data into the CTF format, yielding a 47M CTF 2 trace.

The 47M CTF was processed using different CTF libraries and languages, but with the same underlying logic, in an attempt to make a more fair comparison in terms of code complexity and performance. The following libraries and language bindings were tested:

yactfr (C++) seemed interesting as well but was not chosen since it only operates on a single data stream at a time, either making the comparison unfair or requiring me to write a bunch of glue code. According to its README, it runs similar to babeltrace 1, so we can squint with our eyes and imagine a C++ API with performance similar to what the C/Rust alternatives offer.

Another approach which I have (sadly) seen way too many times in the wild is regex parsing of babeltrace's text output using Python... Suffice to say that setup was not tested here. Although after seeing the performance of the Python bindings, the performance numbers with regex would have been interesting to compare with.

The data processing is quite simple. The data consists of scheduling events from each core, so a hash map is used to map each core to the last scheduling time together with yet another hash map keeping track of each task's total time spent on the core. Since C does not come with a hash map (search.h does not count), map.h from actf was used to store the analysis data for the C implementations. The output from the processing was in the following form, bar some minor formatting differences.

CPU 0
           swapper/0: 0.508601 s
                 obs: 0.074376 s
      winedevice.exe: 0.071495 s
      Dance Dash.exe: 0.037824 s
          wineserver: 0.028300 s
CPU 1
           swapper/1: 0.512032 s
      winedevice.exe: 0.067111 s
                 obs: 0.059902 s
          wineserver: 0.035980 s
      Dance Dash.exe: 0.031961 s
CPU 2
           swapper/2: 0.503841 s
                 obs: 0.098667 s
      winedevice.exe: 0.073203 s
          wineserver: 0.043991 s
      Dance Dash.exe: 0.032262 s
CPU 3
           swapper/3: 0.494540 s
     LighthouseDirec: 0.092279 s
                 obs: 0.068384 s
      winedevice.exe: 0.065736 s
          wineserver: 0.033332 s
CPU 4
           swapper/4: 0.467450 s
                 obs: 0.083199 s
          VRChat.exe: 0.059995 s
          wineserver: 0.057738 s
      winedevice.exe: 0.049785 s
CPU 5
           swapper/5: 0.454535 s
      Dance Dash.exe: 0.070342 s
                 obs: 0.066225 s
          wineserver: 0.057865 s
      monado-service: 0.056186 s
CPU 6
           swapper/6: 0.450509 s
     LighthouseDirec: 0.066589 s
          wineserver: 0.053331 s
      Dance Dash.exe: 0.051936 s
                 obs: 0.050678 s
CPU 7
           swapper/7: 0.440899 s
     LighthouseDirec: 0.072027 s
          wineserver: 0.060782 s
                 obs: 0.059672 s
      Dance Dash.exe: 0.056041 s
CPU 8
           swapper/8: 0.481720 s
     LighthouseDirec: 0.087384 s
                 obs: 0.065288 s
      winedevice.exe: 0.060120 s
      Dance Dash.exe: 0.046687 s
CPU 9
           swapper/9: 0.520220 s
      winedevice.exe: 0.071630 s
                 obs: 0.053362 s
          wineserver: 0.042097 s
          VRChat.exe: 0.039459 s
CPU 10
          swapper/10: 0.484793 s
                 obs: 0.077734 s
      winedevice.exe: 0.066219 s
      Dance Dash.exe: 0.045152 s
       LighthouseUSB: 0.039816 s
CPU 11
          swapper/11: 0.483829 s
     LighthouseDirec: 0.122355 s
      winedevice.exe: 0.059622 s
                 obs: 0.055846 s
      Dance Dash.exe: 0.034227 s
CPU 12
          swapper/12: 0.460862 s
     LighthouseDirec: 0.079508 s
          wineserver: 0.056605 s
                 obs: 0.053798 s
          VRChat.exe: 0.049304 s
CPU 13
          swapper/13: 0.464974 s
          wineserver: 0.064827 s
                 obs: 0.062498 s
          VRChat.exe: 0.062445 s
      Dance Dash.exe: 0.049045 s
CPU 14
          swapper/14: 0.441174 s
                 obs: 0.079303 s
     LighthouseDirec: 0.078323 s
          wineserver: 0.059251 s
      Dance Dash.exe: 0.048622 s
CPU 15
          swapper/15: 0.458847 s
          VRChat.exe: 0.072892 s
                 obs: 0.069808 s
          wineserver: 0.058439 s
      Dance Dash.exe: 0.054922 s

Measurements and Code

babeltrace 2.1.2

Babeltrace is the reference implementation of the CTF format. It is written in C/C++ and has Python bindings for convenient scripting. It is both a CLI and a library and has a plugin-based design where you create trace processing graphs composed of sources, filters and sinks. I wanted to consume data only, so I wrote a sink plugin.

C

To define a sink plugin in C, babeltrace2 requires you to define a set of callbacks which acts as the methods of said plugin. The plugin was compiled as a shared library which the babeltrace2 CLI can load and use (it is possible to write a standalone CLI as well using libbabeltrace but not without the pain of setting up the trace processing graph yourself). There was a bit of boilerplate to get through, but in the end, I had an initialization function, a consume function and a finalization function to put the analysis code into.

It ran in a decent speed, averaging 507 ms over 10 runs.

babeltrace C sink plugin source code

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <string.h>
#include <babeltrace2/babeltrace.h>

typedef struct fatcharp {
	char *s;
	size_t len;
} fatcharp;

/* fnv-1a for non null-terminated fatcharps */
static inline size_t fatcharp_hash_fnv(fatcharp p)
{
	size_t prime, offset_basis;
	if (sizeof(prime) == 4) {
		prime = UINT32_C(16777619);
		offset_basis = UINT32_C(2166136261);
	} else { // assume 64-bit
		prime = UINT64_C(1099511628211);
		offset_basis = UINT64_C(14695981039346656037);
	}
	size_t hash = offset_basis;
	for (size_t i = 0; i < p.len; i++) {
		hash ^= p.s[i];
		hash *= prime;
	}
	return hash;
}

static inline int fatcharpcmp(fatcharp p1, fatcharp p2) {
	if (p1.len < p2.len) {
		return -1;
	} else if (p1.len > p2.len) {
		return 1;
	} else {
		return strncmp(p1.s, p2.s, p1.len);
	}
}

static void fatcharp_free(fatcharp f)
{
	free(f.s);
}

#define MAP_NAME fatcharptou64
#define MAP_KEY_TYPE fatcharp
#define MAP_KEY_FREE fatcharp_free
#define MAP_KEY_CMP fatcharpcmp
#define MAP_VAL_TYPE uint64_t
#define MAP_HASH fatcharp_hash_fnv
#include "crust/map.h"

struct cpudata {
	int64_t last_tstamp;
	fatcharptou64 tasktotime;
};

static void cpudata_free(struct cpudata d)
{
	fatcharptou64_free(&d.tasktotime);
}

#define MAP_NAME u64tocpudata
#define MAP_KEY_TYPE uint64_t
#define MAP_KEY_CMP uint64cmp
#define MAP_VAL_TYPE struct cpudata
#define MAP_VAL_FREE cpudata_free
#define MAP_HASH hash_uint64
#include "crust/map.h"

struct task {
	fatcharp name;
	uint64_t time;
};

static int tasktimecmp(const void *a0, const void *a1)
{
	const struct task *t0 = a0;
	const struct task *t1 = a1;
	if (t0->time > t1->time) {
		return -1;
	} else if (t0->time < t1->time) {
		return 1;
	} else {
		return 0;
	}
}

struct process_out {
	bt_message_iterator *message_iterator;
	u64tocpudata cputodata;
};

static bt_component_class_initialize_method_status process_out_initialize(
        bt_self_component_sink *self_component_sink,
        bt_self_component_sink_configuration *configuration,
        const bt_value *params, void *initialize_method_data)
{
	struct process_out *process_out = malloc(sizeof(*process_out));
	u64tocpudata_init_cap(&process_out->cputodata, 16);

	bt_self_component_set_data(
		bt_self_component_sink_as_self_component(self_component_sink),
		process_out);
	bt_self_component_sink_add_input_port(self_component_sink,
					      "in", NULL, NULL);
	return BT_COMPONENT_CLASS_INITIALIZE_METHOD_STATUS_OK;
}

static int u64cmp(const void *a0, const void *a1)
{
	const uint64_t *v0 = a0;
	const uint64_t *v1 = a1;
	if (*v0 < *v1) {
		return -1;
	} else if (*v0 > *v1) {
		return 1;
	} else {
		return 0;
	}
}

static void process_out_finalize(bt_self_component_sink *self_component_sink)
{
	struct process_out *p = bt_self_component_get_data(
		bt_self_component_sink_as_self_component(self_component_sink));

	uint64_t *cpus = malloc(sizeof(*cpus) * p->cputodata.len);
	struct cpudata tmp;
	struct u64tocpudata_it it = { 0 };
	for (size_t i = 0; u64tocpudata_foreach(&p->cputodata, &cpus[i],
						&tmp, &it) == 0; i++) {};
	qsort(cpus, p->cputodata.len, sizeof(*cpus), u64cmp);

	for (size_t i = 0; i < p->cputodata.len; i++) {
		printf("CPU %"PRIu64"\n", cpus[i]);
		struct cpudata *data = u64tocpudata_find(&p->cputodata, cpus[i]);

		struct task *tasks = malloc(sizeof(*tasks) * data->tasktotime.len);
		struct fatcharptou64_it it = { 0 };
		for (size_t i = 0; fatcharptou64_foreach(&data->tasktotime, &tasks[i].name,
							 &tasks[i].time, &it) == 0; i++) {};

		qsort(tasks, data->tasktotime.len, sizeof(*tasks), tasktimecmp);
		for (size_t i = 0; i < data->tasktotime.len && i < 5; i++) {
			printf("%20.*s: %f s\n", tasks[i].name.len, tasks[i].name.s,
			       (double)tasks[i].time / 1000000000);
		}
		free(tasks);
	}
	free(cpus);

	u64tocpudata_free(&p->cputodata);
	free(p);
}

static bt_component_class_sink_graph_is_configured_method_status process_out_graph_is_configured(
	bt_self_component_sink *self_component_sink)
{
	struct process_out *process_out = bt_self_component_get_data(
		bt_self_component_sink_as_self_component(self_component_sink));

	bt_self_component_port_input *in_port =
		bt_self_component_sink_borrow_input_port_by_index(
			self_component_sink, 0);
	bt_message_iterator_create_from_sink_component(self_component_sink,
						       in_port, &process_out->message_iterator);

	return BT_COMPONENT_CLASS_SINK_GRAPH_IS_CONFIGURED_METHOD_STATUS_OK;
}

static bt_component_class_get_supported_mip_versions_method_status process_get_supported_mip_versions_method(
	bt_self_component_class_sink *source_component_class,
	const bt_value *params, void *initialize_method_data,
	bt_logging_level logging_level,
	bt_integer_range_set_unsigned *supported_versions)
{
	bt_integer_range_set_unsigned_add_range(supported_versions, 0, 1);
	return BT_COMPONENT_CLASS_SINK_GRAPH_IS_CONFIGURED_METHOD_STATUS_OK;
}

static void process(struct process_out *p, const bt_message *msg)
{
	if (bt_message_get_type(msg) != BT_MESSAGE_TYPE_EVENT) {
		return;
	}

	const bt_event *ev = bt_message_event_borrow_event_const(msg);
	const bt_event_class *evc = bt_event_borrow_class_const(ev);
	const char *name = bt_event_class_get_name(evc);
	if (strcmp(name, "sched:sched_switch") != 0) {
		return;
	}
	int64_t tstamp;
	bt_clock_snapshot_get_ns_from_origin(
		bt_message_event_borrow_default_clock_snapshot_const(msg), &tstamp);
	uint64_t cpu_id = bt_field_integer_unsigned_get_value(
		bt_field_structure_borrow_member_field_by_name_const(
			bt_packet_borrow_context_field_const(bt_event_borrow_packet_const(ev)), "cpu_id"));
	struct cpudata *data = u64tocpudata_find(&p->cputodata, cpu_id);
	if (! data) {
		struct cpudata d;
		d.last_tstamp = tstamp;
		fatcharptou64_init_cap(&d.tasktotime, 16);
		u64tocpudata_insert(&p->cputodata, cpu_id, d);
		data = u64tocpudata_find(&p->cputodata, cpu_id);
	}

	const bt_field *prev_comm_fld = bt_field_structure_borrow_member_field_by_name_const(
		bt_event_borrow_payload_field_const(ev), "prev_comm");
	fatcharp prev_comm = {
		.s = (char *)bt_field_string_get_value(prev_comm_fld),
		.len = bt_field_string_get_length(prev_comm_fld)
	};

	uint64_t diff = (uint64_t)(tstamp - data->last_tstamp);

	uint64_t *t = fatcharptou64_find(&data->tasktotime, prev_comm);
	if (t) {
		*t += diff;
	} else {
		prev_comm.s = strdup(prev_comm.s);
		fatcharptou64_insert(&data->tasktotime, prev_comm, diff);
	}

	data->last_tstamp = tstamp;
}

static bt_component_class_sink_consume_method_status process_out_consume(
        bt_self_component_sink *self_component_sink)
{
	bt_component_class_sink_consume_method_status status =
		BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_OK;

	struct process_out *process_out = bt_self_component_get_data(
		bt_self_component_sink_as_self_component(self_component_sink));

	bt_message_array_const messages;
	uint64_t message_count;
	bt_message_iterator_next_status next_status =
		bt_message_iterator_next(process_out->message_iterator, &messages,
					 &message_count);
	switch (next_status) {
	case BT_MESSAGE_ITERATOR_NEXT_STATUS_END:
		bt_message_iterator_put_ref(process_out->message_iterator);
		status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_END;
		goto end;
	case BT_MESSAGE_ITERATOR_NEXT_STATUS_AGAIN:
		status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_AGAIN;
		goto end;
	case BT_MESSAGE_ITERATOR_NEXT_STATUS_MEMORY_ERROR:
		status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_MEMORY_ERROR;
		goto end;
	case BT_MESSAGE_ITERATOR_NEXT_STATUS_ERROR:
		status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_ERROR;
		goto end;
	default:
		break;
	}

	for (uint64_t i = 0; i < message_count; i++) {
		const bt_message *message = messages[i];
		process(process_out, message);
		bt_message_put_ref(message);
	}
end:
	return status;
}

BT_PLUGIN_MODULE();
BT_PLUGIN(process);
BT_PLUGIN_SINK_COMPONENT_CLASS(output, process_out_consume);
BT_PLUGIN_SINK_COMPONENT_CLASS_INITIALIZE_METHOD(output, process_out_initialize);
BT_PLUGIN_SINK_COMPONENT_CLASS_FINALIZE_METHOD(output, process_out_finalize);
BT_PLUGIN_SINK_COMPONENT_CLASS_GRAPH_IS_CONFIGURED_METHOD(output, process_out_graph_is_configured);
BT_PLUGIN_SINK_COMPONENT_CLASS_GET_SUPPORTED_MIP_VERSIONS_METHOD(output, process_get_supported_mip_versions_method);

Python

It is possible to define a sink plugin for Python as well, but to make our lives easier, babeltrace2 exposes a bt2.TraceCollectionMessageIterator that can be constructed using a CTF trace and consumed as a regular Python iterator. Very convenient!

The convenience appears to have come with a price though... The Python processing runs in 22.7 seconds on average. That is... incredibly slow. I ran it under cProfile to check whether I was doing something silly, but it appears most of the time was spent accessing the event and its fields, so I deemed myself innocent.

babeltrace Python iterator source code

import bt2
import sys
import collections

class CpuData:
    def __init__(self, tstamp):
        self.last_tstamp = tstamp
        self.exec_times = collections.Counter()

cputodata = {}

for msg in bt2.TraceCollectionMessageIterator(sys.argv[1]):
    if type(msg) is not bt2._EventMessageConst:
        continue
    event = msg.event
    if event.name != 'sched:sched_switch':
        continue
    tstamp = msg.default_clock_snapshot.ns_from_origin
    cpu = int(event['cpu_id'])
    data = cputodata.get(cpu)
    if data is None:
        data = CpuData(tstamp)
        cputodata[cpu] = data

    prev_comm = str(event['prev_comm'])
    diff = tstamp - data.last_tstamp
    data.exec_times[prev_comm] += diff
    data.last_tstamp = tstamp

for cpu, data in sorted(cputodata.items()):
    print(f"CPU {cpu}")
    for comm, ns in data.exec_times.most_common(5):
        print('\t{:20}{} s'.format(comm, ns / 1e9))

Trace Compass

Trace Compass is primarily a GUI application which reads several types of traces and performs analyses whose results can be browsed interactively in different views.

Java

I need to confess here. I am not the biggest fan of Java, so I did not attempt to use the Trace Compass CTF reader in a professional Java developer way. There are for sure better way to do this... The CTF reader is part of Trace Compass, but does not have any direct dependencies to it, so I built the project as normal using maven and then went spelunking for jar-files in ~/.m2.

The classes related to CTF reading are public and documented, so I could start writing the necessary code based on that. Then when I tried to compile it, Java happily complained about non-existing packages, so I dug through my ~/.m2 folder of jars using jar tf to find the where the package was declared. One after the other, jars were put into a libs folder

Eventually the Java code compiled (yay), but despair struck me yet again as I realized that I also had to resolve all runtime dependencies. Back to the ~/.m2 folder... The endeavour resulted in a library folder with five jar-files that were enough to run the processing. If you ever want to do this as well, these jars were necessary:

I presume you can do this in a more professional way using maven, but javac -cp '.:libs/*' main.java was good enough for me. As a side-note, normally in Trace Compass you would use a wrapper around the CTF event, making it more convenient to use. I did not delve into the rabbit hole to figure out what extra jars would be required for that.

In 10 runs, the Java implementation averaged 1.107 seconds. Not amazing, but still very acceptable.

Trace Compass Java source code

import org.eclipse.tracecompass.ctf.core.trace.CTFTraceReader;
import org.eclipse.tracecompass.ctf.core.trace.CTFTrace;
import org.eclipse.tracecompass.ctf.core.trace.CTFTraceReader;
import org.eclipse.tracecompass.ctf.core.CTFException;
import org.eclipse.tracecompass.ctf.core.event.IEventDefinition;
import org.eclipse.tracecompass.ctf.core.event.types.StringDefinition;
import org.eclipse.tracecompass.ctf.core.event.types.SimpleDatatypeDefinition;

class CpuData {
	long last_tstamp;
	Map<String, Long> tasktotime;

	CpuData(long tstamp) {
		last_tstamp = tstamp;
		tasktotime = new HashMap<>();
	}
}

void process(IEventDefinition defs, Map<Long, CpuData> cpu_to_data)
{
	var name = defs.getDeclaration().getName();
	if (! name.equals("sched:sched_switch")) {
		return;
	}
	var tstamp = defs.getTimestamp();
	var flds = defs.getFields();
	var cpu_id = ((SimpleDatatypeDefinition)defs.getPacketContext()
		.getDefinition("cpu_id")).getIntegerValue();
	var data = cpu_to_data.computeIfAbsent(cpu_id, k -> new CpuData(tstamp));

	var prev_comm = ((StringDefinition)flds.getDefinition("prev_comm"))
		.getValue();

	var diff = tstamp - data.last_tstamp;

	data.tasktotime.compute(prev_comm, (k, v) -> (v == null) ? diff : v + diff);
	data.last_tstamp = tstamp;
}

public void main(String[] args) throws CTFException {
	var path = args[0];
	CTFTrace trace = new CTFTrace(path);
	CTFTraceReader reader = new CTFTraceReader(trace);

	var cpu_to_data = new HashMap<Long, CpuData>();
	while (reader.hasMoreEvents()) {
		var defs = reader.getCurrentEventDef();
		process(reader.getCurrentEventDef(), cpu_to_data);
		reader.advance();
	}
	reader.close();

	cpu_to_data.entrySet().stream()
		.sorted(Map.Entry.<Long, CpuData>comparingByKey())
		.forEach(e0 -> {
				System.out.println("CPU: "+e0.getKey());
				e0.getValue().tasktotime.entrySet().stream()
					.sorted(Map.Entry.<String, Long>comparingByValue().reversed())
					.limit(5)
					.forEach(e1 -> {
							System.out.format("\t%s: %f\n", e1.getKey(),
									  (double)e1.getValue() / 1000000000);
						});
			});
}

actf

actf is my CTF 2 implementation, written in C and with both Rust and Lua bindings for a safer or easier time respectively. actf is easy to embed in other applications due to its few dependencies and pull-based API, but is more of a pure CTF library rather than a trace manipulation framework like babeltrace.

C

The C API is fairly straight forward (biased indeed). You instantiate a reader, open a CTF trace with it and then call read on it repeatably until all events are consumed.

The 10 runs executed at an average of 306 ms per run.

actf C source code

#include <inttypes.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>

#include <actf/actf.h>

typedef struct fatcharp {
	const char *s;
	size_t len;
} fatcharp;

/* fnv-1a for non null-terminated fatcharps */
static inline size_t fatcharp_hash_fnv(fatcharp p)
{
	size_t prime, offset_basis;
	if (sizeof(prime) == 4) {
		prime = UINT32_C(16777619);
		offset_basis = UINT32_C(2166136261);
	} else { // assume 64-bit
		prime = UINT64_C(1099511628211);
		offset_basis = UINT64_C(14695981039346656037);
	}
	size_t hash = offset_basis;
	for (size_t i = 0; i < p.len; i++) {
		hash ^= p.s[i];
		hash *= prime;
	}
	return hash;
}

static inline int fatcharpcmp(fatcharp p1, fatcharp p2) {
	if (p1.len < p2.len) {
		return -1;
	} else if (p1.len > p2.len) {
		return 1;
	} else {
		return strncmp(p1.s, p2.s, p1.len);
	}
}

#define MAP_NAME fatcharptou64
#define MAP_KEY_TYPE fatcharp
#define MAP_KEY_CMP fatcharpcmp
#define MAP_VAL_TYPE uint64_t
#define MAP_HASH fatcharp_hash_fnv
#include "crust/map.h"

struct cpudata {
	int64_t last_tstamp;
	fatcharptou64 tasktotime;
};

static void cpudata_free(struct cpudata d)
{
	fatcharptou64_free(&d.tasktotime);
}

#define MAP_NAME u64tocpudata
#define MAP_KEY_TYPE uint64_t
#define MAP_KEY_CMP uint64cmp
#define MAP_VAL_TYPE struct cpudata
#define MAP_VAL_FREE cpudata_free
#define MAP_HASH hash_uint64
#include "crust/map.h"

void process(actf_event *ev, u64tocpudata *cputodata)
{
	const actf_event_cls *evc = actf_event_event_cls(ev);
	const char *name = actf_event_cls_name(evc);
	if (strcmp(name, "sched:sched_switch") != 0) {
		return;
	}
	int64_t tstamp = actf_event_tstamp_ns_from_origin(ev);
	uint64_t cpu_id = actf_fld_uint64(actf_pkt_fld(actf_event_pkt(ev), "cpu_id"));
	struct cpudata *data = u64tocpudata_find(cputodata, cpu_id);
	if (! data) {
		struct cpudata d;
		d.last_tstamp = tstamp;
		fatcharptou64_init_cap(&d.tasktotime, 16);
		u64tocpudata_insert(cputodata, cpu_id, d);
		data = u64tocpudata_find(cputodata, cpu_id);
	}

	const actf_fld *prev_comm_fld = actf_event_fld(ev, "prev_comm");
	fatcharp prev_comm = {
		.s = actf_fld_str_raw(prev_comm_fld),
		.len = actf_fld_str_sz(prev_comm_fld)
	};

	uint64_t diff = (uint64_t)(tstamp - data->last_tstamp);

	uint64_t *t = fatcharptou64_find(&data->tasktotime, prev_comm);
	if (t) {
		*t += diff;
	} else {
		fatcharptou64_insert(&data->tasktotime, prev_comm, diff);
	}

	data->last_tstamp = tstamp;
}

struct task {
	fatcharp name;
	uint64_t time;
};

static int tasktimecmp(const void *a0, const void *a1)
{
	const struct task *t0 = a0;
	const struct task *t1 = a1;
	if (t0->time > t1->time) {
		return -1;
	} else if (t0->time < t1->time) {
		return 1;
	} else {
		return 0;
	}
}

static int u64cmp(const void *a0, const void *a1)
{
	const uint64_t *v0 = a0;
	const uint64_t *v1 = a1;
	if (*v0 < *v1) {
		return -1;
	} else if (*v0 > *v1) {
		return 1;
	} else {
		return 0;
	}
}

int main(int argc, char *argv[])
{
	if (argc != 2) {
		fprintf(stderr, "Usage: %s <CTF_PATH>\n", argv[0]);
		return 1;
	}

	int rc = 0;
	struct actf_freader_cfg cfg = { 0 };
	actf_freader *rd = actf_freader_init(cfg);
	if (!rd) {
		perror("actf_freader_init");
		return rc;
	}
	if (actf_freader_open_folders(rd, argv + 1, argc - 1) < 0) {
		fprintf(stderr, "actf_freader_open_folder: %s\n", actf_freader_last_error(rd));
		goto err;
	}

	u64tocpudata cputodata;
	u64tocpudata_init_cap(&cputodata, 16);
	size_t evs_len = 0;
	actf_event **evs = NULL;
	while ((rc = actf_freader_read(rd, &evs, &evs_len)) == 0 && evs_len) {
		for (size_t i = 0; i < evs_len; i++) {
			process(evs[i], &cputodata);
		}
	}
	if (rc < 0) {
		fprintf(stderr, "actf_freader_read: %s\n", actf_freader_last_error(rd));
	}

	uint64_t *cpus = malloc(sizeof(*cpus) * cputodata.len);
	struct cpudata tmp;
	struct u64tocpudata_it it = { 0 };
	for (size_t i = 0; u64tocpudata_foreach(&cputodata, &cpus[i],
						&tmp, &it) == 0; i++) {};
	qsort(cpus, cputodata.len, sizeof(*cpus), u64cmp);

	for (size_t i = 0; i < cputodata.len; i++) {
		printf("CPU %"PRIu64"\n", cpus[i]);
		struct cpudata *data = u64tocpudata_find(&cputodata, cpus[i]);

		struct task *tasks = malloc(sizeof(*tasks) * data->tasktotime.len);
		struct fatcharptou64_it it = { 0 };
		for (size_t j = 0; fatcharptou64_foreach(&data->tasktotime, &tasks[j].name,
							 &tasks[j].time, &it) == 0; j++) {};

		qsort(tasks, data->tasktotime.len, sizeof(*tasks), tasktimecmp);
		for (size_t j = 0; j < data->tasktotime.len && j < 5; j++) {
			printf("%20.*s: %f s\n", (int)tasks[j].name.len, tasks[j].name.s,
			       (double)tasks[j].time / 1000000000);
		}
		free(tasks);
	}


	free(cpus);
	u64tocpudata_free(&cputodata);
err:
	actf_freader_free(rd);
	return rc;
}

Rust

The C API of actf has a few sharp parts where you may hurt yourself, such as non null-terminated strings from fields or the ability to keep data around from a call which is then invalidated on subsequent calls. To absolve myself from these sins, I experimented with a safe Rust API, which gives access to the full actf API but removes most of the potential accidents by the use of lifetimes and safer types. Reading events and their fields from a CTF trace is 100% safe in Rust, but to use higher level C components of actf, such as filters, unsafe is a necessity.

The Rust API adds a thin extra layer between all actf calls and utf8 validation, so it fell behind the C implementation gently, and averaged 341 ms over 10 runs. The Rust code is impressively concise while safer than the C alternative, but the numerous unwraps pains me.

actf Rust source code

use actf::Event;
use actf::EventGenerate;
use actf::Freader;
use actf::FreaderConfig;
use std::collections::HashMap;
use std::path::Path;

struct CpuData {
    last_tstamp: i64,
    task_to_time: HashMap<String, u64>,
}

impl CpuData {
    fn new(tstamp: i64) -> Self {
        Self {
            last_tstamp: tstamp,
            task_to_time: HashMap::new(),
        }
    }
}

fn process(ev: &Event, cpu_to_data: &mut HashMap<u64, CpuData>) {
    let name = ev.class().name().unwrap();
    if name != "sched:sched_switch" {
        return;
    }
    let tstamp = ev.timestamp_ns_from_origin();
    let cpu_id = ev.packet().field("cpu_id").unwrap().as_u64().unwrap();
    let data = cpu_to_data.entry(cpu_id).or_insert(CpuData::new(tstamp));

    let prev_comm_s = ev.field("prev_comm").unwrap().as_string().unwrap();
    let prev_comm = prev_comm_s.as_str().unwrap();

    let diff = (tstamp - data.last_tstamp) as u64;

    if let Some(t) = data.task_to_time.get_mut(prev_comm) {
        *t += diff;
    } else {
        data.task_to_time.insert(prev_comm.to_string(), diff);
    }

    data.last_tstamp = tstamp;
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let path = std::env::args().nth(1).expect("no path given");
    let mut fr = Freader::open_folder(FreaderConfig::default(), &Path::new(&path)).unwrap();

    let mut cpu_to_data: HashMap<u64, CpuData> = HashMap::new();

    while let Some(evs) = fr.generate() {
        for &ev in evs? {
            process(ev, &mut cpu_to_data);
        }
    }

    let mut cpus: Vec<(&u64, &CpuData)> = cpu_to_data.iter().collect();
    cpus.sort_by(|a, b| a.0.cmp(b.0));
    for (cpu, data) in cpus {
        println!("CPU {cpu}");
        let mut tasks: Vec<(&String, &u64)> = data.task_to_time.iter().collect();
        tasks.sort_by(|a, b| b.1.cmp(a.1));
        for &(task, &time) in tasks.iter().take(5) {
            println!("{:>20}: {} s", task, time as f64 / 1000000000.0);
        }
    }

    Ok(())
}

Lua

actf can be built with Lua support to allow you to filter and/or analyze CTF events using a Lua script. A Lua based filter can be set up using the API, but the most convenient way to use it is through the actf command-line where you can provide a Lua script directly. The Lua script operates with three callbacks, actf.init, actf.filter and actf.fini which are called during initialization, on every event and during finalization respectively.

Time-wise, the Lua approach spent an average of 602 ms to process the data.

actf Lua source code

function actf.init(...)
   cputodata = {} -- cpu_id to {last_tstamp, tasktotime}
end

function actf.filter(ev)
   if ev:name() ~= "sched:sched_switch" then
      return nil
   end
   local cpu = ev:get("cpu_id")
   local tstamp = ev:timestamp_ns()
   local data = cputodata[cpu]
   if data == nil then
      local tasktotime = {}
      setmetatable(tasktotime, {__index = function () return 0 end})
      data = {last_tstamp=tstamp, tasktotime=tasktotime}
      cputodata[cpu] = data
   end
   local prev_comm = ev:get("prev_comm")
   local diff = tstamp - data.last_tstamp
   data.tasktotime[prev_comm] = data.tasktotime[prev_comm] + diff
   data.last_tstamp = tstamp
end

function actf.fini()
   local cpus = {}
   for k, v in pairs(cputodata) do table.insert(cpus, k) end
   table.sort(cpus)
   for _, cpu in ipairs(cpus) do
      print("CPU " .. cpu)
      local tasktimes = {}
      for k, v in pairs(cputodata[cpu].tasktotime) do table.insert(tasktimes, {k, v}) end
      table.sort(tasktimes, function (b, c) return b[2] > c[2] end)
      for i, tasktime in ipairs(tasktimes) do
	 print(string.format("%20s: %f s", tasktime[1], tasktime[2] / 1000000000))
	 if i == 5 then break end
      end
   end
end

So what to pick?

Looking straight at the performance of the different approaches, actf is the clear winner, regardless if you are writing C or Rust. babeltrace2's C API does run in a respectable time though, so if you are not willing to pay the price of using a niche library like actf, babeltrace2 is a great choice.

Average runtime of each library processing a 47M trace (3 runs warmup, 10 runs measured)
CTF library Runtime (ms)
babeltrace2 (C) 507
babeltrace2 (Python) 22700
Trace Compass (Java) 1107
actf (C) 306
actf (Rust) 341
actf (Lua) 602

Performance is not everything though. I usually find myself writing small throwaway programs and scripts to explore traces and discover new insights from them. For that purpose, the iteration speed in C is not good enough. Instead, the babeltrace2 Python API has been my go-to. The iteration speed is great there if the data you're analyzing is small enough. Due to the poor performance, I sometimes end up dividing my Python scripts into two parts: one that parses the CTF and extracts all relevant data into a separate file and one part that reads the extracted data and performs the analysis. Loading the extracted data is magnitudes faster than reparsing the CTF trace and gives me a tighter feedback loop when figuring out what to do with the data. For this kind of use-case, Lua-based scripting now becomes a contender for me, since I can write the logic directly in Lua and skip the two stage approach I use for Python. Yet, Lua cannot compete with the vast Python ecosystem, so the two stage Python-Python rocket might just change to a Lua-Python rocket if I need to do something more involved...