2026-03-21 - Different Ways to Read CTF
There are many ways to skin a cat, but not overly many ways to read Common Trace Format (CTF) traces. Lets dive into what options actually exist by collecting a trace and implementing the same processing using different CTF libraries.
The data was collected using sudo trace-cmd record -e
sched:sched_switch -C mono on my linux desktop,
sporting a Ryzen 7 3700X CPU, 2666 Mhz RAM and some M.2 NVMe
SSD (this setup is also the host of the processing).
The sched_switch events provide information
about what processes are running on each CPU, allowing us to
estimate which program are occupying the CPU the most. Two
sample events are provided below, showing how
"UnityGfxDeviceW" gets replaced by "monado-service" on core
10.
[03:12:01.107934447] sched:sched_switch: { cpu_id = 10 }, { prev_comm = "UnityGfxDeviceW", prev_pid = 21846, prev_prio = 120, prev_state = 256, next_comm = "monado-service", next_pid = 21931, next_prio = 120 }
[03:12:01.107946159] sched:sched_switch: { cpu_id = 10 }, { prev_comm = "monado-service", prev_pid = 21931, prev_prio = 120, prev_state = 1, next_comm = "wine_sechost_se", next_pid = 21653, next_prio = 120 }
To not only have a bunch of idle CPUs (yawn), I decided to trace while operating my VR setup. So the traced system is running a VR headset, VRChat, Dance Dash and OBS recording the adventure using a CPU encoder.
If you are in the know-how, you might have noticed
that trace-cmd does not produce data in
the CTF format. That is an issue, considering that I want to
compare CTF readers. I could have traced the system
using lttng to produce a
trace in the correct format, but I was curious about
the bt2-ftrace-to-ctf
project presented
at FOSDEM
2026 so I took this as an opportunity to test it
out. bt2-ftrace-to-ctf is a babeltrace2 plugin
that acts as a converter from the trace.dat format (produced
by trace-cmd) to CTF. With a
simple ftrace-to-ctf -b 03:12:01.107806147 -e
03:12:02.107806147 -c 2 recording/trace.dat
recording/ctf I converted a one second snippet of the
collected trace data into the CTF format, yielding a 47M CTF 2
trace.
The 47M CTF was processed using different CTF libraries and languages, but with the same underlying logic, in an attempt to make a more fair comparison in terms of code complexity and performance. The following libraries and language bindings were tested:
- babeltrace (C/Python)
- trace compass (Java)
- actf (C/Rust/Lua)
yactfr (C++) seemed interesting as well but was not chosen since it only operates on a single data stream at a time, either making the comparison unfair or requiring me to write a bunch of glue code. According to its README, it runs similar to babeltrace 1, so we can squint with our eyes and imagine a C++ API with performance similar to what the C/Rust alternatives offer.
Another approach which I have (sadly) seen way too many times in the wild is regex parsing of babeltrace's text output using Python... Suffice to say that setup was not tested here. Although after seeing the performance of the Python bindings, the performance numbers with regex would have been interesting to compare with.
The data processing is quite simple. The data consists of scheduling events from each core, so a hash map is used to map each core to the last scheduling time together with yet another hash map keeping track of each task's total time spent on the core. Since C does not come with a hash map (search.h does not count), map.h from actf was used to store the analysis data for the C implementations. The output from the processing was in the following form, bar some minor formatting differences.
CPU 0
swapper/0: 0.508601 s
obs: 0.074376 s
winedevice.exe: 0.071495 s
Dance Dash.exe: 0.037824 s
wineserver: 0.028300 s
CPU 1
swapper/1: 0.512032 s
winedevice.exe: 0.067111 s
obs: 0.059902 s
wineserver: 0.035980 s
Dance Dash.exe: 0.031961 s
CPU 2
swapper/2: 0.503841 s
obs: 0.098667 s
winedevice.exe: 0.073203 s
wineserver: 0.043991 s
Dance Dash.exe: 0.032262 s
CPU 3
swapper/3: 0.494540 s
LighthouseDirec: 0.092279 s
obs: 0.068384 s
winedevice.exe: 0.065736 s
wineserver: 0.033332 s
CPU 4
swapper/4: 0.467450 s
obs: 0.083199 s
VRChat.exe: 0.059995 s
wineserver: 0.057738 s
winedevice.exe: 0.049785 s
CPU 5
swapper/5: 0.454535 s
Dance Dash.exe: 0.070342 s
obs: 0.066225 s
wineserver: 0.057865 s
monado-service: 0.056186 s
CPU 6
swapper/6: 0.450509 s
LighthouseDirec: 0.066589 s
wineserver: 0.053331 s
Dance Dash.exe: 0.051936 s
obs: 0.050678 s
CPU 7
swapper/7: 0.440899 s
LighthouseDirec: 0.072027 s
wineserver: 0.060782 s
obs: 0.059672 s
Dance Dash.exe: 0.056041 s
CPU 8
swapper/8: 0.481720 s
LighthouseDirec: 0.087384 s
obs: 0.065288 s
winedevice.exe: 0.060120 s
Dance Dash.exe: 0.046687 s
CPU 9
swapper/9: 0.520220 s
winedevice.exe: 0.071630 s
obs: 0.053362 s
wineserver: 0.042097 s
VRChat.exe: 0.039459 s
CPU 10
swapper/10: 0.484793 s
obs: 0.077734 s
winedevice.exe: 0.066219 s
Dance Dash.exe: 0.045152 s
LighthouseUSB: 0.039816 s
CPU 11
swapper/11: 0.483829 s
LighthouseDirec: 0.122355 s
winedevice.exe: 0.059622 s
obs: 0.055846 s
Dance Dash.exe: 0.034227 s
CPU 12
swapper/12: 0.460862 s
LighthouseDirec: 0.079508 s
wineserver: 0.056605 s
obs: 0.053798 s
VRChat.exe: 0.049304 s
CPU 13
swapper/13: 0.464974 s
wineserver: 0.064827 s
obs: 0.062498 s
VRChat.exe: 0.062445 s
Dance Dash.exe: 0.049045 s
CPU 14
swapper/14: 0.441174 s
obs: 0.079303 s
LighthouseDirec: 0.078323 s
wineserver: 0.059251 s
Dance Dash.exe: 0.048622 s
CPU 15
swapper/15: 0.458847 s
VRChat.exe: 0.072892 s
obs: 0.069808 s
wineserver: 0.058439 s
Dance Dash.exe: 0.054922 s
Measurements and Code
babeltrace 2.1.2
Babeltrace is the reference implementation of the CTF format. It is written in C/C++ and has Python bindings for convenient scripting. It is both a CLI and a library and has a plugin-based design where you create trace processing graphs composed of sources, filters and sinks. I wanted to consume data only, so I wrote a sink plugin.
C
To define a sink plugin in C, babeltrace2 requires you to define a set of callbacks which acts as the methods of said plugin. The plugin was compiled as a shared library which the babeltrace2 CLI can load and use (it is possible to write a standalone CLI as well using libbabeltrace but not without the pain of setting up the trace processing graph yourself). There was a bit of boilerplate to get through, but in the end, I had an initialization function, a consume function and a finalization function to put the analysis code into.
It ran in a decent speed, averaging 507 ms over 10 runs.
babeltrace C sink plugin source code
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <string.h>
#include <babeltrace2/babeltrace.h>
typedef struct fatcharp {
char *s;
size_t len;
} fatcharp;
/* fnv-1a for non null-terminated fatcharps */
static inline size_t fatcharp_hash_fnv(fatcharp p)
{
size_t prime, offset_basis;
if (sizeof(prime) == 4) {
prime = UINT32_C(16777619);
offset_basis = UINT32_C(2166136261);
} else { // assume 64-bit
prime = UINT64_C(1099511628211);
offset_basis = UINT64_C(14695981039346656037);
}
size_t hash = offset_basis;
for (size_t i = 0; i < p.len; i++) {
hash ^= p.s[i];
hash *= prime;
}
return hash;
}
static inline int fatcharpcmp(fatcharp p1, fatcharp p2) {
if (p1.len < p2.len) {
return -1;
} else if (p1.len > p2.len) {
return 1;
} else {
return strncmp(p1.s, p2.s, p1.len);
}
}
static void fatcharp_free(fatcharp f)
{
free(f.s);
}
#define MAP_NAME fatcharptou64
#define MAP_KEY_TYPE fatcharp
#define MAP_KEY_FREE fatcharp_free
#define MAP_KEY_CMP fatcharpcmp
#define MAP_VAL_TYPE uint64_t
#define MAP_HASH fatcharp_hash_fnv
#include "crust/map.h"
struct cpudata {
int64_t last_tstamp;
fatcharptou64 tasktotime;
};
static void cpudata_free(struct cpudata d)
{
fatcharptou64_free(&d.tasktotime);
}
#define MAP_NAME u64tocpudata
#define MAP_KEY_TYPE uint64_t
#define MAP_KEY_CMP uint64cmp
#define MAP_VAL_TYPE struct cpudata
#define MAP_VAL_FREE cpudata_free
#define MAP_HASH hash_uint64
#include "crust/map.h"
struct task {
fatcharp name;
uint64_t time;
};
static int tasktimecmp(const void *a0, const void *a1)
{
const struct task *t0 = a0;
const struct task *t1 = a1;
if (t0->time > t1->time) {
return -1;
} else if (t0->time < t1->time) {
return 1;
} else {
return 0;
}
}
struct process_out {
bt_message_iterator *message_iterator;
u64tocpudata cputodata;
};
static bt_component_class_initialize_method_status process_out_initialize(
bt_self_component_sink *self_component_sink,
bt_self_component_sink_configuration *configuration,
const bt_value *params, void *initialize_method_data)
{
struct process_out *process_out = malloc(sizeof(*process_out));
u64tocpudata_init_cap(&process_out->cputodata, 16);
bt_self_component_set_data(
bt_self_component_sink_as_self_component(self_component_sink),
process_out);
bt_self_component_sink_add_input_port(self_component_sink,
"in", NULL, NULL);
return BT_COMPONENT_CLASS_INITIALIZE_METHOD_STATUS_OK;
}
static int u64cmp(const void *a0, const void *a1)
{
const uint64_t *v0 = a0;
const uint64_t *v1 = a1;
if (*v0 < *v1) {
return -1;
} else if (*v0 > *v1) {
return 1;
} else {
return 0;
}
}
static void process_out_finalize(bt_self_component_sink *self_component_sink)
{
struct process_out *p = bt_self_component_get_data(
bt_self_component_sink_as_self_component(self_component_sink));
uint64_t *cpus = malloc(sizeof(*cpus) * p->cputodata.len);
struct cpudata tmp;
struct u64tocpudata_it it = { 0 };
for (size_t i = 0; u64tocpudata_foreach(&p->cputodata, &cpus[i],
&tmp, &it) == 0; i++) {};
qsort(cpus, p->cputodata.len, sizeof(*cpus), u64cmp);
for (size_t i = 0; i < p->cputodata.len; i++) {
printf("CPU %"PRIu64"\n", cpus[i]);
struct cpudata *data = u64tocpudata_find(&p->cputodata, cpus[i]);
struct task *tasks = malloc(sizeof(*tasks) * data->tasktotime.len);
struct fatcharptou64_it it = { 0 };
for (size_t i = 0; fatcharptou64_foreach(&data->tasktotime, &tasks[i].name,
&tasks[i].time, &it) == 0; i++) {};
qsort(tasks, data->tasktotime.len, sizeof(*tasks), tasktimecmp);
for (size_t i = 0; i < data->tasktotime.len && i < 5; i++) {
printf("%20.*s: %f s\n", tasks[i].name.len, tasks[i].name.s,
(double)tasks[i].time / 1000000000);
}
free(tasks);
}
free(cpus);
u64tocpudata_free(&p->cputodata);
free(p);
}
static bt_component_class_sink_graph_is_configured_method_status process_out_graph_is_configured(
bt_self_component_sink *self_component_sink)
{
struct process_out *process_out = bt_self_component_get_data(
bt_self_component_sink_as_self_component(self_component_sink));
bt_self_component_port_input *in_port =
bt_self_component_sink_borrow_input_port_by_index(
self_component_sink, 0);
bt_message_iterator_create_from_sink_component(self_component_sink,
in_port, &process_out->message_iterator);
return BT_COMPONENT_CLASS_SINK_GRAPH_IS_CONFIGURED_METHOD_STATUS_OK;
}
static bt_component_class_get_supported_mip_versions_method_status process_get_supported_mip_versions_method(
bt_self_component_class_sink *source_component_class,
const bt_value *params, void *initialize_method_data,
bt_logging_level logging_level,
bt_integer_range_set_unsigned *supported_versions)
{
bt_integer_range_set_unsigned_add_range(supported_versions, 0, 1);
return BT_COMPONENT_CLASS_SINK_GRAPH_IS_CONFIGURED_METHOD_STATUS_OK;
}
static void process(struct process_out *p, const bt_message *msg)
{
if (bt_message_get_type(msg) != BT_MESSAGE_TYPE_EVENT) {
return;
}
const bt_event *ev = bt_message_event_borrow_event_const(msg);
const bt_event_class *evc = bt_event_borrow_class_const(ev);
const char *name = bt_event_class_get_name(evc);
if (strcmp(name, "sched:sched_switch") != 0) {
return;
}
int64_t tstamp;
bt_clock_snapshot_get_ns_from_origin(
bt_message_event_borrow_default_clock_snapshot_const(msg), &tstamp);
uint64_t cpu_id = bt_field_integer_unsigned_get_value(
bt_field_structure_borrow_member_field_by_name_const(
bt_packet_borrow_context_field_const(bt_event_borrow_packet_const(ev)), "cpu_id"));
struct cpudata *data = u64tocpudata_find(&p->cputodata, cpu_id);
if (! data) {
struct cpudata d;
d.last_tstamp = tstamp;
fatcharptou64_init_cap(&d.tasktotime, 16);
u64tocpudata_insert(&p->cputodata, cpu_id, d);
data = u64tocpudata_find(&p->cputodata, cpu_id);
}
const bt_field *prev_comm_fld = bt_field_structure_borrow_member_field_by_name_const(
bt_event_borrow_payload_field_const(ev), "prev_comm");
fatcharp prev_comm = {
.s = (char *)bt_field_string_get_value(prev_comm_fld),
.len = bt_field_string_get_length(prev_comm_fld)
};
uint64_t diff = (uint64_t)(tstamp - data->last_tstamp);
uint64_t *t = fatcharptou64_find(&data->tasktotime, prev_comm);
if (t) {
*t += diff;
} else {
prev_comm.s = strdup(prev_comm.s);
fatcharptou64_insert(&data->tasktotime, prev_comm, diff);
}
data->last_tstamp = tstamp;
}
static bt_component_class_sink_consume_method_status process_out_consume(
bt_self_component_sink *self_component_sink)
{
bt_component_class_sink_consume_method_status status =
BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_OK;
struct process_out *process_out = bt_self_component_get_data(
bt_self_component_sink_as_self_component(self_component_sink));
bt_message_array_const messages;
uint64_t message_count;
bt_message_iterator_next_status next_status =
bt_message_iterator_next(process_out->message_iterator, &messages,
&message_count);
switch (next_status) {
case BT_MESSAGE_ITERATOR_NEXT_STATUS_END:
bt_message_iterator_put_ref(process_out->message_iterator);
status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_END;
goto end;
case BT_MESSAGE_ITERATOR_NEXT_STATUS_AGAIN:
status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_AGAIN;
goto end;
case BT_MESSAGE_ITERATOR_NEXT_STATUS_MEMORY_ERROR:
status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_MEMORY_ERROR;
goto end;
case BT_MESSAGE_ITERATOR_NEXT_STATUS_ERROR:
status = BT_COMPONENT_CLASS_SINK_CONSUME_METHOD_STATUS_ERROR;
goto end;
default:
break;
}
for (uint64_t i = 0; i < message_count; i++) {
const bt_message *message = messages[i];
process(process_out, message);
bt_message_put_ref(message);
}
end:
return status;
}
BT_PLUGIN_MODULE();
BT_PLUGIN(process);
BT_PLUGIN_SINK_COMPONENT_CLASS(output, process_out_consume);
BT_PLUGIN_SINK_COMPONENT_CLASS_INITIALIZE_METHOD(output, process_out_initialize);
BT_PLUGIN_SINK_COMPONENT_CLASS_FINALIZE_METHOD(output, process_out_finalize);
BT_PLUGIN_SINK_COMPONENT_CLASS_GRAPH_IS_CONFIGURED_METHOD(output, process_out_graph_is_configured);
BT_PLUGIN_SINK_COMPONENT_CLASS_GET_SUPPORTED_MIP_VERSIONS_METHOD(output, process_get_supported_mip_versions_method);
Python
It is possible to define a sink plugin for Python as well,
but to make our lives easier, babeltrace2 exposes
a bt2.TraceCollectionMessageIterator that can be
constructed using a CTF trace and consumed as a regular Python
iterator. Very convenient!
The convenience appears to have come with a price though... The Python processing runs in 22.7 seconds on average. That is... incredibly slow. I ran it under cProfile to check whether I was doing something silly, but it appears most of the time was spent accessing the event and its fields, so I deemed myself innocent.
babeltrace Python iterator source code
import bt2
import sys
import collections
class CpuData:
def __init__(self, tstamp):
self.last_tstamp = tstamp
self.exec_times = collections.Counter()
cputodata = {}
for msg in bt2.TraceCollectionMessageIterator(sys.argv[1]):
if type(msg) is not bt2._EventMessageConst:
continue
event = msg.event
if event.name != 'sched:sched_switch':
continue
tstamp = msg.default_clock_snapshot.ns_from_origin
cpu = int(event['cpu_id'])
data = cputodata.get(cpu)
if data is None:
data = CpuData(tstamp)
cputodata[cpu] = data
prev_comm = str(event['prev_comm'])
diff = tstamp - data.last_tstamp
data.exec_times[prev_comm] += diff
data.last_tstamp = tstamp
for cpu, data in sorted(cputodata.items()):
print(f"CPU {cpu}")
for comm, ns in data.exec_times.most_common(5):
print('\t{:20}{} s'.format(comm, ns / 1e9))
Trace Compass
Trace Compass is primarily a GUI application which reads several types of traces and performs analyses whose results can be browsed interactively in different views.
Java
I need to confess here. I am not the biggest fan of Java, so
I did not attempt to use the Trace Compass CTF reader in a
professional Java developer way. There are for sure better way
to do this... The CTF reader is part of Trace Compass, but
does not have any direct dependencies to it, so I built the
project as normal using maven and then went spelunking for
jar-files in ~/.m2.
The classes related to CTF reading are public and documented,
so I could start writing the necessary code based on that.
Then when I tried to compile it, Java happily complained about
non-existing packages, so I dug through my ~/.m2
folder of jars using jar tf to find the where the
package was declared. One after the other, jars were put into
a libs folder
Eventually the Java code compiled (yay), but despair struck
me yet again as I realized that I also had to resolve all
runtime dependencies. Back to the ~/.m2 folder...
The endeavour resulted in a library folder with five jar-files
that were enough to run the processing. If you ever want to do
this as well, these jars were necessary:
- org.eclipse.tracecompass.ctf.core-5.1.0-SNAPSHOT.jar
- org.eclipse.tracecompass.ctf.parser-1.2.0-SNAPSHOT.jar
- antlr-runtime-3.2.jar
- com.google.guava-33.5.0.jre.jar
- org.apache.commons.commons-io-2.21.0.jar
I presume you can do this in a more professional way using
maven, but javac -cp '.:libs/*' main.java was good
enough for me. As a side-note, normally in Trace Compass you
would use a wrapper around the CTF event, making it more
convenient to use. I did not delve into the rabbit hole to
figure out what extra jars would be required for that.
In 10 runs, the Java implementation averaged 1.107 seconds. Not amazing, but still very acceptable.
Trace Compass Java source code
import org.eclipse.tracecompass.ctf.core.trace.CTFTraceReader;
import org.eclipse.tracecompass.ctf.core.trace.CTFTrace;
import org.eclipse.tracecompass.ctf.core.trace.CTFTraceReader;
import org.eclipse.tracecompass.ctf.core.CTFException;
import org.eclipse.tracecompass.ctf.core.event.IEventDefinition;
import org.eclipse.tracecompass.ctf.core.event.types.StringDefinition;
import org.eclipse.tracecompass.ctf.core.event.types.SimpleDatatypeDefinition;
class CpuData {
long last_tstamp;
Map<String, Long> tasktotime;
CpuData(long tstamp) {
last_tstamp = tstamp;
tasktotime = new HashMap<>();
}
}
void process(IEventDefinition defs, Map<Long, CpuData> cpu_to_data)
{
var name = defs.getDeclaration().getName();
if (! name.equals("sched:sched_switch")) {
return;
}
var tstamp = defs.getTimestamp();
var flds = defs.getFields();
var cpu_id = ((SimpleDatatypeDefinition)defs.getPacketContext()
.getDefinition("cpu_id")).getIntegerValue();
var data = cpu_to_data.computeIfAbsent(cpu_id, k -> new CpuData(tstamp));
var prev_comm = ((StringDefinition)flds.getDefinition("prev_comm"))
.getValue();
var diff = tstamp - data.last_tstamp;
data.tasktotime.compute(prev_comm, (k, v) -> (v == null) ? diff : v + diff);
data.last_tstamp = tstamp;
}
public void main(String[] args) throws CTFException {
var path = args[0];
CTFTrace trace = new CTFTrace(path);
CTFTraceReader reader = new CTFTraceReader(trace);
var cpu_to_data = new HashMap<Long, CpuData>();
while (reader.hasMoreEvents()) {
var defs = reader.getCurrentEventDef();
process(reader.getCurrentEventDef(), cpu_to_data);
reader.advance();
}
reader.close();
cpu_to_data.entrySet().stream()
.sorted(Map.Entry.<Long, CpuData>comparingByKey())
.forEach(e0 -> {
System.out.println("CPU: "+e0.getKey());
e0.getValue().tasktotime.entrySet().stream()
.sorted(Map.Entry.<String, Long>comparingByValue().reversed())
.limit(5)
.forEach(e1 -> {
System.out.format("\t%s: %f\n", e1.getKey(),
(double)e1.getValue() / 1000000000);
});
});
}
actf
actf is my CTF 2 implementation, written in C and with both Rust and Lua bindings for a safer or easier time respectively. actf is easy to embed in other applications due to its few dependencies and pull-based API, but is more of a pure CTF library rather than a trace manipulation framework like babeltrace.
C
The C API is fairly straight forward (biased indeed). You instantiate a reader, open a CTF trace with it and then call read on it repeatably until all events are consumed.
The 10 runs executed at an average of 306 ms per run.
actf C source code
#include <inttypes.h>
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <actf/actf.h>
typedef struct fatcharp {
const char *s;
size_t len;
} fatcharp;
/* fnv-1a for non null-terminated fatcharps */
static inline size_t fatcharp_hash_fnv(fatcharp p)
{
size_t prime, offset_basis;
if (sizeof(prime) == 4) {
prime = UINT32_C(16777619);
offset_basis = UINT32_C(2166136261);
} else { // assume 64-bit
prime = UINT64_C(1099511628211);
offset_basis = UINT64_C(14695981039346656037);
}
size_t hash = offset_basis;
for (size_t i = 0; i < p.len; i++) {
hash ^= p.s[i];
hash *= prime;
}
return hash;
}
static inline int fatcharpcmp(fatcharp p1, fatcharp p2) {
if (p1.len < p2.len) {
return -1;
} else if (p1.len > p2.len) {
return 1;
} else {
return strncmp(p1.s, p2.s, p1.len);
}
}
#define MAP_NAME fatcharptou64
#define MAP_KEY_TYPE fatcharp
#define MAP_KEY_CMP fatcharpcmp
#define MAP_VAL_TYPE uint64_t
#define MAP_HASH fatcharp_hash_fnv
#include "crust/map.h"
struct cpudata {
int64_t last_tstamp;
fatcharptou64 tasktotime;
};
static void cpudata_free(struct cpudata d)
{
fatcharptou64_free(&d.tasktotime);
}
#define MAP_NAME u64tocpudata
#define MAP_KEY_TYPE uint64_t
#define MAP_KEY_CMP uint64cmp
#define MAP_VAL_TYPE struct cpudata
#define MAP_VAL_FREE cpudata_free
#define MAP_HASH hash_uint64
#include "crust/map.h"
void process(actf_event *ev, u64tocpudata *cputodata)
{
const actf_event_cls *evc = actf_event_event_cls(ev);
const char *name = actf_event_cls_name(evc);
if (strcmp(name, "sched:sched_switch") != 0) {
return;
}
int64_t tstamp = actf_event_tstamp_ns_from_origin(ev);
uint64_t cpu_id = actf_fld_uint64(actf_pkt_fld(actf_event_pkt(ev), "cpu_id"));
struct cpudata *data = u64tocpudata_find(cputodata, cpu_id);
if (! data) {
struct cpudata d;
d.last_tstamp = tstamp;
fatcharptou64_init_cap(&d.tasktotime, 16);
u64tocpudata_insert(cputodata, cpu_id, d);
data = u64tocpudata_find(cputodata, cpu_id);
}
const actf_fld *prev_comm_fld = actf_event_fld(ev, "prev_comm");
fatcharp prev_comm = {
.s = actf_fld_str_raw(prev_comm_fld),
.len = actf_fld_str_sz(prev_comm_fld)
};
uint64_t diff = (uint64_t)(tstamp - data->last_tstamp);
uint64_t *t = fatcharptou64_find(&data->tasktotime, prev_comm);
if (t) {
*t += diff;
} else {
fatcharptou64_insert(&data->tasktotime, prev_comm, diff);
}
data->last_tstamp = tstamp;
}
struct task {
fatcharp name;
uint64_t time;
};
static int tasktimecmp(const void *a0, const void *a1)
{
const struct task *t0 = a0;
const struct task *t1 = a1;
if (t0->time > t1->time) {
return -1;
} else if (t0->time < t1->time) {
return 1;
} else {
return 0;
}
}
static int u64cmp(const void *a0, const void *a1)
{
const uint64_t *v0 = a0;
const uint64_t *v1 = a1;
if (*v0 < *v1) {
return -1;
} else if (*v0 > *v1) {
return 1;
} else {
return 0;
}
}
int main(int argc, char *argv[])
{
if (argc != 2) {
fprintf(stderr, "Usage: %s <CTF_PATH>\n", argv[0]);
return 1;
}
int rc = 0;
struct actf_freader_cfg cfg = { 0 };
actf_freader *rd = actf_freader_init(cfg);
if (!rd) {
perror("actf_freader_init");
return rc;
}
if (actf_freader_open_folders(rd, argv + 1, argc - 1) < 0) {
fprintf(stderr, "actf_freader_open_folder: %s\n", actf_freader_last_error(rd));
goto err;
}
u64tocpudata cputodata;
u64tocpudata_init_cap(&cputodata, 16);
size_t evs_len = 0;
actf_event **evs = NULL;
while ((rc = actf_freader_read(rd, &evs, &evs_len)) == 0 && evs_len) {
for (size_t i = 0; i < evs_len; i++) {
process(evs[i], &cputodata);
}
}
if (rc < 0) {
fprintf(stderr, "actf_freader_read: %s\n", actf_freader_last_error(rd));
}
uint64_t *cpus = malloc(sizeof(*cpus) * cputodata.len);
struct cpudata tmp;
struct u64tocpudata_it it = { 0 };
for (size_t i = 0; u64tocpudata_foreach(&cputodata, &cpus[i],
&tmp, &it) == 0; i++) {};
qsort(cpus, cputodata.len, sizeof(*cpus), u64cmp);
for (size_t i = 0; i < cputodata.len; i++) {
printf("CPU %"PRIu64"\n", cpus[i]);
struct cpudata *data = u64tocpudata_find(&cputodata, cpus[i]);
struct task *tasks = malloc(sizeof(*tasks) * data->tasktotime.len);
struct fatcharptou64_it it = { 0 };
for (size_t j = 0; fatcharptou64_foreach(&data->tasktotime, &tasks[j].name,
&tasks[j].time, &it) == 0; j++) {};
qsort(tasks, data->tasktotime.len, sizeof(*tasks), tasktimecmp);
for (size_t j = 0; j < data->tasktotime.len && j < 5; j++) {
printf("%20.*s: %f s\n", (int)tasks[j].name.len, tasks[j].name.s,
(double)tasks[j].time / 1000000000);
}
free(tasks);
}
free(cpus);
u64tocpudata_free(&cputodata);
err:
actf_freader_free(rd);
return rc;
}
Rust
The C API of actf has a few sharp parts where you may hurt yourself, such as non null-terminated strings from fields or the ability to keep data around from a call which is then invalidated on subsequent calls. To absolve myself from these sins, I experimented with a safe Rust API, which gives access to the full actf API but removes most of the potential accidents by the use of lifetimes and safer types. Reading events and their fields from a CTF trace is 100% safe in Rust, but to use higher level C components of actf, such as filters, unsafe is a necessity.
The Rust API adds a thin extra layer between all actf calls and utf8 validation, so it fell behind the C implementation gently, and averaged 341 ms over 10 runs. The Rust code is impressively concise while safer than the C alternative, but the numerous unwraps pains me.
actf Rust source code
use actf::Event;
use actf::EventGenerate;
use actf::Freader;
use actf::FreaderConfig;
use std::collections::HashMap;
use std::path::Path;
struct CpuData {
last_tstamp: i64,
task_to_time: HashMap<String, u64>,
}
impl CpuData {
fn new(tstamp: i64) -> Self {
Self {
last_tstamp: tstamp,
task_to_time: HashMap::new(),
}
}
}
fn process(ev: &Event, cpu_to_data: &mut HashMap<u64, CpuData>) {
let name = ev.class().name().unwrap();
if name != "sched:sched_switch" {
return;
}
let tstamp = ev.timestamp_ns_from_origin();
let cpu_id = ev.packet().field("cpu_id").unwrap().as_u64().unwrap();
let data = cpu_to_data.entry(cpu_id).or_insert(CpuData::new(tstamp));
let prev_comm_s = ev.field("prev_comm").unwrap().as_string().unwrap();
let prev_comm = prev_comm_s.as_str().unwrap();
let diff = (tstamp - data.last_tstamp) as u64;
if let Some(t) = data.task_to_time.get_mut(prev_comm) {
*t += diff;
} else {
data.task_to_time.insert(prev_comm.to_string(), diff);
}
data.last_tstamp = tstamp;
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
let path = std::env::args().nth(1).expect("no path given");
let mut fr = Freader::open_folder(FreaderConfig::default(), &Path::new(&path)).unwrap();
let mut cpu_to_data: HashMap<u64, CpuData> = HashMap::new();
while let Some(evs) = fr.generate() {
for &ev in evs? {
process(ev, &mut cpu_to_data);
}
}
let mut cpus: Vec<(&u64, &CpuData)> = cpu_to_data.iter().collect();
cpus.sort_by(|a, b| a.0.cmp(b.0));
for (cpu, data) in cpus {
println!("CPU {cpu}");
let mut tasks: Vec<(&String, &u64)> = data.task_to_time.iter().collect();
tasks.sort_by(|a, b| b.1.cmp(a.1));
for &(task, &time) in tasks.iter().take(5) {
println!("{:>20}: {} s", task, time as f64 / 1000000000.0);
}
}
Ok(())
}
Lua
actf can be built with Lua support to allow you to filter
and/or analyze CTF events using a Lua script. A Lua based
filter can be set up using the API, but the most convenient
way to use it is through the actf command-line where you can
provide a Lua script directly. The Lua script operates with
three
callbacks, actf.init, actf.filter
and actf.fini which are called during
initialization, on every event and during finalization
respectively.
Time-wise, the Lua approach spent an average of 602 ms to process the data.
actf Lua source code
function actf.init(...)
cputodata = {} -- cpu_id to {last_tstamp, tasktotime}
end
function actf.filter(ev)
if ev:name() ~= "sched:sched_switch" then
return nil
end
local cpu = ev:get("cpu_id")
local tstamp = ev:timestamp_ns()
local data = cputodata[cpu]
if data == nil then
local tasktotime = {}
setmetatable(tasktotime, {__index = function () return 0 end})
data = {last_tstamp=tstamp, tasktotime=tasktotime}
cputodata[cpu] = data
end
local prev_comm = ev:get("prev_comm")
local diff = tstamp - data.last_tstamp
data.tasktotime[prev_comm] = data.tasktotime[prev_comm] + diff
data.last_tstamp = tstamp
end
function actf.fini()
local cpus = {}
for k, v in pairs(cputodata) do table.insert(cpus, k) end
table.sort(cpus)
for _, cpu in ipairs(cpus) do
print("CPU " .. cpu)
local tasktimes = {}
for k, v in pairs(cputodata[cpu].tasktotime) do table.insert(tasktimes, {k, v}) end
table.sort(tasktimes, function (b, c) return b[2] > c[2] end)
for i, tasktime in ipairs(tasktimes) do
print(string.format("%20s: %f s", tasktime[1], tasktime[2] / 1000000000))
if i == 5 then break end
end
end
end
So what to pick?
Looking straight at the performance of the different approaches, actf is the clear winner, regardless if you are writing C or Rust. babeltrace2's C API does run in a respectable time though, so if you are not willing to pay the price of using a niche library like actf, babeltrace2 is a great choice.
| CTF library | Runtime (ms) |
|---|---|
| babeltrace2 (C) | 507 |
| babeltrace2 (Python) | 22700 |
| Trace Compass (Java) | 1107 |
| actf (C) | 306 |
| actf (Rust) | 341 |
| actf (Lua) | 602 |
Performance is not everything though. I usually find myself writing small throwaway programs and scripts to explore traces and discover new insights from them. For that purpose, the iteration speed in C is not good enough. Instead, the babeltrace2 Python API has been my go-to. The iteration speed is great there if the data you're analyzing is small enough. Due to the poor performance, I sometimes end up dividing my Python scripts into two parts: one that parses the CTF and extracts all relevant data into a separate file and one part that reads the extracted data and performs the analysis. Loading the extracted data is magnitudes faster than reparsing the CTF trace and gives me a tighter feedback loop when figuring out what to do with the data. For this kind of use-case, Lua-based scripting now becomes a contender for me, since I can write the logic directly in Lua and skip the two stage approach I use for Python. Yet, Lua cannot compete with the vast Python ecosystem, so the two stage Python-Python rocket might just change to a Lua-Python rocket if I need to do something more involved...