Roll Counting ------------- This is how I propose to deal with the problem of pid wrapping. The problem: ------------ Although a pid uniquely identifies a process *at runtime*, it does not uniquely identify it over the lifetime of a computing session. pids eventually wrap around, and are reused. The solution: ------------- I propose to add another table to the event column, "rollcount", which will store the number of times that we have previously seen this pid. A process will be uniquely identified by its pid/rollcount. Implementation: --------------- This should be done in post-processing - not while the system is actively collecting data. Run this algorithm: cur_count = "SELECT DISTINCT (pid) FROM event"; For each (item in cur_count): cur_forks = "SELECT * FROM event WHERE rc = cur_count.pid AND (it's a fork event) ORDER BY date"; cur_count = 0; For each (fork in cur_forks): "UPDATE event SET roll_count = cur_count WHERE pid = fork.pid AND date >= fork.date" cur_count ++; end end