Abstract. Understanding the dynamic behavior of parallel programs is a critical issue both for debugging and for optimization. A visualization tool displaying an animated sequence of the global states the program runs through offers valuable support for th
pass the events in a correct order.However,the data acquisition layer may receive events generated on different hosts in an incorrect order due to event buffering done by the hosts or due to network delays.It is impossible to order these events correctly by using their time stamps,since their resolution is usually not good enough,and clocks on different hosts cannot be synchronized with the required accuracy.However,the order of events
must obey the causality (or “happened before”)relation [10],i.e.an event
must be prior to an event if there is a causal dependency from to (written as ).As an example,consider to be the sending of a message and to be the receipt of this message on a different host.Even if the data acquisition layer receives before due to buffer or network delays,it must forward to the modeling layer first,because a message cannot be received before it has been sent.In the current implementation,we store incoming events in different buffers for each host used by the application,assuming that the monitoring system ensures correct event ordering within a single host.Whenever the next event is requested by the modeling layer,we have to determine the buffer from which to read that event,thus incrementally merging the local buffers into a consistent global sequence.
The problem of merging local event traces to a global one according to causal rela-tionships is well known;there are also several solutions yet (e.g.[8,9]).However,with a state based visualization approach it turns out that just considering causal dependencies between events is not sufficient for achieving a consistent visualization.As an example,consider the situation depicted in the space-time diagram in Fig.3.Note that there are no causal dependencies (in the sense of “happened before”)between events in different
tasks.However,it is obviously incorrect to order these events as
,because the broadcast actually sent the message to task ,but with this sequence would not be in the group when the broadcast is executed.In contrast,the orders and
are both correct.
1task t on host 12Fig.3.Example illustrating consistency constraints between events and modeled state This implies that in addition to dependencies between events,we have constraints imposed by the current state in the model when selecting the next event to process.Therefore,we use an extended sorting algorithm that works as follows:The input are event buffers,one for each host.We read and remove the next event to be processed