First, we observe that because PRC by definition utilizes only the current state of an LP, combined with the last event or last set of events it processed and most recently sent forward events to undo its forward computation, events can be locally committed without having to know the global virtual time (GVT) of the system. Consequently, there is no need for GVT calculations and fossil collection to be done in optimistic PRC parallel simulation systems. Based on our previous experience with distributed Time Warp systems, we believe the removal of GVT and fossil collection computations will allow parallel PRC models to scale when run on large, distributed processor configurations [17,18,22].
While the removal of GVT and fossil collection overheads may be the silver lining for optimistic parallel PRC simulations, rollback processing does have a potential dark side. Because rollback is realized as the reverse of forward processing, this implies that rollback itself is an optimistic computation and subject to being rolled back as well. Unlike TiPRC, rolling back an LP, results in the scheduling of events in the reverse to potentially other LPs. Thus, to complete a rollback an LP may have to wait on a reverse event from another LP. Because such a causality chain is not known a priori, the LP has the option to continue processing its local set of reverse events. If a reverse event arrives in the LP's rollback past, then the LP must rollback the rollback, which is equivalent to executing events in the forward direction (i.e., increasing timestamps). Consequently, a oscillation between forward and reverse execution may ensue to complete a single rollback. If this oscillation spreads to all processors in the system, will the system live lock?
To avoid this problem, we do observe that it might be possible to exploit the discovered lookahead of the forward execution path to perform the rollback in a conservative, non-risky way. Meaning, PRC would optimistically process events in the forward case yet rollback using a conservative synchronization approach so as to avoid rollback thrashing. Because the lookahead of the system is determined ``on the fly'' as the forward path is explored, it need not be known at runtime.
Because support for a bi-directional processing capability is required, the event scheduler must have two priority queues. The first contains events that are scheduled into the logical future of the simulator, either forwards or backwards depending on the start and end time values specified at runtime. The second queue is used to contain all events scheduled in the logical past (i.e., event to ``rollback'' an LP). The scheduling discipline will process all events from the logical past queue prior to processing any events from the logical future queue. We believe this approach will ensure that all LPs rolling back will get first priority, thus stamping out the spread of incorrect computation faster and ideally making the system more stable.