|
...race conditions and deadlocks--are difficult uncover and analyze, and often remain undetected past the product deployment.
There are a number of reasons for this difficulty:
* For a given functional test, the size of the set of possible interleavings is exponential in the length of the program, and it is not practical to test them all (except for small programs (1,2)). Only a few of the interleavings actually produce the concurrent fault, and thus the probability of producing the concurrent fault is very low.
* Since the scheduler is deterministic, the execution of short tests that are independent of I/0 delays and network load will usually create the same interleaving regardless of the environment. This is also the case when long tests that are not too dependent on I/0 delays and network load are executed in a similar environment.
* Tests that reveal a concurrent fault in a production environment or in stress test are usually long and run under various environmental conditions (such as system load or software version). As a result, such tests are not necessarily repeatable, and when a fault is found, much effort is invested in recreating the conditions in which it occurred.
Research on finding concurrent faults focuses on detecting actual data races (e.g., References 3, 4, and 5 among others), using algorithms for efficient identification of race conditions in the current run. As mentioned above, the chance that a race condition will occur is low, and the race detection tool does nothing to increase it. Moreover, if the race is intentional, false alarms will result. Furthermore, some concurrent defects are not captured by the formal definition of a race condition. For example, one could write a faulty program that depends on the scheduling algorithm. (6)
Researchers have also looked at the problem of replay in distributed and concurrent settings. In a distributed setting, processes might synchronize by sending and receiving messages. For a given run, an order is determined over the message sending and message receiving events. A replay for a distributed setting is a re-execution of the program in which all these events occur in the same order as in the original run. In a concurrent setting, a replay algorithm preserves the order of shared memory access and synchronization events. (This problem was solved for the Java language in Reference 7.) Replay support facilitates debugging as it removes nondeterminism from the execution of a distributed or concurrent program.
Deterministic testing or debugging of distributed or concurrent programs is discussed in References 8 and 9. It is a weak form of replay in which some of the dependencies between processes are captured at run time. Next, during subsequent runs, the debugging tool attempts to force these dependencies. In contrast to replay, no guarantee is provided that the same timing will be obtained in each run.
Model checking has been applied to testing of multithreaded Java programs in Reference 10. In model checking each process is modeled by a state machine. State machines might communicate by sending and receiving messages. An additional state machine models the predicate to be checked. Together the process state machines and the predicate state machine form the Cartesian product state machine. State space exploration is applied to the Cartesian product state machine. At state space exploration time, the predicate state machine is used to determine if the predicate is violated. Systematic state space exploration has inherent scalability issues. (11)
In this paper we study the problem of generating different interleavings for the purpose of revealing concurrent faults. Since the size of the search space is exponential in the program length, we take a heuristic approach. We study whether seeding the program with sleep(), yield(), and priority() primitives increases fault detection capability. We seed the program at shared memory access events and synchronization events. At run time, we make random or coverage-based decisions as to whether seeded primitives are to be executed. Using the seeding technique, we were able to dramatically increase the probability of finding typical concurrent faults (6) injected in Java programs. The probability of observing the concurrent faults without the seeded delays was very low. In addition, we found that the sleep() primitive was almost always better at finding the injected defects than the yield() and priority() primitives. Finally, we defined and implemented an architecture, ConTest, that combines the replay algorithm introduced in Reference 7, which is essential for debugging, with our seeding technique.
Our experiments were conducted on a single processor. Any concurrent defect found by the seeding technique on a single processor is also a defect on a multiprocessor.
Although the seeding technique gives good results in practice, it has inherent limitations. First, the interleaving space is exponential in the length of the program. In addition, different interleavings occur with different probability. We discuss how coverage-directed generation of interleavings can be used to overcome these limitations.
We utilize the test suites employed in functional test and system test to detect concurrent faults. A test suite definition includes the expected results, which are used to indicate if a fault occurred. We rerun each test many times. The seeding technique causes different interleavings to occur. The final results are examined to determine if a fault was observed. This approach integrates seamlessly into standard testing practices. The automated tests are simply re-executed. Our approach can be combined with race detection tools to improve the detection of concurrent faults.
The rest of the paper is organized as follows. In the next section, we cover race detection and replay. In the section "Use scenario," we describe a typical use of ConTest for testing multithreaded Java applications. The section "Architecture" explains ConTest's architecture and discusses some design issues. Next, we describe the seeding technique, and then we explain how coverage can be used to improve it. The section "Experiments" details six experiments in which some commonly found concurrent defects are detected. We present our conclusions in the last section.
Race detectors and replay
In this section we discuss existing race detection and replay algorithms in the context of debugging synchronization faults. For the purposes of our work, we define an interleaving as a complete order over the operations of the program, such that on a system with a single processor there is a possible run in which this was the temporal order of the execution. In multiprocessor systems the definition of temporal order is more complicated, but for our purposes any reasonable definition will do (see Reference 12, for example).
Detecting race conditions. Many concurrent defects result from data-race conditions. A data-race condition is defined by Savage (4) as two accesses to a shared variable, at least one of which is a write, with no mechanism used by either to prevent simultaneous access.
A race condition is a possible source for a defect, since the value of the variable at the time of reading depends on the scheduling. However, not all race conditions are defects. For example, the following code swaps two integers. There is a race condition, but no defect, as the swapping occurs regardless of the interleaving.
class Change{ static int x = 4, y = 5; //Used to implement a busy wait. static int z1 = -1, z2 = -1; //Swap the value of x and y concurrently public static void main(String args[ ]){ (new Thread(new ChangeA())).start(); (new Thread(new ChangeB())).start();} } class ChangeA implements Runnable{ public void run(){ Change.z1 = Change.x; while(Change.z2 == -1) System.out.println("A is waiting"); Change.x = Change.z2;}} class ChangeB implements Runnable{ public void run(){ Change.z2 = Change.y; while(Change.z1 == -1) System.out.println("B is waiting"); Change.y = Change.z1;}}
It should be noted that race conditions are execution-dependent: a program might be in a race condition in one execution and not in another. Therefore, tools that detect races at run time (or by analyzing the trace of a given run) are likely to miss some potential data races.
In addition, the use of locks might prevent race detection tools from exposing a concurrent defect which is not a race condition. For example, suppose there are two accesses to a variable, and suppose there is a lock associated with that variable. If the lock is obtained before each access and released just after it, then the order of the accesses is arbitrary. If this is unintentional, it might be a defect; however, formally, there is no race condition, and race detection tools will not alert us to the problem.
In conclusion, race detection tools, having no knowledge of the function of the program, would miss defects on the one hand and result in false alarms on the other. The "benign" data race cited above is an example of code that results in a false alarm.
Replay. Deterministic programs are debugged by repeatedly executing the program using the...
NOTE: All illustrations and photos
have been removed from this article.

More articles from IBM Systems Journal
The software testing automation framework., March 01, 2002 Flavers: a finite state verification technique for software systems., March 01, 2002
Looking for additional articles?
Search our database of over 3 million articles.
Looking for more in-depth information on this industry?
Search our complete database of Industry & Market reports by text, subject, publication
name or publication date.
About Goliath
Whether you're looking for sales prospects, competitive information, company
analysis or best practices in managing your organization,
Goliath can help you meet your business needs.
Our extensive business information databases empower business
professionals with both the breadth and depth of credible,
authoritative information they need to support their business
goals. Whether it be strategic planning, sales prospecting,
company research or defining management best practices -
Goliath is your leading source for accurate information.
|