Up Next

Part I
Simple tests

Generating tests

We wish to generate tests which exhibit non-SC behaviours when run on x86 or Power machines.

For each architecture we first generate tests with diy. Those tests are violations of SC induced by a cycle of candidate relaxations at which at least one is relaxed by our model for the targeted architecture.

For x86, we get 58 ’X’ tests from the configuration file X.conf (i.e. diy is invoked as “diy -conf X.conf”). The condition on test outcome generated by diy detects one non-SC behaviour. We rather wish to detect any non-SC behaviour. Equivalently, we shall require SC behaviour by running our (non-released) simulator memevents in SC mode and changing test conditions so that SC behaviour is required.

As an example of the approach, running X006.litmus on some x86 machine yields 12 outcomes:

% litmus -mach chianti X006.litmus
...
Histogram (12 states)
5     :>1:EAX=1; 1:EBX=0; 2:EAX=1; 2:EBX=0; x=1;
999488:>1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=0; x=1;
1355  :>1:EAX=1; 1:EBX=1; 2:EAX=1; 2:EBX=0; x=1;
1062834:>1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=0; x=1;
1020  :>1:EAX=1; 1:EBX=0; 2:EAX=1; 2:EBX=1; x=1;
698661:>1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=1; x=1;
6     :>1:EAX=1; 1:EBX=1; 2:EAX=1; 2:EBX=1; x=1;
739   :>1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=1; x=1;
540   :>1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=0; x=2;
595701:>1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=0; x=2;
1050680:>1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=1; x=2;
588971:>1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=1; x=2;
...

However, there are only 10 SC outcomes as can be seen by running the test against SC:

% memevents -model sc X006.litmus
...
States 10
1:EAX=1; 1:EBX=0; 2:EAX=1; 2:EBX=1; x=1;
1:EAX=1; 1:EBX=1; 2:EAX=1; 2:EBX=0; x=1;
1:EAX=1; 1:EBX=1; 2:EAX=1; 2:EBX=1; x=1;
1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=0; x=1;
1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=1; x=1;
1:EAX=2; 1:EBX=0; 2:EAX=1; 2:EBX=1; x=2;
1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=0; x=1;
1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=0; x=2;
1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=1; x=1;
1:EAX=2; 1:EBX=1; 2:EAX=1; 2:EBX=1; x=2;
..

We see that the condition exists (x=2 /\ 1:EAX=2 /\ 1:EBX=0 /\ 2:EAX=1 /\ 2:EBX=0) of test X006 only detects one of the two non-SC outcomes, the other (1:EAX=1; 1:EBX=0; 2:EAX=1; 2:EBX=0; x=1) would remain unnoticed. We thus change the condition of test, resulting into the new test X006, so as to detect all SC outcomes. Notice that the new condition is computed automatically by memevents in SC mode.

We apply the above procedure to the complete batch of tests produced by diy. As a result, any of the 58 new ’X’ tests will yield “No” as soon as it produces a non-SC outcome and “Ok” when all outcomes are SC. Notice that the new conditions are unreadable, except for simple tests such as X054.

For Power, we proceed similarly, with a specific diy configuration file PPC.conf, resulting in 209 PPC’ tests.

Applying offence

For x86 we apply offence three times to all tests of the initial batch(containing for instance X006), using various synchronisation techniques:

For Power we apply offence four times to all tests of the initial batch (containing for instance PPC002), using various synchronisation techniques:

One may notice that Power code uses macros for FNO, STA, LOCK and UNLOCK, while locking idioms are expanded in x86 code. Here are expanded versions of PPC002-A, PPC002-B, and PPC002-L.

Running the tests and checking soundness

For x86, we run the tests on two machines saumur (Intel Xeon with 4, 2-ways hyperthreaded, cores (8 logical procs) running Linux) and chianti (Intel Xeon with 8, 2-ways hyperthreaded, cores (16 logical procs) running Linux).

We performed several runs of the complete batch of tests, varying testing conditions (see litmus documentation). The variations are recorded in files saumur.txt and chianti.txt. This produces files S.00 to S.13 for saumur and files C.00 to C.12 for chianti. We aggregate those into S.X and C.X, respectively.

The following four tables expose the conclusion of the experiment: offence restores SC for all tests.

 C.XS.X
X000No, 415M/27GNo, 5.6M/4.1G
X001No, 25M/17GNo, 1.8M/2.0G
X002No, 20M/17GNo, 1.6M/2.0G
X003No, 18M/17GNo, 1.8M/2.0G
X004No, 4.6M/14GNo, 353k/2.0G
X005No, 2.1M/14GNo, 100k/2.0G
X006No, 3.4M/17GNo, 641k/2.0G
X007No, 810k/14GNo, 102k/2.0G
X008No, 128k/14GNo, 20k/2.0G
X009No, 221M/27GNo, 7.1M/4.1G
X010No, 16M/17GNo, 1.6M/2.0G
X011No, 6.1M/17GNo, 411k/2.0G
X012No, 242M/27GNo, 5.9M/4.1G
X013No, 5.3M/14GNo, 337k/2.0G
X014No, 1.1M/14GNo, 99k/2.0G
X015No, 18M/17GNo, 1.6M/2.0G
X016No, 16M/17GNo, 1.9M/2.0G
X017No, 3.3M/14GNo, 412k/2.0G
X018No, 1.8M/14GNo, 96k/2.0G
X019No, 4.1M/17GNo, 800k/2.0G
X020No, 1.4M/14GNo, 98k/2.0G
X021No, 273k/14GNo, 26k/2.0G
X022No, 61M/27GNo, 721k/4.1G
X023No, 149k/14GNo, 27k/2.0G
X024No, 5.2M/17GNo, 489k/2.0G
X025No, 4.4M/17GNo, 671k/2.0G
X026No, 734k/14GNo, 95k/2.0G
X027No, 248k/14GNo, 31k/2.0G
X028No, 160k/17GNo, 34k/2.0G
X029No, 107k/14GNo, 24k/2.0G
X030No, 7.7k/14GNo, 1.3k/2.0G
X031No, 40M/27GNo, 8.3M/4.1G
X032No, 9.5M/17GNo, 1.5M/2.0G
X033No, 4.9M/17GNo, 375k/2.0G
X034No, 53M/27GNo, 10M/4.1G
X035No, 12M/17GNo, 1.6M/2.0G
X036No, 1.8M/14GNo, 333k/2.0G
X037No, 961k/14GNo, 92k/2.0G
X038No, 4.2M/17GNo, 605k/2.0G
X039No, 841k/14GNo, 89k/2.0G
X040No, 160k/14GNo, 23k/2.0G
X041No, 6.5M/27GNo, 2.4M/4.1G
X042No, 5.9M/17GNo, 505k/2.0G
X043No, 1.0M/14GNo, 99k/2.0G
X044No, 286k/14GNo, 39k/2.0G
X045No, 362k/17GNo, 23k/2.0G
X046No, 271k/14GNo, 24k/2.0G
X047No, 5.3k/14GNo, 1.4k/2.0G
X048No, 64M/27GNo, 6.0M/4.1G
X049No, 14M/17GNo, 1.9M/2.0G
X050No, 2.5M/14GNo, 367k/2.0G
X051No, 1.2M/14GNo, 96k/2.0G
X052No, 4.1M/17GNo, 601k/2.0G
X053No, 206k/14GNo, 28k/2.0G
X054No, 5.7M/27GNo, 809k/4.1G
X055No, 404k/14GNo, 47k/2.0G
X056No, 163k/17GNo, 40k/2.0G
X057No, 3.7k/14GNo, 1.1k/2.0G
    
 C.XS.X
X000-FOkOk
X001-FOkOk
X002-FOkOk
X003-FOkOk
X004-FOkOk
X005-FOkOk
X006-FOkOk
X007-FOkOk
X008-FOkOk
X009-FOkOk
X010-FOkOk
X011-FOkOk
X012-FOkOk
X013-FOkOk
X014-FOkOk
X015-FOkOk
X016-FOkOk
X017-FOkOk
X018-FOkOk
X019-FOkOk
X020-FOkOk
X021-FOkOk
X022-FOkOk
X023-FOkOk
X024-FOkOk
X025-FOkOk
X026-FOkOk
X027-FOkOk
X028-FOkOk
X029-FOkOk
X030-FOkOk
X031-FOkOk
X032-FOkOk
X033-FOkOk
X034-FOkOk
X035-FOkOk
X036-FOkOk
X037-FOkOk
X038-FOkOk
X039-FOkOk
X040-FOkOk
X041-FOkOk
X042-FOkOk
X043-FOkOk
X044-FOkOk
X045-FOkOk
X046-FOkOk
X047-FOkOk
X048-FOkOk
X049-FOkOk
X050-FOkOk
X051-FOkOk
X052-FOkOk
X053-FOkOk
X054-FOkOk
X055-FOkOk
X056-FOkOk
X057-FOkOk
    
 C.XS.X
X000-AOkOk
X001-AOkOk
X002-AOkOk
X003-AOkOk
X004-AOkOk
X005-AOkOk
X006-AOkOk
X007-AOkOk
X008-AOkOk
X009-AOkOk
X010-AOkOk
X011-AOkOk
X012-AOkOk
X013-AOkOk
X014-AOkOk
X015-AOkOk
X016-AOkOk
X017-AOkOk
X018-AOkOk
X019-AOkOk
X020-AOkOk
X021-AOkOk
X022-AOkOk
X023-AOkOk
X024-AOkOk
X025-AOkOk
X026-AOkOk
X027-AOkOk
X028-AOkOk
X029-AOkOk
X030-AOkOk
X031-AOkOk
X032-AOkOk
X033-AOkOk
X034-AOkOk
X035-AOkOk
X036-AOkOk
X037-AOkOk
X038-AOkOk
X039-AOkOk
X040-AOkOk
X041-AOkOk
X042-AOkOk
X043-AOkOk
X044-AOkOk
X045-AOkOk
X046-AOkOk
X047-AOkOk
X048-AOkOk
X049-AOkOk
X050-AOkOk
X051-AOkOk
X052-AOkOk
X053-AOkOk
X054-AOkOk
X055-AOkOk
X056-AOkOk
X057-AOkOk
    
 C.XS.X
X000-LOkOk
X001-LOkOk
X002-LOkOk
X003-LOkOk
X004-LOkOk
X005-LOkOk
X006-LOkOk
X007-LOkOk
X008-LOkOk
X009-LOkOk
X010-LOkOk
X011-LOkOk
X012-LOkOk
X013-LOkOk
X014-LOkOk
X015-LOkOk
X016-LOkOk
X017-LOkOk
X018-LOkOk
X019-LOkOk
X020-LOkOk
X021-LOkOk
X022-LOkOk
X023-LOkOk
X024-LOkOk
X025-LOkOk
X026-LOkOk
X027-LOkOk
X028-LOkOk
X029-LOkOk
X030-LOkOk
X031-LOkOk
X032-LOkOk
X033-LOkOk
X034-LOkOk
X035-LOkOk
X036-LOkOk
X037-LOkOk
X038-LOkOk
X039-LOkOk
X040-LOkOk
X041-LOkOk
X042-LOkOk
X043-LOkOk
X044-LOkOk
X045-LOkOk
X046-LOkOk
X047-LOkOk
X048-LOkOk
X049-LOkOk
X050-LOkOk
X051-LOkOk
X052-LOkOk
X053-LOkOk
X054-LOkOk
X055-LOkOk
X056-LOkOk
X057-LOkOk

More precisely, remembering that “No” means that some non-SC behaviour have been observed, while “Ok” means that all observed behaviours are SC, we see that, for each test, offence experimentally restores SC with three different techniques. The leftmost table also shows the number of non-SC outcomes observed/total number of outcomes collected. Notice that the tests produced by offence have been run at least as many times as the initial tests.

For Power, we run the tests on three machines abducens (Power 6 with 4 cores 2-way SMT running Linux), vargas (Power 6 with 32 cores 2-way SMT running AIX) and power7 (Power 7 with 8 cores 4-way SMT running Linux) We performed several runs of the complete batch of tests, varying testing conditions (see litmus documentation). The variations are recorded in files abducens.txt, vargas.txt and power7.txt. We aggregate run logs into A.X, V.X, and P.X, respectively.

The tables resulting from the experiment are in a separate page. One easily checks that all outcomes collected by running offence generated programs are SC, by seeing that no cell in the last four tables ever contains ”No”. As a curiosity, notice that a few of the initial tests do not exhibit any non-SC behaviour, see PCC170 for instance.

Complete sources and logs: for x86, X86-SIMPLE (archive) and Power, PPC-SIMPLE (archive).


Up Next