memory-barriers.txt (6864B)
1 2 http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_and_Cookbook_A08.pdf 3 4 ----- 5 6 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14041.html 7 states: 8 9 It is architecturally defined that software must perform a Data Memory 10 Barrier (DMB) operation: 11 - between acquiring a resource, for example, through locking a mutex 12 (MUTual EXclusion) or decrementing a semaphore, and making any access 13 to that resource 14 - before making a resource available, for example, through unlocking a 15 mutex or incrementing a semaphore. 16 17 ----- 18 19 To generate a scheduling barrier: 20 asm volatile ("" : : : "memory"); 21 22 To generate a hardware memory barrier: 23 __sync_synchronize(); 24 25 This appears to issue a "dmb sy" on cortex-m3 in arm-eabi-none 26 27 28 Based on these excerpts from a gcc-help thread. References: 29 http://gcc.gnu.org/ml/gcc-help/2011-04/msg00166.html 30 http://gcc.gnu.org/ml/gcc-help/2011-04/msg00168.html 31 http://gcc.gnu.org/ml/gcc-help/2011-04/msg00180.html 32 33 From: Ian Lance Taylor <iant at google dot com> 34 Date: Mon, 11 Apr 2011 14:42:07 -0700 35 Subject: Re: full memory barrier? 36 37 Hei Chan <structurechart at yahoo dot com> writes: 38 39 > I am a little bit confused what asm volatile ("" : : : "memory") does. 40 > 41 > I searched online; many people said that it creates the "full memory barrier". 42 > 43 > I have a test code: 44 > int main() { 45 > bool bar; 46 > asm volatile ("" : : : "memory"); 47 > bar = true; 48 > return 1; 49 > } 50 > 51 > Running g++ -c -g -Wa,-a,-ad foo.cpp gives me: 52 > 53 > 2:foo.cpp **** bool bar; 54 > 3:foo.cpp **** asm volatile ("" : : : "memory"); 55 > 22 .loc 1 3 0 56 > 4:foo.cpp **** bar = true; 57 > 23 .loc 1 4 0 58 > 59 > It doesn't involve any fence instruction. 60 > 61 > Maybe I completely misunderstand the idea of "full memory barrier". 62 63 The definition of "memory barrier" is ambiguous when looking at code 64 written in a high-level language. 65 66 The statement "asm volatile ("" : : : "memory");" is a compiler 67 scheduling barrier for all expressions that load from or store values to 68 memory. That means something like a pointer dereference, an array 69 index, or an access to a volatile variable. It may or may not include a 70 reference to a local variable, as a local variable need not be in 71 memory. 72 73 This kind of compiler scheduling barrier can be used in conjunction with 74 a hardware memory barrier. The compiler doesn't know that a hardware 75 memory barrier is special, and it will happily move memory access 76 instructions across the hardware barrier. Therefore, if you want to use 77 a hardware memory barrier in compiled code, you must use it along with a 78 compiler scheduling barrier. 79 80 On the other hand a compiler scheduling barrier can be useful even 81 without a hardware memory barrier. For example, in a coroutine based 82 system with multiple light-weight threads running on a single processor, 83 you need a compiler scheduling barrier, but you do not need a hardware 84 memory barrier. 85 86 gcc will generate a hardware memory barrier if you use the 87 __sync_synchronize builtin function. That function acts as both a 88 hardware memory barrier and a compiler scheduling barrier. 89 90 Ian 91 92 ----- 93 94 From: Ian Lance Taylor <iant at google dot com> 95 Date: Mon, 11 Apr 2011 15:20:27 -0700 96 Subject: Re: full memory barrier? 97 98 Hei Chan <structurechart at yahoo dot com> writes: 99 100 > You mentioned the statement "is a compiler scheduling barrier for all 101 > expressions that load from or store values to memory". Does "memory" mean the 102 > main memory? Or does it include the CPU cache? 103 104 I tried to explain what I meant by way of example. It means pointer 105 reference, array reference, volatile variable access. Also I should 106 have added global variable access. In general it means memory from the 107 point of view of the compiler. The compiler doesn't know anything about 108 the CPU cache. When thinking about a "compiler scheduling barrier," you 109 have to think about the world that the compiler sees, which is quite 110 different from, though obviously related to, the world that the hardware 111 sees. 112 113 Ian 114 115 ----- 116 117 From: Ian Lance Taylor <iant at google dot com> 118 Date: Tue, 12 Apr 2011 15:36:58 -0700 119 Subject: Re: full memory barrier? 120 121 David Brown <david at westcontrol dot com> writes: 122 123 > On 11/04/2011 23:42, Ian Lance Taylor wrote: 124 >> 125 >> The definition of "memory barrier" is ambiguous when looking at code 126 >> written in a high-level language. 127 >> 128 >> The statement "asm volatile ("" : : : "memory");" is a compiler 129 >> scheduling barrier for all expressions that load from or store values to 130 >> memory. That means something like a pointer dereference, an array 131 >> index, or an access to a volatile variable. It may or may not include a 132 >> reference to a local variable, as a local variable need not be in 133 >> memory. 134 >> 135 > 136 > Is there any precise specifications for what counts as "memory" here? 137 > As gcc gets steadily smarter, it gets harder to be sure that 138 > order-specific code really is correctly ordered, while letting the 139 > compiler do it's magic on the rest of the code. 140 141 I'm not aware of a precise specification. It would be something like 142 the list I made above, to which I would add global variables. But 143 you're right, as the compiler gets smarter, it is increasingly able to 144 lift things out of memory. I suppose that in the extreme case, it's 145 possible that only volatile variables count. 146 147 148 > For example, if you have code like this: 149 > 150 > static int x; 151 > void test(void) { 152 > x = 1; 153 > asm volatile ("" : : : "memory"); 154 > x = 2; 155 > } 156 > 157 > The variable "x" is not volatile - can the compiler remove the 158 > assignment "x = 1"? Perhaps with aggressive optimisation, the 159 > compiler will figure out how and when x is used, and discover that it 160 > doesn't need to store it in memory at all, but can keep it in a 161 > register (perhaps all uses have ended up inlined inside the same 162 > function). Then "x" is no longer in memory - will it still be 163 > affected by the memory clobber? 164 165 If the compiler manages to lift x into a register, then it will not be 166 affected by the memory clobber, and, yes, the compiler would most likely 167 remove the assignment "x = 1". 168 169 170 > Also, is there any way to specify a more limited clobber than just 171 > "memory", so that the compiler has as much freedom as possible? 172 > Typical examples are to specify "clobbers" for just certain variables, 173 > leaving others unaffected, or to distinguish between reads and writes. 174 > For example, you might want to say "all writes should be completed by 175 > this point, but data read into registers will stay valid". 176 > 177 > Some of this can be done with volatile accesses in different ways, but 178 > not always optimally, and not always clearly. 179 180 You can clobber certain variables by listing them in the output of the 181 asm statement. There is no way to distinguish between reads and writes. 182 183 Ian 184 185