m3dev

cortex m3 debug tools -- superceded by mdebug
git clone http://frotz.net/git/m3dev.git
Log | Files | Refs | README | LICENSE

memory-barriers.txt (6864B)


      1 
      2 http://infocenter.arm.com/help/topic/com.arm.doc.genc007826/Barrier_Litmus_Tests_and_Cookbook_A08.pdf
      3 
      4 -----
      5 
      6 http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14041.html
      7 states:
      8 
      9 It is architecturally defined that software must perform a Data Memory 
     10 Barrier (DMB) operation:
     11 - between acquiring a resource, for example, through locking a mutex 
     12   (MUTual EXclusion) or decrementing a semaphore, and making any access 
     13   to that resource
     14 - before making a resource available, for example, through unlocking a 
     15   mutex or incrementing a semaphore.
     16 
     17 -----
     18 
     19 To generate a scheduling barrier:
     20    asm volatile ("" : : : "memory");
     21 
     22 To generate a hardware memory barrier:
     23    __sync_synchronize();
     24 
     25    This appears to issue a "dmb sy" on cortex-m3 in arm-eabi-none
     26 
     27 
     28 Based on these excerpts from a gcc-help thread.  References:
     29 http://gcc.gnu.org/ml/gcc-help/2011-04/msg00166.html
     30 http://gcc.gnu.org/ml/gcc-help/2011-04/msg00168.html
     31 http://gcc.gnu.org/ml/gcc-help/2011-04/msg00180.html
     32 
     33 From: Ian Lance Taylor <iant at google dot com>
     34 Date: Mon, 11 Apr 2011 14:42:07 -0700
     35 Subject: Re: full memory barrier?
     36 
     37 Hei Chan <structurechart at yahoo dot com> writes:
     38 
     39 > I am a little bit confused what asm volatile ("" : : : "memory") does.
     40 >
     41 > I searched online; many people said that it creates the "full memory barrier".
     42 >
     43 > I have a test code:
     44 > int main() {
     45 >         bool bar;
     46 >         asm volatile ("" : : : "memory");
     47 >         bar = true;
     48 >         return 1;
     49 > }
     50 >
     51 > Running g++ -c -g -Wa,-a,-ad foo.cpp gives me:
     52 >
     53 >    2:foo.cpp       ****         bool bar;
     54 >    3:foo.cpp       ****         asm volatile ("" : : : "memory");
     55 >   22                            .loc 1 3 0
     56 >    4:foo.cpp       ****         bar = true;
     57 >   23                            .loc 1 4 0
     58 >
     59 > It doesn't involve any fence instruction.
     60 >
     61 > Maybe I completely misunderstand the idea of "full memory barrier".
     62 
     63 The definition of "memory barrier" is ambiguous when looking at code
     64 written in a high-level language.
     65 
     66 The statement "asm volatile ("" : : : "memory");" is a compiler
     67 scheduling barrier for all expressions that load from or store values to
     68 memory.  That means something like a pointer dereference, an array
     69 index, or an access to a volatile variable.  It may or may not include a
     70 reference to a local variable, as a local variable need not be in
     71 memory.
     72 
     73 This kind of compiler scheduling barrier can be used in conjunction with
     74 a hardware memory barrier.  The compiler doesn't know that a hardware
     75 memory barrier is special, and it will happily move memory access
     76 instructions across the hardware barrier.  Therefore, if you want to use
     77 a hardware memory barrier in compiled code, you must use it along with a
     78 compiler scheduling barrier.
     79 
     80 On the other hand a compiler scheduling barrier can be useful even
     81 without a hardware memory barrier.  For example, in a coroutine based
     82 system with multiple light-weight threads running on a single processor,
     83 you need a compiler scheduling barrier, but you do not need a hardware
     84 memory barrier.
     85 
     86 gcc will generate a hardware memory barrier if you use the
     87 __sync_synchronize builtin function.  That function acts as both a
     88 hardware memory barrier and a compiler scheduling barrier.
     89 
     90 Ian
     91 
     92 -----
     93 
     94 From: Ian Lance Taylor <iant at google dot com>
     95 Date: Mon, 11 Apr 2011 15:20:27 -0700
     96 Subject: Re: full memory barrier?
     97 
     98 Hei Chan <structurechart at yahoo dot com> writes:
     99 
    100 > You mentioned the statement "is a compiler scheduling barrier for all 
    101 > expressions that load from or store values to memory".  Does "memory" mean the 
    102 > main memory?  Or does it include the CPU cache?
    103 
    104 I tried to explain what I meant by way of example.  It means pointer
    105 reference, array reference, volatile variable access.  Also I should
    106 have added global variable access.  In general it means memory from the
    107 point of view of the compiler.  The compiler doesn't know anything about
    108 the CPU cache.  When thinking about a "compiler scheduling barrier," you
    109 have to think about the world that the compiler sees, which is quite
    110 different from, though obviously related to, the world that the hardware
    111 sees.
    112 
    113 Ian
    114 
    115 -----
    116 
    117 From: Ian Lance Taylor <iant at google dot com>
    118 Date: Tue, 12 Apr 2011 15:36:58 -0700
    119 Subject: Re: full memory barrier?
    120 
    121 David Brown <david at westcontrol dot com> writes:
    122 
    123 > On 11/04/2011 23:42, Ian Lance Taylor wrote:
    124 >>
    125 >> The definition of "memory barrier" is ambiguous when looking at code
    126 >> written in a high-level language.
    127 >>
    128 >> The statement "asm volatile ("" : : : "memory");" is a compiler
    129 >> scheduling barrier for all expressions that load from or store values to
    130 >> memory.  That means something like a pointer dereference, an array
    131 >> index, or an access to a volatile variable.  It may or may not include a
    132 >> reference to a local variable, as a local variable need not be in
    133 >> memory.
    134 >>
    135 >
    136 > Is there any precise specifications for what counts as "memory" here?
    137 > As gcc gets steadily smarter, it gets harder to be sure that
    138 > order-specific code really is correctly ordered, while letting the
    139 > compiler do it's magic on the rest of the code.
    140 
    141 I'm not aware of a precise specification.  It would be something like
    142 the list I made above, to which I would add global variables.  But
    143 you're right, as the compiler gets smarter, it is increasingly able to
    144 lift things out of memory.  I suppose that in the extreme case, it's
    145 possible that only volatile variables count.
    146 
    147 
    148 > For example, if you have code like this:
    149 >
    150 > static int x;
    151 > void test(void) {
    152 > 	x = 1;
    153 > 	asm volatile ("" : : : "memory");
    154 > 	x = 2;
    155 > }
    156 >
    157 > The variable "x" is not volatile - can the compiler remove the
    158 > assignment "x = 1"?  Perhaps with aggressive optimisation, the
    159 > compiler will figure out how and when x is used, and discover that it
    160 > doesn't need to store it in memory at all, but can keep it in a
    161 > register (perhaps all uses have ended up inlined inside the same
    162 > function).  Then "x" is no longer in memory - will it still be
    163 > affected by the memory clobber?
    164 
    165 If the compiler manages to lift x into a register, then it will not be
    166 affected by the memory clobber, and, yes, the compiler would most likely
    167 remove the assignment "x = 1".
    168 
    169 
    170 > Also, is there any way to specify a more limited clobber than just
    171 > "memory", so that the compiler has as much freedom as possible?
    172 > Typical examples are to specify "clobbers" for just certain variables,
    173 > leaving others unaffected, or to distinguish between reads and writes.
    174 > For example, you might want to say "all writes should be completed by
    175 > this point, but data read into registers will stay valid".
    176 >
    177 > Some of this can be done with volatile accesses in different ways, but
    178 > not always optimally, and not always clearly.
    179 
    180 You can clobber certain variables by listing them in the output of the
    181 asm statement.  There is no way to distinguish between reads and writes.
    182 
    183 Ian
    184 
    185