Ignore:
Timestamp:
Aug 16, 2017, 1:38:57 PM (8 years ago)
Author:
[email protected]
Message:

Enhance MacroAssembler::probe() to support an initializeStackFunction callback.
https://bugs.webkit.org/show_bug.cgi?id=175617
<rdar://problem/33912104>

Reviewed by JF Bastien.

This patch adds a new feature to MacroAssembler::probe() where the probe function
can provide a ProbeFunction callback to fill in stack values after the stack
pointer has been adjusted. The probe function can use this feature as follows:

  1. Set the new sp value in the ProbeContext's CPUState.
  1. Set the ProbeContext's initializeStackFunction to a ProbeFunction callback which will do the work of filling in the stack values after the probe trampoline has adjusted the machine stack pointer.
  1. Set the ProbeContext's initializeStackArgs to any value that the client wants to pass to the initializeStackFunction callback.
  1. Return from the probe function.

Upon returning from the probe function, the probe trampoline will adjust the
the stack pointer based on the sp value in CPUState. If initializeStackFunction
is not set, the probe trampoline will restore registers and return to its caller.

If initializeStackFunction is set, the trampoline will move the ProbeContext
beyond the range of the stack pointer i.e. it will place the new ProbeContext at
an address lower than where CPUState.sp() points. This ensures that the
ProbeContext will not be trashed by the initializeStackFunction when it writes to
the stack. Then, the trampoline will call back to the initializeStackFunction
ProbeFunction to let it fill in the stack values as desired. The
initializeStackFunction ProbeFunction will be passed the moved ProbeContext at
the new location.

initializeStackFunction may now write to the stack at addresses greater or
equal to CPUState.sp(), but not below that. initializeStackFunction is also
not allowed to change CPUState.sp(). If the initializeStackFunction does not
abide by these rules, then behavior is undefined, and bad things may happen.

For future reference, some implementation details that this patch needed to
be mindful of:

  1. When the probe trampoline allocates stack space for the ProbeContext, it should include OUT_SIZE as well. This ensures that it doesn't have to move the ProbeContext on exit if the probe function didn't change the sp.
  1. If the trampoline has to move the ProbeContext, it needs to point the machine sp to new ProbeContext first before copying over the ProbeContext data. This protects the new ProbeContext from possibly being trashed by interrupts.
  1. When computing the new address of ProbeContext to move to, we need to make sure that it is properly aligned in accordance with stack ABI requirements (just like we did when we allocated the ProbeContext on entry to the probe trampoline).
  1. When copying the ProbeContext to its new location, the trampoline should always copy words from low addresses to high addresses. This is because if we're moving the ProbeContext, we'll always be moving it to a lower address.
  • assembler/MacroAssembler.h:
  • assembler/MacroAssemblerARM.cpp:
  • assembler/MacroAssemblerARM64.cpp:
  • assembler/MacroAssemblerARMv7.cpp:
  • assembler/MacroAssemblerX86Common.cpp:
  • assembler/testmasm.cpp:

(JSC::testProbePreservesGPRS):
(JSC::testProbeModifiesStackPointer):
(JSC::fillStack):
(JSC::testProbeModifiesStackWithCallback):
(JSC::run):

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/Source/JavaScriptCore/assembler/MacroAssemblerARMv7.cpp

    r220579 r220807  
    4343#define PROBE_PROBE_FUNCTION_OFFSET (0 * PTR_SIZE)
    4444#define PROBE_ARG_OFFSET (1 * PTR_SIZE)
    45 
    46 #define PROBE_FIRST_GPREG_OFFSET (2 * PTR_SIZE)
     45#define PROBE_INIT_STACK_FUNCTION_OFFSET (2 * PTR_SIZE)
     46#define PROBE_INIT_STACK_ARG_OFFSET (3 * PTR_SIZE)
     47
     48#define PROBE_FIRST_GPREG_OFFSET (4 * PTR_SIZE)
    4749
    4850#define GPREG_SIZE 4
     
    102104#define PROBE_CPU_D30_OFFSET (PROBE_FIRST_FPREG_OFFSET + (30 * FPREG_SIZE))
    103105#define PROBE_CPU_D31_OFFSET (PROBE_FIRST_FPREG_OFFSET + (31 * FPREG_SIZE))
     106
    104107#define PROBE_SIZE (PROBE_FIRST_FPREG_OFFSET + (32 * FPREG_SIZE))
    105 #define PROBE_ALIGNED_SIZE (PROBE_SIZE)
     108
     109#define OUT_SIZE GPREG_SIZE
    106110
    107111// These ASSERTs remind you that if you change the layout of ProbeContext,
     
    110114COMPILE_ASSERT(PROBE_OFFSETOF(probeFunction) == PROBE_PROBE_FUNCTION_OFFSET, ProbeContext_probeFunction_offset_matches_ctiMasmProbeTrampoline);
    111115COMPILE_ASSERT(PROBE_OFFSETOF(arg) == PROBE_ARG_OFFSET, ProbeContext_arg_offset_matches_ctiMasmProbeTrampoline);
     116COMPILE_ASSERT(PROBE_OFFSETOF(initializeStackFunction) == PROBE_INIT_STACK_FUNCTION_OFFSET, ProbeContext_initializeStackFunction_offset_matches_ctiMasmProbeTrampoline);
     117COMPILE_ASSERT(PROBE_OFFSETOF(initializeStackArg) == PROBE_INIT_STACK_ARG_OFFSET, ProbeContext_initializeStackArg_offset_matches_ctiMasmProbeTrampoline);
    112118
    113119COMPILE_ASSERT(!(PROBE_CPU_R0_OFFSET & 0x3), ProbeContext_cpu_r0_offset_should_be_4_byte_aligned);
     
    133139COMPILE_ASSERT(PROBE_OFFSETOF(cpu.sprs[ARMRegisters::fpscr]) == PROBE_CPU_FPSCR_OFFSET, ProbeContext_cpu_fpscr_offset_matches_ctiMasmProbeTrampoline);
    134140
    135 COMPILE_ASSERT(!(PROBE_CPU_D0_OFFSET & 0xf), ProbeContext_cpu_d0_offset_should_be_16_byte_aligned);
     141COMPILE_ASSERT(!(PROBE_CPU_D0_OFFSET & 0x7), ProbeContext_cpu_d0_offset_should_be_8_byte_aligned);
    136142
    137143COMPILE_ASSERT(PROBE_OFFSETOF(cpu.fprs[ARMRegisters::d0]) == PROBE_CPU_D0_OFFSET, ProbeContext_cpu_d0_offset_matches_ctiMasmProbeTrampoline);
     
    170176
    171177COMPILE_ASSERT(sizeof(ProbeContext) == PROBE_SIZE, ProbeContext_size_matches_ctiMasmProbeTrampoline);
    172 COMPILE_ASSERT(!(PROBE_ALIGNED_SIZE & 0xf), ProbeContext_aligned_size_offset_should_be_16_byte_aligned);
    173 
    174178#undef PROBE_OFFSETOF
    175179   
     
    194198    "mov       ip, sp" "\n"
    195199    "mov       r0, sp" "\n"
    196     "sub       r0, r0, #" STRINGIZE_VALUE_OF(PROBE_ALIGNED_SIZE) "\n"
     200    "sub       r0, r0, #" STRINGIZE_VALUE_OF(PROBE_SIZE + OUT_SIZE) "\n"
    197201
    198202    // The ARM EABI specifies that the stack needs to be 16 byte aligned.
    199203    "bic       r0, r0, #0xf" "\n"
    200     "mov       sp, r0" "\n"
     204    "mov       sp, r0" "\n" // Set the sp to protect the ProbeContext from interrupts before we initialize it.
    201205
    202206    "str       lr, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_PC_OFFSET) "]" "\n"
     
    229233    "mov       fp, sp" "\n" // Save the ProbeContext*.
    230234
     235    // Initialize ProbeContext::initializeStackFunction to zero.
     236    "mov       r0, #0" "\n"
     237    "str       r0, [fp, #" STRINGIZE_VALUE_OF(PROBE_INIT_STACK_FUNCTION_OFFSET) "]" "\n"
     238
    231239    "ldr       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_PROBE_FUNCTION_OFFSET) "]" "\n"
    232240    "mov       r0, sp" "\n" // the ProbeContext* arg.
    233241    "blx       ip" "\n"
     242
     243    // Make sure the ProbeContext is entirely below the result stack pointer so
     244    // that register values are still preserved when we call the initializeStack
     245    // function.
     246    "ldr       r1, [fp, #" STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET) "]" "\n" // Result sp.
     247    "add       r2, fp, #" STRINGIZE_VALUE_OF(PROBE_SIZE + OUT_SIZE) "\n" // End of ProveContext + buffer.
     248    "cmp       r1, r2" "\n"
     249    "it        ge" "\n"
     250    "bge     " LOCAL_LABEL_STRING(ctiMasmProbeTrampolineProbeContextIsSafe) "\n"
     251
     252    // Allocate a safe place on the stack below the result stack pointer to stash the ProbeContext.
     253    "sub       r1, r1, #" STRINGIZE_VALUE_OF(PROBE_SIZE + OUT_SIZE) "\n"
     254    "bic       r1, r1, #0xf" "\n" // The ARM EABI specifies that the stack needs to be 16 byte aligned.
     255    "mov       sp, r1" "\n" // Set the new sp to protect that memory from interrupts before we copy the ProbeContext.
     256
     257    // Copy the ProbeContext to the safe place.
     258    // Note: we have to copy from low address to higher address because we're moving the
     259    // ProbeContext to a lower address.
     260    "mov       r5, fp" "\n"
     261    "mov       r6, r1" "\n"
     262    "add       r7, fp, #" STRINGIZE_VALUE_OF(PROBE_SIZE) "\n"
     263
     264    LOCAL_LABEL_STRING(ctiMasmProbeTrampolineCopyLoop) ":" "\n"
     265    "ldr       r3, [r5], #4" "\n"
     266    "ldr       r4, [r5], #4" "\n"
     267    "str       r3, [r6], #4" "\n"
     268    "str       r4, [r6], #4" "\n"
     269    "cmp       r5, r7" "\n"
     270    "it        lt" "\n"
     271    "blt     " LOCAL_LABEL_STRING(ctiMasmProbeTrampolineCopyLoop) "\n"
     272
     273    "mov       fp, r1" "\n"
     274
     275    // Call initializeStackFunction if present.
     276    LOCAL_LABEL_STRING(ctiMasmProbeTrampolineProbeContextIsSafe) ":" "\n"
     277    "ldr       r2, [fp, #" STRINGIZE_VALUE_OF(PROBE_INIT_STACK_FUNCTION_OFFSET) "]" "\n"
     278    "cbz       r2, " LOCAL_LABEL_STRING(ctiMasmProbeTrampolineRestoreRegisters) "\n"
     279
     280    "mov       r0, fp" "\n" // Set the ProbeContext* arg.
     281    "blx       r2" "\n" // Call the initializeStackFunction (loaded into r2 above).
     282
     283    LOCAL_LABEL_STRING(ctiMasmProbeTrampolineRestoreRegisters) ":" "\n"
    234284
    235285    "mov       sp, fp" "\n"
     
    248298
    249299    // There are 5 more registers left to restore: ip, sp, lr, pc, and apsr.
    250     // There are 2 issues that complicate the restoration of these last few
    251     // registers:
    252     //
    253     // 1. Normal ARM calling convention relies on moving lr to pc to return to
    254     //    the caller. In our case, the address to return to is specified by
    255     //    ProbeContext.cpu.gprs[pc]. And at that moment, we won't have any available
    256     //    scratch registers to hold the return address (lr needs to hold
    257     //    ProbeContext.cpu.gprs[lr], not the return address).
    258     //
    259     //    The solution is to store the return address on the stack and load the
    260     //    pc from there.
    261     //
    262     // 2. Issue 1 means we will need to write to the stack location at
    263     //    ProbeContext.cpu.gprs[sp] - PTR_SIZE. But if the user probe function had
    264     //    modified the value of ProbeContext.cpu.gprs[sp] to point in the range between
    265     //    &ProbeContext.cpu.gprs[ip] thru &ProbeContext.cpu.sprs[aspr], then the action
    266     //    for Issue 1 may trash the values to be restored before we can restore them.
    267     //
    268     //    The solution is to check if ProbeContext.cpu.gprs[sp] contains a value in
    269     //    the undesirable range. If so, we copy the remaining ProbeContext
    270     //    register data to a safe area first, and restore the remaining register
    271     //    from this new safe area.
    272 
    273     // The restore area for the pc will be located at 1 word below the resultant sp.
    274     // All restore values are located at offset <= PROBE_CPU_APSR_OFFSET. Hence,
    275     // we need to make sure that resultant sp > offset of apsr + 1.
    276     "add       ip, sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_APSR_OFFSET + PTR_SIZE) "\n"
     300
     301    // Set up the restore area for sp and pc.
    277302    "ldr       lr, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET) "]" "\n"
    278     "cmp       lr, ip" "\n"
    279     "it        gt" "\n"
    280     "bgt     " SYMBOL_STRING(ctiMasmProbeTrampolineEnd) "\n"
    281 
    282     // Getting here means that the restore area will overlap the ProbeContext data
    283     // that we will need to get the restoration values from. So, let's move that
    284     // data to a safe place before we start writing into the restore area.
    285     // Let's locate the "safe area" at 2x sizeof(ProbeContext) below where the
    286     // restore area. This ensures that:
    287     // 1. The safe area does not overlap the restore area.
    288     // 2. The safe area does not overlap the ProbeContext.
    289     //    This makes it so that we can use memcpy (does not require memmove) semantics
    290     //    to copy the restore values to the safe area.
    291 
    292     // lr already contains [sp, #STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET)].
    293     "sub       lr, lr, #(2 * " STRINGIZE_VALUE_OF(PROBE_ALIGNED_SIZE) ")" "\n"
    294    
    295     "mov       ip, sp" "\n" // Save the original ProbeContext*.
    296    
    297     // Make sure the stack pointer points to the safe area. This ensures that the
    298     // safe area is protected from interrupt handlers overwriting it.
    299     "mov       sp, lr" "\n" // sp now points to the new ProbeContext in the safe area.
    300    
    301     "mov       lr, ip" "\n" // Use lr as the old ProbeContext*.
    302    
    303     // Copy the restore data to the new ProbeContext*.
    304     "ldr       ip, [lr, #" STRINGIZE_VALUE_OF(PROBE_CPU_IP_OFFSET) "]" "\n"
    305     "str       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_IP_OFFSET) "]" "\n"
    306     "ldr       ip, [lr, #" STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET) "]" "\n"
    307     "str       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET) "]" "\n"
    308     "ldr       ip, [lr, #" STRINGIZE_VALUE_OF(PROBE_CPU_LR_OFFSET) "]" "\n"
    309     "str       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_LR_OFFSET) "]" "\n"
    310     "ldr       ip, [lr, #" STRINGIZE_VALUE_OF(PROBE_CPU_PC_OFFSET) "]" "\n"
    311     "str       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_PC_OFFSET) "]" "\n"
    312     "ldr       ip, [lr, #" STRINGIZE_VALUE_OF(PROBE_CPU_APSR_OFFSET) "]" "\n"
    313     "str       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_APSR_OFFSET) "]" "\n"
    314 
    315     // ctiMasmProbeTrampolineEnd expects lr to contain the sp value to be restored.
    316     // Since we used it as scratch above, let's restore it.
    317     "ldr       lr, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET) "]" "\n"
    318 
    319     ".thumb_func " THUMB_FUNC_PARAM(ctiMasmProbeTrampolineEnd) "\n"
    320     SYMBOL_STRING(ctiMasmProbeTrampolineEnd) ":" "\n"
    321 
    322     // Set up the restore area for sp and pc.
    323     // lr already contains [sp, #STRINGIZE_VALUE_OF(PROBE_CPU_SP_OFFSET)].
    324303
    325304    // Push the pc on to the restore area.
    326305    "ldr       ip, [sp, #" STRINGIZE_VALUE_OF(PROBE_CPU_PC_OFFSET) "]" "\n"
    327     "sub       lr, lr, #" STRINGIZE_VALUE_OF(PTR_SIZE) "\n"
     306    "sub       lr, lr, #" STRINGIZE_VALUE_OF(OUT_SIZE) "\n"
    328307    "str       ip, [lr]" "\n"
    329308    // Point sp to the restore area.
Note: See TracChangeset for help on using the changeset viewer.