If the -enable-bat option is enabled, the effect of secondary optimization is weakened

In our environment, it is not possible to deploy a single instance for perf sampling, so we turn on the -enable-bat option for continuous optimization.
However, we find that the optimization effect of BoltOptBinary2 is less than that of BoltOptBinary1 without any changes to the origin code. Through analysis, we found that compared with BoltOptBinary1, BoltOptBinary2’s itlb-miss-rate and L1-icache-load-misses-rate have increased

OriginBinary — [perf record] — [perf.data] — [bolt transform] –enable-batBoltOptBinary1

BoltOptBinary1 —— [perf record] — [perf.data] — OriginBinary — [bolt transform]---- > BoltOptBinary2

We are curious, is this as expected? Does enable-bat make some optimizations less effective in our scenario?

Thanks for a report.
This is somewhat expected given we lose accuracy of mapping back to input binary with some transformations (nop removal, indirect call promotion, etc).
Can you share BOLT log from the last step? What’s profile staleness that you see?