2023-07-20
Participants
-
Peter Smith
-
Michael Platings
-
Anton Repetov
-
Michael Jones
-
Nathan Sidwell
-
Petr Hosek
-
Prabhu Rajasekaran
-
Scott
-
Stan
-
Yvan Roux
-
Yung-Chia Lin
-
Vince Del Vecchio
-
Zhi Zhuang
-
Garrett Van Mourik
-
Volodymyr Turanskyy
Agenda
-
Planning for the Pre-LLVM-DEV’23 – Embedded Toolchains Workshop.
-
Follow up on code reviews in progress.
-
Ideas/questions from Scott in Discord:
- Memory region function attributes and how they’d impact inlining and output section.
- Assembly inline with the source similar to opt-viewer, but be able to have gcc assembly alongside clang generated assembly.
- Using Arm trace data as an input to PGO. That’d give high quality performance data without needing any instrumentation.
-
AoB
Discussion
Pre-LLVM DevMeeting workshop (Peter)
NOTE: LLVM sync on 12th Oct will overlap with the LLVMDev meeting, so we will skip it.
-
Proposal submitted - did not hear back yet. Number of people requested ~25. There was a list of possible topics suggested - we will need to review and confirm topics and agree who can drive each of the topics.
-
News and next steps to be posted on Discourse when the workshop is confirmed.
Code reviews
-
Update from Alan Phipps on MC/DC: code reviews have been accepted, thanks for the help!
-
Michael P: libc++ with picolibc testing: code review accepted, expected to land soon, buildkite CI will test the picolibc (embedded) configuration of libc++ running in QEMU on Armv7-M.
-
Unified LTO, discussed previously, landed (RFC 2, patch a1ca3af 2) - impact/opportunities for embedded?
-
Unified LTO landed: thin or full LTP can decide on link time.
-
FatLTO: changes are mostly accepted and started landing, it may take a few more days to finish.
-
Code-size comparison (Scott)
-
opt-viewer style tool: Code comparison using objdump and llvm-objdump and debug info to match the output.
-
May be similar to LLVM performance testing: there is a system to use perf data to compare performance per building block between builds from different days.
Placement of code (Scott)
-
Function attribute to define memory region and copy depending function in the same memory region to be provided by the compiler.
-
Similar to what is needed for LTO to support placement in output sections. Do a pre-assignment of the output section before running the LTO itself. There was a link to the relevant presentation in the Discord channel: 2022 LLVM Dev Mtg: Link-Time Attributes for LTO: Incorporating linker knowledge into the LTO... - YouTube.
-
Automatic attribute propagation through the call graph is useful if there are libraries source code of which cannot be changed.
-
Somewhat similar to overlay logic to copy or not functions for different overlays.
PGO from traces (Scott)
-
PGO: trace capability of higher end CPUs - can it be used as input to PGO (without code instrumentation)? Branch instructions are most interesting to recreate the flow. Should be possible in principle. Arm Streamline is a trace based tool, armcc (Arm Compiler 5) was able to read its output, but not armclang (Arm Compiler 6).
-
There are a lot of trace formats out there so it could be tricky to parse all of them.
-
Compiler teams use a lot of models for testing, however for people working with peripherals there are less options.
Findings from migrating a hypervisor (Peter)
-
FIasco hypervisor (GitHub - kernkonzept/fiasco: The development version of the Fiasco.OC microkernel) has support for clang compiler, but not LLVM binutils.
-
Some issues found with llvm bin utils: llvm-objdump and llvm-objcopy have slightly different bahavior to GNU, which causes build issues.
-
Peter will raise upstream issue for these.
-
LLD: asserts in linker scripts - different behavior because of different time when the conditions are checked by LD vs LLD, thus build failure again.
FP modes in compiler_rt (Peter)
-
compiler_rt software emulation of floating point: rounding modes and flush to zero - who is interested in improvements? Having faster vs stricter IEEE modes. Arm can contribute.
-
Most of the time no-FP is used, thus limited experience and/or interest.
Embedded benchmarking (Petr)
-
What is a good set of benchmarks for embedded? embench (https://www.embench.org/)?
-
May be good to add something to LLVM test suite, if the benchmark is open-source.
-
Peter: Dhrystone and CoreMark, EEMBC are widely used, however they are mostly C (no C++).
-
CMSIS DSP, CMSIS NN can be used as application benchmarks, especially for SIMD.
-
embench was considered by the Arm team, however is not adopted for regular testing yet.
-
-
Scott: MicroPython has a set of benchmarks, can be seen as a more real world use case.
CMSIS clang support (Petr)
-
CMSIS is a dependency of a project the team is working on, but it does not support clang yet.
-
Volodymyr: CMSIS6 clang support is in progress: Core(M): Add support for LLVM/Clang · ARM-software/CMSIS_6@193243d · GitHub
-
There is no current plan to backport to CMSIS5, however both the clang enablement is a minor change and CMSIS6 is mostly compatible with CMSIS5 - it is a better split and arrangement of the same components, so should be straightforward to migrate.