You're reading from LLVM Code Generation A deep dive into compiler backend development

Product type Paperback

Published in May 2025

Publisher Packt

ISBN-13 9781837637782

Length 620 pages

Edition 1st Edition

Languages

C++

Tools

LLVM

Concepts

Programming Language

Author (1):

Quentin Colombet

View More author details

Table of Contents (30) Chapters

Preface

1. Part 1: Getting Started with LLVM

2. Building LLVM and Understanding the Directory Structure FREE CHAPTER

3. Contributing to LLVM

4. Compiler Basics and How They Map to LLVM APIs

5. Writing Your First Optimization

6. Dealing with Pass Managers

7. TableGen – LLVM Swiss Army Knife for Modeling

8. Part 2: Middle-End: LLVM IR to LLVM IR

9. Understanding LLVM IR

10. Survey of the Existing Passes

11. Introducing Target-Specific Constructs

12. Hands-On Debugging LLVM IR Passes

13. Part 3: Introduction to the Backend

14. Getting Started with the Backend

15. Getting Started with the Machine Code Layer

16. The Machine Pass Pipeline

17. Part 4: LLVM IR to Machine IR

18. Getting Started with Instruction Selection

19. Instruction Selection: The IR Building Phase

20. Instruction Selection: The Legalization Phase

21. Instruction Selection: The Selection Phase and Beyond

22. Part 5: Final Lowering and Optimizations

23. Instruction Scheduling

24. Register Allocation

25. Lowering of the Stack Layout

26. Getting Started with the Assembler

27. Unlock Your Book’s Exclusive Benefits

28. Other Books You May Enjoy

29. Index

Canonicalization passes

Canonicalization passes are transformations that put the IR in a canonical form, that is, an agreed-upon way to represent expressions.

For instance, consider the C statement a = b - c. A frontend could produce the following two equivalent IRs:

Version 1:
```
%a = sub i64 %b - %c
```

Version 2:

%neg_c = sub i64 0, %c
%a = add i64 %b, %neg_c

In both cases, the semantics are the same, but the first version is a more direct translation of the input C statement and is also more compact. The canonical form in LLVM of this expression is the first version. It is important to be aware of the tendency of the LLVM middle end to revolve around the canonical representation.

What this means is the following:

With the standard pipeline, anything that is not canonical will be canonicalized
Optimizations have been tested almost exclusively using the canonical form

The second point means that...