blob: 280abc71c942676440016ffe0c273de4d62b0d06 [file] [log] [blame] [view]
Sunny Sachanandani68be7e32018-06-26 21:36:571# CHROMIUM Sync Token Internals
2
3Chrome uses a mechanism known as "sync tokens" to synchronize different command
4buffers in the GPU process. This document discusses the internals of the sync
5token system.
6
7[TOC]
8
9## Rationale
10
11In Chrome, multiple processes, for example browser and renderer, submit work to
12the GPU process asynchronously in command buffer. However, there are
13dependencies between the work submitted by different processes, such as
kylechar7fbb9e92022-07-05 03:07:3914SkiaRenderer in the display compositor in the GPU process rendering a tile
Sunny Sachanandani68be7e32018-06-26 21:36:5715produced by the raster worker in the renderer process.
16
17Sync tokens are used to synchronize the work contained in command buffers
18without waiting for the work to complete. This improves pipelining, and with the
19introduction of GPU scheduling, allows prioritization of work. Although
20originally built for synchronizing command buffers, they can be used for other
21work in the GPU process.
22
23## Generation
24
25Sync tokens are represented by a namespace, identifier, and the *fence release
26count*. `CommandBufferId` is a 64-bit unsigned integer which is unique within a
27`CommandBufferNamespace`. For example IPC command buffers are in the *GPU_IO*
28CommandBufferNamespace, and are identified by CommandBufferId with process id as
29the MSB and IPC route id as the LSB.
30
Sunny Sachanandanicec4bcb2018-06-27 22:58:3031The fence release count marks completion of some work in a command buffer. Note:
32this is CPU side work done that includes command decoding, validation, issuing
33GL calls to the driver, etc. and not GPU side work. See
34[gpu_synchronication.md](/docs/design/gpu_synchronization.md) for more
35information about synchronizing GPU work.
36
Sunny Sachanandani68be7e32018-06-26 21:36:5737Fences are typically generated or inserted on the client using a sequential
38counter. The corresponding GL API is `GenSyncTokenCHROMIUM` which generates the
39fence using `CommandBufferProxyImpl::GenerateFenceSyncRelease()`, and also adds
Corentin Wallez0f412f02019-04-03 22:42:3840the fence to the command buffer using the internal `InsertFenceSync` command.
Sunny Sachanandani68be7e32018-06-26 21:36:5741
42## Verification
43
44Different client processes communicate with the GPU process using *channels*. A
45channel wraps around a message pipe which doesn't provide ordering guarantees
46with respect to other pipes. For example, a message from the browser process
47containing a sync token wait can arrive before the message from the renderer
48process that releases or fulfills the sync token promise.
49
50To prevent the above problem, client processes must verify sync tokens before
51sending to another process. Verification involves a synchronous nop IPC message,
52`GpuChannelMsg_Nop`, to the GPU process which ensures that the GPU process has
53read previous messages from the pipe.
54
55Sync tokens used within a process do not need to be verified, and the
56`GenSyncTokenUnverifiedCHROMIUM` GL API serves this common case. These sync
57tokens need to be verified using `VerifySyncTokensCHROMIUM`. Sync tokens
58generated using `GenSyncTokenCHROMIUM` are already verified. `SyncToken` has a
59`verified_flush` bit that guards against accidentally sending unverified sync
60tokens over IPC.
61
62## Streams
63
64In the GPU process, command buffers are organized into logical streams of
65execution that are called *sequences*. Within a sequence tasks are ordered, but
66are asynchronous with respect to tasks in other sequences. Dependencies between
67tasks are specified as sync tokens. For IPC command buffers, this implies flush
68ordering within a sequence.
69
70A sequence can be created by `Scheduler::CreateSequence` which returns a
71`SequenceId`. Tasks are posted to a sequence using `Scheduler::ScheduleTask`.
72Typically there is one sequence per channel, but sometimes there are more like
73raster, compositor, and media streams in renderer's channel.
74
75The scheduler also provides a means for co-operative scheduling through
76`Scheduler::ShouldYield` and `Scheduler::ContinueTask`. These allow a task to
77yield and continue once higher priority work is complete. Together with the GPU
78scheduler, multiple sequences provide the means for prioritization of UI work
79over raster prepaint work.
80
81## Waiting and Completion
82
83Sync tokens are managed in the GPU process by `SyncPointManager`, and its helper
84classes `SyncPointOrderData` and `SyncPointClientState`. `SyncPointOrderData`
85holds state for a logical stream of execution, typically containing work of
86multiple command buffers from one process. `SyncPointClientState` holds sync token
87state for a client which generated sync tokens, typically an IPC command buffer.
88
89GPU scheduler maintains a `SyncPointOrderData` per sequence. Clients must create
90SyncPointClientState using `SyncPointManager::CreateSyncPointClientState` and
91identify their namespace, id, and sequence.
92
93Waiting on a sync token is done by calling `SyncPointManager::Wait()` with a
94sync token, order number for the wait, and a callback. The callbacks are
95enqueued with the `SyncPointClientState` of the target with the release count of
96the sync token. The scheduler does this internally for sync token dependencies
97for scheduled tasks, but the wait can also be performed when running the
98`WaitSyncTokenCHROMIUM` GL command.
99
100Sync tokens are completed when the fence is released in the GPU process by
101calling `SyncPointClientState::ReleaseFenceSync()`. For GL command buffers, the
Corentin Wallez0f412f02019-04-03 22:42:38102`InsertFenceSync` command, which contains the release count generated in the
103client, calls this when executed in the service. This issues callbacks and
Sunny Sachanandani68be7e32018-06-26 21:36:57104allows waiting command buffers to resume their work.
105
106## Correctness
107
108Correctness of waits and releases basically amounts to checking that there are
109no indefinite waits because of broken promises or circular wait chains. This is
110ensured by associating an order number with each wait and release and
111maintaining the invariant that the order number of release is less than or equal
112to the order number of wait.
113
114Each task is assigned a global sequential order number generated by
115`SyncPointOrderData::GenerateUnprocessedOrderNumber` which are stored in a queue
116of unprocessed order numbers. In `SyncPointManager::Wait()`, the callbacks are
117also enqueued with the order number of the waiting task in `SyncPointOrderData`
118in a queue called `OrderFenceQueue`.
119
120`SyncPointOrderData` maintains the invariant that all waiting callbacks must
121have an order number greater than the sequence's next unprocessed order number.
122This invariant is checked when enqueuing a new callback in
123`SyncPointOrderData::ValidateReleaseOrderNumber`, and after completing a task in
124`SyncPointOrderData::FinishProcessingOrderNumber`.
125
126
127## See Also
128
129[CHROMIUM_sync_point](/gpu/GLES2/extensions/CHROMIUM/CHROMIUM_sync_point.txt)
130[gpu_synchronication.md](/docs/design/gpu_synchronization.md)
131[Lightweight GPU Sync Points](https://docs.google.com/document/d/1XwBYFuTcINI84ShNvqifkPREs3sw5NdaKzKqDDxyeHk/edit)