@@ -127,13 +127,20 @@ SLUB Debug output
127
127
128
128
Here is a sample of slub debug output:
129
129
130
- *** SLUB kmalloc-8: Redzone Active@0xc90f6d20 slab 0xc528c530 offset=3360 flags=0x400000c3 inuse=61 freelist=0xc90f6d58
131
- Bytes b4 0xc90f6d10: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
132
- Object 0xc90f6d20: 31 30 31 39 2e 30 30 35 1019.005
133
- Redzone 0xc90f6d28: 00 cc cc cc .
134
- FreePointer 0xc90f6d2c -> 0xc90f6d58
135
- Last alloc: get_modalias+0x61/0xf5 jiffies_ago=53 cpu=1 pid=554
136
- Filler 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
130
+ ====================================================================
131
+ BUG kmalloc-8: Redzone overwritten
132
+ --------------------------------------------------------------------
133
+
134
+ INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc
135
+ INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58
136
+ INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58
137
+ INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554
138
+
139
+ Bytes b4 0xc90f6d10: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
140
+ Object 0xc90f6d20: 31 30 31 39 2e 30 30 35 1019.005
141
+ Redzone 0xc90f6d28: 00 cc cc cc .
142
+ Padding 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
143
+
137
144
[<c010523d>] dump_trace+0x63/0x1eb
138
145
[<c01053df>] show_trace_log_lvl+0x1a/0x2f
139
146
[<c010601d>] show_trace+0x12/0x14
@@ -155,74 +162,108 @@ Filler 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
155
162
[<c0104112>] sysenter_past_esp+0x5f/0x99
156
163
[<b7f7b410>] 0xb7f7b410
157
164
=======================
158
- @@@ SLUB kmalloc-8: Restoring redzone (0xcc) from 0xc90f6d28-0xc90f6d2b
159
165
166
+ FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc
160
167
168
+ If SLUB encounters a corrupted object (full detection requires the kernel
169
+ to be booted with slub_debug) then the following output will be dumped
170
+ into the syslog:
161
171
162
- If SLUB encounters a corrupted object then it will perform the following
163
- actions:
164
-
165
- 1. Isolation and report of the issue
172
+ 1. Description of the problem encountered
166
173
167
174
This will be a message in the system log starting with
168
175
169
- *** SLUB <slab cache affected>: <What went wrong>@<object address>
170
- offset=<offset of object into slab> flags=<slabflags >
171
- inuse=<objects in use in this slab> freelist=<first free object in slab>
176
+ ===============================================
177
+ BUG <slab cache affected>: <What went wrong >
178
+ -----------------------------------------------
172
179
173
- 2. Report on how the problem was dealt with in order to ensure the continued
174
- operation of the system.
180
+ INFO: <corruption start>-<corruption_end> <more info>
181
+ INFO: Slab <address> <slab information>
182
+ INFO: Object <address> <object information>
183
+ INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by
184
+ cpu> pid=<pid of the process>
185
+ INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu>
186
+ pid=<pid of the process>
175
187
176
- These are messages in the system log beginning with
177
-
178
- @@@ SLUB <slab cache affected>: <corrective action taken>
188
+ (Object allocation / free information is only available if SLAB_STORE_USER is
189
+ set for the slab. slub_debug sets that option)
179
190
191
+ 2. The object contents if an object was involved.
180
192
181
- In the above sample SLUB found that the Redzone of an active object has
182
- been overwritten. Here a string of 8 characters was written into a slab that
183
- has the length of 8 characters. However, a 8 character string needs a
184
- terminating 0. That zero has overwritten the first byte of the Redzone field.
185
- After reporting the details of the issue encountered the @@@ SLUB message
186
- tell us that SLUB has restored the redzone to its proper value and then
187
- system operations continue.
188
-
189
- Various types of lines can follow the @@@ SLUB line:
193
+ Various types of lines can follow the BUG SLUB line:
190
194
191
195
Bytes b4 <address> : <bytes>
192
- Show a few bytes before the object where the problem was detected.
196
+ Shows a few bytes before the object where the problem was detected.
193
197
Can be useful if the corruption does not stop with the start of the
194
198
object.
195
199
196
200
Object <address> : <bytes>
197
201
The bytes of the object. If the object is inactive then the bytes
198
- typically contain poisoning values. Any non-poison value shows a
202
+ typically contain poison values. Any non-poison value shows a
199
203
corruption by a write after free.
200
204
201
205
Redzone <address> : <bytes>
202
- The redzone following the object. The redzone is used to detect
206
+ The Redzone following the object. The Redzone is used to detect
203
207
writes after the object. All bytes should always have the same
204
208
value. If there is any deviation then it is due to a write after
205
209
the object boundary.
206
210
207
- Freepointer
208
- The pointer to the next free object in the slab. May become
209
- corrupted if overwriting continues after the red zone.
210
-
211
- Last alloc:
212
- Last free:
213
- Shows the address from which the object was allocated/freed last.
214
- We note the pid, the time and the CPU that did so. This is usually
215
- the most useful information to figure out where things went wrong.
216
- Here get_modalias() did an kmalloc(8) instead of a kmalloc(9).
211
+ (Redzone information is only available if SLAB_RED_ZONE is set.
212
+ slub_debug sets that option)
217
213
218
- Filler <address> : <bytes>
214
+ Padding <address> : <bytes>
219
215
Unused data to fill up the space in order to get the next object
220
216
properly aligned. In the debug case we make sure that there are
221
- at least 4 bytes of filler . This allow for the detection of writes
217
+ at least 4 bytes of padding . This allows the detection of writes
222
218
before the object.
223
219
224
- Following the filler will be a stackdump. That stackdump describes the
225
- location where the error was detected. The cause of the corruption is more
226
- likely to be found by looking at the information about the last alloc / free.
220
+ 3. A stackdump
221
+
222
+ The stackdump describes the location where the error was detected. The cause
223
+ of the corruption is may be more likely found by looking at the function that
224
+ allocated or freed the object.
225
+
226
+ 4. Report on how the problem was dealt with in order to ensure the continued
227
+ operation of the system.
228
+
229
+ These are messages in the system log beginning with
230
+
231
+ FIX <slab cache affected>: <corrective action taken>
232
+
233
+ In the above sample SLUB found that the Redzone of an active object has
234
+ been overwritten. Here a string of 8 characters was written into a slab that
235
+ has the length of 8 characters. However, a 8 character string needs a
236
+ terminating 0. That zero has overwritten the first byte of the Redzone field.
237
+ After reporting the details of the issue encountered the FIX SLUB message
238
+ tell us that SLUB has restored the Redzone to its proper value and then
239
+ system operations continue.
240
+
241
+ Emergency operations:
242
+ ---------------------
243
+
244
+ Minimal debugging (sanity checks alone) can be enabled by booting with
245
+
246
+ slub_debug=F
247
+
248
+ This will be generally be enough to enable the resiliency features of slub
249
+ which will keep the system running even if a bad kernel component will
250
+ keep corrupting objects. This may be important for production systems.
251
+ Performance will be impacted by the sanity checks and there will be a
252
+ continual stream of error messages to the syslog but no additional memory
253
+ will be used (unlike full debugging).
254
+
255
+ No guarantees. The kernel component still needs to be fixed. Performance
256
+ may be optimized further by locating the slab that experiences corruption
257
+ and enabling debugging only for that cache
258
+
259
+ I.e.
260
+
261
+ slub_debug=F,dentry
262
+
263
+ If the corruption occurs by writing after the end of the object then it
264
+ may be advisable to enable a Redzone to avoid corrupting the beginning
265
+ of other objects.
266
+
267
+ slub_debug=FZ,dentry
227
268
228
- Christoph Lameter, <
[email protected] >, May
23 , 2007
269
+ Christoph Lameter, <
[email protected] >, May
30 , 2007
0 commit comments