内存错误检测
KASAN — 内核地址消毒剂
KASAN(Kernel Address SANitizer)是最强大的内存错误检测工具,能检测:
- 堆越界访问(heap out-of-bounds)
- 栈越界访问(stack out-of-bounds)
- 释放后使用(use-after-free)
- 释放后返回(use-after-return)
启用 KASAN
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y # 通用模式(x86/arm64)
# 或
CONFIG_KASAN_SW_TAGS=y # 软件标签模式(arm64,性能更好)KASAN 报告解读
==================================================================
BUG: KASAN: heap-out-of-bounds in my_driver_write+0x45/0x120
Write of size 4 at addr ffff888012345678 by task my_app/1234
CPU: 2 PID: 1234 Comm: my_app
Call Trace:
dump_stack+0x6b/0x8b
print_address_description+0x1f/0x1f0
kasan_report+0x138/0x160
my_driver_write+0x45/0x120
vfs_write+0xb5/0x1f0
Allocated by task 1234:
kmalloc+0x1f/0x30
my_driver_probe+0x89/0x200
The buggy address belongs to the object at ffff888012345670
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 8 bytes to the right of
64-byte region [ffff888012345670, ffff888012345670+0x40)
==================================================================分析:在 my_driver_write 中,向 64 字节缓冲区的第 72 字节(偏移 8 字节越界)写入了 4 字节。
常见 KASAN 错误类型
heap-out-of-bounds — 堆内存越界
stack-out-of-bounds — 栈内存越界
use-after-free — 释放后使用
use-after-return — 函数返回后使用栈变量
global-out-of-bounds — 全局变量越界KFENCE — 轻量级内存错误检测
KFENCE(Kernel Electric Fence)是 KASAN 的轻量替代,适合生产环境:
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=100 # 每 100ms 保护一个对象bash
# 查看 KFENCE 统计
cat /sys/kernel/debug/kfence/stats
# 查看错误报告
cat /sys/kernel/debug/kfence/objectskmemleak — 内存泄漏检测
CONFIG_DEBUG_KMEMLEAK=ybash
# 触发扫描
echo scan > /sys/kernel/debug/kmemleak
# 查看泄漏报告
cat /sys/kernel/debug/kmemleak
# 清除已知泄漏(重新开始追踪)
echo clear > /sys/kernel/debug/kmemleak报告示例:
unreferenced object 0xffff888012345678 (size 64):
comm "my_app", pid 1234, jiffies 4294967295
backtrace:
kmalloc+0x1f/0x30
my_driver_open+0x45/0x80 ← 分配点
chrdev_open+0x89/0x200
do_open+0x1f/0x30分析:my_driver_open 中分配的 64 字节内存没有被释放(my_driver_release 中忘记 kfree)。
UBSAN — 未定义行为检测
CONFIG_UBSAN=y
CONFIG_UBSAN_SANITIZE_ALL=y检测:整数溢出、数组越界、空指针解引用、未对齐访问等。
UBSAN: Undefined behaviour in drivers/mydriver/my_driver.c:42:15
signed integer overflow:
2147483647 + 1 cannot be represented in type 'int'lockdep — 死锁检测
CONFIG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=ylockdep 在运行时追踪所有锁的获取顺序,检测潜在死锁:
WARNING: possible circular locking dependency detected
my_driver/1234 is trying to acquire lock:
(&priv->lock){+.+.}, at: my_driver_write+0x45
but task is already holding lock:
(&dev->mutex){+.+.}, at: my_driver_ioctl+0x23
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&dev->mutex){+.+.}:
my_driver_ioctl+0x23
-> #0 (&priv->lock){+.+.}:
my_driver_write+0x45
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&dev->mutex);
lock(&priv->lock);
lock(&dev->mutex); ← 等待 CPU0
lock(&priv->lock); ← 等待 CPU1
DEADLOCK综合调试配置
开发阶段推荐的内核配置组合:
# 内存调试
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_SLUB_DEBUG=y
CONFIG_DEBUG_PAGEALLOC=y
# 锁调试
CONFIG_LOCKDEP=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
# 通用调试
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_INFO=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_DYNAMIC_DEBUG=y
CONFIG_FRAME_POINTER=y
# 注意:以上配置会显著降低性能,仅用于开发调试