在一个太阳毒辣的周五,刚到公司的我还没吃完手里的面包,肯尼斯把我抓到他的屏幕前:“你来说说这段代码什么意思。”
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
我凝视屏幕一分钟,问道:“这...有注释吗?”
见状,肯尼斯渐渐露出笑容。我仿佛从肯尼斯的笑脸中,看到了下一分钟认怂的我。
一分钟后,肯尼斯从 Kernel 的双向链表给我讲起...
测试环境
下面的 Kernel 源码以及测试环境均为 RHEL7.3:
kernel-3.10.0-514.el7.x86_64
gcc-4.8.5-11.el7.x86_64
通常的双向链表实现
我理解的双向链表是长这个样子的:
gpointer data;
GList *next;
GList *prev;
};
双向链表由一个个的节点组成,每个节点都包含着数据以及分别之前前后节点的指针。画个示意图:
Linux Kernel 的双向链表实现
Kernel 的链表是通过 "Intrusive list" 来实现的。也就是这样:
struct list_head *next, *prev;
};
好吧,数据存哪? 事实上,这个链表本身不存数据,它只负责连接数据。画个示意图:
数据是由单独的数据结构定义,只需要在这些数据结构中加入 list_head, 并把 list_head 放到链表里即可。这样做的一个好处是,一个数据结构可以同时存在于多个链表中。如上图, Data1可以有两个不同的 list_head,两个 list_head 在不同的两个链表中。
问题来了,通过这些 list_head,怎样能找到其所在的数据结构?比如,得到了上图"Data1"中"List1"的地址,怎样可以得到"Data1"的地址?这就用到了最开始的那个宏函数了。(它是有注释的)
* container_of - cast a member of a structure out to the containing structure
* @ptr: the pointer to the member.
* @type: the type of the container struct this is embedded in.
* @member: the name of the member within the struct.
*
*/
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
给出一个 type 里的 member 的指针 ptr, container_of 可以返回该 type 的起始地址。 offsetof() 用于找到 member 相对 type 的偏移 offset,将 member 的地址与 offset 相减,即可得到 type 的起始地址。举个例子:
#include <stdlib.h>
#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
struct student{
int id;
int age;
};
int main(int agrc, char **argv)
{
struct student lihua;
struct student *tmp;
lihua.id = 2001;
lihua.age = 10;
printf("Relative offset of age to student = %d\n", ((size_t)&((struct student *)0)->age));
const typeof( ((struct student *)0)->age ) *__mptr = &lihua.age;
printf("Pointer to lihua.age = %p\n", __mptr);
tmp = (struct student *)( (char *)__mptr - offsetof(struct student, age));
printf("Calculated pointer to lihua = %p\n", tmp);
printf("The actual pointer = %p\n", &lihua);
return 0;
}
上面的程序首先定义一个 struct student, 里面有 id 和 age 两个 member。定义一个类型为 struct student 的变量 lihua, 如果能知道 lihua.age 的地址,即可推算出 lihua 的地址。
$ ./test Relative offset of age to student = 4 Pointer to lihua.age = 0x7fffdb408b84 Calculated pointer to lihua = 0x7fffdb408b80 The actual pointer = 0x7fffdb408b80
另外,从 container_of 的第一条语句可以知道 __mptr 的值与 ptr 是相等的,这条语句有什么存在意义呢?
它的一个作用是,它可以验证 member 是否存在于 type 之中,避免犯错。例如,捏造一个叫 score 的 member,编译的时候就会报错。
$ gcc -Wall -o test test.c
test.c:19:40: error: ‘struct student’ has no member named ‘score’
const typeof( ((struct student *)0)->score ) *__mptr = &lihua.age;
Kernel 的应用例子
双向链表在 Kernel 中大量存在,下面将用 task_struct 进行验证。
对于每一个进程(task),Kernel 都会用一个 task_struct 来记录相关信息。在 task_struct 中,有一个 struct list_head tasks 的定义。接下来将验证能否通过这个 tasks 链表,找到其他 task_struct 的信息。
volatile long state;
void *stack;
atomic_t usage;
<--snip-->
struct list_head tasks;
}
随便找个进程作为例子:
[ffff88007bc53ec0] volatile long state;
<--snip-->
[ffff88007bc542f0] struct list_head tasks; <---------
可以算出,struct list head tasks 相对于 task_struct 起始地址的 offset 是 0x430.
hexadecimal: 430
从这个 task_struct 的 list_head tasks 中,可以读取下一个 list_head tasks 的地址。
struct list_head {
[ffff88007bc542f0] struct list_head *next;
[ffff88007bc542f8] struct list_head *prev;
}
crash> list_head 0xffff88007bc542f0
struct list_head {
next = 0xffff88007a2d8430,
prev = 0xffff88003657a390
}
可以看到,它指向的下一个 list_head tasks 的地址是 0xffff88007a2d8430,进而可以算出对应 task_struct 的地址:
hexadecimal: ffff88007a2d8000
从这个 task_struct 的 list_head tasks 中,又可以看到它的 prev 其实是指向原先的 task_struct 的 tasks (0xffff88007bc542f0):
struct task_struct {
[ffff88007a2d8000] volatile long state;
<--snip-->
[ffff88007a2d8430] struct list_head tasks;
crash> list_head 0xffff88007a2d8430
struct list_head {
next = 0xffff880036712390,
prev = 0xffff88007bc542f0 <---------
}
从另一个方向同样可以验证:
hexadecimal: ffff880036579f60
crash> task_struct 0xffff880036579f60 -ox
struct task_struct {
[ffff880036579f60] volatile long state;
<--snip-->
[ffff88003657a390] struct list_head tasks;
crash> list_head 0xffff88003657a390
struct list_head {
next = 0xffff88007bc542f0, <---------
prev = 0xffff88003657e250
}
使用示例
道理都懂,但是这样的双向链表要怎么用?举个例子吧。
首先,从 kernel 代码中抠出双向链表的部分实现,放到 list.h 中。其中,LIST_HEAD() 用于初始化链表, list_add() 用于向链表中添加一个 entry,list_entry 用于获取 entry 所在的数据结构。
#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
/** include/linux/kernel.h **/
/**
* container_of - cast a member of a structure out to the containing structure
* @ptr: the pointer to the member.
* @type: the type of the container struct this is embedded in.
* @member: the name of the member within the struct.
*
*/
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
/** include/linux/types.h **/
struct list_head {
struct list_head *next, *prev;
};
/** include/linux/list.h **/
#define LIST_HEAD_INIT(name) { &(name), &(name) }
#define LIST_HEAD(name) \
struct list_head name = LIST_HEAD_INIT(name)
static inline void INIT_LIST_HEAD(struct list_head *list)
{
list->next = list;
list->prev = list;
}
/*
* Insert a new entry between two known consecutive entries.
*
* This is only for internal list manipulation where we know
* the prev/next entries already!
*/
static inline void __list_add(struct list_head *new,
struct list_head *prev,
struct list_head *next)
{
next->prev = new;
new->next = next;
new->prev = prev;
prev->next = new;
}
/**
* list_add - add a new entry
* @new: new entry to be added
* @head: list head to add it after
*
* Insert a new entry after the specified head.
* This is good for implementing stacks.
*/
static inline void list_add(struct list_head *new, struct list_head *head)
{
__list_add(new, head, head->next);
}
/**
* list_entry - get the struct for this entry
* @ptr: the &struct list_head pointer.
* @type: the type of the struct this is embedded in.
* @member: the name of the list_struct within the struct.
*/
#define list_entry(ptr, type, member) \
container_of(ptr, type, member)
下面的程序进行以下操作:
- 定义 struct person,里面包含 struct list_head list;
- 初始化链表 persons;
- 创建两个 person,并添加到 persons 链表中;
- 读取 persons 链表中每一个 person 的信息;
#include <stdio.h>
#include <stdlib.h>
#include "list.h"
struct person{
unsigned int id;
unsigned int height;
unsigned int weight;
struct list_head list;
};
int main(int argc, char **argv)
{
struct person *tmp;
struct list_head *pos;
struct person persons;
INIT_LIST_HEAD(&persons.list);
// add 1st person
tmp = (struct person *)malloc(sizeof(struct person));
tmp->id = 1;
tmp->height = 170;
tmp->weight = 65;
printf("%d, %d, %d\n", tmp->id, tmp->height, tmp->weight);
list_add(&(tmp->list), &(persons.list));
// add 2nd person
tmp = (struct person *)malloc(sizeof(struct person));
tmp->id = 2;
tmp->height = 160;
tmp->weight = 60;
printf("%d, %d, %d\n", tmp->id, tmp->height, tmp->weight);
list_add(&(tmp->list), &(persons.list));
// read persons
pos = persons.list.next;
tmp = list_entry(pos, struct person, list);
printf("item1 %d, %d, %d \n", tmp->id, tmp->height, tmp->weight);
pos = pos->next;
tmp = list_entry(pos, struct person, list);
printf("item2 %d, %d, %d \n", tmp->id, tmp->height, tmp->weight);
pos = pos->next;
tmp = list_entry(pos, struct person, list);
printf("item3 %d, %d, %d \n", tmp->id, tmp->height, tmp->weight);
return 0;
}
从输出可以验证,只要知道 person 中 list 的地址,就能读取到每个 person 的内容。至于 item3 是链表的头,这个程序没给它进行初始化,所以数值比较奇怪。
1, 170, 65
2, 160, 60
item1 2, 160, 60
item2 1, 170, 65
item3 1, 0, 0
参考资料
[1] Data Structures in the Linux Kernel - Linux Inside
https://0xax.gitbooks.io/linux-insides/DataStructures/dlist.html
[2] Linux Kernel Linked List Explained
https://isis.poly.edu/kulesh/stuff/src/klist/
思考题
为什么作者在开头用“太阳毒辣”来形容周五?