Linux内核模块编程指南
上一页	第四章字符设备文件	下一页

4.1. 字符设备驱动程序

4.1.1. file_operations 结构体

file_operations 结构体定义在linux/fs.h，并保存指向驱动程序定义的函数的指针，这些函数执行设备上的各种操作。结构的每个字段都对应于驱动程序定义的用于处理请求操作的某个函数的地址。

例如，每个字符驱动程序都需要定义一个从设备读取数据的函数。file_operations 结构体保存了模块中执行该操作的函数的地址。以下是内核的定义方式2.6.5:

struct file_operations {
	struct module *owner;
	 loff_t(*llseek) (struct file *, loff_t, int);
	 ssize_t(*read) (struct file *, char __user *, size_t, loff_t *);
	 ssize_t(*aio_read) (struct kiocb *, char __user *, size_t, loff_t);
	 ssize_t(*write) (struct file *, const char __user *, size_t, loff_t *);
	 ssize_t(*aio_write) (struct kiocb *, const char __user *, size_t,
			      loff_t);
	int (*readdir) (struct file *, void *, filldir_t);
	unsigned int (*poll) (struct file *, struct poll_table_struct *);
	int (*ioctl) (struct inode *, struct file *, unsigned int,
		      unsigned long);
	int (*mmap) (struct file *, struct vm_area_struct *);
	int (*open) (struct inode *, struct file *);
	int (*flush) (struct file *);
	int (*release) (struct inode *, struct file *);
	int (*fsync) (struct file *, struct dentry *, int datasync);
	int (*aio_fsync) (struct kiocb *, int datasync);
	int (*fasync) (int, struct file *, int);
	int (*lock) (struct file *, int, struct file_lock *);
	 ssize_t(*readv) (struct file *, const struct iovec *, unsigned long,
			  loff_t *);
	 ssize_t(*writev) (struct file *, const struct iovec *, unsigned long,
			   loff_t *);
	 ssize_t(*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
			     void __user *);
	 ssize_t(*sendpage) (struct file *, struct page *, int, size_t,
			     loff_t *, int);
	unsigned long (*get_unmapped_area) (struct file *, unsigned long,
					    unsigned long, unsigned long,
					    unsigned long);
};

某些操作不由驱动程序实现。例如，处理视频卡的驱动程序不需要从目录结构读取数据。file_operations 结构体中相应的条目应设置为NULL.

有一个 gcc 扩展使为此结构体赋值更加方便。您会在现代驱动程序中看到它，可能会让您感到惊讶。这就是为结构体赋值的新方法

struct file_operations fops = {
	read: device_read,
	write: device_write,
	open: device_open,
	release: device_release
};

然而，还有一种 C99 的方式可以为结构体的元素赋值，这绝对比使用 GNU 扩展更受欢迎。作者在编写本文时使用的 gcc 版本，2.95，支持新的 C99 语法。如果有人想移植您的驱动程序，您应该使用这种语法。这将有助于兼容性

struct file_operations fops = {
	.read = device_read,
	.write = device_write,
	.open = device_open,
	.release = device_release
};

含义很明确，您应该意识到，结构体中任何您未显式赋值的成员都将被初始化为NULL由 gcc 初始化。

struct file_operations 的一个实例，其中包含指向用于实现 read、write、open 等系统调用的函数的指针，通常命名为fops.

4.1.2. file 结构体

每个设备在内核中都由一个 file 结构体表示，该结构体定义在linux/fs.h。请注意，file 是一个内核级别的结构体，永远不会出现在用户空间程序中。它与 FILE 不同，FILE 由 glibc 定义，永远不会出现在内核空间函数中。此外，它的名称有点误导性；它表示一个抽象的打开的“文件”，而不是磁盘上的文件，磁盘上的文件由名为 inode 的结构体表示。

struct file 的一个实例struct file通常命名为filp。您也会看到它被称为struct file file。抵制这种诱惑。

继续查看 file 的定义。file。您看到的大多数条目，例如struct dentry设备驱动程序不使用，您可以忽略它们。这是因为驱动程序不直接填充file；它们只使用包含在file中的结构体，这些结构体在其他地方创建。

4.1.3. 注册设备

如前所述，字符设备通过设备文件访问，通常位于/dev[1]。主设备号告诉您哪个驱动程序处理哪个设备文件。次设备号仅由驱动程序本身使用，以区分它正在操作哪个设备，以防驱动程序处理多个设备。

将驱动程序添加到您的系统意味着将其注册到内核。这等同于在模块初始化期间为其分配一个主设备号。您可以通过使用register_chrdev函数来完成，该函数由linux/fs.h.

int register_chrdev(unsigned int major, const char *name, struct file_operations *fops);

其中unsigned int major是您要请求的主设备号，const char *name是设备名称，它将出现在/proc/devices和struct file_operations *fops是指向file_operations您驱动程序的表的指针。负返回值表示注册失败。请注意，我们没有将次设备号传递给register_chrdev。这是因为内核不关心次设备号；只有我们的驱动程序使用它。

现在的问题是，如何在不占用已在使用的主设备号的情况下获取一个主设备号？最简单的方法是查看Documentation/devices.txt并选择一个未使用的。这是一种糟糕的做法，因为您永远无法确定您选择的号码是否会在以后被分配。答案是您可以要求内核为您分配一个动态主设备号。

如果您将主设备号 0 传递给register_chrdev，返回值将是动态分配的主设备号。缺点是您无法提前创建设备文件，因为您不知道主设备号是什么。有几种方法可以做到这一点。首先，驱动程序本身可以打印新分配的号码，我们可以手动创建设备文件。其次，新注册的设备将在/proc/devices中有一个条目，我们可以手动创建设备文件，或者编写一个 shell 脚本来读取文件并创建设备文件。第三种方法是我们可以让我们的驱动程序在成功注册后使用mknod系统调用创建设备文件，并在调用cleanup_module.

4.1.4. 注销设备

我们不能允许在 root 用户想这样做时就 rmmod 内核模块。如果设备文件被一个进程打开，然后我们删除了内核模块，那么使用该文件将导致调用内存位置，而该位置曾经是适当的函数（read/write）。如果幸运的话，没有其他代码加载到那里，我们会收到一个难看的错误消息。如果不幸的话，另一个内核模块被加载到相同的位置，这意味着跳转到内核中另一个函数的中间。这样做的结果将无法预测，但它们不会非常积极。

通常，当您不想允许某事时，您会从本应执行该操作的函数返回一个错误代码（负数）。对于cleanup_module这是不可能的，因为它是一个 void 函数。但是，有一个计数器可以跟踪有多少进程正在使用您的模块。您可以通过查看/proc/modules的第三个字段来查看它的值。如果这个数字不是零，rmmod将会失败。请注意，您不必从cleanup_module中检查计数器，因为系统调用sys_delete_module会为您执行检查，该系统调用在linux/module.c中定义。您不应该直接使用此计数器，但是linux/module.h中定义了一些函数，允许您增加、减少和显示此计数器

try_module_get(THIS_MODULE)：增加使用计数。
module_put(THIS_MODULE)：减少使用计数。

保持计数器准确非常重要；如果您丢失了正确的使用计数，您将永远无法卸载模块；现在是重启时间了，孩子们。这迟早会在模块开发过程中发生在你身上。

4.1.5. chardev.c

下一个代码示例创建一个名为chardev的字符驱动程序。您可以cat它的设备文件（或open使用程序打开文件），驱动程序会将设备文件被读取的次数放入文件中。我们不支持写入文件（例如 echo "hi" > /dev/hello），但会捕获这些尝试并告诉用户不支持该操作。如果您没有看到我们如何处理读入缓冲区的数据，请不要担心；我们没有对它做太多处理。我们只是读取数据并打印一条消息，确认我们收到了它。

示例 4-1. chardev.c

/*
 *  chardev.c: Creates a read-only char device that says how many times
 *  you've read from the dev file
 */

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <asm/uaccess.h>	/* for put_user */

/*  
 *  Prototypes - this would normally go in a .h file
 */
int init_module(void);
void cleanup_module(void);
static int device_open(struct inode *, struct file *);
static int device_release(struct inode *, struct file *);
static ssize_t device_read(struct file *, char *, size_t, loff_t *);
static ssize_t device_write(struct file *, const char *, size_t, loff_t *);

#define SUCCESS 0
#define DEVICE_NAME "chardev"	/* Dev name as it appears in /proc/devices   */
#define BUF_LEN 80		/* Max length of the message from the device */

/* 
 * Global variables are declared as static, so are global within the file. 
 */

static int Major;		/* Major number assigned to our device driver */
static int Device_Open = 0;	/* Is device open?  
				 * Used to prevent multiple access to device */
static char msg[BUF_LEN];	/* The msg the device will give when asked */
static char *msg_Ptr;

static struct file_operations fops = {
	.read = device_read,
	.write = device_write,
	.open = device_open,
	.release = device_release
};

/*
 * This function is called when the module is loaded
 */
int init_module(void)
{
        Major = register_chrdev(0, DEVICE_NAME, &fops);

	if (Major < 0) {
	  printk(KERN_ALERT "Registering char device failed with %d\n", Major);
	  return Major;
	}

	printk(KERN_INFO "I was assigned major number %d. To talk to\n", Major);
	printk(KERN_INFO "the driver, create a dev file with\n");
	printk(KERN_INFO "'mknod /dev/%s c %d 0'.\n", DEVICE_NAME, Major);
	printk(KERN_INFO "Try various minor numbers. Try to cat and echo to\n");
	printk(KERN_INFO "the device file.\n");
	printk(KERN_INFO "Remove the device file and module when done.\n");

	return SUCCESS;
}

/*
 * This function is called when the module is unloaded
 */
void cleanup_module(void)
{
	/* 
	 * Unregister the device 
	 */
	int ret = unregister_chrdev(Major, DEVICE_NAME);
	if (ret < 0)
		printk(KERN_ALERT "Error in unregister_chrdev: %d\n", ret);
}

/*
 * Methods
 */

/* 
 * Called when a process tries to open the device file, like
 * "cat /dev/mycharfile"
 */
static int device_open(struct inode *inode, struct file *file)
{
	static int counter = 0;

	if (Device_Open)
		return -EBUSY;

	Device_Open++;
	sprintf(msg, "I already told you %d times Hello world!\n", counter++);
	msg_Ptr = msg;
	try_module_get(THIS_MODULE);

	return SUCCESS;
}

/* 
 * Called when a process closes the device file.
 */
static int device_release(struct inode *inode, struct file *file)
{
	Device_Open--;		/* We're now ready for our next caller */

	/* 
	 * Decrement the usage count, or else once you opened the file, you'll
	 * never get get rid of the module. 
	 */
	module_put(THIS_MODULE);

	return 0;
}

/* 
 * Called when a process, which already opened the dev file, attempts to
 * read from it.
 */
static ssize_t device_read(struct file *filp,	/* see include/linux/fs.h   */
			   char *buffer,	/* buffer to fill with data */
			   size_t length,	/* length of the buffer     */
			   loff_t * offset)
{
	/*
	 * Number of bytes actually written to the buffer 
	 */
	int bytes_read = 0;

	/*
	 * If we're at the end of the message, 
	 * return 0 signifying end of file 
	 */
	if (*msg_Ptr == 0)
		return 0;

	/* 
	 * Actually put the data into the buffer 
	 */
	while (length && *msg_Ptr) {

		/* 
		 * The buffer is in the user data segment, not the kernel 
		 * segment so "*" assignment won't work.  We have to use 
		 * put_user which copies data from the kernel data segment to
		 * the user data segment. 
		 */
		put_user(*(msg_Ptr++), buffer++);

		length--;
		bytes_read++;
	}

	/* 
	 * Most read functions return the number of bytes put into the buffer
	 */
	return bytes_read;
}

/*  
 * Called when a process writes to dev file: echo "hi" > /dev/hello 
 */
static ssize_t
device_write(struct file *filp, const char *buff, size_t len, loff_t * off)
{
	printk(KERN_ALERT "Sorry, this operation isn't supported.\n");
	return -EINVAL;
}

4.1.6. 为多个内核版本编写模块

系统调用是内核向进程显示的主要接口，通常在不同版本之间保持不变。可能会添加新的系统调用，但通常旧的系统调用会像以前一样运行。这对于向后兼容性是必要的——新的内核版本不应该破坏常规进程。在大多数情况下，设备文件也将保持不变。另一方面，内核内部的接口可能会并且确实会在不同版本之间发生变化。

Linux 内核版本分为稳定版本 (n.$<$偶数$>$.m) 和开发版本 (n.$<$奇数$>$.m)。开发版本包含所有很酷的新想法，包括那些将被认为是错误或在下一个版本中重新实现的想法。因此，您不能相信这些版本中的接口会保持不变（这就是为什么我不打算在本书中支持它们，这太麻烦了，而且很快就会过时）。另一方面，在稳定版本中，我们可以期望接口保持不变，而与错误修复版本（m 号）无关。

不同内核版本之间存在差异，如果您想支持多个内核版本，您会发现自己必须编写条件编译指令。这样做的方法是比较宏LINUX_VERSION_CODE与宏KERNEL_VERSION。在版本a.b.c的内核中，此宏的值将是 $2^{16}a+2^{8}b+c$。

虽然本指南的先前版本详细介绍了如何使用此类构造编写向后兼容的代码，但我们决定打破这一传统以求更好。对此感兴趣的人现在可以使用与他们的内核版本匹配的 LKMPG。我们决定像内核一样对 LKMPG 进行版本控制，至少就主版本号和次版本号而言是这样。我们使用补丁级别进行我们自己的版本控制，因此内核 2.4.x 使用 LKMPG 版本 2.4.x，内核 2.6.x 使用 LKMPG 版本 2.6.x，依此类推。此外，请确保您始终使用内核和指南的当前最新版本。

更新：我们上面所说的对于内核 2.6.10 及更早版本是正确的。您可能已经注意到最近的内核看起来有所不同。如果您还没有注意到，它们现在看起来像 2.6.x.y。前三项的含义基本保持不变，但添加了一个子补丁级别，它将指示安全修复程序，直到下一个稳定补丁级别发布。因此，人们可以在具有安全更新的稳定树和使用最新的内核作为开发人员树之间进行选择。如果您对完整的故事感兴趣，请搜索内核邮件列表存档。

上一页	首页	下一页
字符设备文件	上一级	/proc 文件系统