到目前为止,我们有两种方法可以从内核模块生成输出:我们可以注册一个设备驱动程序并使用 mknod 创建一个设备文件,或者我们可以创建一个/proc文件。这允许内核模块告诉我们任何它想说的内容。唯一的问题是我们无法与它对话。我们将向内核模块发送输入的第一种方式是通过写回到/proc文件。
因为 proc 文件系统主要是为了允许内核向进程报告其状态而编写的,所以没有针对输入的特殊规定。因此,struct proc_dir_entry不包含指向输入函数的指针,就像它包含指向输出函数的指针一样。相反,要写入一个/proc文件,我们需要使用标准的 文件系统机制。
在 Linux 中,有一个标准的文件系统注册机制。由于每个文件系统都必须有自己的函数来处理 inode 和文件操作[1],因此有一个特殊的结构来保存指向所有这些函数的指针,struct inode_operations,它包含一个指向struct file_operations。在 /proc 中,每当我们注册一个新文件时,我们被允许指定哪个struct inode_operations将被用于访问它。这就是我们使用的机制,一个struct inode_operations它包含一个指向一个struct file_operations它包含指向我们的module_input和module_output函数。
重要的是要注意,在内核中,读和写的标准角色是相反的。读取函数用于输出,而写入函数用于输入。原因是读取和写入指的是用户的角度 --- 如果一个进程从内核读取内容,那么内核需要输出它;如果一个进程向内核写入内容,那么内核会将其作为输入接收。
另一个有趣的方面是module_permission函数。每当一个进程尝试对/proc文件执行某些操作时,都会调用此函数,它可以决定是否允许访问。目前,它仅基于操作和当前用户的 uid(在current中可用,它是一个指向包含当前正在运行的进程信息的结构的指针),但它可以基于我们想要的任何内容,例如其他进程对同一文件执行的操作、一天中的时间或我们收到的最后一个输入。
原因是put_user和get_user是因为 Linux 内存(在 Intel 架构下,在其他处理器下可能有所不同)是分段的。这意味着指针本身并不引用内存中的唯一位置,而只是内存段中的一个位置,您需要知道它是哪个内存段才能使用它。内核有一个内存段,每个进程也都有一个内存段。
进程唯一可访问的内存段是它自己的,因此在编写作为进程运行的常规程序时,无需担心段的问题。当您编写内核模块时,通常您希望访问内核内存段,这由系统自动处理。但是,当需要在当前运行的进程和内核之间传递内存缓冲区的内容时,内核函数接收一个指向进程段中内存缓冲区的指针。这些put_user和get_user宏允许您访问该内存。
示例 6-1. procfs.c
/* procfs.c - create a "file" in /proc, which allows both input and output. */ #include <linux/kernel.h> /* We're doing kernel work */ #include <linux/module.h> /* Specifically, a module */ /* Necessary because we use proc fs */ #include <linux/proc_fs.h> /* In 2.2.3 /usr/include/linux/version.h includes a * macro for this, but 2.0.35 doesn't - so I add it * here if necessary. */ #ifndef KERNEL_VERSION #define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c)) #endif #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) #include <asm/uaccess.h> /* for get_user and put_user */ #endif /* The module's file functions ********************** */ /* Here we keep the last message received, to prove * that we can process our input */ #define MESSAGE_LENGTH 80 static char Message[MESSAGE_LENGTH]; /* Since we use the file operations struct, we can't * use the special proc output provisions - we have to * use a standard read function, which is this function */ #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) static ssize_t module_output( struct file *file, /* The file read */ char *buf, /* The buffer to put data to (in the * user segment) */ size_t len, /* The length of the buffer */ loff_t *offset) /* Offset in the file - ignore */ #else static int module_output( struct inode *inode, /* The inode read */ struct file *file, /* The file read */ char *buf, /* The buffer to put data to (in the * user segment) */ int len) /* The length of the buffer */ #endif { static int finished = 0; int i; char message[MESSAGE_LENGTH+30]; /* We return 0 to indicate end of file, that we have * no more information. Otherwise, processes will * continue to read from us in an endless loop. */ if (finished) { finished = 0; return 0; } /* We use put_user to copy the string from the kernel's * memory segment to the memory segment of the process * that called us. get_user, BTW, is * used for the reverse. */ sprintf(message, "Last input:%s", Message); for(i=0; i<len && message[i]; i++) put_user(message[i], buf+i); /* Notice, we assume here that the size of the message * is below len, or it will be received cut. In a real * life situation, if the size of the message is less * than len then we'd return len and on the second call * start filling the buffer with the len+1'th byte of * the message. */ finished = 1; return i; /* Return the number of bytes "read" */ } /* This function receives input from the user when the * user writes to the /proc file. */ #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) static ssize_t module_input( struct file *file, /* The file itself */ const char *buf, /* The buffer with input */ size_t length, /* The buffer's length */ loff_t *offset) /* offset to file - ignore */ #else static int module_input( struct inode *inode, /* The file's inode */ struct file *file, /* The file itself */ const char *buf, /* The buffer with the input */ int length) /* The buffer's length */ #endif { int i; /* Put the input into Message, where module_output * will later be able to use it */ for(i=0; i<MESSAGE_LENGTH-1 && i<length; i++) #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) get_user(Message[i], buf+i); /* In version 2.2 the semantics of get_user changed, * it not longer returns a character, but expects a * variable to fill up as its first argument and a * user segment pointer to fill it from as the its * second. * * The reason for this change is that the version 2.2 * get_user can also read an short or an int. The way * it knows the type of the variable it should read * is by using sizeof, and for that it needs the * variable itself. */ #else Message[i] = get_user(buf+i); #endif Message[i] = '\0'; /* we want a standard, zero * terminated string */ /* We need to return the number of input characters * used */ return i; } /* This function decides whether to allow an operation * (return zero) or not allow it (return a non-zero * which indicates why it is not allowed). * * The operation can be one of the following values: * 0 - Execute (run the "file" - meaningless in our case) * 2 - Write (input to the kernel module) * 4 - Read (output from the kernel module) * * This is the real function that checks file * permissions. The permissions returned by ls -l are * for referece only, and can be overridden here. */ static int module_permission(struct inode *inode, int op) { /* We allow everybody to read from our module, but * only root (uid 0) may write to it */ if (op == 4 || (op == 2 && current->euid == 0)) return 0; /* If it's anything else, access is denied */ return -EACCES; } /* The file is opened - we don't really care about * that, but it does mean we need to increment the * module's reference count. */ int module_open(struct inode *inode, struct file *file) { MOD_INC_USE_COUNT; return 0; } /* The file is closed - again, interesting only because * of the reference count. */ #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) int module_close(struct inode *inode, struct file *file) #else void module_close(struct inode *inode, struct file *file) #endif { MOD_DEC_USE_COUNT; #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) return 0; /* success */ #endif } /* Structures to register as the /proc file, with * pointers to all the relevant functions. ********** */ /* File operations for our proc file. This is where we * place pointers to all the functions called when * somebody tries to do something to our file. NULL * means we don't want to deal with something. */ static struct file_operations File_Ops_4_Our_Proc_File = { NULL, /* lseek */ module_output, /* "read" from the file */ module_input, /* "write" to the file */ NULL, /* readdir */ NULL, /* select */ NULL, /* ioctl */ NULL, /* mmap */ module_open, /* Somebody opened the file */ #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) NULL, /* flush, added here in version 2.2 */ #endif module_close, /* Somebody closed the file */ /* etc. etc. etc. (they are all given in * /usr/include/linux/fs.h). Since we don't put * anything here, the system will keep the default * data, which in Unix is zeros (NULLs when taken as * pointers). */ }; /* Inode operations for our proc file. We need it so * we'll have some place to specify the file operations * structure we want to use, and the function we use for * permissions. It's also possible to specify functions * to be called for anything else which could be done to * an inode (although we don't bother, we just put * NULL). */ static struct inode_operations Inode_Ops_4_Our_Proc_File = { &File_Ops_4_Our_Proc_File, NULL, /* create */ NULL, /* lookup */ NULL, /* link */ NULL, /* unlink */ NULL, /* symlink */ NULL, /* mkdir */ NULL, /* rmdir */ NULL, /* mknod */ NULL, /* rename */ NULL, /* readlink */ NULL, /* follow_link */ NULL, /* readpage */ NULL, /* writepage */ NULL, /* bmap */ NULL, /* truncate */ module_permission /* check for permissions */ }; /* Directory entry */ static struct proc_dir_entry Our_Proc_File = { 0, /* Inode number - ignore, it will be filled by * proc_register[_dynamic] */ 7, /* Length of the file name */ "rw_test", /* The file name */ S_IFREG | S_IRUGO | S_IWUSR, /* File mode - this is a regular file which * can be read by its owner, its group, and everybody * else. Also, its owner can write to it. * * Actually, this field is just for reference, it's * module_permission that does the actual check. It * could use this field, but in our implementation it * doesn't, for simplicity. */ 1, /* Number of links (directories where the * file is referenced) */ 0, 0, /* The uid and gid for the file - * we give it to root */ 80, /* The size of the file reported by ls. */ &Inode_Ops_4_Our_Proc_File, /* A pointer to the inode structure for * the file, if we need it. In our case we * do, because we need a write function. */ NULL /* The read function for the file. Irrelevant, * because we put it in the inode structure above */ }; /* Module initialization and cleanup ******************* */ /* Initialize the module - register the proc file */ int init_module() { /* Success if proc_register[_dynamic] is a success, * failure otherwise */ #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0) /* In version 2.2, proc_register assign a dynamic * inode number automatically if it is zero in the * structure , so there's no more need for * proc_register_dynamic */ return proc_register(&proc_root, &Our_Proc_File); #else return proc_register_dynamic(&proc_root, &Our_Proc_File); #endif } /* Cleanup - unregister our file from /proc */ void cleanup_module() { proc_unregister(&proc_root, Our_Proc_File.low_ino); } |
[1] | 两者之间的区别在于文件操作处理文件本身,而 inode 操作处理引用文件的方式,例如创建指向它的链接。 |