|
The Linux Kernel Module Programming
Guide |
Prev |
Chapter 4. Character
Device Files |
Next |
4.1. Character
Device Drivers
4.1.1. The
file_operations Structure
The file_operations structure is
defined in linux/fs.h, and holds pointers
to functions defined by the driver that perform various operations
on the device. Each field of the structure corresponds to the
address of some function defined by the driver to handle a
requested operation.
For example, every character driver needs to define a function
that reads from the device. The file_operations structure holds the address of the
module's function that performs that operation. Here is what the
definition looks like for kernel 2.6.5:
struct file_operations {
struct module *owner;
loff_t(*llseek) (struct file *, loff_t, int);
ssize_t(*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t(*aio_read) (struct kiocb *, char __user *, size_t, loff_t);
ssize_t(*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t(*aio_write) (struct kiocb *, const char __user *, size_t,
loff_t);
int (*readdir) (struct file *, void *, filldir_t);
unsigned int (*poll) (struct file *, struct poll_table_struct *);
int (*ioctl) (struct inode *, struct file *, unsigned int,
unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
int (*open) (struct inode *, struct file *);
int (*flush) (struct file *);
int (*release) (struct inode *, struct file *);
int (*fsync) (struct file *, struct dentry *, int datasync);
int (*aio_fsync) (struct kiocb *, int datasync);
int (*fasync) (int, struct file *, int);
int (*lock) (struct file *, int, struct file_lock *);
ssize_t(*readv) (struct file *, const struct iovec *, unsigned long,
loff_t *);
ssize_t(*writev) (struct file *, const struct iovec *, unsigned long,
loff_t *);
ssize_t(*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
void __user *);
ssize_t(*sendpage) (struct file *, struct page *, int, size_t,
loff_t *, int);
unsigned long (*get_unmapped_area) (struct file *, unsigned long,
unsigned long, unsigned long,
unsigned long);
};
|
Some operations are not implemented by a driver. For example, a
driver that handles a video card won't need to read from a
directory structure. The corresponding entries in the file_operations structure should be set to NULL.
There is a gcc extension that makes assigning to this structure
more convenient. You'll see it in modern drivers, and may catch you
by surprise. This is what the new way of assigning to the structure
looks like:
struct file_operations fops = {
read: device_read,
write: device_write,
open: device_open,
release: device_release
};
|
However, there's also a C99 way of assigning to elements of a
structure, and this is definitely preferred over using the GNU
extension. The version of gcc the author used when writing this,
2.95, supports the new C99 syntax. You
should use this syntax in case someone wants to port your driver.
It will help with compatibility:
struct file_operations fops = {
.read = device_read,
.write = device_write,
.open = device_open,
.release = device_release
};
|
The meaning is clear, and you should be aware that any member of
the structure which you don't explicitly assign will be initialized
to NULL by gcc.
An instance of struct file_operations
containing pointers to functions that are used to implement read,
write, open, ... syscalls is commonly named fops.
4.1.2. The
file structure
Each device is represented in the kernel by a file structure, which is defined in linux/fs.h. Be aware that a file is a kernel level structure and never appears in
a user space program. It's not the same thing as a FILE, which is defined by glibc and would never
appear in a kernel space function. Also, its name is a bit
misleading; it represents an abstract open `file', not a file on a
disk, which is represented by a structure named inode.
An instance of struct file is commonly
named filp. You'll also see it refered to
as struct file file. Resist the
temptation.
Go ahead and look at the definition of file. Most of the entries you see, like struct dentry aren't used by device drivers, and
you can ignore them. This is because drivers don't fill file directly; they only use structures contained in
file which are created elsewhere.
4.1.3.
Registering A Device
As discussed earlier, char devices are accessed through device
files, usually located in /dev. The major number tells you which driver
handles which device file. The minor number is used only by the
driver itself to differentiate which device it's operating on, just
in case the driver handles more than one device.
Adding a driver to your system means registering it with the
kernel. This is synonymous with assigning it a major number during
the module's initialization. You do this by using the register_chrdev function, defined by linux/fs.h.
int register_chrdev(unsigned int major, const char *name, struct file_operations *fops);
|
where unsigned int major is the major
number you want to request, const char
*name is the name of the device as it'll appear in /proc/devices and struct
file_operations *fops is a pointer to the file_operations table for your driver. A negative
return value means the registration failed. Note that we didn't
pass the minor number to register_chrdev.
That's because the kernel doesn't care about the minor number; only
our driver uses it.
Now the question is, how do you get a major number without
hijacking one that's already in use? The easiest way would be to
look through Documentation/devices.txt
and pick an unused one. That's a bad way of doing things because
you'll never be sure if the number you picked will be assigned
later. The answer is that you can ask the kernel to assign you a
dynamic major number.
If you pass a major number of 0 to register_chrdev, the return value will be the
dynamically allocated major number. The downside is that you can't
make a device file in advance, since you don't know what the major
number will be. There are a couple of ways to do this. First, the
driver itself can print the newly assigned number and we can make
the device file by hand. Second, the newly registered device will
have an entry in /proc/devices, and we
can either make the device file by hand or write a shell script to
read the file in and make the device file. The third method is we
can have our driver make the the device file using the mknod system call after a successful registration
and rm during the call to cleanup_module.
4.1.4.
Unregistering A Device
We can't allow the kernel module to be rmmod'ed whenever root feels like it. If the
device file is opened by a process and then we remove the kernel
module, using the file would cause a call to the memory location
where the appropriate function (read/write) used to be. If we're
lucky, no other code was loaded there, and we'll get an ugly error
message. If we're unlucky, another kernel module was loaded into
the same location, which means a jump into the middle of another
function within the kernel. The results of this would be impossible
to predict, but they can't be very positive.
Normally, when you don't want to allow something, you return an
error code (a negative number) from the function which is supposed
to do it. With cleanup_module that's
impossible because it's a void function. However, there's a counter
which keeps track of how many processes are using your module. You
can see what it's value is by looking at the 3rd field of
/proc/modules. If this number isn't zero,
rmmod will fail. Note that you don't have
to check the counter from within cleanup_module because the check will be performed
for you by the system call sys_delete_module, defined in linux/module.c. You shouldn't use this counter
directly, but there are functions defined in linux/module.h which let you increase, decrease and
display this counter:
It's important to keep the counter accurate; if you ever do lose
track of the correct usage count, you'll never be able to unload
the module; it's now reboot time, boys and girls. This is bound to
happen to you sooner or later during a module's development.
4.1.5.
chardev.c
The next code sample creates a char driver named chardev. You can cat its
device file (or open the file with a
program) and the driver will put the number of times the device
file has been read from into the file. We don't support writing to
the file (like echo "hi" > /dev/hello),
but catch these attempts and tell the user that the operation isn't
supported. Don't worry if you don't see what we do with the data we
read into the buffer; we don't do much with it. We simply read in
the data and print a message acknowledging that we received it.
Example 4-1. chardev.c
/*
* chardev.c: Creates a read-only char device that says how many times
* you've read from the dev file
*/
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <asm/uaccess.h> /* for put_user */
/*
* Prototypes - this would normally go in a .h file
*/
int init_module(void);
void cleanup_module(void);
static int device_open(struct inode *, struct file *);
static int device_release(struct inode *, struct file *);
static ssize_t device_read(struct file *, char *, size_t, loff_t *);
static ssize_t device_write(struct file *, const char *, size_t, loff_t *);
#define SUCCESS 0
#define DEVICE_NAME "chardev" /* Dev name as it appears in /proc/devices */
#define BUF_LEN 80 /* Max length of the message from the device */
/*
* Global variables are declared as static, so are global within the file.
*/
static int Major; /* Major number assigned to our device driver */
static int Device_Open = 0; /* Is device open?
* Used to prevent multiple access to device */
static char msg[BUF_LEN]; /* The msg the device will give when asked */
static char *msg_Ptr;
static struct file_operations fops = {
.read = device_read,
.write = device_write,
.open = device_open,
.release = device_release
};
/*
* This function is called when the module is loaded
*/
int init_module(void)
{
Major = register_chrdev(0, DEVICE_NAME, &fops);
if (Major < 0) {
printk(KERN_ALERT "Registering char device failed with %d\n", Major);
return Major;
}
printk(KERN_INFO "I was assigned major number %d. To talk to\n", Major);
printk(KERN_INFO "the driver, create a dev file with\n");
printk(KERN_INFO "'mknod /dev/%s c %d 0'.\n", DEVICE_NAME, Major);
printk(KERN_INFO "Try various minor numbers. Try to cat and echo to\n");
printk(KERN_INFO "the device file.\n");
printk(KERN_INFO "Remove the device file and module when done.\n");
return SUCCESS;
}
/*
* This function is called when the module is unloaded
*/
void cleanup_module(void)
{
/*
* Unregister the device
*/
int ret = unregister_chrdev(Major, DEVICE_NAME);
if (ret < 0)
printk(KERN_ALERT "Error in unregister_chrdev: %d\n", ret);
}
/*
* Methods
*/
/*
* Called when a process tries to open the device file, like
* "cat /dev/mycharfile"
*/
static int device_open(struct inode *inode, struct file *file)
{
static int counter = 0;
if (Device_Open)
return -EBUSY;
Device_Open++;
sprintf(msg, "I already told you %d times Hello world!\n", counter++);
msg_Ptr = msg;
try_module_get(THIS_MODULE);
return SUCCESS;
}
/*
* Called when a process closes the device file.
*/
static int device_release(struct inode *inode, struct file *file)
{
Device_Open--; /* We're now ready for our next caller */
/*
* Decrement the usage count, or else once you opened the file, you'll
* never get get rid of the module.
*/
module_put(THIS_MODULE);
return 0;
}
/*
* Called when a process, which already opened the dev file, attempts to
* read from it.
*/
static ssize_t device_read(struct file *filp, /* see include/linux/fs.h */
char *buffer, /* buffer to fill with data */
size_t length, /* length of the buffer */
loff_t * offset)
{
/*
* Number of bytes actually written to the buffer
*/
int bytes_read = 0;
/*
* If we're at the end of the message,
* return 0 signifying end of file
*/
if (*msg_Ptr == 0)
return 0;
/*
* Actually put the data into the buffer
*/
while (length && *msg_Ptr) {
/*
* The buffer is in the user data segment, not the kernel
* segment so "*" assignment won't work. We have to use
* put_user which copies data from the kernel data segment to
* the user data segment.
*/
put_user(*(msg_Ptr++), buffer++);
length--;
bytes_read++;
}
/*
* Most read functions return the number of bytes put into the buffer
*/
return bytes_read;
}
/*
* Called when a process writes to dev file: echo "hi" > /dev/hello
*/
static ssize_t
device_write(struct file *filp, const char *buff, size_t len, loff_t * off)
{
printk(KERN_ALERT "Sorry, this operation isn't supported.\n");
return -EINVAL;
}
|
4.1.6. Writing
Modules for Multiple Kernel Versions
The system calls, which are the major interface the kernel shows
to the processes, generally stay the same across versions. A new
system call may be added, but usually the old ones will behave
exactly like they used to. This is necessary for backward
compatibility -- a new kernel version is not supposed to break
regular processes. In most cases, the device files will also remain
the same. On the other hand, the internal interfaces within the
kernel can and do change between versions.
The Linux kernel versions are divided between the stable
versions (n.$<$even number$>$.m) and the development versions
(n.$<$odd number$>$.m). The development versions include all
the cool new ideas, including those which will be considered a
mistake, or reimplemented, in the next version. As a result, you
can't trust the interface to remain the same in those versions
(which is why I don't bother to support them in this book, it's too
much work and it would become dated too quickly). In the stable
versions, on the other hand, we can expect the interface to remain
the same regardless of the bug fix version (the m number).
There are differences between different kernel versions, and if
you want to support multiple kernel versions, you'll find yourself
having to code conditional compilation directives. The way to do
this to compare the macro LINUX_VERSION_CODE to the macro KERNEL_VERSION. In version a.b.c of the kernel, the value of this macro would
be $2^{16}a+2^{8}b+c$.
While previous versions of this guide showed how you can write
backward compatible code with such constructs in great detail, we
decided to break with this tradition for the better. People
interested in doing such might now use a LKMPG with a version
matching to their kernel. We decided to version the LKMPG like the
kernel, at least as far as major and minor number are concerned. We
use the patchlevel for our own versioning so use LKMPG version
2.4.x for kernels 2.4.x, use LKMPG version 2.6.x for kernels 2.6.x
and so on. Also make sure that you always use current, up to date
versions of both, kernel and guide.
Update: What we've said above was true for kernels up to and
including 2.6.10. You might already have noticed that recent
kernels look different. In case you haven't they look like 2.6.x.y
now. The meaning of the first three items basically stays the same,
but a subpatchlevel has been added and will indicate security fixes
till the next stable patchlevel is out. So people can choose
between a stable tree with security updates and use the
latest kernel as developer tree. Search the kernel mailing list
archives if you're interested in the full story.
|
|