第 8 章 常见问题解答

以下是关于 Linux 汇编编程的常见问题(及解答)。部分问题(及解答)来源于 linux-assembly 邮件列表。

8.1. 如何在 Linux 中进行图形编程?
8.2. 如何在 Linux 下调试纯汇编代码?
8.3. 还有其他有用的调试工具吗?
8.4. 如何从 Linux(BSD、BeOS 等)访问 BIOS 功能?
8.5. 是否可以使用汇编编写内核模块?
8.6. 如何动态分配内存?
8.7. 我不明白如何使用 select 系统调用!

8.1. 如何在 Linux 中进行图形编程?

来自 Paul Furber 的解答

Ok you have a number of options to graphics in Linux. Which one you use
depends on what you want to do. There isn't one Web site with all the
information but here are some tips:

SVGALib: This is a C library for console SVGA access.
Pros: very easy to learn, good coding examples, not all that different
from equivalent gfx libraries for DOS, all the effects you know from DOS
can be converted with little difficulty.
Cons: programs need superuser rights to run since they write directly to
the hardware, doesn't work with all chipsets, can't run under X-Windows.
Search for svgalib-1.4.x on http://ftp.is.co.za

Framebuffer: do it yourself graphics at SVGA res
Pros: fast, linear mapped video access, ASM can be used if you want :)
Cons: has to be compiled into the kernel, chipset-specific issues, must
switch out of X to run, relies on good knowledge of linux system calls
and kernel, tough to debug
Examples: asmutils (http://www.linuxassembly.org) and the leaves example
and my own site for some framebuffer code and tips in asm
(http://ma.verick.co.za/linux4k/)

Xlib: the application and development libraries for XFree86.
Pros: Complete control over your X application
Cons: Difficult to learn, horrible to work with and requires quite a bit
of knowledge as to how X works at the low level. 
Not recommended but if you're really masochistic go for it. All the
include and lib files are probably installed already so you have what
you need. 

Low-level APIs: include PTC, SDL, GGI and Clanlib
Pros: very flexible, run under X or the console, generally abstract away
the video hardware a little so you can draw to a linear surface, lots of
good coding examples, can link to other APIs like OpenGL and sound libs,
Windows DirectX versions for free
Cons: Not as fast as doing it yourself, often in development so versions
can (and do) change frequently.
Examples: PTC and GGI have excellent demos, SDL is used in sdlQuake,
Myth II, Civ CTP and Clanlib has been used for games as well.

High-level APIs: OpenGL - any others?
Pros: clean api, tons of functionality and examples, industry standard
so you can learn from SGI demos for example
Cons: hardware acceleration is normally a must, some quirks between
versions and platforms
Examples: loads - check out www.mesa3d.org under the links section.

To get going try looking at the svgalib examples and also install SDL
and get it working. After that, the sky's the limit.

8.2. 如何在 Linux 下调试纯汇编代码?

有一个早期版本的汇编语言调试器,它被设计用于处理汇编代码,并且具有足够的可移植性,可以在 Linux 和 *BSD 上运行。它已经可以正常工作,应该是正确的选择,去看看吧!

你也可以尝试 gdb ;)。虽然它是一个源代码级调试器,但它可以用来调试纯汇编代码,并且通过一些技巧,你可以让 gdb 完成你需要做的事情(不幸的是,nasm '-g' 开关不会为 gdb 生成正确的调试信息;我认为这是一个 nasm 的 bug)。这是来自 Dmitry Bakhvalov 的解答

Personally, I use gdb for debugging asmutils. Try this:
 
1) Use the following stuff to compile:
   $ nasm -f elf -g smth.asm
   $ ld -o smth smth.o

2) Fire up gdb:
   $ gdb smth

3) In gdb:
   (gdb) disassemble _start
   Place a breakpoint at _start+1 (If placed at _start the breakpoint
   wouldnt work, dunno why)
   (gdb) b *0x8048075

   To step thru the code I use the following macro:
   (gdb)define n
   >ni
   >printf "eax=%x ebx=%x ...etc...",$eax,$ebx,...etc...
   >disassemble $pc $pc+15
   >end

   Then start the program with r command and debug with n.

   Hope this helps.

来自 ??? 的补充说明

    I have such a macro in my .gdbinit for quite some time now, and it
    for sure makes life easier. A small difference : I use "x /8i $pc",
    which guarantee a fixed number of disassembled instructions. Then,
    with a well chosen size for my xterm, gdb output looks like it is
    refreshed, and not scrolling.

如果你想在你的代码中设置断点,你可以直接使用int 3指令作为断点(而不是在 gdb 中手动输入地址)。

如果你正在使用 gas,你应该查阅 gasgdb 相关教程。

8.3. 还有其他有用的调试工具吗?

当然 strace 很有帮助(FreeBSD 上是 ktracekdump),它用于跟踪系统调用和信号。阅读它的手册页(man strace)和 strace - -help 输出以获取详细信息。

8.4. 如何从 Linux(BSD、BeOS 等)访问 BIOS 功能?

简短的回答是 -- 没门。这是保护模式,请使用操作系统服务来代替。再次强调,你不能使用int 0x10, int 0x13,等等。幸运的是,几乎所有事情都可以通过系统调用或库函数来实现。在最坏的情况下,你可以通过直接端口访问,或者制作内核补丁来实现所需的功能,或者使用 LRMI 库来访问 BIOS 功能。

8.5. 是否可以使用汇编编写内核模块?

是的,的确可以。虽然一般来说这不是一个好主意(它几乎不会加速任何东西),但可能需要这种技巧。编写模块本身的过程并不那么难——一个模块必须有一些预定义的全局函数,它可能还需要调用内核中的一些外部函数。查看内核源代码(可以构建为模块)以获取详细信息。

同时,这里有一个最小的哑内核模块的例子(module.asm)(源代码基于 mammon_ 在 APJ #8 中的示例)

section .text

	global init_module
	global cleanup_module
	global kernel_version

	extern printk

init_module:
	push	dword str1
	call	printk
	pop	eax
	xor	eax,eax
	ret

cleanup_module:
	push	dword str2
	call	printk
	pop	eax
	ret
	
str1		db	"init_module done",0xa,0
str2		db	"cleanup_module done",0xa,0

kernel_version	db	"2.2.18",0

此示例唯一做的事情是报告其操作。修改kernel_version以匹配你的版本,并使用以下命令构建模块

$ nasm -f elf -o module.m module.asm

$ ld -r -o module.o module.m

现在你可以使用 insmod/rmmod/lsmod 来玩玩了(需要 root 权限);很有趣,不是吗?

8.6. 如何动态分配内存?

来自 H-Peter Recktenwald 的简洁回答

	ebx := 0	(in fact, any value below .bss seems to do)
	sys_brk
	eax := current top (of .bss section)

	ebx := [ current top < ebx < (esp - 16K) ]
	sys_brk
	eax := new top of .bss

来自 Tiago Gasiba 的详细解答

section	.bss

var1	resb	1

section	.text

;
;allocate memory
;

%define	LIMIT	0x4000000			; about 100Megs

	mov	ebx,0				; get bottom of data segment
	call	sys_brk

	cmp	eax,-1				; ok?
	je	erro1

	add	eax,LIMIT			; allocate +LIMIT memory
	mov	ebx,eax
	call	sys_brk
	
	cmp	eax,-1				; ok?
	je	erro1

	cmp	eax,var1+1			; has the data segment grown?
	je	erro1

;
;use allocated memory
;
						; now eax contains bottom of
						; data segment
	mov	ebx,eax				; save bottom
	mov	eax,var1			; eax=beginning of data segment
repeat:	
	mov	word	[eax],1			; fill up with 1's
	inc	eax
	cmp	ebx,eax				; current pos = bottom?
	jne	repeat

;
;free memory
;

	mov	ebx,var1			; deallocate memory
	call	sys_brk				; by forcing its beginning=var1

	cmp	eax,-1				; ok?
	je	erro2

8.7. 我不明白如何使用select系统调用!

来自 Patrick Mochel 的解答

When you call sys_open, you get back a file descriptor, which is simply an
index into a table of all the open file descriptors that your process has.
stdin, stdout, and stderr are always 0, 1, and 2, respectively, because
that is the order in which they are always open for your process from there.
Also, notice that the first file descriptor that you open yourself (w/o first
closing any of those magic three descriptors) is always 3, and they increment
from there.

Understanding the index scheme will explain what select does. When you
call select, you are saying that you are waiting certain file descriptors
to read from, certain ones to write from, and certain ones to watch from
exceptions from. Your process can have up to 1024 file descriptors open,
so an fd_set is just a bit mask describing which file descriptors are valid
for each operation. Make sense?

Since each fd that you have open is just an index, and it only needs to be
on or off for each fd_set, you need only 1024 bits for an fd_set structure.
1024 / 32 = 32 longs needed to represent the structure.

Now, for the loose example.
Suppose you want to read from a file descriptor (w/o timeout).

- Allocate the equivalent to an fd_set.  

.data

my_fds: times 32 dd 0

- open the file descriptor that you want to read from.

- set that bit in the fd_set structure.

   First, you need to figure out which of the 32 dwords the bit is in.  

   Then, use bts to set the bit in that dword. bts will do a modulo 32
   when setting the bit. That's why you need to first figure out which
   dword to start with.

   mov edx, 0
   mov ebx, 32
   div ebx

   lea ebx, my_fds
   bts ebx[eax * 4], edx

- repeat the last step for any file descriptors you want to read from.

- repeat the entire exercise for either of the other two fd_sets if you want action from them.

That leaves two other parts of the equation - the n paramter and the timeout
parameter. I'll leave the timeout parameter as an exercise for the reader
(yes, I'm lazy), but I'll briefly talk about the n parameter.

It is the value of the largest file descriptor you are selecting from (from
any of the fd_sets), plus one. Why plus one? Well, because it's easy to
determine a mask from that value. Suppose that there is data available on
x file descriptors, but the highest one you care about is (n - 1). Since
an fd_set is just a bitmask, the kernel needs some efficient way for
determining whether to return or not from select. So, it masks off the bits
that you care about, checks if anything is available from the bits that are
still set, and returns if there is (pause as I rummage through kernel source).
Well, it's not as easy as I fantasized it would be. To see how the kernel
determines that mask, look in fs/select.c in the kernel source tree.

Anyway, you need to know that number, and the easiest way to do it is to save
the value of the last file descriptor open somewhere so you don't lose it.

Ok, that's what I know. A warning about the code above (as always) is that
it is not tested. I think it should work, but if it doesn't let me know.
But, if it starts a global nuclear meltdown, don't call me. ;-)

目前就这些,伙计们.