4. 实施

现在所有的解释都讲完了,是时候用 Linux 来实施带宽管理了。

4.1. 注意事项

限制发送到 DSL 调制解调器的实际数据速率并不像看起来那么简单。大多数 DSL 调制解调器实际上只是以太网桥,在您的 Linux 机器和 ISP 的网关之间来回桥接数据。大多数 DSL 调制解调器使用 ATM 作为链路层来发送数据。ATM 以信元形式发送数据,信元长度始终为 53 字节。其中 5 个字节是头部信息,剩下 48 个字节可用于数据。即使您只发送 1 字节的数据,也会消耗整个 53 字节的带宽,因为 ATM 信元始终为 53 字节长。这意味着,如果您发送一个典型的 TCP ACK 数据包,它由 0 字节数据 + 20 字节 TCP 头部 + 20 字节 IP 头部 + 18 字节以太网头部组成。实际上,即使您发送的以太网数据包只有 40 字节的有效负载(TCP 和 IP 头部),以太网数据包的最小有效负载为 46 字节数据,因此剩余的 6 字节用空字符填充。这意味着以太网数据包加上头部的实际长度为 18 + 46 = 64 字节。为了通过 ATM 发送 64 字节,您必须发送两个 ATM 信元,这将消耗 106 字节的带宽。这意味着对于每个 TCP ACK 数据包,您都会浪费 42 字节的带宽。如果 Linux 考虑了 DSL 调制解调器使用的封装,这本可以接受,但实际上,Linux 只计算 TCP 头部、IP 头部和 14 字节的 MAC 地址(Linux 不计算 4 字节的 CRC,因为这在硬件级别处理)。Linux 既不计算 46 字节的最小以太网数据包大小,也不考虑固定的 ATM 信元大小。

所有这些意味着您必须将出站带宽限制为略低于您的真实容量(直到我们能找到一种数据包调度器,它可以考虑正在使用的各种类型的封装)。您可能会发现您已经找到了一个限制带宽的好数字,但是当您下载一个大文件时,延迟开始飙升到 3 秒以上。这很可能是因为 Linux 错误地计算了那些小 ACK 数据包消耗的带宽。

我一直在研究解决这个问题几个月了,并且几乎确定了一个解决方案,我将很快向公众发布以进行进一步测试。该解决方案涉及使用用户空间队列而不是 Linux 的 QoS 来限制数据包速率。我基本上使用 Linux 用户空间队列实现了一个简单的 HTB 队列。这个解决方案(到目前为止)已经能够非常好地调节出站流量,即使在大量的批量下载(多个流)和批量上传(gnutella,多个流)期间,延迟峰值也只有 400 毫秒,而我的正常无流量延迟约为 15 毫秒。有关此 QoS 方法的更多信息,请订阅电子邮件列表以获取更新,或查看此 HOWTO 的更新。

4.2. 脚本:myshaper

以下是我用来控制我的 Linux 路由器上带宽的脚本列表。它使用了本文档中介绍的几个概念。出站流量根据类型被放入 7 个队列之一。入站流量被放入两个队列中,如果入站数据超速,TCP 数据包将首先被丢弃(最低优先级)。此脚本中给出的速率似乎适用于我的设置,但您的结果可能会有所不同。

#!/bin/bash
#
# myshaper - DSL/Cable modem outbound traffic shaper and prioritizer.
#            Based on the ADSL/Cable wondershaper (www.lartc.org)
#
# Written by Dan Singletary (8/7/02)
#
# NOTE!! - This script assumes your kernel has been patched with the
#          appropriate HTB queue and IMQ patches available here:
#          (subnote: future kernels may not require patching)
#
#       http://luxik.cdi.cz/~devik/qos/htb/
#       http://luxik.cdi.cz/~patrick/imq/
#
# Configuration options for myshaper:
#  DEV    - set to ethX that connects to DSL/Cable Modem
#  RATEUP - set this to slightly lower than your
#           outbound bandwidth on the DSL/Cable Modem.
#           I have a 1500/128 DSL line and setting
#           RATEUP=90 works well for my 128kbps upstream.
#           However, your mileage may vary.
#  RATEDN - set this to slightly lower than your
#           inbound bandwidth on the DSL/Cable Modem.
#
#
#  Theory on using imq to "shape" inbound traffic:
#
#     It's impossible to directly limit the rate of data that will
#  be sent to you by other hosts on the internet.  In order to shape
#  the inbound traffic rate, we have to rely on the congestion avoidance
#  algorithms in TCP.  Because of this, WE CAN ONLY ATTEMPT TO SHAPE
#  INBOUND TRAFFIC ON TCP CONNECTIONS.  This means that any traffic that
#  is not tcp should be placed in the high-prio class, since dropping
#  a non-tcp packet will most likely result in a retransmit which will
#  do nothing but unnecessarily consume bandwidth.  
#     We attempt to shape inbound TCP traffic by dropping tcp packets
#  when they overflow the HTB queue which will only pass them on at
#  a certain rate (RATEDN) which is slightly lower than the actual
#  capability of the inbound device.  By dropping TCP packets that
#  are over-rate, we are simulating the same packets getting dropped
#  due to a queue-overflow on our ISP's side.  The advantage of this
#  is that our ISP's queue will never fill because TCP will slow it's
#  transmission rate in response to the dropped packets in the assumption
#  that it has filled the ISP's queue, when in reality it has not.
#     The advantage of using a priority-based queuing discipline is
#  that we can specifically choose NOT to drop certain types of packets
#  that we place in the higher priority buckets (ssh, telnet, etc).  This
#  is because packets will always be dequeued from the lowest priority class
#  with the stipulation that packets will still be dequeued from every
#  class fairly at a minimum rate (in this script, each bucket will deliver
#  at least it's fair share of 1/7 of the bandwidth).  
#
#  Reiterating main points:
#   * Dropping a tcp packet on a connection will lead to a slower rate
#     of reception for that connection due to the congestion avoidance algorithm.
#   * We gain nothing from dropping non-TCP packets.  In fact, if they
#     were important they would probably be retransmitted anyways so we want to
#     try to never drop these packets.  This means that saturated TCP connections
#     will not negatively effect protocols that don't have a built-in retransmit like TCP.
#   * Slowing down incoming TCP connections such that the total inbound rate is less
#     than the true capability of the device (ADSL/Cable Modem) SHOULD result in little
#     to no packets being queued on the ISP's side (DSLAM, cable concentrator, etc).  Since
#     these ISP queues have been observed to queue 4 seconds of data at 1500Kbps or 6 megabits
#     of data, having no packets queued there will mean lower latency.
#
#  Caveats (questions posed before testing):
#   * Will limiting inbound traffic in this fashion result in poor bulk TCP performance?
#     - Preliminary answer is no!  Seems that by prioritizing ACK packets (small <64b)
#       we maximize throughput by not wasting bandwidth on retransmitted packets
#       that we already have.
#   

# NOTE: The following configuration works well for my 
# setup: 1.5M/128K ADSL via Pacific Bell Internet (SBC Global Services)

DEV=eth0
RATEUP=90
RATEDN=700  # Note that this is significantly lower than the capacity of 1500.
            # Because of this, you may not want to bother limiting inbound traffic
            # until a better implementation such as TCP window manipulation can be used.

# 
# End Configuration Options
#

if [ "$1" = "status" ]
then
        echo "[qdisc]"
        tc -s qdisc show dev $DEV
        tc -s qdisc show dev imq0
        echo "[class]"
        tc -s class show dev $DEV
        tc -s class show dev imq0
        echo "[filter]"
        tc -s filter show dev $DEV
        tc -s filter show dev imq0
        echo "[iptables]"
        iptables -t mangle -L MYSHAPER-OUT -v -x 2> /dev/null
        iptables -t mangle -L MYSHAPER-IN -v -x 2> /dev/null
        exit
fi

# Reset everything to a known state (cleared)
tc qdisc del dev $DEV root    2> /dev/null > /dev/null
tc qdisc del dev imq0 root 2> /dev/null > /dev/null
iptables -t mangle -D POSTROUTING -o $DEV -j MYSHAPER-OUT 2> /dev/null > /dev/null
iptables -t mangle -F MYSHAPER-OUT 2> /dev/null > /dev/null
iptables -t mangle -X MYSHAPER-OUT 2> /dev/null > /dev/null
iptables -t mangle -D PREROUTING -i $DEV -j MYSHAPER-IN 2> /dev/null > /dev/null
iptables -t mangle -F MYSHAPER-IN 2> /dev/null > /dev/null
iptables -t mangle -X MYSHAPER-IN 2> /dev/null > /dev/null
ip link set imq0 down 2> /dev/null > /dev/null
rmmod imq 2> /dev/null > /dev/null

if [ "$1" = "stop" ] 
then 
        echo "Shaping removed on $DEV."
        exit
fi

###########################################################
#
# Outbound Shaping (limits total bandwidth to RATEUP)

# set queue size to give latency of about 2 seconds on low-prio packets
ip link set dev $DEV qlen 30

# changes mtu on the outbound device.  Lowering the mtu will result
# in lower latency but will also cause slightly lower throughput due 
# to IP and TCP protocol overhead.
ip link set dev $DEV mtu 1000

# add HTB root qdisc
tc qdisc add dev $DEV root handle 1: htb default 26

# add main rate limit classes
tc class add dev $DEV parent 1: classid 1:1 htb rate ${RATEUP}kbit

# add leaf classes - We grant each class at LEAST it's "fair share" of bandwidth.
#                    this way no class will ever be starved by another class.  Each
#                    class is also permitted to consume all of the available bandwidth
#                    if no other classes are in use.
tc class add dev $DEV parent 1:1 classid 1:20 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 0
tc class add dev $DEV parent 1:1 classid 1:21 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 1
tc class add dev $DEV parent 1:1 classid 1:22 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 2
tc class add dev $DEV parent 1:1 classid 1:23 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 3
tc class add dev $DEV parent 1:1 classid 1:24 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 4
tc class add dev $DEV parent 1:1 classid 1:25 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 5
tc class add dev $DEV parent 1:1 classid 1:26 htb rate $[$RATEUP/7]kbit ceil ${RATEUP}kbit prio 6

# attach qdisc to leaf classes - here we at SFQ to each priority class.  SFQ insures that
#                                within each class connections will be treated (almost) fairly.
tc qdisc add dev $DEV parent 1:20 handle 20: sfq perturb 10
tc qdisc add dev $DEV parent 1:21 handle 21: sfq perturb 10
tc qdisc add dev $DEV parent 1:22 handle 22: sfq perturb 10
tc qdisc add dev $DEV parent 1:23 handle 23: sfq perturb 10
tc qdisc add dev $DEV parent 1:24 handle 24: sfq perturb 10
tc qdisc add dev $DEV parent 1:25 handle 25: sfq perturb 10
tc qdisc add dev $DEV parent 1:26 handle 26: sfq perturb 10

# filter traffic into classes by fwmark - here we direct traffic into priority class according to
#                                         the fwmark set on the packet (we set fwmark with iptables
#                                         later).  Note that above we've set the default priority
#                                         class to 1:26 so unmarked packets (or packets marked with
#                                         unfamiliar IDs) will be defaulted to the lowest priority
#                                         class.
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 20 fw flowid 1:20
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 21 fw flowid 1:21
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 22 fw flowid 1:22
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 23 fw flowid 1:23
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 24 fw flowid 1:24
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 25 fw flowid 1:25
tc filter add dev $DEV parent 1:0 prio 0 protocol ip handle 26 fw flowid 1:26

# add MYSHAPER-OUT chain to the mangle table in iptables - this sets up the table we'll use
#                                                      to filter and mark packets.
iptables -t mangle -N MYSHAPER-OUT
iptables -t mangle -I POSTROUTING -o $DEV -j MYSHAPER-OUT

# add fwmark entries to classify different types of traffic - Set fwmark from 20-26 according to
#                                                             desired class. 20 is highest prio.
iptables -t mangle -A MYSHAPER-OUT -p tcp --sport 0:1024 -j MARK --set-mark 23 # Default for low port traffic 
iptables -t mangle -A MYSHAPER-OUT -p tcp --dport 0:1024 -j MARK --set-mark 23 # "" 
iptables -t mangle -A MYSHAPER-OUT -p tcp --dport 20 -j MARK --set-mark 26     # ftp-data port, low prio
iptables -t mangle -A MYSHAPER-OUT -p tcp --dport 5190 -j MARK --set-mark 23   # aol instant messenger
iptables -t mangle -A MYSHAPER-OUT -p icmp -j MARK --set-mark 20               # ICMP (ping) - high prio, impress friends
iptables -t mangle -A MYSHAPER-OUT -p udp -j MARK --set-mark 21                # DNS name resolution (small packets)
iptables -t mangle -A MYSHAPER-OUT -p tcp --dport ssh -j MARK --set-mark 22    # secure shell
iptables -t mangle -A MYSHAPER-OUT -p tcp --sport ssh -j MARK --set-mark 22    # secure shell
iptables -t mangle -A MYSHAPER-OUT -p tcp --dport telnet -j MARK --set-mark 22 # telnet (ew...)
iptables -t mangle -A MYSHAPER-OUT -p tcp --sport telnet -j MARK --set-mark 22 # telnet (ew...)
iptables -t mangle -A MYSHAPER-OUT -p ipv6-crypt -j MARK --set-mark 24         # IPSec - we don't know what the payload is though...
iptables -t mangle -A MYSHAPER-OUT -p tcp --sport http -j MARK --set-mark 25   # Local web server
iptables -t mangle -A MYSHAPER-OUT -p tcp -m length --length :64 -j MARK --set-mark 21 # small packets (probably just ACKs)
iptables -t mangle -A MYSHAPER-OUT -m mark --mark 0 -j MARK --set-mark 26      # redundant- mark any unmarked packets as 26 (low prio)

# Done with outbound shaping
#
####################################################

echo "Outbound shaping added to $DEV.  Rate: ${RATEUP}Kbit/sec."

# uncomment following line if you only want upstream shaping.
# exit

####################################################
#
# Inbound Shaping (limits total bandwidth to RATEDN)

# make sure imq module is loaded

modprobe imq numdevs=1

ip link set imq0 up

# add qdisc - default low-prio class 1:21

tc qdisc add dev imq0 handle 1: root htb default 21

# add main rate limit classes
tc class add dev imq0 parent 1: classid 1:1 htb rate ${RATEDN}kbit

# add leaf classes - TCP traffic in 21, non TCP traffic in 20
#
tc class add dev imq0 parent 1:1 classid 1:20 htb rate $[$RATEDN/2]kbit ceil ${RATEDN}kbit prio 0
tc class add dev imq0 parent 1:1 classid 1:21 htb rate $[$RATEDN/2]kbit ceil ${RATEDN}kbit prio 1

# attach qdisc to leaf classes - here we at SFQ to each priority class.  SFQ insures that
#                                within each class connections will be treated (almost) fairly.
tc qdisc add dev imq0 parent 1:20 handle 20: sfq perturb 10
tc qdisc add dev imq0 parent 1:21 handle 21: red limit 1000000 min 5000 max 100000 avpkt 1000 burst 50

# filter traffic into classes by fwmark - here we direct traffic into priority class according to
#                                         the fwmark set on the packet (we set fwmark with iptables
#                                         later).  Note that above we've set the default priority
#                                         class to 1:26 so unmarked packets (or packets marked with
#                                         unfamiliar IDs) will be defaulted to the lowest priority
#                                         class.
tc filter add dev imq0 parent 1:0 prio 0 protocol ip handle 20 fw flowid 1:20
tc filter add dev imq0 parent 1:0 prio 0 protocol ip handle 21 fw flowid 1:21

# add MYSHAPER-IN chain to the mangle table in iptables - this sets up the table we'll use
#                                                         to filter and mark packets.
iptables -t mangle -N MYSHAPER-IN
iptables -t mangle -I PREROUTING -i $DEV -j MYSHAPER-IN

# add fwmark entries to classify different types of traffic - Set fwmark from 20-26 according to
#                                                             desired class. 20 is highest prio.
iptables -t mangle -A MYSHAPER-IN -p ! tcp -j MARK --set-mark 20              # Set non-tcp packets to highest priority
iptables -t mangle -A MYSHAPER-IN -p tcp -m length --length :64 -j MARK --set-mark 20 # short TCP packets are probably ACKs
iptables -t mangle -A MYSHAPER-IN -p tcp --dport ssh -j MARK --set-mark 20    # secure shell
iptables -t mangle -A MYSHAPER-IN -p tcp --sport ssh -j MARK --set-mark 20    # secure shell
iptables -t mangle -A MYSHAPER-IN -p tcp --dport telnet -j MARK --set-mark 20 # telnet (ew...)
iptables -t mangle -A MYSHAPER-IN -p tcp --sport telnet -j MARK --set-mark 20 # telnet (ew...)
iptables -t mangle -A MYSHAPER-IN -m mark --mark 0 -j MARK --set-mark 21              # redundant- mark any unmarked packets as 26 (low prio)

# finally, instruct these packets to go through the imq0 we set up above
iptables -t mangle -A MYSHAPER-IN -j IMQ

# Done with inbound shaping 
#
####################################################

echo "Inbound shaping added to $DEV.  Rate: ${RATEDN}Kbit/sec."