meredtf − Table driven Mice and Elephant Random Early Detection Filter |
tc qdisc ... meredtf qMaxPkts npackets bps bits mtu bytes [ IPIPP ] [ ED ] [ th1 npackets ] [ th2 npackets ] [ th3 npackets ] [ wq val ] [ maxCount npackets ] [ aredIval nseconds ] [ flr val ] [ elepQsize BYTES ] [ mFlows NRECS ] [ flowActivSecs SECONDS ] [ qFull option ] [ fairQfact val ] [ fltbl flow table ] |
meredtf is a filter for use only with the ingress "queueing discipline". It is designed to discriminate between traffic flows entering an interface. You can use it to favor short flows or impose fairness between flows. It works by either marking or instructing "ingress" to drop packet from selected flows in order to achieve a target overall flow rate. It may seem counterintuitive that perceived performance can be improved by dropping packets after they have been received, but that is the purpose of this filter. A suitable use for this filter would be to regulate flows coming in from an ADSL connection to a domestic residence or small business. You could use it to create fairness between bandwidth consumptions of the several PCs. In order to be effective, the target download speed must be set to 85% to 90% of the actual potential throughput of the interface used (or possibly the interface being limited). This should have the effect of reducing the average queue size of the interface down to a fraction of what is was. |
A Mice and Elephants queueing discipline is one which favours short flows over long flows. Flows are classified according to how many bytes have passed in that flow. Flows longer than a given threshold are referred to as elephants. Flows that up until now have not reached that threshold are referred to as mice. Mice and Elephants is a concept that is analagous to shortest job first in scheduling. The short flows gain, and the long flows are no worse off. The Mice are advantaged in two ways. The first is a dropping advantage and the second is a queuing advantage. meredtf is a filter, and can only mark or drop, so therefor it cannot have this second queuing advantage. The way the dropping advantage works, is that mice packets are given some immunity to dropping. The flow length in bytes of each active flow is maintained. This flow length is converted to a flow value by means of a piecewise linear function, which can be specified in table form by the user. As each packet is processed, the flow value of its corresponding flow record contributes to a running average flow value. The dropping probability for the packet given by ARED is multiplied by the flow value for the flow and divided by the average flow value to give the actual dropping probability used. This scheme allows configuration that has a simple mice/elephants cutoff point, or can permit a flow to gradually become an elephant and have some elephants larger than others. |
Random Early Detection (RED) is a classless qdisc which manages its queue size smartly. Regular queues simply drop packets from the tail when they are full, which may not be the optimal behaviour. RED also performs tail drop, but does so in a more gradual way. Once the queue hits a certain average length, packets enqueued have a configurable chance of being marked (which may mean dropped). This chance increases linearly up to a point called the th2 average queue length, although the queue might get bigger. This has a host of benefits over simple taildrop, while not being processor intensive. It prevents synchronous retransmits after a burst in traffic, which cause further retransmits, etc. The goal is the have a small queue size, which is good for interactivity while not disturbing TCP/IP traffic with too many sudden drops after a burst of traffic. The average queue size is used for determining the marking probability. This is calculated using an Exponential Weighted Moving Average, which can be more or less sensitive to bursts. When the average queue size is below th1 packets, no packet will ever be marked. When it exceeds th1, the probability of doing so climbs linearly up to maxp, until the average queue size hits th2 packets. Because probability is normally not set to 100%, the queue size might conceivably rise above th2 bytes, so the qMaxPkts parameter is provided to set a hard maximum for the size of the queue. This uses the "gentle" variety of RED as suggested by Sally Floyd where the dropping probablity does not go immediately to 1 after the queue size hits th2. Instead, the dropping probability climbs linearly up to 1 as the queue size hits th3. |
RED does not work well unless it is properly configured for the traffic. This is not so easy to do. One feature of an ill configured is oscillations in the size of the queue. Sadly, the volume of traffic can change, and if RED was properly configured, then it will no longer be properly configured after the chante. Adaptive RED ARED, like RED, may be attributed to Sally Floyd et al. It automatically adusts the key parameter (maxp) to keep the RED properly configured. There is a target queue size (which will be 1/6 of the actual queue size. A second, slower estimate of average queue size is kept, and if this is larger than the target queue size, then maxp is increased. Otherwise, maxp is decreased. You can configure all of the parameters to ARED directly. Or, you can choose to provide only the mandatory parameters, and ARED will attempt to make sensible assumptions and choose sutiable values for the optional parameters. This latter strategy is recommended. |
meredtf uses a virtual queue. It is a filter, and cannot queue the packets, but it keeps a notional queue size that is calculated to be what the queue size would be if it did queue the packets through an interface of its size. |
qMaxPkts |
A mandatory integer parameter - the total number of packets that the queue will hold. In order to set this, you should take into account the queuing delay of a queue at the target queue size (normally 1/3 of the total number of packets) and also the memory usage. (You might use the MTU times the half queue size.) If you have no clues, then try 200. |
bps |
A mandatory integer parameter - the bits per second rate of your interface in the direction of the packet. If you are using SDSL this could be anything from 250000 to 1500000. If your are configuring an upstream ADSL, this could be anything from 56000 to 250000. |
||
mtu |
A mandatory integer parameter - the Maximum Transmission Unit (MTU). You can find this out using the ifconfig command. |
||
IPIPP |
IPIPP specifies the definition of a flow. You might with the definition of a flow to be the definition of a TCP connection, or you might decide to make the definition of a flow to be all traffic a given PC, or you might decide to make it something else. |
The letters of IPIPP must be in order and correspond to destination (I)p number, destination (P)ort number, source (I)p number, source (P)ort number and (P)rotocol in that order. A capitalised letter indicates that the attribute will be used as part of the flow definition. Lowercase indicates that it will not. "ipipp" is illegal, because that would mean there would be no attributes to differentiate between flows. "IPIPP" is the default and it means that flows a re differentiated by the protocol as well as the source and destination IP and Port numbers. |
ED |
ED is an optional parameter which specifies the action for flow notification. The letters of ED must be in order and correspond to (E)xplicit congestion notification and (D)ropping. A capitalised letter selects that feature. Lowercase deselects. "ed" is illegal, because that would mean that targeted packets would be neither dropped or marked. "Ed" indicates that there is to be no dropping. Packets are marked if ECN is supported. "ED" indicates that ECN will be used where it is supported and dropping otherwise. "eD" is the default, and indicates that targeted packets are to be dropped. |
||
th1 |
An optional integer parameter - This is the number of packets below which the dropping probability is set to zero. This will be set to qMaxPkts/6 if not specified. |
||
th2 |
An optional integer parameter - Knee of the packet dropping probability curve, after which the probability of marking slopes to 1. If not specified, this will be set to qMaxPkts/2. |
||
th3 |
An optional integer parameter - Queue size in packets after which dropping probablity becomes 1. If not specified, this will be set to th2 * 2, which would normally be qMaxPkts. If you do not wish the gentle variety of RED, you could set this to be th2+1. |
||
wq |
An optional floating point parameter - Used for determining how fast the average queue size is influenced by the real queue size. As each packet arrives, the current queue size is multiplied by (1-wq) and the current queue size times wq is added. If not specified, then this will be set according to the following formula :- "ps = bps / (80 * mtu)" "wq = 1 - exp(ln(0.37)/ps)". |
maxcount |
An optional integer parameter - As proposed by Floyd et al, this version of ARED uses "Uniform random variables" as opposed to "geometric random variables". If the dropping probability is low, then the number of packets we allow without checking for dropping is potentially high. The queue may be in a different state by that time, so this version of ARED imposes a limit (maxcount) on how many packets enter before we check again. If not specified, this is set to th1/4. |
aredIval |
An optional floating point parameter - This is "ARED interval" the number of seconds between each adjustment of maxp. If not specified, this is set to half a second. |
elepQsize |
An optional integer parameter - Once a flow has passed more than this size in bytes, it is regarded as an elephant for the purposes of queuing. If unspecified, this defaults to 6000 bytes. |
flr |
An optional floating point parameter - Used for determining how fast the average flow value is influenced by the flow value of the flow corresponding to each packet as it arrives. As each packet arrives, the current flow value is multiplied by flr and the flow value corresponding to the current packet times (1-flr) is added. If not specified, then this will be set to (1-wq). |
|
mFlows |
An optional integer parameter - The maximum number of flows that this queueing discipline can keep track of. Packets coming from a new flow after this threshold is reached will not have their flow recognised and will be treated as mice flows. If unspecified, this defaults to qMaxPkts flow records. |
flowActivSecs |
An optional integer parameter - Flows with no activity are forgotten after this many seconds. If not specified this defaults to 10. |
fltbl |
An optional table - The flow table consists of 14 integers separated by spaces. This is treated as 7 X,Y pairs, where X corresponds to the actual flow length in bytes, and Y corresponds to the flow length weighting used to determine packet dropping probabilities. The first Y value should be a zero. The X values must be in non decreasing order. The Y values must be in strictly increasing order. This queueing disciplines idea of individual flow length will not be permitted to exceed the highest X value, but will be truncated to that whenever it grows larger. If not specified, then flow values are zero until a flow reaches 6000 bytes, and linearly increases to 100 at the point where a flow reaches 30000 bytes. |
fairQfact |
The fairQfact optional floating point parameter provides a mechanism to alter the behaviour of this filter so that it can be a pure mice and elephants filter or a fair queueing filter, or something in between. It works by reducing the size of this filter’s idea of individual flow lengths in bytes by a value derived by multiplying the total flow target by the the fairQfact and dividing by the number of flows. If fairQfact is set to 0.0 (the default), then this is a pure mice and elephants filter. If it is 1.0 (or a bit above that) and the mice and elephants threshold is low, then this is a fair queuing filter. A compromise is possible by setting in the vicinity of 1.0 and having the mice and elephants threshold not so low. As mentioned above, meredt’s idea of a flow length will be truncated so as not to excees the highest flow length of the flow table. |
qFull |
The optional qFull parameter indicates how meredtf is to behave when the notional queue is full. A value of "slide" indicates that new packet arrivals when the queue is full will be permitted through, but they will have no effect on the queue size. A value of "stretch" indicates that new packet arrivals when the queue is full will be permitted through and the queue size will expand, accomodating new packets. A value of "drop" indicates that new packet arrivals will be rejected when the queue is full. |
$tc qdisc add dev eth0 handle ffff: ingress $tc filter add dev eth0 parent ffff: protocol ip prio 50 handle 1 meredtf qMaxBytes 50000 bps 56000 atu 800 ED |
tc(8) |
o |
Guo and Matta, The War Between Mice and Elephants http://www.cs.bu.edu/techreports/pdf/2001-005-war-tcp-rio.pdf |
||
o |
Floyd and Jacobson, Random Early Detection gateways for Congestion Avoidance. http://www.aciri.org/floyd/papers/red/red.html |
||
o |
Floyd, Gummadi and Shenker, An Algorithm for increasing the robustness of RED’s Active Queue Management. http://www.icir.org/floyd/papers/adaptiveRed.pdf |
Stephen Braithwaite Some code originates from the RED queueing discipline and the SFQ discipline which was written by Alexey N. Kuznetsov, <kuznet@ms2.inr.ac.ru>, Alexey Makarenko <makar@phoenix.kharkov.ua>, J Hadi Salim <hadi@nortelnetworks.com>. |