So today was a rather weird day and I had a searing pain in my head. So I decided on a special masochistic thing to learn : traffic shaping. Since I am working on a project that needs this exclusively, I reckoned it will be helpful too.
I am sure most of you must have come across this term. In case you haven't, the wiki page here provides a good introduction. Traffic shaping is mostly used by ISPs and people who manage routers. It is a way to give priority to certain users or certain types of data. The term "Giving priority" hugely underestimates the complexity of the task. eg. Real time data streaming may value real-time data delivery whereas others might tolerate some delay. In general, when traffic shaping goes near QoS, it becomes a difficult problem to solve.
However, here I am going to rant about the ISP aspect of traffic shaping (i.e. allocating bandwidth per user, per service, etc). "tc" is a very powerful command in Linux to implement traffic shaping. I won't be giving sample commands or such because tc is quite complex. Instead (if anyone ever ends up using this post, which I strongly suspect is going to be my future self) I will give out links to tutorials I used for my setup.
Some introduction to tc is in line. With tc you can create queues per device. Multiple queues. You can choose which queueing discipline you are going to use. Queueing discipline says how the queue is managed. You can have a standard FIFO queue or a Token-Bucket filter (neatly manages bandwidth difference between two lines) or Stochastic Fair Queueing (ensures fairness). Queues can be defined as hierarchy and they can inherit each other's properties. Eg. I might define a queue for www traffic which takes up 40% of the bandwidth and 60% of everything else. Now, I can define another queue for ssh which is a child of the "other" queue (it basically means that the ssh queue can borrow bandwidth from its parent and go beyond its limit to occupy upto 60% of the bandwidth. This way you guarantee minimum service). On routers, each queue can represent a customer or a chunk of customers. High paying customers get more share of the bandwidth. After we are done defining this classes of traffic, next step is to define rules for saying which traffic belongs to which class. tc provides a wide variety of filters for the purpose. You can look at almost any of the TCP or IP headers, which interface the packet comes from, etc. For all purposes, this set of filters proves to be sufficient.
Now that we know how tc works, lets see what all tc can do:
1) Provide only a defined amount of bandwidth to a particular user (write a filter based on IP address)
2) Provide a defined amount of bandwidth to a particular service (write a filter based on port number)
3) Enable flexibility. i.e. if there is a burst of traffic in one class accommodate the burst instead of providing hard boundaries. (burst and cburst parameters control that)
4) Rates can be specified in percentage of total as well as actual absolute values like 8kbits, etc.
An addition to tc can do more wonderful stuff. It is called "netem" which stands for network emulator. netem has tunable knobs for any parameter you can conceive which can help you simulate a WAN. Here is what we tried today:
1) Simulate delays. If you are simulating a WAN at home, the major issue is that delays don't get simulated. With tc combined with netem, you can introduce fake delays. You can even vary delays about a point. eg. You can say probabilistically change delay at 100ms +- 10ms. You can even change the probability distribution by which it randomises the delay.
2) Simulate losses. When in WAN loses are inevitable. Either due to network congestion or corruption. netem with tc can simulate both. You can simulate percentage losses, losses in bursts, losses following a specific probability distribution, losses following a particular pattern. Anything under the sun.
3) Packet duplication, packet corruption, packet reordering.
4) Introducing a latency is only a defined type of traffic (to test QoS)
Anyone who knows a little about networks realises how difficult each one of this is to implement. Tuning all these knobs appropriately can give you an awesome simulation of the internet. I am working on a project which needs me to simulate network congestion at home. With a line-speed router and a speed of 100mbps, it is really tough to simulate congestion. We used tc command exclusively (with a lot of actual studies which has derived the loss and corruption values) to successfully simulate the internet at home. We now a get a very good TCP congestion window graph. The things that tc can do are pretty amazing. Unfortunately tc is a complex command and needs a lot of knowledge first to start using it. Fret not. Here are links to some tutorials that can get you going:
The classic TLDP tutorial
HTB tutorial : Contains some excellent explanations and practical examples with commands.
Learning tc is a process and the end result is pretty satisfying. \m/ to the author of tc.