Appendix D. IP Route Management
Table of Contents
Routing and understanding routing in an IP network is one of the fundamentals you will need to grasp the flexibility of IP networking, and services which run on IP networks. It is not enough to address the machines and mix yourself a dirty martini. You'll need to verify that the machine has a route to any network with which it needs to exchange IP packets.
One key element to remember when designing networks, viewing routing tables, debugging networking problems, and viewing network traffic on the wire is that IP routing is stateless []. This means that every time a new packet hits the routing stage, the router makes an independent decision about where to send this packet.
In this section, we'll look at the tools available to manipulate and view the routing table(s). We'll start with the well known command, and move on to the increasingly used and tools which are part of the iproute2 package.
[] For those who have some doubt, netfilter provides a connection tracking mechanism for packets passing through a linux router. This connection tracking, however, is independent of routing. It is important to not conflate the packet filtering connection tracking statefulness with the statelessness of IP routing. For an example of a complex networking setup where netfilter's statefulness and the statelessness of IP routing collide, see .
D.1. route
In the same way that is the venerable utility for IP address management, route is a tremendously useful command for manipulating and displaying IP routing tables.
Here we'll look at several tasks you can perform with route. You can , (most importantly, the ), , and . I will switch between traditional and CIDR notation for network addressing in this (and subsequent) sections, so the reader unaware of these notations is encouraged to refer liberally to the links provided in .
When using route and ip route on the same machine, it is important to understand that not all routing table entries can be shown with route. The key distinction is that route only displays information in the main routing table. NAT routes, and routes in tables other than the main routing table must be managed and viewed separately with the tool.
D.1.1. Displaying the routing table with route
By far the simplest and most common task one performs with route is viewing the routing table. On a single-homed desktop like tristan
, the routing table will be very simple, probably comprised of only a few routes. Compare this to a complex routing table on a host with multiple interfaces and static routes to internal networks, such as masq-gw
. It is by using the route command that you can determine where a packet goes when it leaves your machine.
Example D.1. Viewing a simple routing table with route
[root@tristan]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.99.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.99.254 0.0.0.0 UG 0 0 0 eth0 |
In the simplest routing tables, as in tristan
's case, you'll see three separate routes. The route which is customarily present on all machines (and which I'll not remark on after this) is the route to the loopback interface. The loopback interface is an IP interface completely local to the host itself. Most commonly, loopback is configured as a single IP address in a class A-sized network. This entire network has been set aside for use on loopback devices. The address used is usually 127.0.0.1/8, and the device name under all default installations of linux I have seen is lo. It is not at all unheard of for people to host services on loopback which are intended only for consumption on that machine, e.g., SMTP on tcp/25.
The remaining two lines define how tristan
should reach any other IP address anywhere on the Internet. These two routing table entries divide the world into two different categories: a locally reachable network (192.168.99.0/24) and everything else. If an address falls within the 192.168.99.0/24 range, tristan
knows it can reach the IP range directly on the wire, so any packets bound for this range will be pushed out onto the local media.
If the packet falls in any other range tristan
will consult its routing table and find no single route that matches. In this case, the default route functions as a terminal choice. If no other route matches, the packet will be forwarded to this destination address, which is usually a router to another set of networks and routers (which eventually lead to the Internet).
Viewing a complex routing table is no more difficult than viewing a simple routing table, although it can be a bit more diffiult to read, interpret, and sometimes even find the route you wish to examine.
Example D.2. Viewing a complex routing table with route
[root@masq-gw]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.100.0 0.0.0.0 255.255.255.252 U 0 0 0 eth3 205.254.211.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 192.168.100.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 192.168.99.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 192.168.98.0 192.168.99.1 255.255.255.0 UG 0 0 0 eth2 10.38.0.0 192.168.100.1 255.255.0.0 UG 0 0 0 eth3 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 205.254.211.254 0.0.0.0 UG 0 0 0 eth1 |
The above routing table shows a more complex set of static routes than one finds on a single-homed host. By comparing the network mask of the routes above, we can see that the network mask is listed from the most specific to the least specific. Refer to for more discussion.
A quick glance down this routing table also provides us with a good deal of knowledge about the topology of the network. Immediately we can identify four separate Ethernet interfaces, 3 locally connected class C sized networks, and one tiny subnet (192.168.100.0/30). We can also determine that there are two networks reachable via static routes behind internal routers.
Now that we have taken a quick glance at the output from the route command, let's examine a bit more systematically what it's reporting to us.
D.1.2. Reading route's output
For this discussion refer to the network map in the appendix, and also to . route is a venerable command, one which can manipulate routing tables for protocols other than IP. If you wish to know what other protocols are supported, try route --help
at your leisure. Fortunately, routedefaults to inet (IPv4) routes if no other address family is specified.
By combining the values in columns one and three you can determine the destination network or host address. The first line in masq-gw
's routing table shows 192.168.100.0/255.255.255.252, which is more conveniently written in CIDR notation as 192.168.100.0/30. This is the smallest possible network according to . The only two useable addresses are 192.168.100.1 (service-router
) and 192.168.100.2 (masq-gw
).
The second column holds the IP address of the gateway to the destination if the destination is not a locally connected network. If there is a value other than 0.0.0.0 in this field, the kernel will address the outbound packet for this device (a router of some kind) rather than directly for the destination. The column after the netmask column (Flags) should always contain a G for destination not locally connected to the linux machine.
The fields Metric, Ref and Use are not generally used in simple or even moderately complex routing tables, however, we will discuss the Use column further in .
The final field in the route output contains the name of the interface through which the destination is reachable. This can be any interface known to the kernel which has an IP address. In we can learn immediately that 192.168.98.0/24 is reachable through interface eth2
.
After this brief examination of the commonest of output from route, let's look at some of the other things we can learn from route and also how we can change the routing table.
D.1.3. Using route to display the routing cache
The routing cache is used by the kernel as a lookup table analogous to a quick reference card. It's faster for the kernel to refer to the cache (internally implemented as a hash table) for a recently used route than to lookup the destination address again. Routes existing in the route cache are periodically expired. If you need to clean out the routing cache entirely, you'll want to become familiar with .
At first, it might surprise you to learn that there are no entries for locally connected networks in a routing cache. After a bit of reflection, you come to realize that there is on need to cache an IP route for a locally connected network because the machine is connected to the same Ethernet. So, any given destination has an entry in either the arp table or in the routing cache. For a clearer picture of the differences between each of the cached routse, I'd suggest adding a -e
switch.
Example D.3. Viewing the routing cache with route
[root@tristan]# route -Cen Kernel IP routing cache Source Destination Gateway Flags MSS Window irtt Iface 194.52.197.133 192.168.99.35 192.168.99.35 l 40 0 0 lo 192.168.99.35 194.52.197.133 192.168.99.254 1500 0 29 eth0 192.168.99.35 192.168.99.254 192.168.99.254 1500 0 0 eth0 192.168.99.254 192.168.99.35 192.168.99.35 il 40 0 0 lo 192.168.99.35 192.168.99.35 192.168.99.35 l 16436 0 0 lo 192.168.99.35 194.52.197.133 192.168.99.254 1500 0 0 eth0 192.168.99.35 192.168.99.254 192.168.99.254 1500 0 0 eth0 |
FIXME! I don't really know why there are three entries in the routing cache for each destination. Here, for example, we see three entries in the routing cache for 194.52.197.133 (a Swedish destination).
The MSS column tells us what the path MTU discovery has determined for a maximum segment size for the route to this destination. By discovering the proper segment size for a route and caching this information, we can make most efficient use of bandwidth to the destination, without incurring the overhead of packet fragmentation enroute. See for a more complete discussion of MSS and MTU.
FIXME! There has to be more we can say about the routing cache here.
D.1.4. Creating a static route with route add
Static routes are explicit routes to non-local destinations through routers or gateways which are not the default gateway. The case of the routing table on tristan
is a classic example of the need for a static route. There are two routers in the same network, masq-gw
and isdn-router
. If tristan
has packets for the 192.168.98.0/24 network, they should be routed to 192.168.99.1 (isdn-router
). Refer also to for this example.
As with , route has a syntax unlike most standard unix command line utilities, mixing options and arguments with less regularity. Note the mandatory -net
or -host
options when adding or removing any route other than the default route.
In order to add a static route to the routing table, you'll need to gather several pieces of information about the remote network.
In our example network, masq-gw
can only reach 10.38.0.0/16 through service-router
. Let's add a static route to the masquerading firewall to ensure that 10.38.0.0/16 is reachable. Our intended routing table will look like the routing table in . Let's also view the output if we mistype the IP address of the default gateway and specify an address which is not a locally reachable address.
Example D.4. Adding a static route to a network route add
[root@masq-gw]# route add -net 10.38.0.0 netmask 255.255.0.0 gw 192.168.109.1 SIOCADDRT: Network is unreachable [root@masq-gw]# route add -net 10.38.0.0 netmask 255.255.0.0 gw 192.168.100.1 |
It should be clear now that the gateway address must be reachable on a locally connected network for a static route to be useable (or even make sense). In the first line, where we mistyped, the route could not be added to the routing table because the gateway address was not a reachable address.
Now, instead of sending packets with a destination of 10.38.0.0/16 to the default gateway, wan-gw
, masq-gw
will send this traffic to service-router
at IP address 192.168.100.1.
The above is a simple example of routing a network to a separate gateway, a gateway other than the default gateway. This is a common need on networks central to an operation, and less common in branch offices and remote networks.
Occasionally, however, you'll have a single machine with an IP address in a different range on the same Ethernet as some other machines. Or you might have a single machine which is reachable via a router. Let's look at these two scenarios to see how we can create static routes to solve this routing need.
Occasionally, you may have a desire to restrict communication from one network to another by not including routes to the network. In our sample network, tristan
may be a workstation of an employee who doesn't need to reach any machines in the branch office. Perhaps this employee needs to periodically access some data or service supplied on 192.168.98.101. We'll need to add a static route to allow this machine to access this single host IP in the branch office network [].
Here's a summary of the for our static route. The destination is 192.168.98.101/32 and the gateway is 192.168.99.1.
Example D.5. Adding a static route to a host with route add
[root@tristan]# route add -host 192.168.98.101 gw 192.168.99.1 [root@tristan]# route -nKernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.98.101 192.168.99.1 255.255.255.255 UG 0 0 0 eth0 192.168.99.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.99.254 0.0.0.0 UG 0 0 0 eth0 |
Now, we have successfully altered the routing table to include a host route for the single machine we want our employee to be able to reach.
Even rarer, you may encounter a situation where a single Ethernet network is used to host multiple IP networks. There are reasons people might do this, although I regard this is bad form. If possible, it is cleaner, more secure, and easier to troubleshoot if you do not share IP networks on the same media segment. With that said, you can still convince your linux box to be a part of each network [].
Let's assume for the sake of this example that NAT is not an option for us, and we need to move the machine 205.254.211.184 into another network. Though it violates the concept of security partitioning, we have decided to put the server into the same network as service-router
. Naturally, we'll need to modify the routing table on masq-gw
.
Be sure to refer to for a complete discussion of this unusual networking scenario.
Example D.6. Adding a static route to a host on the same media with route add
[root@masq-gw]# route add -host 205.254.211.184 dev eth3 |
I'll leave as an exercise to the reader's imagination the question of how to send all traffic to a locally connected network to an interface. In light of the host route above, it should be a logical step for the reader to make.
The above are common examples of the usage of the route command.
D.1.5. Creating a default route with route add default
The default route is a special case of a static route. Any machine which is connected to the Internet has a default route. For the majority of smaller networks which are not running dynamic routing protocols, each machine on an internal network uses a router or firewall as its default gateway, forwarding all traffic to that destination. Typically, this router or firewall forwards the traffic to the next router or device via a static route until the traffic reaches the ISP's service access router. Many ISPs use dynamic routing internally to determine the best path out of their networks to remote destinations.
But we are only interested in adding a default route and understanding that packets are reaching the default gateway. Once the packets have reached the default gateway, we assume that the administrator of that device is monitoring its correct operation.
With this bit of background about the default route, it is easy to see why a default route is a key part of any networking device's configuration. If the machine is to reach machines other than the machines on the local network, it must know the address of the default gateway.
Because the default gateway is so important, there is particular support for adding a default route included in the route command. Refer to for a simple example of adding a default route. The syntax of the command is as follows:
Example D.7. Setting the default route with route
[root@tristan]# route add default gw 192.168.99.254 |
This is the commonest method used for setting a default route, although the route can also be specified by the following command. I find the alternate method more explicit than the common method for setting default gateway, because the destination address and network mask are treated exactly like any other network address and netmask.
Example D.8. An alternate method of setting the default route with route
[root@tristan]# route add -net 0.0.0.0 netmask 0.0.0.0 gw 192.168.99.254 |
The alternate method of setting a default route specifies a network and netmask of 0, which is shorthand for all destinations. I'll reiterate that the kernel sees these two methods of setting the default route as identical. The resulting routing table is exactly the same. You may select whichever of these route invocations you find more comfortable.
Now that we have covered adding static routes and the special static route, the default route, let's try our hand at removing existing routes from routing tables.
D.1.6. Removing routes with route del
Any route can be removed from the routing table as easily as it can be added. The syntax of the command is exactly the same as the syntax of the route add command.
After we went to all of the trouble above to put our machine 205.254.211.184 into the network with service-router
, we probably realize that from a security partitioning standpoint, it is not only stupid, but also foolhardy! So now, we conclude that we need to return 205.254.211.184 to its former network (the DMZ proper). We'll now remove the special host route for its IP, so the network route for 205.254.211.0/24 will now be used for reaching this host. (If you have questions about why, read .)
Example D.9. Removing a static host route with route del
[root@masq-gw]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 205.254.211.184 0.0.0.0 255.255.255.255 U 0 0 0 eth3 192.168.100.0 0.0.0.0 255.255.255.252 U 0 0 0 eth3 205.254.211.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 192.168.100.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 192.168.99.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 192.168.98.0 192.168.99.1 255.255.255.0 UG 0 0 0 eth2 10.38.0.0 192.168.100.1 255.255.0.0 UG 0 0 0 eth3 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 205.254.211.254 0.0.0.0 UG 0 0 0 eth1 [root@masq-gw]# route del -host 205.254.211.184 dev eth3 |
Another possible example might be the prohibition of Internet traffic to a particular user. If a machine does not have a default route, but instead has a routing table populated only with routes to internal networks, then that machine can only reach IP addresses in networks to which it has a routing table entry. Let's say that you have a user who routinely spends work hours browsing the Internet, fetching mail from a POP account outside your network, and in short wastes time on the Internet. You can easily prevent this user from reaching anything except your internal networks. Naturally, this sort of a problem employee should probably face some sort of administrative sanction to address the real problem, but as a technical component of the strategy to prevent this user from wasting time on the Internet, you could remove access to the Internet from this employee's machine.
In the below example, we'll use the route command a number of times for different operations, all of which you should be familiar with by now.
Example D.10. Removing the default route with route del
[root@morgan]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.98.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.98.254 0.0.0.0 UG 0 0 0 eth0 [root@morgan]# route del default gw 192.168.98.254[root@morgan]# route add -net 192.168.99.0 netmask 255.255.255.0 gw 192.168.98.254[root@morgan]# route add -net 192.168.100.0 netmask 255.255.255.0 gw 192.168.98.254[root@morgan]# route add -net 205.254.211.0 netmask 255.255.255.0 gw 192.168.98.254[root@morgan]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 205.254.211.0 192.168.98.254 255.255.255.0 U 0 0 0 eth0 192.168.100.0 192.168.98.254 255.255.255.0 U 0 0 0 eth0 192.168.99.0 192.168.98.254 255.255.255.0 U 0 0 0 eth0 192.168.98.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo |
Now, the user on morgan
can only reach the specified networks. The networks we have entered here are all of our corporate networks. If the user tries to generate a packet to any other destination, the kernel is not going to know where to send it, so will return in error code to the application trying to make the network connection.
While this can be a very effective way to restrict access to an individual machine, it is an ineffective method of systems administration, since it requires that the user log in to the affected machine and make changes to the routing table on demand. A better solution would be to use .
[] Though tristan
does not have a direct route to the 192.168.98.0/24 network, it does have a default route which knows about this destination network. Therefore, for the purposes of this illustrative example, we'll assume thatmasq-gw
is configured to drop or reject all traffic to 192.168.98.0/24 from 192.168.99.0/24 and vice versa. Effectively this means that the only path to reach the branch office from the main office is via isdn-router
.
[] There can potentially be routing problems with multiple IP networks on the same media segment, but if you can remember that IP routing is essentially stateless, you can plan around these routing problems and solve these problems. For a fuller discussion of these issues, see and .
D.2. ip route
Another part of the iproute2 suite of tools for IP management,ip route provides management tools for manipulating any of the routing tables. Operations include or the , , , , and and .
One thing to keep in mind when using the ip route is that you can operate on any of the 255 routing tables with this command. Where the command operated only on the main routing table (table 254), the ip route command operates by default on the main routing table, but can be easily coaxed into using other tables with thetable
parameter.
Fortunately, as mentioned earlier, the iproute2 suite of tools does not rely on DNS for any operation so, the ubiquitous-n
switch in previous examples will not be required in any example here.
All operations with the ip route command are atomic, so each command will return eitherRTNETLINK answers: No such process
in the case of an error, or nothing in the face of success. The-s
switch which provides additional statistical information when reporting link layer information will only provide additional information when reporting on the state of the or ..
The ip route utility when used in conjunction with the utility can create stateless NAT tables. It can even manipulate the local routing table, a routing table used for traffic bound for broadcast addresses and IP addresses hosted on the machine itself.
In order to understand the context in which this tool runs, you need to understand some of the basics of IP routing, so if you have read the above introduction to theip route tool, and are confused, you may want to read and grasp some of the concepts of IP routing (with linux) before continuing here.
D.2.1. Displaying a routing table withip route show
In its simplest form, ip route can be used to display the main routing table output. The output of this command is significantly different from the output of the. For comparison, let's look at the output of bothroute -n and ip route show.
Example D.11. Viewing the main routing table with ip route show
[root@tristan]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.99.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.99.254 0.0.0.0 UG 0 0 0 eth0 [root@tristan]# ip route show 192.168.99.0/24 dev eth0 scope link 127.0.0.0/8 dev lo scope link default via 192.168.99.254 dev eth0 |
If you are accustomed to the route output format, theip route output can seem terse. The same basic information is displayed, however. As with our former example, let's ignore the 127.0.0.0/8 loopback route for the moment. This is a required route for any IPs hosted on the loopback interface. We are far more interested in the other two routes.
The network 192.168.99.0/24 is available on eth0 with a scope of link, which means that the network is valid and reachable through this device (eth0). Refer to for definitions of possible scopes. As long as link remains good on that device, we should be able to reach any IP address inside of 192.168.99.0/24 through the eth0 interface.
Finally, our all-important default route is expressed in the routing table with the word default. Note that any destination which is reachable through a gateway appears in the routing table output with the keywordvia
. This final line matches semantically with the final line of output fromroute -n above.
Now, let's have a look at the local routing table, which we can't see withroute. To be fair, it is usually completely unnecessary to view and/or manipulate the local routing table, which is whyroute provides no way to access this information.
Example D.12. Viewing the local routing table with ip route show table local
[root@tristan]# ip route show table local local 192.168.99.35 dev eth0 proto kernel scope host src 192.168.99.35 broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1 broadcast 192.168.99.255 dev eth0 proto kernel scope link src 192.168.99.35 broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1 local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1 local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 |
This gives us a good deal of information about the IP networks to which the machine is directly connected, and an inside look into the way that the routing tables treat special addresses like broadcast addresses and locally configured addresses.
The first field in this output tells us whether the route is for a broadcast address or an IP address or range locally hosted on this machine. Subsequent fields inform us through which device the destination is reachable, and notably (in this table) that the kernel has added these routes as part of bringing up the IP layer interfaces.
For each IP hosted on the machine, it makes sense that the machine should restrict accessiblity to that IP or IP range to itself only. This explains why, in, 192.168.99.35 has a host scope. Becausetristan
hosts this IP, there's no reason for the packet to be routed off the box. Similarly, a destination of localhost (127.0.0.1) does not need to be forwarded off this machine. In each of these cases, the scope has been set to host.
For broadcast addresses, which are intended for any listeners who happen to share the IP network, the destination only makes sense as for a scope of devices connected to the same link layer [].
The final characteristic available to us in each line of the local routing table output is thesrc
keyword. This is treated as a hint to the kernel about what IP address to select for a source address on outgoing packets on this interface. Naturally, this is most commonly used (and abused) on multi-homed hosts, although almost every machine out there uses this hint for connections to localhost [].
Now that we have inspected the main routing table and the local routing table, let's see how easy it is to look at any one of the other routing tables. This is as simple as specifying the table by its name in/etc/iproute2/rt_tables
or by number. There are a few reserved table identifiers in this file, but the other table numbers between 1 and 252 are available for the user. Please note that this example is for demonstration only and has no intrinsic value other than showing the use of the table
parameter.
Example D.13. Viewing a routing table with ip route show table
[root@tristan]# ip route show table special Error: argument "special" is wrong: table id value is invalid [root@tristan]# echo 7 special >> /etc/iproute2/rt_tables [root@tristan]# ip route show table special [root@tristan]# ip route add table special default via 192.168.99.254[root@tristan]# ip route show table special default via 192.168.99.254 dev eth0 |
In the above example you get a first glance at how to add a route to a table other than the main routing table, but what we are really interested in is the final command and output. In , we have identified table 7 by the name "special" and have added a route to this table. The commandip route show table special
shows us routing table number 7 from the kernel.
ip route consults /etc/iproute2/rt_tables
for a table identifier. If it finds no identifier, it complains that it cannot find a reference to such a table. If a table identifier is found, then the corresponding routing table is displayed.
The use of multiple routing tables can make a router very complex, very quickly. Using names instead of numbers for these tables can assist in the management of this complexity. For further discussion on managing multiple routing tables and some issues of handling them see .
D.2.2. Displaying the routing cache withip route show cache
The routing cache is used by the kernel as a lookup table analogous to a quick reference card. It's faster for the kernel to refer to the cache (internally implemented as a hash table) for a recently used route than to lookup the destination address again. Routes existing in the route cache are periodically expired.
The routing cache can be displayed in all its glory with ip route show cache, which provides a detailed view of recent destination IP addresses and salient characteristics about those destinations. On routers, masquerading boxen and firewalls, the routing cache can become very large. Instead of viewing the entire routing cache even on a workstation, we'll select a particular destination from the routing cache to examine.
Example D.14. Displaying the routing cache with ip route show cache
[root@tristan]# ip route show cache 192.168.100.17 192.168.100.17 from 192.168.99.35 via 192.168.99.254 dev eth0 cache mtu 1500 rtt 18ms rttvar 15ms cwnd 15 advmss 1460 192.168.100.17 via 192.168.99.254 dev eth0 src 192.168.99.35 cache mtu 1500 advmss 1460 |
FIXME! I don't know how to explain rtt, rttvar, and cwnd, even after reading Alexey's comments in the iproute2 documentation! Not only that, I'm not sure why there are two entries!
The output in summarizes the reachability of the destination 192.168.100.17 from 192.168.99.35. The first line of each entry provides some important information for us: the destination IP, the source IP, the gateway through which the destination is reachable, and the interface through which packets were routed. Together, these data identify a route entry in the cache.
Characteristics of that route are summarized in the second line of each entry. For the route betweentristan
and isolde
, we see that Path MTU discovery has identified 1500 bytes as the maximum packet size from end to end. The maximum segment size (MSS) of data is 1460 bytes. Although this is not usually of any but the most casual of interest, it can be helpful diagnostic information.
If you are a die-hard fan of statistics, and can't get enough information about the routing on your machine, you can always throw the-s
switch.
Example D.15. Displaying statistics from the routing cache with ip -s route show cache
[root@tristan]# ip -s route show cache 192.168.100.17 192.168.100.17 from 192.168.99.35 via 192.168.99.254 dev eth0 cache users 1 used 326 age 12sec mtu 1500 rtt 72ms rttvar 22ms cwnd 2 advmss 1460 192.168.100.17 via 192.168.99.254 dev eth0 src 192.168.99.35 cache users 1 used 326 age 12sec mtu 1500 advmss 1460 |
With this output, you'll get just a bit more information about the routes. The most interesting datum is usually the "used" field, which indicates the number of times this route has been accessed in the routing cache. This can give you a very good idea of how many times a particular route has been used. The age field is used by the kernel to decide when to expire a cache entry. The age is reset every time the route is accessed [].
In sum, you can use the routing cache to learn a good deal about remote IP destinations and some of the characteristics of the network path to those destinations.
D.2.3. Using ip route add to populate a routing table
ip route add is a used to populate a routing table. Although you can use to do the same thing,ip route add offers a large number of options that are not possible with the venerable route command. After we have looked at some simple examples, we'll discuss more complex routes withip route.
In , we used two classic examples of adding a network route (to our service provider's network from ) and a host route. Let's look at the difference in syntax with the ip route command.
Example D.16. Adding a static route to a network with route add, cf.
[root@masq-gw]# ip route add 10.38.0.0/16 via 192.168.100.1 |
This is one of the simplest examples of the syntax of the ip route. As you may recall, you can only add a route to a destination network through a gateway that is itself already reachable. In this case, masq-gw
already knows a route to 192.168.100.1 (service-router
). Now any packets bound for 10.38.0.0/16 will be forwarded to 192.168.100.1.
Other interesting examples of this command involve the use of prohibit
and from
. Use of the prohibit
will cause the router to report that the requested destination is unreachable. If you know a netblock that hosts a service you are not interested in allowing your users to access, this is an effective way to block the outbound connection attempts.
Let's look at an example of output which shows theprohibit
route in action.
Example D.17. Adding a prohibit
route withroute add
[root@masq-gw]# ip route add prohibit 209.10.26.51 [root@tristan]# ssh 209.10.26.51 ssh: connect to address 209.10.26.51 port 22: No route to host [root@masq-gw]# tcpdump -nnq -i eth2tcpdump: listening on eth2 22:13:13.740406 192.168.99.35.51973 > 209.10.26.51.22: tcp 0 (DF) 22:13:13.740714 192.168.99.254 > 192.168.99.35: icmp: host 209.10.26.51 unreachable - admin prohibited filter [tos 0xc0] |
Compare the ICMP packet returned to the sender in this case with the if you used iptables and theREJECT
target []. Although the net effect is identical (the user is unable to reach the intended destinatioan), the user gets two different error messages. With an iptables REJECT
, the user seesConnection refused
, where the user seesNo route to host
with the use of prohibit
. These are but two of the options for controlling outbound access from your network.
Supposing you don't want to block access to this particular host for all of your users, thefrom
option comes to your aid.
Example D.18. Using from
in a routing command with route add
[root@masq-gw]# ip route add prohibit 209.10.26.51 from 192.168.99.35 |
Now, you have effectively blocked the source IP 192.168.99.35 from reaching 209.10.26.51. Any packets matching this source and destination address will match this route. In this case, masq-gw
will generate an ICMP error message indicating that the destination is administratively unreachable.
If you are still following along here, you can see that the options for identifying particular routes are many and multi-faceted. The src
option provides a hint to the kernel for source address selection. When you are working with multiple routing tables and different classes of traffic, you can ease your administrative burden, by hosting several different IPs on your linux machine and setting the source address differently, depending on the type of traffic.
In the example below, let's assume that our masquerading host also runs a DNS resolver for the internal network and we have selected all of the outbound DNS packets to be routed according to table 7 []. Now, any packet which originates on this box (or is masqueraded through this table) will have its source IP set to 205.254.211.198.
Example D.19. Using src
in a routing command with route add
[root@masq-gw]# ip route add default via 205.254.211.254 src 205.254.211.198 table 7 |
FIXME!! I have nothing to say about nexthop
yet, because I have never used it, this goes for equalize
and onlink
as well. If anybody has some examples s/he would like to contribute, I'd love to hear.
There are other options to the ip route add documented in Alexey's thoroughiproute2 documentation. For further research, I'd suggested acquiring and reading this manual.
D.2.4. Adding a default route withip route add default
Naturally, one of the most important routes on a machine is its default route. Adding a default route is one of the simplest operations withip route.
We need exactly one piece of information in order to set the default route on a machine. This is the IP address of the gateway. The syntax of the command is extremely simple and aside from the use of thevia
instead of gw
, it is almost the same command as the equivalentroute -n.
Example D.20. Setting the default route with ip route add default
[root@tristan]# ip route add default via 192.168.99.254 |
D.2.5. Setting up NAT with ip route add nat
Be sure to see for a full treatment of the issues involved in network address translation (NAT). If you are here to learn a bit more about how to set up NAT in your network, then you should know that theip route add nat is only half of the solution. You must understand that performing NAT withiproute2 involves one component to rewrite the inbound packet (ip route add nat), and another command to rewrite the outbound packet (). If you only get half of the system in place, your NAT will only work halfway--or not at all, depending on how you define "work".
Alexey documents clearly in the appendix to the iproute2 manual that the NAT provided by the iproute2 suite is stateless. This is distinctly unlike NAT with netfilter. Refer to and for a better look at the connection tracking and network address translation support available under netfilter.
The ip route add nat command is used to rewrite the destination address of a packet from one IP or range to another IP or range. Theiproute2 tools can only operate on the entire IP packet. There is no provision directly within the iproute2suite to support conditional rewriting based on the destination port of a UDP datagram or TCP segment. It's the whole packet, every packet, and nothing but the packet [].
Example D.21. Creating a NAT route for a single IP with ip route add nat
[root@masq-gw]# ip route add nat 205.254.211.17 via 192.168.100.17 [root@masq-gw]# ip route show table local | grep ^nat nat 205.254.211.17 via 192.168.100.17 scope host |
The route entry we have just made tells the kernel to rewrite any inbound packet bound for 205.254.211.17 to 192.168.100.17. The actual rewriting of the packet occurs at the routing stage of the packets trip through the kernel. This is an important detail, illuminated more fully in .
Not only can iproute2 support network address translation for single IPs, but also for entire network ranges. The syntax is substantially similar to the syntax above, but uses a CIDR network address instead of a single IP.
Example D.22. Creating a NAT route for an entire network withip route add nat
[root@masq-gw]# ip route add nat 205.254.211.32/29 via 192.168.100.32 [root@masq-gw]# ip route show table local | grep ^nat nat 205.254.211.32/29 via 192.168.100.32 scope host |
In this example, we are adding a route for an entire network. Any IP packets which come to us destined for any address between 205.254.211.32 and 205.254.211.39 will be rewritten to the corresponding address in the range 192.168.100.32 through 192.168.100.39. This is a shorthand way to specify multiple translations with CIDR notation.
Again, this is only one half of the story for NAT with iproute2. Please be certain to read the section below for usage information on, in addition to which will provide fuller documentation for NAT support under linux. Don't forget to use after you add NAT routes and the corresponding NAT rules [].
D.2.6. Removing routes with ip route del
The ip route del takes exactly the same syntax as the command, so if you have familiarized yourself with the syntax, this should be a snap.
It is, in fact, almost trivial to delete routes on the command line withip route del. You can simply identify the route you wish to remove with command and append the output line verbatim toip route del.
Example D.23. Removing routes with ip route del []
[root@masq-gw]# ip route show 192.168.100.0/30 dev eth3 scope link 205.254.211.0/24 dev eth1 scope link 192.168.100.0/24 dev eth0 scope link 192.168.99.0/24 dev eth0 scope link 192.168.98.0/24 via 192.168.99.1 dev eth0 10.38.0.0/16 via 192.168.100.1 dev eth3 127.0.0.0/8 dev lo scope link default via 205.254.211.254 dev eth1 [root@masq-gw]# ip route del 10.38.0.0/16 via 192.168.100.1 dev eth3 |
We identified the network route to 10.38.0.0/16 as the route we wished to remove, and simply appended the description of the route to ourip route del command.
This command can be used to remove routes such as broadcast routes and routes to locally hosted IPs in addition to manipulation of any of the other routing tables. This means that you can cause some very strange problems on your machine by inadvertently removing routes, especially routes to locally hosted IP addresses.
D.2.7. Altering existing routes withip route change
Occasionally, you'll want to remove a route and replace it with another one. Fortunately, this can be done atomically with ip route change.
Let's change the default route on tristan with this command.
Example D.24. Altering existing routes with ip route change
[root@tristan]# ip route change default via 192.168.99.113 dev eth0 [root@tristan]# ip route show 192.168.99.0/24 dev eth0 scope link 127.0.0.0/8 dev lo scope link default via 192.168.99.113 dev eth0 |
If you do use the ip route change command, you should be aware that it does not communicate a routing table state change to the routing cache, so here is another good place to get in the habit of using .
There's not much more to say about the use of this command. If you don't want to use an immediately followed by an you can useip route change.
D.2.8. Programmatically fetching route information withip route get
When configuring routing tables, it is not always sufficient to search for the destination manually. Especially with large routing tables, this can become a rather boring and time-consuming endeavor. Fortunately,ip route get elegantly solves the problem. By simulating a request for the specified destination, ip route get causes the routing selection algorithm to be run. When this is complete, it prints out the resulting path to the destination. In one sense, this is almost equivalent to sending an ICMP echo request packet and then using .
Example D.25. Testing routing tables with ip route get
[root@tristan]# ip -s route get 127.0.0.1/32 ip -s route get 127.0.0.1/32 local 127.0.0.1 dev lo src 127.0.0.1 cache |
For casual use, ip route get is an invaluable tool. An obvious side effect of usingip route get the increase in the usage count of every touched entry in the routing cache. While this is no problem, it will alter the count of packets which have used that particular route. If you are using ip to count outbound packets (people have done it!) you should be cautious with this command.
D.2.9. Clearing routing tables withip route flush
The flush
option, when used with ip route empties a routing table or removes the route for a particular destination. In, we'll first remove a route for a destination network using ip route flush, and then we'll remove all of the routes in the main routing table with one command.
If you do not wish to delete routes by hand, you can quickly empty all of the routes in a table by specifying a table identifier to theip route flush command.
Example D.26. Removing a specific route and emptying a routing table with ip route flush
[root@masq-gw]# ip route flush "ip route flush" requires arguments [root@masq-gw]# ip route flush 10.38 Nothing to flush. [root@masq-gw]# ip route flush 10.38.0.0/16 [root@masq-gw]# ip route show 192.168.100.0/30 dev eth3 scope link 205.254.211.0/24 dev eth1 scope link 192.168.100.0/24 dev eth0 scope link 192.168.99.0/24 dev eth0 scope link 192.168.98.0/24 via 192.168.99.1 dev eth0 127.0.0.0/8 dev lo scope link default via 205.254.211.254 dev eth1[root@masq-gw]# ip route flush table main [root@masq-gw]# ip route show [root@masq-gw]# |
Note that you should exercise caution when using ip route flush table because you can easily destroy your own route to the machine by specifying the main routing table or a routing table that is used to send packets to your workstation. Naturally, this is not a problem if you are connected to the machine via a serial, modem, console, or other out of band connection.
D.2.10. ip route flush cache
Above, in , we looked at the contents of the routing cache, a hash table in the kernel which contains recently used routes. To quote John S. Denker, you should not forget to use ip route flush cache after you have changed the routing tables; "otherwise changes will take effect only after some maddeningly irreproducible delay." []
Since the kernel refers to the routing cache before fetching a new route from the routing tables,ip route flush cacheempties the cache of any data. Now when the kernel goes to the routing cache to locate the best route to a destination, it finds the cache empty. Next, it traverses the routing policy database and routing tables. When the kernel finds the route, it will enter the newly fetched destination into the routing cache.
Example D.27. Emptying the routing cache with ip route flush cache
[root@tristan]# ip route show cache local 127.0.0.1 from 127.0.0.1 tos 0x10 dev lo cache |
When making routing changes to a linux box, you can save yourself some troubleshooting time (and confusion) by getting in the habit of finishing your routing commands withip route flush cache.
D.2.11. Summary of the use ofip route
With this overview of the use of the ip route utility, you should be ready to step into some advanced territory to harness multiple routing tables, take advantage of special types of routes, use network address translation, and gather detailed statistics on the usage of your routing tables.
[] I'm going to specifically neglect a discussion of bridging and broadcast addresses for now. Let's assume a simple Ethernet where the entire IP network is on one hub or switch.
[] When a user initiates a connection to localhost (let's say localhost:25, where a private SMTP server is listening), the connection could, of course, come from the IP assigned to any of the Ethernet interfaces. It makes the most sense, however, for the source IP to be set to 127.0.0.1, since the connection is actually initiated from on the local machine. Some services running on a local machine rely on the loopback interface and will restrict incoming connections to source addresses of 127.0.0.1. Frankly, I find this quite sensible for services which are not intended for public use.
[] Be wary of using and ip route show cache because ip route get implicitly causes a route lookup to be performed, thus increasing the used counter on the route, and resetting the age. This will alter the statistics reported byip -s route show cache.
[] Please note that I in the cross-referenced example I have used iptables. The same behaviour should be expected withipchains. (Anybody have any proof?)
[] If you wonder how this kind of magic is accomplished, you'll want to read.
[] This should not lead you into believing it cannot be done. This is linux after all! By routing via fwmark, and using the --mark
option to ipchains or the MARK target and--set-mark
option in iptables, you can perform conditional routing based on characteristics and contents of the packet.
[] You can always use my and instead of entering your own commands, however, it is always important to understand the tool you are using.
[] Please note that this is the same routing table as is shown in the, which displays the output fromroute -n on masq-gw
.
[] See this remark in his of a workaround with FreeS/WAN and iproute2 to approximate more RFC-like SPD behaviour for a linux IPSec tunnel.
D.3. ip rule
Another part of the iproute2 software package, ip rule is the single tool for manipulating the routing policy database under linux (RPDB). For a fuller discussion of the RPDB, see . The RPDB can be . Particular rules can be added and removed with (predictably, if you have been reading the sections on the other iproute2 tools) command and the command. We'll make a particular example of the .
D.3.1. ip rule show
Briefly, the RPDB mediates access to the routing tables. In the overwhelming majority of installations (most workstations, servers, and even routers), there is no need to take advantage of the RPDB. A single IP routing table is all that is required for basic connectivity. In more complex networking configurations, however, the RPDB allows the administrator to programmatically select a routing table based on characteristics of a packet.
Along with this freedom and flexibility comes the power to break networking in strange and unexpected ways. I will reiterate: IP routing is stateless. Because IP routing is stateless, the network architect, planner or administrator needs to be aware of the issues involved with using multiple routing tables.
For a fuller discussion of some of these issues, be sure to read . Now, let's look at some of the ways to use ip rule.
D.3.2. Displaying the RPDB with ip rule show
To display the RPDB, use the command ip route show. The output of the command is a list of rules in the RPDB sorted by order of priority. The rules with the highest priority will be displayed at the top of the output.
Example D.28. Displaying the RPDB with ip rule show
[root@isolde]# ip rule show 0: from all lookup local 32766: from all lookup main 32767: from all lookup 253 |
There are some interesting items to observe here. First, these are the three default rules in the RPDB which will be available on any machine with an RPDB. The first rule specifies that any packet from any where should first be matched against routes in the local routing table. Remember that the local routing table is for broadcast addresses on link layers, network address translation, and locally hosted IP addresses.
If a packet is not bound for any of these three destinations, the kernel will check the next entry in the RPDB. In the simple case above, on isolde
, a packet bound for 205.254.211.182 would first pass through the local routing table without matching any of the local destinations. The next entry in the RPDB recommends using the main routing table to select a destination route.
In isolde
's main routing table, it is likely that there is no host nor network match for this destination, thus the packet will match the default route in the main routing table.
FIXME!! Can anybody (somebody?) explain to me why there is a rule priority 32767 which refers to table 253? I'm still confused about this.
D.3.3. Adding a rule to the RPDB with ip rule add
Adding a rule to the routing policy database is simple. The syntax of the ip rule add command should be familiar to those who have read or have used the ip route to populate routing tables.
A simple rule selects a packet on the the packet's characteristics. Some characteristics available as selection criteria are the source address, the destination, the type of service (ToS), the interface on which the packet arrived, and an fwmark.
One great way to take advantage of the RPDB is to split different types of traffic to different providers based on packet characteristics. Let's assume two network connections on masq-gw
, one that is a highly reliable high cost connection, and a much lower cost less reliable connection. Let's also assume that we are using Type of Service flags on IP packets on the internal network.
We might want to prefer a low-latency, highly reliable link for one type of packet. By using tos
as a selection criterion with ip rule we can effectively route these packets via our faster and more reliable internet connection.
Example D.29. Creating a simple entry in the RPDB with ip rule add []
[root@masq-gw]# ip route add default via 205.254.211.254 table 8 [root@masq-gw]# ip rule add tos 0x08 table 8 [root@masq-gw]# ip route flush cache [root@masq-gw]# ip rule show 0: from all lookup local 32765: from all tos 0x08 lookup 8 32766: from all lookup main 32767: from all lookup 253 |
Note that the rule we inserted was added to the next available higher priority in the RPDB because we did not specify a priority. If we wished to specify a priority, we could use prio
.
Now any packet with an IP ToS field matching 0x08 will be routed according to the instructions in table 8. If no route in table 8 applies to the matched packet (not possible, since we added a default route), the packet would be routed according to the instructions in table "main".
The selection criteria for matching a packet can be grouped. Let's look at a more complex example of ip rule where we use multiple selection criteria.
Example D.30. Creating a complex entry in the RPDB with ip rule add
[root@masq-gw]# ip rule add from 192.168.100.17 tos 0x08 fwmark 4 table 7 |
Frankly, that's a very complex rule! I do not know if I could describe a scenario where this particular rule would be required. The point, though, is that you can have arbitrarily complex selection criteria, and multiple rules which lookup routes in as many of the 253 routing tables as you wish.
ip rule add, while a powerful tool, can quickly make a routing table or router too complex to easily understand. It's important to try to design and implement the simplest configuration to maintain on your router. If you cannot avoid using multiple routing tables and the RPDB, at least be systematic about it.
D.3.4. ip rule add nat
As discussed more thoroughly in , this is the other half of iproute2supported network address translation. The two components are and ip rule add nat.
ip rule add nat is used to rewrite the source IP on packets during the routing stage. Each packet from the real IP is translated to the NAT IP without altering the destination address of the packet.
NAT is commonly used to publish a service in an internal network on a public IP. Thus packets returning to the public network need to be readdressed to appear with a source address of the publicly accessibly IP.
Example D.31. Creating a NAT rule with ip rule add nat
[root@masq-gw]# ip rule add nat 205.254.211.17 from 192.168.100.17 [root@masq-gw]# ip rule show 0: from all lookup local 32765: from 192.168.100.17 lookup main map-to 205.254.211.17 32766: from all lookup main 32767: from all lookup 253 |
In more complex situations, entire subnets can be translated to provide NAT for a range of IPs. The example below shows how to specify the ip rule add nat to complete the NAT mapping in .
Example D.32. Creating a NAT rule for an entire network with ip rule add nat
[root@masq-gw]# ip rule add nat 205.254.211.32 from 192.168.100.32/29 [root@masq-gw]# ip rule show 0: from all lookup local 32765: from 192.168.100.32/29 lookup main map-to 205.254.211.32 32766: from all lookup main 32767: from all lookup 253 |
Notice the ip rule synonym for the nat
option. It is valid to substitute map-to
for nat
.
D.3.5. ip rule del
Naturally, no iproute2 tool would be complete without the ability to undo what has been done. With ip rule del, individual rules can be removed from the RPDB.
It is at first quite confusing that the word all
in the ip rule show output needs to be replaced with the network address 0/0. I do not know why all
is not acceptable as a synonym for 0/0, but you'll save yourself some headache by getting in the habit of replacing all
with 0/0.
By replacing the verb add
in any of the command lines above with the verb del
, you can remove the specified entry from the RPDB.
Example D.33. Removing a NAT rule for an entire network with ip rule del nat
[root@masq-gw]# ip rule del nat 205.254.211.32 from 192.168.100.32/29 [root@masq-gw]# ip rule show 0: from all lookup local 32766: from all lookup main 32767: from all lookup 253 |
The ip rule utility can be a great boon in the manipulation and maintenance of complex routers.
[] Please note that this is an incomplete example. Simply put, I'm not dealing with the issues of inbound packets or packets destined for locally connected networks in this example. Keep in mind the instructional nature of this example, and plan your own network accordingly. For a fuller discussion of the issues involved with handling multiple Internet links, see . Note also, that there is no corresponding network connection in the example network for this network connection.