Search This Blog

Tuesday, November 22, 2011

NetworkManager: per device routing tables


Today Alex Fiestas told me about one problem he has and that I had long ago.

Everybody uses Internet nowadays and probably most of you know what IP address, netmask, gateway address mean. For example, my notebook is using this configuration right now:

eth0
IP address: 192.168.1.10
Netmask: 255.255.255.0
Gateway: 192.168.1.1

wlan0
IP address: 192.168.1.12
Netmask: 255.255.255.0
Gateway: 192.168.1.1

Yes, two devices in the same local network, but that does not invalidate this post :-)

Although I have the same IP as gateway for my two devices the Linux kernel sees 192.168.1.1/eth0 and 192.168.1.1/wlan0, not 192.168.1.1 alone. When there is only one active device there is no problem, but what happens if you have two, like my notebook?

Only one of those two IP address/device pairs is the default route, that is, the IP address/device pair the Linux resorts to when it cannot find the computer to connect to. Now enters NetworkManager. NetworkManager sorts devices by type when deciding which one to configure as default route. If I am not mistaken it sorts like this: wired, wifi, mobile broadband.

Now the problem:

Suppose you are connected using wlan0 only, with Kmail, Kopete, Konversation, etc up and running. Then you decide to plug the ethernet cable (eth0) and keep wlan0 also active. NetworkManager will switch the default route from wlan0 to eth0. Now what happens with the connections Kmail, Kopete, Konversation opened when eth0 was the default route? Well, if you do not have a proper routing table configuration they will stop working because the Linux kernel, by default, does not expect receiving IP packets from the old gateway's IP address/device pair anymore, only from the new gateway's IP address/device pair.

When the connections time out (in a couple of minutes or so) the Linux kernel will close them, but not before you get annoyed by seeing your connections closing in front you even though you have two, not only one, active connections.

Long ago I created a script similar to this one to solve this problem for me. I have even forgotten that I had this configuration in my notebook. If you have this same problem just do:

Code
$ wget http://kde-mg.org/wp-content/uploads/2011/11/per_device_routing_tables.txt
$ sudo chown root:root per_device_routing_tables.txt
$ sudo chmod 700 per_device_routing_tables.txt
$ sudo mv per_device_routing_tables.txt \
  /etc/NetworkManager/dispatcher.d/per_device_routing_tables.sh

Add one line per device to file /etc/iproute2/rt_tables like this:
File /etc/iproute2/rt_tables
100     eth0
101     wlan0

The first column contains the table's id, they just need to be different from each other. The second column contains the table's name. The script above assumes the table's name is equal to the device's interface name.

Finally, restart all your connections. When when NM switches the default route all existing connections will keep working.

What the script does is create one routing table for each device (one for eth0 and one for wlan0 in the example above). There is one default route in each routing table and the global default route (the one NM changes) still exists. Even when the global default route is changed the per device default routes continue intact and let the existing connections to reach the Internet using the correct IP address/device pair.

Enjoy :-)

17 comments:

Anonymous said...

Isn't this fixed now in NM 0.9.2? Please file a bug if it is not.

Rudd-O said...

I fixed this problem by simply multihoming -- two SAME IP addresses on two different devices in my laptop.

Now, when I unplug or plug the network cable, traffic seamlessly and autonomically moves from device to device, without losing any connectivity.

You should try it:


/var/lib/dhclient@karen.dragonfear Ω:
ifconfig
eth0 Link encap:Ethernet HWaddr 14:DA:E9:17:ED:36
inet addr:10.254.102.5 Bcast:10.254.102.255 Mask:255.255.255.0
inet6 addr: fe80::16da:e9ff:fe17:ed36/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15987376 errors:0 dropped:0 overruns:0 frame:0
TX packets:3022389 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:23384085170 (21.7 GiB) TX bytes:1394768076 (1.2 GiB)
Interrupt:46 Base address:0x4000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:753 errors:0 dropped:0 overruns:0 frame:0
TX packets:753 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:75974 (74.1 KiB) TX bytes:75974 (74.1 KiB)

tun0 Link encap:Ethernet HWaddr 7E:BB:84:87:2B:FC
inet addr:10.254.102.5 Bcast:10.254.102.255 Mask:255.255.255.0
inet6 addr: fe80::7cbb:84ff:fe87:2bfc/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1166 errors:0 dropped:0 overruns:0 frame:0
TX packets:627 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:152914 (149.3 KiB) TX bytes:91418 (89.2 KiB)

wlan0 Link encap:Ethernet HWaddr 48:5D:60:CB:3E:F5
inet addr:10.254.102.5 Bcast:10.254.102.255 Mask:255.255.255.0
inet6 addr: fe80::4a5d:60ff:fecb:3ef5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:286 errors:0 dropped:0 overruns:0 frame:0
TX packets:183 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:71242 (69.5 KiB) TX bytes:52734 (51.4 KiB)

Rudd-O said...

See? Three devices (one virtual VPN that loops back to my LAN). I unplug the LAN cable, move my wireless network to the phone, and roam away. Everything continues to operate normally. Even the NFS server mounts continue to operate.

Multihoming rocks.

Gaurav Chaturvedi said...

Hey great post.
One minor observation :-

In a few distros we dont idle is the root shell.

So it would be really nice, if you could add sudo in front of your commands to download and copy your bash script.

Pedro Alves said...

@Rudd-O Hey, that's very interesting. Now if NetworkManager had an option to do configure that automatically for me, it'd be super.

Lamarque said...

@Anonymous, there is this bug already, which can be solved using the configuration from my script.

@Rudd-O, you use static IP configuration, right? If not, then you would need to configure each and every DHCP server your computer uses to give the same IP address to your computer's network devices. I would rather use a configuration that works transparently with any DHCP server, which is what most people use nowadays.

My script works with any IPv4 DHCP server and can be adapted to work with IPv6, ppp, and static IP.

I have never used a configuration like yours though. In my configuration all connections that passes through one device continues to pass through only that device when the second device is activated. Does your configuration allows to move the connection to a different device? I do not think so, but since I have never used multihoming.

@Gaurav Chaturvedi, Ok, done.

@Pedro Alves, NM has the option "Use this connection only for resources on its network". However, I think it does not configure per device routing tables, it only makes the device not set a default route as far as I know.

Fri13 said...

@Lamarque
Next time if you replace "kernel" with "operating system" it works correctly with every Unix system. As Linux kernel is monolithic kernel and it means Linux kernel is the whole operating system. The "kernel" and "operating system" are synonyms for each other.

But on Unix operating systems what are not monolithics by architecture but Server-Client, the "kernel" does not work so well as people mistake "Kernel" and "microkernel" and in Server-Client architecture OS's (like HURD, MINIX, XNU, NT and so on) the microkernel does not have anything to do with the networking as networking is done by server, not by microkernel.
There are few BSD family OS what are not Monolithics but Server-Clients by architecture so it is wiser to use the common term "Operating System" than its synonym "Kernel".

@Gaurav Chaturvedi

So it would be really nice, if you could add sudo in front of your commands to download and copy your bash script.

NO! Never, EVER add sudo in front of the program names. Sudo is one program among other in what is typed to bash or other commandline shells and sudo is not wise to even be used like it is configured in Ubuntu where it replaces the root user.

Every guide should be without sudo program in the line.

If a person is too lazy to type sudo because they distributors are stupid by security and dont care about good configuration of sudo, then its their users job to learn it.

And when even doing copy-paste, almost everyone else can not just paste it but they need to go back and remove the sudo program from front before executing the actual program.

Sudo users can do it much easier way just typing sudo and then paste. If they forget to type first sudo, it is their fault.

And bash scripts should be done by Unix way portability in mind and that means there should not be sudo program mentioned at all.
As on wise systems, sudo ain't configured as it is today in Ubuntu or its variants but it is always configured not to replace root user or give all permissions to user.

Fri13 said...

Is there any plans to add old feature of Unix when having multiple network connections to have them all work as single one?

At the times of 56k and ISDN connections, people doubled or even tripled their connection speeds by joining all the network connections as single one what was then used as one faster connection.

That would still be valid today when people have 1Mbits ADSL connection and then on their phones a other 1Mbits connection as hotspot.

Of course it does not work so well for games because latency but for downloading and other service functions it would still today be great.

And maybe at some day we can have same kind functionality with networkmanager as we do have with pulseaudio, that we can rule out what connection applications use.

Like place ktorrent to 10Mbits Eth0 connection and then web browser to 1Mbits WLAN.
Or even other way round so user can rule that youtube playing gets full speed from one connection and still continue using other connection for every other applications.

Lamarque said...

@Fri13, since I do not know how other operating systems behave reggarding routing tables I have better use "Linux kernel" instead of operating systems :-P

Linux is monolithic modular. Still monolithic in the sense all operating system functions (scheduling, drivers, filesystems, etc) are implemented in kernel mode, but by being modular the Linux kernel overcomes almost all short-commings of not being a micro-kernel, so it is important to say it too.

In well configured systems if you run sudo as root it will also work. You do not need to remove it from the command if you do a copy and paste.

PS: I use Gentoo and used to use Unix (Sun OS, Solaris, AIX) long ago before using Linux. I also do not like the way Ubuntu configures sudo by default (without asking for password).

Pedro Alves said...

@Lamarque Rudd-O did say:

"Now, when I unplug or plug the network cable, traffic seamlessly and autonomically moves from device to device, without losing any connectivity."

I'd guess that his tun0 interface serves as frontend, and routing is transparent to the appropriate real interface, whichever's present.

I think it should be possible to use such a scheme with DHCP, though it may take some smart probing.

Fri13 said...

@Lamarque
Linux is monolithic modular. Still monolithic in the sense all operating system functions (scheduling, drivers, filesystems, etc) are implemented in kernel mode, but by being modular the Linux kernel overcomes almost all short-commings of not being a micro-kernel, so it is important to say it too.

The kernel mode vs user mode has no meaning as the monolithic can be modular (like linux has been since 2.2 version, or was it from 2.0...) as you said, but the difference is the architecture how the modularity is actually done. Like on server-client architecture servers (not modules) can exist as well in kernel or user space, it does not matter. But the architecture is such that every server is like own program shielded from others. The architecture is same, is the server located in either one call spaces.
But on monolithic architecture, module can not ever exist in user space (what you know). And the difference to the server-client architecture is that when module is loaded (like device driver or network protocol), it works like it would never been separated from the main OS image.
So the difference really is that in modular monolithic OS, the separation is on binary level. But on server-client OS, the separation is on architecture level.

In well configured systems if you run sudo as root it will also work. You do not need to remove it from the command if you do a copy and paste.

I have seen well configured systems (from grazy security nazis to more passive security persons) where are sudo used in such manners that it can be debated by the sys admins how to do that, but I say in other half of the well configured system it is possible. Still I would take the non-sudo version of script (or tutorial) what then to be edited if needed to work the "other half". =)

I also do not like the way Ubuntu configures sudo by default (without asking for password).

Have ubuntu moved to that now? Or did I misunderstand that it just does not ask password again if once gaved it to sudo?
As I have mostly used systems where sudo is used to grant for some specific people a specific rights at specific networks/computers how sudo was designed. No one even knows who the root user is, and it works very well. No one can execute all commands at all, just very specific.

Like the security habits what ubuntu teach to the users (single password for everything) is even worse situation. As too many times I have seen ubuntu users using same password for login and to most web sites and online services, with same loginname. So only thing to actually do is to somepoint just intercept the password some manner and root access is possible to whole computer. Was it trough malware (user directory) or just someone watching over shoulder, it is too risky to teach people to have single password for system administration and user account.

And from that it came up that noticed few days ago first time the NM possibility to set system network settings instead per user. Very easy just clicking the "system connection".
Dont know why I haven't even noticed it earlier.

Lamarque said...

@Fri13, yes, I knew the difference between modular monolithic and micro-kernel and the sudo can be configured to ask for password even from root.

As I said I do not use Ubuntu, but in every (K)Ubuntu system I had to do something sudo was configured to never ask for the password.

The system-connection checkbox works since April if I recall correctly and has always worked in nm09 branch.

misc said...

There is a rather important security issue in your script, you should not use a file with a predicatbla name in /tmp to be written as root.

If someone just link /tmp/log_nm2.txt ( like ln -s /etc/shadow /tmp/log_nm2.txt ), it may be written ( depend on the distro, there is sometimes patches in the kernel against that ).

Lamarque said...

@misc, ok, I have changed the log file path to /var/log/$0.log. Just hope the user does not change the script filename to daemon or Xorg.0, or it will override /var/log/daemon.log or /var/log/Xorg.0.log :-)

Anonymous said...

A minor improvement; remove the rules when pulling the interfaces down!

#!/bin/sh

export PATH="/bin:/sbin:/usr/bin:/usr/sbin"

case $2 in
up|dhcp4-change)
if [ "$(awk '/^[^#]/ { if ( $2 == "'$DEVICE_IP_IFACE'" ) { print $2 } }' /etc/iproute2/rt_tables)" != "" ]
then
IP_PREFIX=$(echo $IP4_ADDRESS_0 | cut -d ' ' -f 1 | cut -d / -f 2)
ip route add $DHCP4_NETWORK_NUMBER/$IP_PREFIX dev $DEVICE_IP_IFACE src $DHCP4_IP_ADDRESS table $DEVICE_IP_IFACE
ip route add $DHCP4_NETWORK_NUMBER/$IP_PREFIX dev $DEVICE_IP_IFACE src $DHCP4_IP_ADDRESS
ip route add default via $(echo $DHCP4_ROUTERS | cut -d ' ' -f 1) table $DEVICE_IP_IFACE
ip rule add from $DHCP4_IP_ADDRESS table $DEVICE_IP_IFACE
fi
;;
down)
if [ "$(awk '/^[^#]/ { if ( $2 == "'$1'" ) { print $2 } }' /etc/iproute2/rt_tables)" != "" ]
then
ip rule del table $1
fi
;;

*)
exit 0
;;
esac

exit 0

Lamarque said...

Thanks for the tip.

Scott Bertilson said...

Very handy, thanks much for creating this capability. Disappointing that NetworkManager still doesn't seem to have integrated this capability. Drove me crazy years back that my Windows XP laptop would allow me to keep wireless connections alive when also activating Ethernet, but Linux wouldn't. Finally!

Not sure if I'm missing something, but I don't think this route is necessary since the kernel supplies it when the interface recieves its address:
ip route add $DHCP4_NETWORK_NUMBER/$IP_PREFIX dev $DEVICE_IP_IFACE src $DHCP4_IP_ADDRESS
At least my routes look less cluttered without it and it doesn't seem to have caused me any issues to have commented it out.