(not really, as usual, once you know what's going on...)
During a network troubleshooting session, while running the excellent mtr to a remote IP, I needed to run another instance of it to another destination.
So while I was running this:
Host Loss% Snt Last Avg Best Wrst StDev 1. gw.office 0.0% 148 0.3 0.4 0.2 3.9 0.7 2. ge-0-1.transit 0.0% 148 1.6 4.7 1.3 74.1 9.5 3. rtr4.example.com 0.0% 148 1.1 15.9 0.9 195.1 38.8 ...
I opened another window and ran another mtr. Right from the beginning, this second mtr (but it could just as easily have been the first one, as we'll see) was showing a ridiculous percentage of packet loss for the first hop:
Host Loss% Snt Last Avg Best Wrst StDev 1. gw.office 85.7% 8 0.3 0.3 0.3 0.3 0.0 2. ge-0-1.transit 0.0% 8 1.8 1.7 1.4 2.2 0.3 3. lon-par.upstream2.example.com 0.0% 8 1.1 1.1 1.0 1.2 0.1 ...
Obviously there was something strange (especially because packets were certainly getting through, given the 0% loss at the following hops), but what?
Before proceeding, let's briefly review how mtr works. The full story is here; the executive summary is that, like any good traceroute-like program, mtr needs to be able to receive ICMP error messages as proof that hosts along the path are alive. The error packets that are returned are of the "time to live exceeded" kind, except for the last hop which sends back a regular echo reply in ICMP mode, and a "port unreachable" ICMP error when in UDP mode.
In the above scenario, I was using ICMP mode, but it would have been the same in UDP mode.
If mtr is expecting an ICMP error message to come back from a certain host but it doesn't receive it, it consider this fact a symptom of packet loss happening at that hop.
Now, let's go back to our two concurrent mtr instances. As can be seen, the first hop (where the supposed packet loss was happening according to the second instance) is the same for both traces. Since each mtr instance by default sends one packet per second, the office gateway should have been sending back two ICMP error messages every second. But running tcpdump, only ONE such message was observed on my machine. So obviously, the mtr instance that wasn't getting its expected ICMP error was declaring packet loss at the first hop.
Now here's the catch: the office gateway is a Linux machine, and by default Linux limits the rate at which certain ICMP packets can be sent. The limit is per-IP. There are a couple of files under
Turns out that by default, the ICMP error messages that mtr needs (both the "time to live exceeded" and the "port unreachable") are rate-limited, and the rate is of one packet per second. This is from man 7 icmp:
icmp_ratelimit (integer; default: 1000; since Linux 2.4.10) Limit the maximum rates for sending ICMP packets whose type matches icmp_ratemask (see below) to specific targets. 0 to disable any limiting, otherwise the minimum space between responses in milliseconds. icmp_ratemask (integer; default: see below; since Linux 2.4.10) Mask made of ICMP types for which rates are being limited. Significant bits: IHGFEDCBA9876543210 Default mask: 0000001100000011000 (0x1818) Bit definitions (see the Linux kernel source file include/linux/icmp.h): 0 Echo Reply 3 Destination Unreachable * 4 Source Quench * 5 Redirect 8 Echo Request B Time Exceeded * C Parameter Problem * D Timestamp Request E Timestamp Reply F Info Request G Info Reply H Address Mask Request I Address Mask Reply The bits marked with an asterisk are rate limited by default (see the default mask above).
Mistery solved. When running two instances, the instance that gets the ICMP error can be either one, depending on how their packets are interleaved; in general, since they both send packets at regular intervals of one second, one of them will get all the ICMP errors, and the other one none, which is what we observed. But here the rate-limited hop was in the local LAN; if the shared, rate-limited hop is further away, then latency can play a role and the returning ICMP errors can have "jitter", so to speak, and thus be more distributed between the two mtr instances (which will then both report packet loss at that hop whereas it's likely that there is none).
Solution: do not run two or more mtr instances from the same source IP address, whose traces share a Linux hop. And preferrably do not use an interval less than one second for probes, even if you're running a single instance, otherwise you'll see strange packet losses popping up if some hops of your trace happen to be Linux machines. This is probably the case if you're sending N probes per second and a packet loss percentage of about
If you have access to the Linux host in question, another approach could be to remove or loosen the rate limit setting (but think carefully, because the limitation is there for a reason).
Finally, you can of course do nothing, if you are aware of this fact and can live with it.
Many thanks to Jordi Clariana for the discussion we had about this and the ideas he brought in.