I have managed networks for the past 20+ years. I am able to use my experience to very quickly diagnose ISP problems that arise with my own home Internet connection. I am documenting my basic troubleshooting steps for the benefit of anyone else who is having trouble with their Internet connection.
IMPORTANT: If you are troubleshooting a problem with your Internet connection, be sure to disable any VPN connection before you run any of these tests!
The most important terms to understand when troubleshooting problems with your Internet connection are “Latency”, “Packet Loss”, and “Bandwidth”. I also defined related terms, including the tools we will use.
Latency: This is the round-trip time it takes for a “packet” of information to travel from your device, to the remote host, and back to your device. Usually measured in milli-seconds (ms). 1000 ms = 1 second.
Ping: This tool sends one or more “packets” of information to a remote host (aka “Echo Request”) and expects to receive the same number of packets back from the remote host (aka “Echo Reply”). Ping will display the Latency for each packet, or will tell you that a reply was not received.
Packet Loss: This indicates the number of “packets” that were lost when pinging a remote host. Packet Loss is measured in %. If 10 packets were sent and only 9 responses (90%) were received, Packet Loss would be 10%.
Traceroute: This tool maps each “hop” between your computer and a remote computer (and collects Packet Loss % and Latency for each “hop”) by taking advantage of an IP feature called “Time to Live” (aka “TTL”) which limits the lifespan of a packet on the Internet. Traceroute sends 3 pings with TTL value of 1 to the remote computer, which causes the 1st “hop” between your computer and the remote computer to reply 3 times. The process is repeated with a TTL of 2 to collect information from the 2nd hop, and so on until the entire path is mapped showing Packet Loss and Latency for each “hop”.
MTR: This tools is very similar to Traceroute, but provides a continuous realtime view of Packet Loss and
Latency at each “hop” between your device and the remote host. Instead of sending 3 pings to each TTL
value and expecting 3 replies from each “hop” like Traceroute, MTR continuously sends 1 ping to each
TTL value and updates the Packet Loss and Latency stats for each hop.
Bandwidth: This is the maximum amount of data that can be transferred per second between your computer and a remote host. Bandwidth is usually measured in “bits per second” or “bytes per second”. 1 byte = 8 bits. “Download” speed (from the Internet to your device) and “Upload” speed (from your device to the Internet) are usually measured separately. For example, a Gigabit Cable connection may have a “1 Gig” download speed (aka 1000 Mbps) and a “40 Mbps” upload speed, while a Gigabit Fiber connection may have a “1 Gig” download and “1 Gig” upload speed.
This is the first tool I reach for when troubleshooting problems with my Internet connection. This tool can tell me the following:
- Do I have any Internet connectivity?
- How frequently am I losing Internet connectivity?
- Do I have any Internet latency issues?
- How often am I having Internet latency issues?
Begin by pinging a reliable remote host such as “google.com” or “aws.amazon.com”. To run a continuous ping, type “ping aws.amazon.com” in a Mac/Linux Terminal or type “ping -t aws.amazon.com” in a Windows Command Prompt.
In the example above, I ran “ping aws.amazon.com” then pressed CTRL+C after 10 pings to stop the ping and show a summary. My latency ranged from 34ms to 44ms. My packet loss was 0%.
Your latency will vary depending on the type of Internet connection you have, your location, and the location of the remote host. The further a remote host is from your location, the higher the latency. If you are in the United States, latency to another host in the U.S. (e.g. “amazon.com”) will usually be much lower than latency to a host in Europe (e.g. “amazon.de”) or Asia (e.g. “amazon.cn”).
If you only get a “Request timeout” instead of the normal replies shown above, try to ping a different hostname or IP address. Here are some personal favorite hostnames and IP addresses:
- 220.127.116.11 (DNS resolver operated by OpenDNS)
- 18.104.22.168 (DNS resolver operated by OpenDNS)
- 22.214.171.124 (DNS resolver operated by CloudFlare)
TIP: If I am having intermittent problems with my Internet connection, I will leave a Terminal window running a continuous audible ping in the background (e.g. “ping -i2 -A google.com” on Mac). This command will run a ping every 2 seconds and I will hear a bell sound anytime a packet is dropped. Anytime I feel like a website or app is not responding normally, I can switch to the Terminal window and look at the ping output to see if any recent pings are showing high latency or dropped packets.
TIP: If you can ping an IP address but you cannot ping a hostname, review your DNS settings. Incorrect DNS settings can prevent you from accessing ANY websites by hostname, but would still allow you to browse to a website by IP address. For example, here is a website you can browse to by IP address even if you are having DNS problems: https://126.96.36.199/.
When ping is reporting high latency on my Internet connection, I can use traceroute to determine where the high latency is occurring. The issue is almost always the first “hop” outside of my home network, indicating a problem with my home Internet connection. However, the latency issue could be further upstream between my Internet provider and another network.
Here is an example of output from a traceroute. In this example, I ran a traceroute to IP address “188.8.131.52”, but you can run a traceroute to any hostname or IP address. In this example, I used the “-n” option to skip name lookups, which helps the traceroute run a little faster. However, showing names can help you determine which network each hop belongs to (e.g. your ISP, AT&T, Amazon, Google, Microsoft, etc).
Note: Mac/Linux users type “traceroute 184.108.40.206”, Windows users type “tracert 220.127.116.11”
Observations from the traceroute output above:
- Hop 1 is an IP address inside my home network. The best ping time is 2.655 ms, which seems reasonable for a ping to a device on my home network.
Wireless: If I was seeing very high latency (e.g. >50 ms) to this very first hop on my network and I was using WiFi, I would switch from wireless to a hard-wired network connection by plugging directly into my home router and I would run the test again to see if the latency issue disappears. High latency that only happens on WiFi could indicate a wireless interference problem, an older WiFi access point or router that cannot handle the number of devices or your bandwidth needs, or a failing WiFi access point.
Wired: If I was seeing very high latency (e.g. >50 ms) to this very first hop on my network and I was using a hardwired network connection, I would suspect a problem with my network wiring, a network switch, or my network router. The performance problem is likely inside my network and no fault of the ISP.
- Hop 2 is the first internet device outside of my home network. It is not unusual for one or more hops in a traceroute to not respond. There are a number of reasons why this hop might not respond, including IP addressing or IP filtering.
- Hop 3 is the first internet device outside my home network that I can ping. This is a router at my ISP. The best ping time is 8.521 ms, which seems reasonable for a ping to a local/regional ISP router outside my home network. If I see a significant increase in latency from Hop 1 (inside my network) to this next hop (at my ISP), I would suspect an issue with my router, my cable modem, or my ISP connection.
Tip: Plug directly into your cable modem with a network cable and repeat the test to eliminate the possibility that the latency issue is being caused by your router.
Tip: Perform a cable modem swap and repeat the test to reduce the possibility that the latency issue is being caused by your cable modem. I say “reduce” instead of “eliminate” because I’ve swapped cable modems, only to find that my brand new cable modem was also faulty.
- Hops 4, 6, 7: Notice that these hops show latency for 2 or 3 different IP addresses. This is very common and happens when traffic can travel multiple redundant paths to the same destination.
- Hop 8 is the remote host. It is NOT unusual for the remote host to block ping replies, so do not be surprised if the last hop(s) show no response (* * *).
TIP: If I want to determine who an IP address belongs to, I perform an IP WHOIS lookup on the ARIN website (https://www.arin.net/). The results will tell me who is responsible for that IP address (e.g. my ISP, “Amazon”, “Google”, etc. If the result mentions another IP registry, repeat the lookup on their website. e.g. RIPE (https://www.ripe.net/), APNIC (https://www.apnic.net/), etc.
This free tool performs a continuous traceroute. I usually reach for this tool instead of traceroute because of its speed and because of its continuous nature.
Here is an example of output from MTR after 30 seconds. Because MTR requires special permissions, I had to type “sudo mtr 18.104.22.168” in my Mac Terminal. I stopped MTR by pressing CTRL+C.
Observations from MTR output:
- Overall, this output is nearly identical to the traceroute output. See traceroute observations above.
- Because I did not skip name resolution, I can see that Hop 6 belongs to Equinix, likely in Chicago (chi). I can also see that my final hop (22.214.171.124) has a vanity reverse name of “one.one.one.one”.
I can identify the cause of most Internet performance problems using some combination of Ping and Traceroute (or MTR), but it can be helpful to perform bandwidth tests. For example, I used a bandwidth speed test to confirm that my service was only upgraded to 500 Mbps when I upgraded to a Gigabit plan. The ISP was able to find/resolve the issue right away.
Here are a few of my favorite tests:
- Ookla Speed Test (https://www.speedtest.net/) — I like that I can easily save an image of the results
- Google Speed Test (https://www.google.com/search?q=Internet+Speed+Test) — I like that this is built into the Google search result page
- Your ISP Speed Test — I like to use these test results when working with my ISP to troubleshoot speed issues. If these results are significantly better than one of the other speed tests, I’ll share both results with my ISP.
If you need to measure the available bandwidth between two Linux hosts, consider using “iperf”. You must be able to install the tool on both hosts.
Thanks for Reading!
I intend to update this article with example outputs when my Internet connection is having issues or some point between my Internet provider and another network is having issues.
Did you find this helpful? Let me know by sending me a comment. I tend to update and maintain posts more frequently if I know others find them helpful. Thanks for visiting!