Introduction
Ping is a network utility to check network connectivity to a target destination. Ping is built-in on Windows, Linux and MAC OS. Ping operates based on the Internet Control Message Protocol (ICMP). It is commonly used to find out whether a destination host is reachable or not.
Intuition
In this quarantine period, people stay at home during the lockdown restrictions. After spending countless day binge watching Netflix, you suddenly miss your friends. The natural thing to do is to reach out to our iPhone and call them — “Hey <insert weird name here>, are you still alive?”.
Machine also needs to know if their peers (target destination) are alive to communicate. Instead of iPhones, the machines use the PING command to check if their peers are still reachable on the network.
There are two ways for machines to check for host reachability;
(A) using Keepalive message (periodical)
Some devices like routers use keep alive messages that are sent out periodically to check if their peers are still alive. For example, routers that run on RIPv1 sends out updates every 30s to all their peers to keep the heartbeat alive.
(B) using Ping command (triggered)
End devices normally send PING request to a destination host and waits for the PING reply to determine if the target is reachable after a period of inactivity. For the devices that are already in a TCP session, there is no need to PING since we are sure these devices are already connected.
ICMP
The Internet Control Message Protocol (ICMP) is a helper protocol that supports IP for error reporting and simple queries. Imagine throwing a stone at your neighbour’s house — your neighbour might throw the same piece of stone back at you; or if you got lucky, he might throw a piece of gold back at you. ICMP provides machines the same capability to throw this ‘are you alive’ message back and forth.
ICMP is a simple protocol; there are only 2 types of ICMP messages.
In this example, Host_A pings Host_B. Here, the Host_A sends the ICMP request to Host_B. If Host_B is reachable, it sends ICMP reply back to Host_A. If Host_B is not reachable, an ICMP error message is generated by the router and sent to Host_A.
There are two outcomes of the ICMP operations —
(A) Ping is successful
the source host successfully received the ICMP reply from the destination host. In this case, the connection metrics will be shown.
(B) Ping failed
the source host did not received the ICMP reply from the destination after a period of time. In this case, an ICMP error message will be generated to the source host.
There are many types of options for these two ICMP messages. This options made ICMP error messages to be intuitive; so we can quickly guess what’s causing the connectivity issues. The full table of ICMP codes can be found here. The two most common errors are Destination network unreachable and Request timed out.
How to use PING command
Ping is pre-installed as part of the network utilities in most modern OS. The syntax for ping is:
PING <IP address of target host>eg.
ping 192.168.1.1
To use PING on Windows:
1. Launch command prompt.
2. Type the command ‘ping’ followed by the ‘destination host IP address’
To use PING on MAC OS
1. Launch terminal.
2. Type the command ‘ping’ followed by the ‘destination host IP address’
To use PING on Linux
1. Launch Shell
2. Type the command ‘ping’ followed by the ‘destination host IP address’
As you can see, the ping command is universal across most OS. The only differences is the types of options we can add while using the ping command. For example, on Windows; 4 ICMP request are sent for each ping command. Meanwhile, on Linux; the ICMP requests are sent indefinitely until the user manually stop it.
We can find out the options for PING using the -h parameter.
If we issue a ping to Google web server on a MAC; the machine will continuously ping until we press ctrl+c to terminate it.
To send only 2 ICMP request, we use the -c options. Here, we simply type ‘ping google.com -c 2’ to set the ICMP count.
There are many other options that you can experiment with based on the guides from -h output.
The figure below visualise what just happened — a local host with IP 192.168.1.1 is trying to ping Google.com web server located somewhere on the Internet with the IP 172.217.174.174. Firstly, the local host sends DNS request to DNS server to resolve ‘google.com’. Secondly, the DNS server return the IP 172.217.174.174 to the local host. Thirdly, the local host builds the ICMP request to the destination 172.217.174.174. After a shortwhile, Google replied back to the local host.
Troubleshooting connectivity issues with PING
Continuing from the earlier example where PC1 failed to ping failure Google Web Server. How do we fix this connectivity issues? — The first step is to isolate the point of failure. In this case, any device starting from PC1 itself, to the routers in the middle, until the destination web server can be a point of failure. To find out where it fails, we need to systematically ping each of the node along the path to detect where the ICMP packets are dropped.
A systematic troubleshooting starts from the source; going to the direction of destination. Here’s how we can perform step by step troubleshooting:
(1) Perform Self Diagnostic on PC1
On PC1, we perform a self-ping using the command ping 127.0.0.1. This is a self diagnostic test to check if the TCP/IP stack on PC1 is working properly. If the ping is successful, this means the problem is on some other nodes after PC1.
If the ping fails, it means the NIC of PC1 might not be working properly.
(2) Perform Ping Test to Router1
If PC1 is cleared, next we try to ping next hop address; which is Router1. We can ping either interfaces; 192.168.1.254 or 10.10.10.1 directly connected to Router1. If the ping is successful, this mean the packet is discarded somewhere after Router1.
If the ping failed, this imply that Router1 is mis-configured with the wrong IP address or the interface is not yet set to UP (using the no shut command); or the router is simply broken.
(3) Perform Ping Test to Router2
If Router1 is cleared, next we ping 10.10.10.2 on Router2. If the ping is successful, this means Router2 has already forwarded the packet to the next hop.
If it fails, this can means several things:
(a) IP is configured wrongly on Router2
(b) Interfaces on Router2 is not set to UP
(c) Hardware issues with Router2
(d) Routing is not configured from PC1’s Network to Google’s Network
(e) Firewall is blocking ICMP request from entering Router2
(f) If the ping pass at 10.10.10.2 but failed at 172.217.174.174, this could indicate issues with NAT configurations on Router2
(4) Perform Ping Test to Google Web Server
If Router2 is cleared, this means that the packet has succesfully traveled from PC1 →Router1 →Router2; so the last possible point of failure is the Google Web server.
If the ping is successful, then Google Web server will send the ICMP reply back in the opposite direction back to PC1 and that’s how PC1 see this output on its end.
This could due to the server is down at the moment, or the load balancer in Google Data center is simpl not responding. Most of the time, Google has setup Firewall to prevent outsiders from pinging their servers to prevent the denial-of-service attack. In some rare cases, the server is actually reachable; but the ICMP reply going back from the server →PC1 is dropped on the way back due to security settings like ACL.
Normally, we would ping the destination host directly in ping test. Step by step troubleshooting is only nessecarily if the ping fails.
Checking end to end connectivity with Trace Route
So step by step troubleshooting is cool, but imagine isolating a point of failure on a topology with 200 routers in between the source and the destination. That’s where trace route comes in.
Trace route is a network service (like ping on steroid) that check end to end connectivity at per-hop basis. Think automatically ping each nodes in the path leading to the destination; all with one ‘tracert’ command. The syntax is:
tracert <destination IP>eg.
tracert 172.217.174.174
Using the same example network, we can quickly check the point of failure by performing tracert on PC1 to Google Web Server.
The output shows that Google Web server is reachable from PC1, across 3 hops. The latency of each hops is also shown. Tracert is also useful to troubleshoot poor performing nodes, especially to find out which router is congested when online game is lagging. In our example, both hops are so fast that the packet is forwarded to the next hop in only 6ms.
We can measure the time it takes for packet to travel from source to destination back to source as Round trip time (RTT). RTT gives a clearer picture of the network quality since the path taken to go from source →destination and back from destination →source might be different.