The first command to enter on both the server and the client is ping 127.0.0.1. This is the loopback address and testing it will indicate whether any networking support is functioning at all. On Unix, you can use ping 127.0.0.1 with the statistics option and interrupt it after a few lines. On Sun workstations, the command is typically /usr/etc/ping -s 127.0.0.1; on Linux, just ping 127.0.0.1. On Windows clients, run ping 127.0.0.1 in an MS-DOS window and it will stop by itself after four lines.
Here is an example on a Linux server:
$ ping 127.0.0.1 PING localhost: 56 data bytes 64 bytes from localhost (127.0.0.1): icmp-seq=0. time=1. ms 64 bytes from localhost (127.0.0.1): icmp-seq=1. time=0. ms 64 bytes from localhost (127.0.0.1): icmp-seq=2. time=1. ms ^C ----127.0.0.1 PING Statistics---- 3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms) min/avg/max = 0/0/1
If you get "ping: no answer from..." or "100% packet loss," you have no IP networking at all installed on the machine. The address 127.0.0.1 is the internal loopback address and doesn't depend on the computer being physically connected to a network. If this test fails, you have a serious local problem. TCP/IP either isn't installed or is seriously misconfigured. See your operating system documentation if it is a Unix server. If it is a Windows client, follow the instructions in Chapter 3, Configuring Windows Clients, to install networking support.
If you're the network manager, some good references are Craig Hunt's TCP/IP Network Administration, Chapter 11, and Craig Hunt & Robert Bruce Thompson's new book, Windows NT TCP/IP Network Administration, both published by O'Reilly.
Next, try to ping localhost on the Samba server. localhost is the conventional hostname for the 127.0.0.1 loopback, and it should resolve to that address. After typing ping localhost, you should see output similar to the following:
$ ping localhost PING localhost: 56 data bytes 64 bytes from localhost (127.0.0.1): icmp-seq=0. time=0. ms 64 bytes from localhost (127.0.0.1): icmp-seq=1. time=0. ms 64 bytes from localhost (127.0.0.1): icmp-seq=2. time=0. ms ^C
If this succeeds, try the same test on the client. Otherwise:
If you get "unknown host: localhost," there is a problem resolving the host name localhost into a valid IP address. (This may be as simple as a missing entry in a local hosts file.) From here, skip down to the section Section 9.2.8, Troubleshooting Name Services.
If you get "ping: no answer," or "100% packet loss," but pinging 127.0.0.1 worked, then name services is resolving to an address, but it isn't the correct one. Check the file or database (typically /etc/hosts on a Unix system) that the name service is using to resolve addresses to ensure that the entry is corrected.
Next, ping the server's network IP address from itself. This should get you exactly the same results as pinging 127.0.0.1:
$ ping 192.168.236.86 PING 192.168.236.86: 56 data bytes 64 bytes from 192.168.236.86 (192.168.236.86): icmp-seq=0. time=1. ms 64 bytes from 192.168.236.86 (192.168.236.86): icmp-seq=1. time=0. ms 64 bytes from 192.168.236.86 (192.168.236.86): icmp-seq=2. time=1. ms ^C ----192.168.236.86 PING Statistics---- 3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms) min/avg/max = 0/0/1
If this works on the server, repeat it for the client. Otherwise:
If ping network_ip fails on either the server or client, but ping 127.0.0.1 works on that machine, you have a TCP/IP problem that is specific to the Ethernet network interface card on the computer. Check with the documentation for the network card or the host operating system to determine how to correctly configure it. However, be aware that on some operating systems, the ping command appears to work even if the network is disconnected, so this test doesn't always diagnose all hardware problems.
Now, ping the server by name (instead of its IP address), once from the server and once from the client. This is the general test for working network hardware:
$ ping server PING server.example.com: 56 data bytes 64 bytes from server.example.com (192.168.236.86): icmp-seq=0. time=1. ms 64 bytes from server.example.com (192.168.236.86): icmp-seq=1. time=0. ms 64 bytes from server.example.com (192.168.236.86): icmp-seq=2. time=1. ms ^C ----server.example.com PING Statistics---- 3 packets transmitted, 3 packets received, 0% packet loss round-trip (ms) min/avg/max = 0/0/1
On Microsoft Windows, a ping of the server would look like Figure 9.1.
If successful, this test tells us five things:
The hostname (e.g., "server") is being found by your local nameserver.
The hostname has been expanded to the full name (e.g., server.example.com).
Its address is being returned (192.168.236.86).
The client has sent the Samba server four 56-byte UDP/IP packets.
The Samba server has replied to all four packets.
If this test isn't successful, there can be one of several things wrong with the network:
First, if you get "ping: no answer," or "100% packet loss," you're not connecting to the network, the other machine isn't connecting, or one of the addresses is incorrect. Check the addresses that the ping command reports on each machine, and ensure that they match the ones you set up initially.
If not, there is at least one mismatched address between the two machines. Try entering the command arp -a, and see if there is an entry for the other machine. The arp command stands for the Address Resolution Protocol. The arp -a command lists all the addresses known on the local machine. Here are some things to try:
If you receive a message like "192.168.236.86 at (incomplete)," the Ethernet address of 192.168.236.86 is unknown. This indicates a complete lack of connectivity, and you're likely having a problem at the very bottom of the TCP/IP protocol stack, at the Ethernet-interface layer. This is discussed in Chapters 5 and 6 of TCP/IP Network Administration (O'Reilly).
If you receive a response similar to "server (192.168.236.86) at 8:0:20:12:7c:94," then the server has been reached at some time, or another machine is answering on its behalf. However, this means that ping should have worked: you may have an intermittent networking or ARP problem.
If the IP address from ARP doesn't match the addresses you expected, investigate and correct the addresses manually.
If each machine can ping itself but not another, something is wrong on the network between them.
If you get "ping: network unreachable" or "ICMP Host Unreachable," then you're not receiving an answer and there is likely more than one network involved.
In principle, you shouldn't try to troubleshoot SMB clients and servers on different networks. Try to test a server and client on the same network. The three tests that follow assume you might be testing between two networks:
First, perform the tests for no answer described earlier in this section. If this doesn't identify the problem, the remaining possibilities are the following: an address is wrong, your netmask is wrong, a network is down, or just possibly you've been stopped by a firewall.
Check both the address and the netmasks on source and destination machines to see if something is obviously wrong. Assuming both machines really are on the same network, they both should have the same netmasks and ping should report the correct addresses. If the addresses are wrong, you'll need to correct them. If they're right, the programs may be confused by an incorrect netmask. See Section 9.2.9.1, Netmasks, later in this chapter.
If the commands are still reporting that the network is unreachable and neither of the previous two conditions is in error, one network really may be unreachable from the other. This, too, is a network manager issue.
If you get "ICMP Administratively Prohibited," you've struck a firewall of some sort or a misconfigured router. You will need to speak to your network security officer.
If you get "ICMP Host redirect," and ping reports packets getting through, this is generally harmless: you're simply being rerouted over the network.
If you get a host redirect and no ping responses, you are being redirected, but no one is responding. Treat this just like the "Network unreachable" response and check your addresses and netmasks.
If you get "ICMP Host Unreachable from gateway gateway_name," ping packets are being routed to another network, but the other machine isn't responding and the router is reporting the problem on its behalf. Again, treat this like a "Network unreachable" response and start checking addresses and netmasks.
If you get "ping: unknown host hostname," your machine's name is not known. This tends to indicate a name-service problem, which didn't affect localhost. Have a look at Section 9.2.8, later in this chapter.
If you get a partial success, with some pings failing but others succeeding, you either have an intermittent problem between the machines or an overloaded network. Ping for longer, and see if more than about 3 percent of the packets fail. If so, check it with your network manager: a problem may just be starting. However, if only a few fail, or if you happen to know some massive network program is running, don't worry unduly. Ping's ICMP (and UDP) are designed to drop occasional packets.
If you get a response like "smtsvr.antares.net is alive" when you actually pinged client.example.com, you're either using someone else's address or the machine has multiple names and addresses. If the address is wrong, name service is clearly the culprit; you'll need to change the address in the name service database to refer to the right machine. This is discussed in Section 9.2.8, later in this chapter.
Server machines are often multihomed: connected to more than one network, with different names on each net. If you are getting a response from an unexpected name on a multihomed server, look at the address and see if it's on your network (see the section Section 9.2.9.1, later in this chapter). If so, you should use that address, rather than one on a different network, for both performance and reliability reasons.
Servers may also have multiple names for a single Ethernet address, especially if they are web servers. This is harmless, if otherwise startling. You probably will want to use the official (and permanent) name, rather than an alias which may change.
If everything works, but the IP address reported is 127.0.0.1, you have a name service error. This typically occurs when a operating system installation program generates an /etc/hosts line similar to 127.0.0.1 localhost hostname.domainname. The localhost line should say 127.0.0.1 localhost or 127.0.0.1 localhost loghost. Correct it, lest it cause failures to negotiate who is the master browse list holder and who is the master browser. It can, also cause (ambiguous) errors in later tests.
If this worked from the server, repeat it from the client.