| EIW Fall 2003 Lecture Notes |
|   EIW Home  |   Course Syllabus |
This is a very brief overview of TCP/IP (and Ethernet), you don't need to worry about understanding everything completely - a general understanding of what TCP/IP is (and that the Internet is just a bunch of machines that talk TCP/IP) is enough...
The Internet is a collection of end-systems that communicate using the TCP/IP protocol suite. TCP/IP stands for "Transmission Control Protocol/Internet Protocol".
A Protocol Suite is a set of protocols defining services at a number of layers (often corresponding to the layers in the OSI reference model).
The TCP/IP protocol suite includes services at the network and transport layers. Remember that the network layer needs some data link layer to provide basic communication between end-systems on the same network. TCP/IP does not include a data link layer (or a physical layer), which means that TCP/IP can run on many different types of networks (many different data link layers). We will discuss one popular data link layer to get some understanding of how a data link layer can operate.
Ethernet is a popular data link layer used in many LANS. Ethernet is supported by a variety of physical layer implementations. Although the physical layer may vary, all ethernet networks have common mechanism for transmission of data between computers on the same network, including the following:
C0:B3:44:17:21:17The source system constructs a chunk of data that includes the ethernet address of the destination and puts this data "on the wire" (asks the physical layer to transmit the data).
All end-systems on the network (wire) receive the chunk of data and look at the destination address. If the destination address matches the ethernet address of the end-system, it keeps the data - otherwise it throws it away.
Each frame includes a "header" that includes information such as the destination address - the header is always in the same place in the chunk of data.

The payload is data that comes from/to higher level layers.
The network layer of TCP/IP is called IP (Internet Protocol). Since this is a network layer, it is responsible for sending data between end-systems on different networks. IP accomplishes this by using a data link layer such as ethernet to transmit data between systems on the same network, often the system ethernet delivers the data to is a router that can now forward the data to another network. IP networks have the following properties:
Each end-system has a unique address (and IP address). Some systems have multiple network interfaces and therefore multiple IP addresses.
The IP layer in each end-system uses a data link layer to transmit and receive IP packets. For example, and IP packet might be the payload of a ethernet frame.
Since some systems (routers) must be able for forward IP packets, there must be some mechanism for determining where each packet should go. This is accomplished using network addresses (identifying networks) and host addressing individual hosts (end-systems) on a network. Together the network and host address make up the IP address.
An IP address is 32 bits long, and is usually shown using "dotted decimal notation" in which each byte is shown as the decimal number encoded by the 8 bits. For example:
| Binary | Dotted Decimal |
|---|---|
00000001 00000010 00000011 00000100 |
1.2.3.4 |
10000000 11010101 00000001 00000001 |
128.213.1.1 |
The first part of each IP address specifies the network address, for example the network address 128.213 is the address for the Computer Science department. The last 2 digits specify which host (end-system) within the CS network.
When routing packets, the routers only look at the network address. Each router must have a table (called a routing table) that indicates where it should send (which network) a packet given the destination network address. Routers also have a "default route" that indicates where it should send packets that have network addresses not in the routing table.
|
| ||||||||||||||||
The routing table above might correspond to the router shown in the picture to the right. The router is connected to 3 networks and each of these networks has (at least) another router. Note the default route (labeled anything else).
The TCP/IP protocol suite includes 2 transport layers, called UDP (user datagram protocol) and TCP (transmission control protocol). Although these are both transport layers and the primary responsibility is to deliver packets between processes (not just between end-systems), these transport layers provide very different types of service:
UDP - provides for the delivery of a single chunk of data (called a datagram) between processes. Each datagram is independent of any others, and they may be lost by the network or arrive out of order. A receiving system has no way of knowing if the sender has sent a datagram unless one arrives, nor does it know if the datagram received was the next one sent by the sender (no assumption can be made about the order received). UDP is an unreliable, connectionless, datagram oriented transport protocol.
Unreliable - there is no guarantee that the network will deliver a message, and the network does not inform the sender if the message is lost.
Connectionless - there is no initial "connection" made between the sender and receiver before data is sent.
Datagram-oriented -data is sent and received in chunks without any order imposed or implied.
TCP - provides for connection-oriented, bi-directional, byte stream transmission of data. Each endpoint views a TCP "connection" as an ordered stream of bytes. An initial connection must be made between the sender and receiver before any data flows. TCP provides a reliable service - it attempts to fix any problems (retransmits data if something is lost) and notifies the sending process of problems.
UDP is a minimal transport protocol, adding as little as possible to the services provided by IP. TCP provides lots of features (including reliability) although there is some overhead involved with the additional services.
Most applications on The Internet are based on TCP.
Both TCP and UDP must provide communication between processes identified by protocol "ports" (really just an integer). So the destination address of any TCP or UDP data includes an IP address and a protocol port number. Many network services operate on a prescribed port number, for example HTTP (WWW) servers use port 80 and mail servers (SMTP) use port 25.
The Internet is a very large TCP/IP network. Each end-system on The Internet must have a unique IP address (these are assigned in blocks by a central authority). We are running out of IP addresses! IPV6 (a.k.a. IPNG) is a new version of the TCP/IP protocol suite that will extend the address length to 64 bits.
In the old days...
The Internet has been around for over 10 years, although until recently the major applications were email, file transfer and remote login. These applications had text-oriented, command based user interfaces. The use of The Internet was generally restricted to people with Unix systems who knew the various commands for each specific application. Although many of the old services still exist on the internet, the WWW hides most of the details. Transferring a file now just requires a click, in the past a sequence of text commands was necessary. We will try out some old services as an exercise...
Humans typically don't deal with IP addresses, we usually give a client the "host name". These names are arranged in a hierarchical structure that makes them easy to remember (easier than IP addresses). The naming hierarchy is based on the concept of naming "domains", each domain covers some subset of the entire set of names. At the top level there are domains corresponding to educational institutions (.edu), commercial entities (.com), public organizations (.org), government entities (.gov), etc. There are also top level domains for countries. The organization of the top level in the naming hierarchy is likely to change soon…

Individual organizations each get a name at the second level in the name hierarchy. For example, Rensselaer has the domain "rpi.edu", Microsoft has "microsoft.com", etc. Within organizations it is possible to add additional naming levels, this is typically done to identify sub-organizations or departments within the larger organization. Computer Science at RPI has the domain "cs.rpi.edu" for example.
Finally, each individual "host" (end-system) has a unique name within it's department or sub-organization. The complete name of a host includes this name as well as the names of all sub-organization units, the organizational name and the top level domain. For example - a specific computer in the computer science department is "coffeepot.cs.rpi.edu" and another might be "joe.theory.cs.rpi.edu".
There is networked service that provides conversion between these host names and IP addresses, client processes typically access this service automatically whenever a user enters a host name. This networked service is the "Domain Name System" and is an integral part of the Internet. There are thousands of participating DNS servers, each server can provide IP addresses for a specific domain.
When you attempt to access a remote computer on the Internet using a host name your client process contacts the DNS server that controls the domain in which the client is found. It asks this server for the IP address of the remote computer, once it gets a reply the client now uses the IP address to establish communication with the remote computer. Since it is not likely that the local DNS server knows the IP address of all hosts on the Internet, it might need to contact another DNS server (one that handles a larger domain) and ask for help. This process continues up the naming tree until reaching a "top-level" DNS server, at which point the top level server will forward the request to the DNS server that handles the top-level domain of the remote host (which can also pass the request along to lower level domain servers). Since the conversion between names and IP addresses is such a common operation, it is important that these requests are handled efficiently and that the collection of name servers can grow to accommodate the growth in the naming hierarchy. DNS is probably one of the best designed network services, so far it has kept pace with the growth of the Internet without major problems…
All Internet applications use TCP/IP to provide communication services. The data (the payload) delivered by TCP/IP must be formatted according to application protocols - these protocols define the exact nature of the data that is exchanged between processes. The development of protocols is typically done by a group called the Internet Engineering Task Force, with small committees responsible for development of each individual protocol. For example, there is a committee that developed the SMTP protocol used to deliver email messages, and another that developed the FTP protocol used for file transfer. Each protocol is made widely available, so that anyone can develop software that adheres to the protocol. A number of Internet Application Protocols and links to the published protocol documentation are listed below:
|
HTTP - Hypertext Transfer Protocol |
|
|
FTP - File Transfer Protocol |
|
|
SMTP - Simple Mail Transfer Protocol |
|
|
TFTP - Trivial File Transfer Protocol |
|
|
RIP - Routing Information Protocol |
These documents are known as "Request for Comments" (an historical name) and are often difficult to read, although they are generally the reference for specific protocols.
Additional Information: