CSCI.4220 Network Programming
Fall, 2006
Class 3: Low level networking, Ethernet, ARP

The bottom two layers of the OSI protocol stack are the Physical Layer and the Datalink layer. Their mission is to get information from one node to another. It does not matter whether a node is a router or a host; the principles are the same.

The Physical layer is concerned with how the bytes are converted into volts or wavelengths or whatever the physical medium is, transmitting them, and converting the volts or wavelengths back into bytes at the other end. We will not discuss this layer any more in this course.

The DataLink Layer receives an IP packet from the Network Layer and has to get it to the next hop. The unit of transmission at this level is called a frame. In general, one frame contains one IP Packet with some header information, although it is possible for an IP packet to be fragmented, that is, broken up into several frames to be reassembled at the destination. The DataLink Layer has to convert the bytes in the packet to whatever the physical medium happens to be.

Examples of protocols at this layer include 802.11 wireless, token ring, PPP, ATM, and the various flavors of Ethernet.

There are two broad categories of transmission at this level, point-to-point, and broadcast. In a point-to-point network, there is a direct link (a wire, or at least a dedicated circuit) between the sender and the receiver; the sender simply outputs the frame and it goes to the receiver. In a broadcast network, all messages are sent to all hosts, and, hopefully, only the designated receiver reads the message. The analogy is the boss going to the cubicle farm and yelling "Smith, come to my office". Everyone can hear the message, but only Smith does anything about it. Of necessity, any wireless or satellite network is broadcast.

Communication at this level is typically done with a network interface card (NIC), also known as an adapter. This is a board or card which contains some memory, a digital signal processor, an interface to the host bus, and a link interface, which does the actual translation from the bits to volts, radio signals or whatever.

The network interface card has an address encoded in it. Addresses at the physical layer are 6 bytes (48 bits) and are unique on the planet. Physical layer addresses are controlled by the IEEE. When a company wants to manufacture NICs, they are assigned a block of addresses; typically the block is 24 bits.

Note that the physical address is assigned to the card, not to the host. If you replace the Ethernet card on your computer with a new card, you will have a new physical address. Conversely, if you take your computer to Starbucks and connect to the Internet using WiFi, your physical address stays the same, but the WiFi network assigns you a temporary IP address (much more on this shortly).

Some hosts have two or more interface cards; thus they have two or more physical addresses. The address applies to the interface, not to the host.

Broadcast protocols present a problem; any host can transmit at any time, and so it is possible that two (or more) hosts can start sending at the same time, resulting in a corrupted transmission. This is known as a collision. There are several solutions to the collision problem. Hosts can take turns, they can partition the channel, or they can use a randomization procedure.

The datalink layer has to have a mechanism to deal with this. The datalink layer is sometimes divided into two sublayers, the upper layer is called the Medium Access Control Layer (MAC), and this contains the software which determines whether or not to transmit. The 6 byte physical address is often called a MAC Address.

Ethernet

So far we have discussed this layer in general terms. I will use Ethernet as a specific example. Ethernet is by far the most widely used protocol at this level, and so it is a good choice.

The original Ethernet was invented in the 1970s by Bob Metcalf at Xerox PARC (Palo Alto Research Center). Since that time there have been a number of upgrades in speed, but the original concept is essentially unchanged. There are now a number of different Ethernet technologies; the most widely used are 10BaseT, which uses twisted copper wire (the same as a telephone connection) and transmits at up to 10 megabits per second (10 Mbps), 100BaseT, which also uses twisted pair copper wire and transmits at up to 100 Mbps; and Gigabit Ethernet which can transmit at up to 1,000 Mbps.

Ethernet is a broadcast protocol. In the old days, the actual ether was a coax cable. Each host connected to it through a vampire tap. In modern installations, each host connects to an Ethernet hub in a star topology. The hub simply passes all the bits onto all of the other hosts.

An Ethernet frame consists of the following fields:

Note that Ethernet is a connectionless, unreliable protocol. It is connectionless because there is no attempt to reserve resources or to alert the receiver before a message is sent. It is unreliable because there is no acknowledgment; at this level, the sender has no idea whether or not the frame was successfully received (other layers may send an acknowledgment).

Because Ethernet is a broadcast protocol, it is possible for two or more frames to collide because more than one host decides to transmit at the same time. When this happens, both messages are garbled. Ethernet addresses this problem through a technique called Carrier Sense Multiple Access with Collision Detection (CSMA/CD). When a host wishes to transmit, it first listens on the line, and waits until there is no other frame being transmitted. It then transmits a frame. The NIC is able to detect that a collision occurred by noting that there is more power on the line than they actually sent.

When the NIC detects that its frame was corrupted by a collision, it immediately stops transmitting. It uses an algorithm called Binary Exponential Backoff to determine when to retransmit. The NIC generates a generates a random bit, either 0 or 1. If it generates 0, it starts retransmitting again immediately. If it generates a 1, it waits for a time period. This time period is the maximum time that it takes for a frame to get from one end to the other (typically about 100 usec). Then it retransmits.

In the simple case where a collision occurs because two hosts are trying to transmit at the same time, both messages will be sent with a 50% probability (If one chooses 0 and the other chooses 1). If they both choose the same number, there will be another collision. If this happens, both choose a random number between 0 and 3, and wait that many time slots before transmitting. The probability that both will choose the same number is one in four. Thus, the probability of successful transmission is .75. If a collision occurs at this level, each chooses a number between 0 and 7, and they follow the same procedure.

Ethernet works better in practice than in theory. Under low load, it is extremely efficient. Under periods of very high load, when many hosts want to transmit at the same time, it degrades as any network would, but it eventually reaches a point where essentially no frames are transmitted because there are too many collisions, and the hosts that want to transmit are spending longs periods of time waiting.

Ethernet has been around for a long time, and it works pretty well, but recently, there have been a number of extensions to adapt to newer technologies. In 1998, the IEEE issued a standard to allow a frame format extension to support a Virtual Local Area Network (VLAN) field. This allows a large network to be divided into subnets based on a virtual field rather than a physical hub. Thus, a number of computers on different physical networks can behave as if they were on a single network.

Gigabit networks can allow a sender to transmit in burst mode, in which they can transmit a series of frames, all part of a single stream, without giving up control of the medium.

The Address Resolution Protocol (ARP)

Consider this network.

When hillary wants to sent a message to marykate, it can get its IP address with the gethostbyname() function or equivalent, but it needs to know marykate's MAC address. This is the role of the Address Resolution Protocol (ARP).

Each host on the network maintains an ARP table. Each entry in this table has the following fields

When a sender wants to send a packet to another host, it knows the IP address and so it can look up the MAC address.

Suppose hillary wants to send a message to marykate, but marykate is not in hillary's ARP table. There are a number of reasons why this could be: The entry could have expired (recall that each entry has an expiration time); marykate could be newly added to the network, or hillary could be recently rebooted and so it has not populated its ARP table yet. If it cannot find the host in its ARP table, hillary sends an ARP request packet. This packet has four fields.

The packet would look like this
Source IP addr Source MAC addr Dest IP addr Broadcast MAC addr
128.213.71.311b.aa.58.91.c0.00 128.213.44.11ff.ff.ff.ff.ff.ff
This is sent to the broadcast MAC address, so every host on the network reads it, including marykate. marykate sees that the request has its IP address, so it sends an ARP reply that looks like this.

Source IP addr Source MAC addr Dest IP addr Broadcast MAC addr
128.213.44.11 4f.ca.60.06.5e.22 128.213.71.31 1b.aa.58.91.c0.00

All of the other hosts on the network ignore the request. When hillary receives the reply, it makes the appropriate entry in its ARP table, and is now able to send the message to marykate.

ARP is a plug-and-play protocol. It does not need to be configured by the system administrator. When a new host is added to the network, other hosts learn about it over time without any human intervention.

Hubs, Repeaters, Bridges, and Switches

There are a number of ways that several LANS can be connected. The lowest level is a hub. A hub is a physical layer interface; it only knows about bits; it often does not even know about MAC addresses. it simply amplifies the bits and sends them out to everyone on the network. But even a hub can provide some management features. If an adapter malfunctions and continuously sends frames, the hub can detect this and internally disconnect the malfunctioning adapter.

A repeater is similar to a hub in the sense that it is physical layer device; a repeater can be attached as one of the devices on a LAN to connect another Ethernet. Like a hub, it simply amplifies the bits and passes them on. This means that two networks connected with a repeater are in the same collision space.

Two Ethernet can also be connected with a bridge or a switch. These are datalink layer devices; they knows about MAC addresses and frame boundaries. A bridge connects exactly two networks, a switch connects multiple networks. A switch knows which hosts are on each of its networks. When a switch receives a frame, it looks at the MAC address; if it comes from a host on network A and is being sent to a host on network B (or the reverse), it forwards it, otherwise it ignores it.

Like ARP, a switch is self learning; it does not need to be configured by the system administrator. Here is the algorithm:

Token Ring

Token Ring is another LAN protocol. It is waning in popularity now. It solves the problem of collisions in a very different way from Ethernet.

All of the hosts on a network are in a ring. When the network is idle, a token is passed around the ring from one host to another. If a host receives the token and does not want to transmit, it passes it on to the next host. If host A wants to send a message to Host B, it has to wait for the token to arrive before it can transmit. Once it receives the token, rather than passing it on, it sends out its frame. This prevents the problem of collisions because only one host at a time is permitted to transmit.

Once it has sent a packet, it passes the token on to the next host in the ring.

Theoretically, token ring is slightly less efficient under periods of low load (the norm on most LANs) because when a host wants to transmit, it has to wait for the token. In practice, this is not noticeable.

Under periods of very heavy load, when many hosts want to transmit large messages, token ring is noticeably better than Ethernet. Obviously there will be degraded performance under large load with both, but Ethernet will eventually collapse and very little data will be sent, while token ring will continue to perform.

Fiber-distributed data interface (FDDI) is LAN protocol which uses a token ring system. It operates over a pair of fiber optic rings, with each ring passing a token in opposite directions. FDDI networks transmitted at 100 Mbps, which initially made them quite popular for high-speed networking. With the advent of 100-Mbps Ethernet, which is cheaper and easier to administer, FDDI has waned in popularity.

PPP the point to point protocol

PPP is a link layer protocol used for dial-up connections. as the name implies, it is a point-to-point protocol, so collisions are not an issue as they are in a broadcast network.

A PPP frame looks like this

If the end-of-frame byte appears in the data, PPP uses byte stuffing. There is an escape character 011111101. This is inserted before the end of data byte in the data. If the escape byte occurs in the data, it is preceded by another escape byte. The receiver strips these escape bytes out.

ATM and Tunneling

Asynchronous Transfer Mode (ATM) is a virtual circuit network protocol written in the 1980s. ATM uses a different protocol stack. The ATM Adaptation Layer (AAL) is roughly comparable to the Transport Layer in the OSI stack. It breaks a large message down into segments on the sender side and reassembles the segments into a message and does error checking on the receiver side. In ATM speak, the segments are called cells

ATM supports several different service models.

The ATM layer is roughly comparable to the Network Layer. It uses virtual circuit routing. Each cell is 53 bytes. A cell contains a 5 byte header and 48 bytes of payload. The header contains a 28 bit Virtual Channel Identifier (VCI). All of the cells associated with a particular session have the same VCI value. ATM is connection oriented in that an end-to-end virtual channel is created prior to any data being transmitted. The VCI tells each intermediate router which channel a particular cell is associated with. This makes routing though a network faster than the IP datagram model.

The cell header also contains a 3 bit payload type field and an 8 bit checksum.

When an Internet datagram is sent over an ATM network, the entire datagram is encapsulated into ATM cells, a process known as tunneling. The datagram is reassembled at the other side of the ATM network. To the IP layer, the ATM network looks like a single hop, even though the data may have been passed through a number of ATM switches.

802.11 Wireless LANs (WiFi)

Before communication occurs, a host needs to establish a connection, (an association) with a base station. It is sometimes the case that one host can receive signals from several base stations, so it has to choose one. Periodically, each access point sends out a beacon frame containing its ID number and MAC address. A new host scans the available channels for a beacon frame, and chooses one to establish an association. They may be required to authenticate themselves.

Once an association has been established, the host and access point can communicate. The protocol is called Carrier Sense with Collision Avoidance CSMA/CA (in contast with Ethernet, which is CSMA/CD). There is no attempt to detect collisions; once a frame starts, it is sent in its entirety.

Frames have a checksum, similar to Ethernet. However, in contrast with Ethernet, each frame is acknowledged by the receiver if it is successfully received. This is done because the probability of a frame being corrupted is higher with wireless transmission than with transmission on a wire. If the sender does not receive an acknowledgement in a reasonable amount of time, it retransmits the frame.

The frame structure of 802.11 is more complex than that of Ethernet. It starts with a 16 bit frame control field, which contains fields pertaining to version and such. Each frame has three MAC addresses: (actually there is a fourth address as well, but it is only used in ad hoc networks). The three addresses are

  1. MAC address of the wireless station that is to receive the frame
  2. MAC address of the source
  3. MAC address of the destination router or switch
An example will help explain this.

Host A has established an association with Base Station 1. It wants to send a packet to the Internet. Address 1 of the frame would be the MAC address of Base Station 1 and Address 2 would be the MAC address of Host A. Recall that the base station does not understand IP addresses. The third address is the MAC address of the Internet Gateway, which is just another host on the Ethernet. When the frame arrives at the Base Station, the Base Station converts it from an 802.11 (WiFi) format to an 802.3 (Ethernet) format and passes it on to the Gateway. It puts Address 3 in the Destination address of the Ethernet packet. The source is Host A's MAC address.

Likewise, if a packet arrives at the Internet Gateway destined for Host A, the gateway sends an ordinary Ethernet frame to the base station, with Host A's MAC address as the source. The base station reads this frame and converts it to a WiFi frame, with the MAC address of Host A in address 1, the MAC address of the base station in address 2, and the MAC address of the Gateway in Address 3.

Note that the Gateway is not aware that there is a base station between the network and Host A.

Required Reading

As usual, the Wikipedia description of Ethernet is excellent

and here is its treatment of ATM

More than you ever wanted to know about ARP