CSCI E-131b Communication Protocols and Internet Architectures [Intro and CH. 1]

For Fall '09, Harvard Extension is offering a Communication Protocols & Internet Architectures with Len Evenchik. The meetings are Mondays from 5:30-7:30 PM.

I decided to take the class because the Web is my world and it's been a while since I understood how it actually works. Right now I see it as a bunch of TLAs (three-letter-acronyms) that scare the heck out of me. Why should I care about TCPs, UDPs and IPs? I'm just a developer!

But after working for a major newspaper and trying to run my own start-ups I realized that's an awful mindset to have. I need to have a better picture of this whole world -- wide web -- that I love to work in. At times I'm not just a back-end developer, I'm the front-end guru, or the database genius, the sys-admin g*d, etc, etc.

Now let's get started. In blogs titled "Communication Protocols and Internet Architectures" I'll be referencing the book Computer Networks by Larry L. Peterson and Bruce S. Davie (Fourth Edition).

Note: All of my bullets can probably start off with the phrase "I think that" or "I don't really know but it seems like".

Chapter 1: Foundation

  • A network was a term originally used to describe the relationship between remote terminals and a main-frame computer. Think using a keyboard and a LCD screen but having the Computer itself (CD-ROM, etc) 200 ft away.
  • Many networks are optimized for a specific task. There are telephone networks, cable networks, and satellite networks to name a few. Not all networks support computer, phone, or cable traffic as efficiently as another.

1.1 Applications
  • The World Wide Web (Web for short) is not the internet. The Web is an application that uses the internet.
  • Everything selectable item on a web page has a special identifier. It's referred to as a URL or URI (Universal Resource Locator/Identifier).
  • Every time you click on a URL, many messages are sent between you and remote computers. Some include establishing connections and discovering IP-addresses in order to retrieve data.

1.2 Requirements
  • Building the internet involves trying to please almost everyone.
  • Application programmers (speed), network designers (efficiency), and the network providers (administration) all have different goals.

1.2.1 Connectivity
  • Scale is a popular buzzword. It means to support growth and typically refers to a system built to grow-and-grow-and-grow-and-grow. The internet scales as new users and systems are added to it.
  • A link is a connection between computers (a cable, fiber, or wireless). Several machines sharing a link forms a switched networked.
  • There are two main types of switched networks, circuit-switched and packet-switched. If you've ever watched Mad Men you'll occasionally see circuit-switched networks. It's when -- way-back-when, a telephone operator would physically move widgets and doodads in order to get your call through. I wasn't alive then but I imagine it looking like:

  • Packet-switched refers to a trickier concept. It's where packets (blocks of data) are passed from machine to machine, at times using shared resources and links. Since ALL data cannot flow instantaneously, the data or packets "take turns."
  • A packet-switched network uses a store-and-forward strategy. A machine will first save the data and then pass it to the next machine. In your office you may have a piece of hardware called a switch whose primary function is to do this well.
  • The process of determining where certain packets go is called routing.

1.2.2. Cost-Effective Resource Sharing
  • Sharing a common resource such as a link is called multiplexing. It's how data from several machines can somehow "fit" on the same wire. De-multiplexing is how the same data makes it off of the wire and to its destination.
  • There are several ways to share a resource. The time-share way is called synchronous time-division multiplexing (STDM). As you can imagine, it is sharing based on predetermined time-slots. This can be inefficient when the time-slots are not being used.
  • Frequency division multiplexing (FDM) transmits data at different frequencies over the same resource. I'm believe this is how we can cable-tv, internet, and phone through the same coaxial cable.
  • When computers talk to each other, the data is sent in smaller blocks of data called packets. Occasionally, packets are lost or dropped when too much data is passing through a switch. Remember store-and-forward? When someone can't be stored (like a packet) it's good-bye!

1.2.3 Reliability
  • There are three classes of network issues: bit errors (0s turning into 1s and vice versa), lost packets, and physical issues (blue-screen-of-death).

1.3 Network Architecture
  • A network architecture (blueprint) helps network designers plan and implement a network.

1.3.1 Layering and Protocols
  • Layering allows you to modify or add functionality at a specific layer.
  • A protocol provides a communication service that things (higher-language-objs) can use.
  • A protocol has two parts/interfaces (a service and a peer interface)
  • The service interface defines how software on the same computer can use the protocol.
  • The peer interface defines how to different machines can use the protocol to communicate.
  • A protocol specification defines how a protocol is designed and implemented.
  • Interoperate is a term used to describe two protocol modules that implement a specification, thus allowing communication.

1.3.1 Encapsulation
  • As there are different layers, or network modules, there are specific ways to handle the piece of data that is being transferred. Encapsulation is a process of only showing a particular part of useful data to each particular module.
  • There are two main parts, the header and the payload.
  • The header is like a mailing label, it describes just enough information to know where a packet is heading and where it came from.
  • The payload is the data itself.
  • Encapsulation is a recursive process where each layer can wrap and unwrap a packet.
  • The demultiplexing key (demux key) seems to be an identifier to how messages were multiplexed and demultiplexed. Maybe this helps a particular layer dissemble and reassemble a set of packets?
  • ISO is a common term: International Standards Organization.
  • They helped define a common way to connect computers using an architecture called Open Systems Interconnection (OSI).
  • It's more or less a 7-layer system (application [...] network)

1.3.3 Internet Architecture
  • The internet architecture is often called the TCP/IP architecture after its two main protocols.
  • The protocols, when illustrated top to bottom, for an hour-glass shaped figure where everything is funneled through a common IP layer.
  • The protocol modules at the top (application) and bottom (network) may be interchanged, while IP is necessary for any internet traffic.
  • Interesting enough, the layering is not strict and modules may leap-frog over a module... An application may use the IP protocol directly.

1.4.1 Application Programming Interface (Sockets)
  • The Operating System (OS) provides a way to hook into and use network protocols. It's called an Application Programming Interface (API) or socket.
  • A protocol is a provider -- of a service -- and the API/socket is the syntax that the OS provides.

1.4.3 Protocol Implementation Issues
  • There are 2 main ways that an Operating System allocations resources for a network subsystem.
  • Process-per-protocol involves starting and stopping a process/thread for each layer or protocol module that's involved.
  • A protocol module is subsystem that handles a particular protocol (TCP, IP, UDP, etc)
  • Process-per-message involves allocating a process/thread for the message as it traverses through the entire network stack.
  • [??] I am confused about what this all means.
  • The OS maintains an abstract data type for the message as it is handled by the system. This allows for the OS to by-pass the store-and-forward process, not having to copy the message up and down the network stack.

1.5 Performance
  • Network designers build for performance.
  • Performance is measured by bandwidth and latency.
  • Bandwidth (heavily related to throughput) is how much data can be transmitted at a given time.
  • Bandwidth is constantly improving as technology improves. More bits (1s and 0s) can fit on the same wire.
  • Latency (delay) is how long it takes data to reach its destination.
  • Its more important to measure latency by it's round-trip-time RTT. This is how long data takes to reach its destination and back.
  • Bandwidth is more significant latency with larger file sizes. Latency is significant when fewer packets are being transferred.
  • The specific amount of time that it takes to send a signal from one end to the other is called the propagation-delay.

1.5.2 Delay x Bandwidth Product
  • If the delay is how long, and bandwidth is how much at a time, then the product is the volume (of the channel/pipe).
  • [??] I'm not sure if I understand the significance.

1.5.4 Application Performance Needs
  • Some apps, such as video, can state a ceiling on how much bandwidth is needed.
  • Video can be compressed and only the bits that change from frame to frame can be sent.
  • Jitter describes the latency between packets or the interpacket-gap.
  • If you know the upper and lower bounds of latency on a network, you can delay the start of a video to wait in order to stream the video without any hiccups.

1.6 Summary
  • Networks must be cost effective.
  • The layered internet architecture defines a blueprint and the protocols are the means of communication.
  • The socket interface is the interface between applications and the OS networking subsystem.
  • Networks have an emphasis on high-performance. Delay * Bandwidth is important in protocol design.