Raptor Codes - Part 1
These notes describe the theory, design, and practical realization of Raptor Codes.
Data Transmission
-
Digital media have become an integral part of our lives. Whether surfing the web, making a wireless phone call, watching a satellite TV, or listening to digital music, a large part of our professional and leisure time is filled with all things digital.
-
The replacement of analog media by their digital brethren and the explosion of the Internet use has had a perhaps unintended consequence.
- Whereas, the analog media were previously replaced by digital media to preserve quality, the existence of high-speed computer networks makes digital media available to potentially everyone, anywhere, and at any time.
- This possibility is the basis for modern scientific and economic developments centered around the distribution of digital data to a worldwide audience.
-
Reliable transport of digital data to heterogeneous clients becomes thus a central and at times a critical issue. Receivers can be anywhere and they may be connected to networks with widely differing fidelities.
The Transmission Control Protocol
-
The basic transmission protocol used by any internet transmission is the Internet Protocol, commonly known as IP.
- The data to be transmitted is divided into packets; these packets are given headers with information pertaining to their origin and their destination; pretty much like sending a regular letter, where we put addresses of the receiver and that of the sender on the envelope.
- Routers that take the role of mail stations inspect the headers and forward the packets to another router closer to the destination.
- To do this, they consult regularly updated routing tables, through which they can determine the shortest path between them and the destination.
- Eventually, following the path from one router to another, packets may be delivered to their destinations.
-
In theory, this protocol is sufficient for data delivery; however. the reality looks different.
- Routers tend to get overwhelmed at times by incoming traffic, leading them to drop some of the incoming packets.
- These dropped packets will never reach their destination.
- To overcome this problem, researchers proposed already in the early days of the Internet the "Transmission Control Protocol", or TCP.
- TCP withstood the test of time, as it remains the most widely used transmission protocol on the Internet.
-
We give here a very simplified description which has the advantage of clarifying the main mechanisms behind the real TCP.
- In effect, for every packet sent, an acknowledgment is expected from receiver.
- If the acknowledgement is not received after a prescribed period of time, the packet is considered lost and countermeasures are initiated, with the most basic of these countermeasures consisting of resending the missing packet.
- The other integral part of these countermeasures is the reduction of transmission rate, which is done in the following way: the real TCP does not await acknowledgements of individual packets before sending the next one, but instead has at any time a number of packets in transit.
- The acknowledgment of packet is only expected after all packets sent previous to the packet have been acknowledged.
- When a packet is lost, detected at the server by received acknowledgments of packets sent after the lost packet, the number of packets allowed to be in transit is reduced, which effectively reduces the rate at which packets are sent to the receiver and provides a rate control mechanism.
The User Datagram Protocol
- Another transmission protocol, "User Datagram Protocol". Originally this protocol was envisioned for short messages without strict reliability requirements.
- Essentially it is analogous to sending mail through the postal service: each UDP packet contains a source address and a destination address, and the packet is routed through the network toward its destination without any guarantees that it will arrive; the packet may, e.g., be lost en route due to a router buffer overflow, or to a wireless transmission error, etc.
- Furthermore, each UDP packet is sent independently of all other UDP packets, and a source may sent a sequence of UDP packets at an arbitrarily high rate that can easily overload the network. e
- Thus UDP lacks TCP's higher level reliability and rate control mechanisms.
- Because of this, delivery protocols that use UDP are sometimes blocked by firewalls from entering corporate networks.
- Nevertheless, in some situations using UDP can be quite useful.
- For example, the destination address of a UDP packet can be a group address instead of an individual receiver address.
- Thus, any receiver that is part of the group can receive UDP packets sent to that group, thus enabling UDP to be used effectively in conjunction with a broadcast or multicast enabled network in a scalable way.
- For example, broadcast file delivery and broadcast streaming applications often use UDP because the sent packets can be delivered concurrently to many receivers in a scalable way.
- In these types of applications, the packet sending rate is fixed at the source according to the available capacity of the network and/or requirements of the application.
- In these types of applications, adding a reliability protocol on top of UDP can be quite valuable, and providing this reliability is one of the main applications of Raptor Codes.
Point-to-point Transmission
- The simplest transmission scenario is point-to-point transmission.
- Here, a sender transmits data to one receiver, as described in the following figure in which a sender
S
is transmitting data to a receiverR
.
- If the distance between the sender and the receiver is not too large, then TCP is a perfect transmission protocol.
- However, if the distance is large, then TCP exhibits inefficiencies: during the time in which acknowledgements are awaited, transmission is in an idle mode and hence the real capacity of the network may not be achieved.
- The situation is compounded when there is loss on the network, i.e., the TCP transmission rate slows down even more.
Point-to-multipoint Transmission
- The second transmission scenario is the point-to-multipoint transmission.
- The situation is described in the figure below, in which a sender is transmitting data to receivers .
- A typical example is distributing live video over the Internet.
- Unless the number of receivers is small, TCP turns out to have some scaling issues in this setting.
- The reason is that the sender needs to keep track of the reception of every individual receiver, and furthermore that each receiver needs to be sent a separate stream of data.
- Therefore, the server load and the network load increase with the number of receivers, and reliable transmission becomes more challenging.
- Ironically, the more popular the content, the more difficult it becomes to deliver it to all receivers.
- This phenomenon that is typically referred to as the "curse of popularity" makes it difficult to provide a scalable broadcast service on the Internet.
Multipoint-to-point Transmission
- Another scenario is multipoint-to-point transmission.
- Here, a group of senders each possessing a copy of the same data, wants to transmit this copy to one receiver.
- The following figure shows an example in which senders are transmitting to a common receiver .
- In addition to problems discussed in the case of point-to-point transmission, multipoint-to-point solutions based on TCP lead to enormous inefficiencies: the packets received from various senders may be identical if senders are not coordinated.
- The reception of duplicate packets is one of the principal sources of inefficient network usage in this case.
- The fountain code properties make Raptor codes ideally suited for basing efficient solutions to the multipoint-to-point transmission protocols, where either UDP or TCP may be the underlying protocol onto which the usage of Raptor codes is layered.
Multipoint-to-multipoint Transmission
- Another scenario is multipoint-to-multipoint transmission, depicted in the following figure in which we have a group of senders denoted , each processing a piece of data and a group of receivers , each of which connects to a subset of the senders and receives the data.
- A good example, of this transmission scenario is a peer-to-peer network.
- All the problems discussed for the previous three transmission scenarios are also valid here.
- These problems are compounded if senders and receivers are transient, as in the case in a larger peer-to-peer network.