Adaptive streaming

The most common way to deliver online content is via adaptive streaming over HTTP.

Adaptive streaming is a client/server–based system used to download content over the Internet without some of the disadvantages of a progressive download of an entire file for watching content. With adaptive streaming, the content is logically segmented and delivered in small chunks (which allows, for example, a quick start-up time). This method also introduces the ability to make adjustments to compensate for changes in the available network bandwidth during content presentation so that the content is presented as smoothly as possible without rebuffering to the end user.

A media presentation that is prepared for adaptive streaming is normally encoded in multiple ways. For example, the same content is encoded at different bit rates to provide higher and lower bandwidth-intensive versions. For another example, the content can be encoded with different codecs and configurations, such as Dolby Digital Plus surround sound or stereo AAC for audio tracks. Additional or alternative content (for example, audio in additional languages) can also be encoded as part of the media presentation. All versions of the content are made available on the server and represented in the manifest file.

To prepare content for adaptive streaming, the content is encoded, packaged in a container format (such as a fragmented ISO base media file or an MPEG-2 transport stream), and divided into small segments (literally or virtually). Meanwhile, one or more manifests or playlist files are generated that describe the content alternatives available for a media presentation, the URL of each segment, and other relevant information. Based on this information, a playback client checks the delivery conditions (such as available bandwidth and client capabilities) and requests the most appropriate version of the content.

The diagram provides an overview of the content preparation process.

Figure: Adaptive streaming content preparation overview

Adaptive streaming content preparation overview

During playback, the client monitors delivery conditions and can switch to a different presentation of the content. Short-duration segments allow the client to quickly respond to changes of bandwidth or other conditions and switch to a different encoding of the content with little or no interruption to the playback.

The ability to switch between different content streams also allows delivery of alternative versions of the same content (for example, to switch to other camera angles for video, or to an alternative language for the audio).

To leverage features of adaptive streaming and provide a good user experience, fundamental heuristics must be implemented in a playback product. This includes the capability of providing fast start-up, highest-quality presentation for available bandwidth, and smooth adaptation to available (and potentially varying) network bandwidth. A client may start playback with the lowest quality available, which provides the fastest start-up time. During the initial downloading, the client should estimate the network condition to switch to a higher quality available for the current network bandwidth. For another example, to seek to an arbitrary point in the audio/video content, the client should calculate the corresponding time interval and request that specific media segment immediately. Furthermore, a client may use more metrics than just available bandwidth. A client may take into account other available resources, including the CPU processing power availability (or other dedicated hardware support), to avoid stuttering during playback even if the bandwidth may allow for streaming a higher bit-rate version of the content.

The diagram provides an overview of the playback process.

Figure: Adaptive Streaming Playback Overview

Adaptive Streaming Playback Overview