Adaptive streaming

The most common way to deliver online content is via adaptive streaming over HTTP. A multimedia presentation that is prepared for adaptive streaming is normally encoded in different ways (for example, at different bit rates). Additional or alternative content can also be encoded (for example, to provide audio in additional languages).

To prepare content for adaptive streaming, the content is encoded, packaged in a container format (such as an .mp4 file or an MPEG-2 transport stream), and segmented. One or more files that instruct a client application how to download and play back the content must also be created. These manifest or playlist files include instructions for the client to play back any of the available content that forms a media presentation.

Your product may perform all of these functions, or other products may perform some of the functions, depending on your work flow. For example, your product may encode and multiplex the content, but your work flow uses a third-party segmenter. This diagram gives an overview of the content preparation process.

Figure 1. Adaptive streaming content preparation overview
Adaptive streaming content preparation overview

In an adaptive streaming delivery system, several versions of the content are made available on the server. For example, content is encoded at several quality levels to provide higher and lower bandwith-intensive versions. Content is divided into small segments (literally or virtually). The client checks the delivery conditions (such as available bandwidth and client capabilities) and requests the most appropriate version of the content.

During playback, the client monitors delivery conditions and can switch to a different presentation of the content. Small segments allow the client to switch to a different encoding of the content with little or no interruption to the playback.

The ability to switch between different content streams also allows delivery of alternative versions of the same content (for example, to switch to other camera angles for video, or to an alternative language for the audio).