Simplify the Cloud: Applying the Policy Continuum to Unified Communications

The policy continuum concept is an important component of any policy-based management system. It states that policies can be defined at different levels of abstraction and that policies must be mapped and reconciled between those levels of abstraction. In an earlier post, I showed how the policy continuum concept can be applied to cloud services by using TOSCA to model service topologies and associated policies at various levels of abstractions and then recursively map these models into increasingly lower levels of abstractions, until they can ultimately be deployed onto available resources.

To further illustrate the policy continuum concept, I’ll use performance and quality management of Real-Time Media applications (such. Unified Communications, Video Conferencing, Video Streaming, etc.) as an example. In the discussion, I’ll use the following five levels of abstraction (as proposed by John Strassner in his DEN-ng work):

Business view
System view
Administrator view
Device view
Instance view

As we’ll see, managing performance of Real-Time Media applications isn’t as simple as just “turning on QoS”. Instead, different performance-related concepts are used at different levels of abstraction, and these concepts will ultimately cascade down into low-level QoS configurations on various network devices.

Business View

At the highest level of abstraction, the business view expresses policies in terms of business goals. These goals are typically expressed as Service Level Objectives (SLOs) that must be met by the service. For real-time media applications, SLOs will focus on the audio and video experience that must be delivered to the end user. This experience can generally be expressed along the following dimensions:

Fidelity: This dimension represents the inherent quality of the media streams. For example, video “fidelity” can be high-definition, standard definition, or lower. Video fidelity can be expressed in terms of video resolution and frame-rate (for example 1080p video at 30 frames per second).
Interactivity: This dimension focuses on how well the service enables interactive communications. High levels of interactivity are required for VoIP or video conferencing, but not for streaming video. Other applications may fall somewhere in between. For example, moderated interactive meetings that adopt a “pass the baton” model of floor control do not have the same interactivity requirements as “free-for-all” interactions. The level of interactivity can be expressed in terms of end-to-end delays in the media streams (typically measured in milliseconds). If audio and video streams experience large delays, people in the call will start talking over one another and interactivity will be limited.
Service “grade”: This dimension focuses on any noticeable artifacts or disruptions in the audio or video streams (typically as a result of lost or late packets). Whereas people will tolerate a certain amount of video artifacts or disruptions with consumer-grade video products, business-grade services are expected to deliver a more pristine experience. For video, this requirement can generally be expressed in terms of the percentage of video frames that are corrupted.

Note that while these SLOs can be expressed using quantifiable metrics, business-view policies generally do not use static numbers. As an example, consider the case of an enterprise video conferencing service. While the service might be configured to deliver high-definition video, the corresponding video resolution might differ depending on the users involved in the call, the location of those users, the time of day, etc. For example, the high-definition requirement may translate into 1080p video for executives and board meetings, but regular users may be limited to 720p. In addition, 1080p might be OK for calls within the continental US, but for cost reasons trans-pacific calls may always limited to 720p or lower, independent of the user.

Business-level policies for real-time media dynamically produce quantifiable metrics for fidelity, interactivity, and “grade” of service.

System View

The system view describes the service in terms of the architectural components that make up the service functionality. It does so using descriptions that are largely technology-independent. For example, a video conferencing service could be modeled as a number of media servers hosted in a cloud data center and accessed from enterprise endpoints using one or more access networks.
Using this system view of the service, it is then possible to translate the business goals specified at the higher level of abstraction into system requirements for each of the components in the functional architecture. For example, let’s examine how business goals might translate into requirements on the access network used to access video conferencing services.

The fidelity requirement translates almost directly into bandwidth capacity requirements on the network. For example, 720p video at 30fps might require at least 1Mb/s per video stream.
The interactivity requirement drives propagation delay tolerance as well as jitter tolerance (since adaptive jitter buffers add delays to deal with variable packet delays). For example, most VoIP applications require propagation delays of less than 150msec.
The enterprise-grade vs. consumer grade dimension translates into packet loss tolerance as well as jitter tolerance (since jitter buffers may discard late packets, which makes these packets equivalent to lost packets). For example, business grade video might need less than 0.1% packet loss.

Just as business-level policies tend to be dynamic, the mapping function between business-level objectives and system-level metrics tends to be dynamic rather than static as well. For example, bandwidth requirements for a given video resolution might depend on the type of content delivered by the video application (since images with lots of motion tend to need more bandwidth than video that is more static such as board room video conferencing).

System-level policies for real-time media dynamically produce quantifiable metrics for bandwidth requirements, propagation delay and jitter, and packet loss rates.

Administrator View

The administrator view drills down into the specific technologies that can be used to deliver each of the functional components specified in the system architecture. To meet the bandwidth, latency, jitter, and packet loss requirements specified at the system level, network administrators typically employ the following three different technologies:

Class-of-Service (“Diff-Serv”): Network administrators can request differentiated treatment of traffic on the network by assigning different Class of Service to different types of traffic. All traffic belonging to the same class will receive the same treatment on the network. For example, traffic belonging to the Expedited Forwarding (EF) class is forwarded as quickly as possible to ensure low delay, low loss and low jitter. These characteristics are suitable for voice, video and other real-time services. Traffic belonging to the Assured Forwarding (AF) class is guaranteed to be delivered as long as the traffic does not exceed some subscribed rate.
Traffic Engineering: Using Traffic Engineering, network administrators can allocate specific amounts of bandwidth to different classes of service. For example, networks may be configure to reserve 20% of link capacity to interactive voice, and another 20% to video traffic. In addition, performance-based routing may be used to route different classes of traffic along different links depending on network topology or network load.
Admission Control: Traffic engineering is often used in conjunction with Admission Control to enforce the configured bandwidth limits for each of the different classes of traffic. Admission control prevents links from getting overloaded, which would result in packets getting dropped.

Mapping mechanisms are required to translate bandwidth requirements as well as latency, jitter, and packet loss tolerances into CoS, Traffic Engineering, and Admission Control configurations. Again this process needs to be dynamic. For example, it may be necessarily to manage traffic engineering allocations dynamically to adapt to changing call volumes in a video conferencing service.

Administrator-level policies dynamically produce configurations for class of service, traffic engineering, and admission control.

Device View

The device view describes the device-specific configurations required to implement the CoS, Traffic Engineering, and Admission Control policies spelled out in the administrator view. Devices from different vendors may have different capabilities, but most networking equipment supports the following general set of QoS features that can be used for this task:

Low-latency queues. Expedited Forwarding traffic is often sent through low-latency queues that have strict priority over all other traffic classes.
Priority queuing: to provide relative priority between different traffic classes. Higher priority traffic will generally be queued before traffic in lower priority queues.
Configurable queue depths and policing to limit the amount of traffic allowed for the different priority queues: Because an overload of EF traffic will cause queuing delays and affect the jitter and delay tolerances within the class, EF traffic is often strictly controlled through admission control, policing and other mechanisms.
Traffic shaping: to spread out inter-packet arrival times, which can prevent congestion and the associated packet loss.

The exact mechanisms for configuring these features are device and vendor-specific.

Device-level policies dynamically produce configurations low-latency and priority queuing, for configuring queue depths, and for policing and shaping traffic.

Instance View

Once configurations have been created for the various types of devices on the network, these configurations can then be applied to all instances of that type. The exact configurations of each instance may be adjusted individually to account for actual traffic patterns and traffic loads. In addition, this is where all aspects of actual traffic flowing through network devices use can get monitored and where information regarding performance is collected. The instance view then provides a starting point for aggregating usage information can be used to dynamically adapt configurations at higher levels of abstraction.

Instance-level policies describe the specific device configurations as tailored to each individual device instance.

In summary, the policy continuum concept is an important construct that can help rationalize policies at different levels of abstraction. Applying the policy continuum to Unified Communications helps to demonstrate the relationships between various QoS concepts and as a result can dramatically simplify performance management of real-time media on any network.

Simplify the Cloud

Sunday, September 4, 2016

Applying the Policy Continuum to Unified Communications

Business View

System View

Administrator View

Device View

Instance View

No comments:

Post a Comment

About Me