Wednesday, October 17, 2012

Content Delivery Networks (CDN) federation 101


Content Delivery Networks (CDN) federation is the buzz word in the network service providers industry, today. This blog post tries to answer all the basic questions that someone may have about CDN federation.

CDN Federation Architecture


What is CDN federation?

It is a collection of CDNs which are operated independantly by different service providers, but are interconnected through open interfaces/APIs. The combined CDN will look like ONE LARGE content delivery infrastructure.

CDN federation agreements are very similar to the "roaming" agreements between the service providers in the wireless world.

For example,

Service Provider A operating in Country A will has a hosting/CDN service agreement with CDN customer A.
Service Provider B operating in Country B will has a hosting/CDN service agreement with CDN customer B.

Assuming that Service Providers A & B have a CDN federation agreement,

CDN Customer A's media assets can be cached/served by the Service Provider B, when a user from Country B requests for such content.

CDN Customer B's media assets can be cached/served by the Service Provider A, when a user from Country A requests for such content.

They'll settle their bills later :)

Why do we need CDN federation?

In the past, CDN service providers (such as Akamai, Edgecast & Limelight) dominated the CDN service market. Network Service Providers allowed these CDN service providers to deploy their caches in their network. Network Service Providers had the advantage of saving network bandwidth, which provided them the benefit of NOT upgrading their network capacity often. CDN service providers got power, cooling and real estate for (almost!) free. This model didn't allow Network Service Providers to monetize the business, though they saved significant money spent on trans-continental bandwidth. Some of the CDN service providers also engaged in a revenue sharing model with the Network Service Providers. However, Network Service Providers weren't satisified with that.

Today, more and more Network Service Providers have started rolling out CDN services - in order to monetize the content that is sitting in their network. In this model, we'll soon have a world with lots of "disconnected" small CDNs - as opposed to ONE LARGE CDN (similar to what Akamai has). The Network Service Providers wanted to get the maximum benefit out of their investment and also wanted to leverage the best practises from their wireless industry.

Thus, the need for CDN federation was born.

What are the benefits of CDN federation?
  • Service Providers can increase the foot print for their CDN customers, without having to do a lot of capital investment
  • Improved user experience and faster content downloads for end users, who are accessing content from different parts of the world 
  • Service Providers can now monetize the content sitting in their network (For ex., by offering service differentiation for premium vs. standard users)
  • Service Providers can reduce the network bandwidth costs by caching content and serving them locally (especially videos!) 

How is CDN federation achieved?

CDN federation is achieved by interconnecting the CDNs. CDN service providers use established interfaces/APIs for the interconnection.

The following are some of the items that need to be considered for CDN Interconnection:
  • Exchange of policies for Content Distribution (What are the caches that need to be populated with this content?, What is the time at which the content should be ingested to the caches? What should the cache do, in case of a cache-miss etc.,)
  • Maintaining knowledge about "which content resides where". This information can be used in making effective request routing decisions
  • Routing user requests to the appropriate cache (i.e., to the cache that is closer to the user's geography which can serve the user's request effectively)
  • Exchanging policy information or performing policy lookups against a centralized policy database (i.e., a content that is permissable for public viewing in Country A may not be permissible in Country B)
  • Management APIs for integration (for purging content from the CDN, for logging/reporting, for accounting/billing, for service monitoring)
Who are the vendors/service providers that claim CDN federation support?

Cisco Systems claims to have conducted a CDN federation pilot/trial with service providers such as British Telecom, KDDI, Orange, SFR and Telecom Italia.

http://blogs.cisco.com/sp/on-transitions-cdn-interconnects-and-why-you-should-not-miss-scott-puopolos-keynote-at-cdn-world-summit/

Edge cast claims to have conducted a OpenCDN trial with tier-one North American operator as well as one of Asia’s largest carriers, Pacnet.

http://www.edgecast.com/company/news/edgecast-announces-opencdn

American Telco CDN was integrated with the European StreamZilla CDN

http://jet-stream.com/blog/cdn-federation-pilot/

Are there any standards for CDN federation?

The work for standardising CDN federation and CDN interconnection APIs are in progress. A number of IETF standards are evolving for CDN federation. The key focus of the working group is to come up with standards for establishing CDN Interconnection (CDNI).

http://datatracker.ietf.org/wg/cdni/

CDN Reporting and Analytics

Reporting and Analytics is a crucial component of a Content Delivery Network (CDN). In simple words, Reporting/Analytics software helps the CDN to find out if the delivery network is doing good and how they can do better / improve the business.

The CDN reports/analytics can be used to

Monitor the performance of the servers and caches
Plan for capacity management / upgrades
Monetize by creating advertisements or by delivering premium content to users
Fine-tune the caching and delivery policies to optimize the media delivery (for ex., how much is served from cache vs. origin server)
Detect attacks or unauthorized access to content

Reporting/Analytics Servers produce two types of reports:

Real time reports (mostly generated based on statistics / counters retrieved via SNMP or through CSV files from the caching/media servers)
Historical reports (mostly generated based on access logs)

Reports are organized/presented based on:

Time (hourly, daily, weekly, monthly & yearly media delivery performance reports)
Content (content popularity, content type, server domain(s), server response status and content referrers based reports)
Geography (user's Country/State/City/Region, and ISP based reports)
Media client / player (number of pauses made, number of seeks made, buffering etc.,)

There are a number of off-the-shelf CDN reporting/analytics software available in the market.

1) Skytide

http://www.skytide.com

2) Sawmill (requires some customization)

http://www.sawmill.net/

3) Casterstats

http://www.casterstats.com/

4) Ooyala (video specific analytics/reports based information/statistics obtained from video player)

http://www.ooyala.com/

Vendors such as Akamai, Cisco, Edgecast, Limelight, Media Melon, NSA CDN, Azion, Jetstream, Internap, Verivue, and Velocix provide integrated CDN reporting and analytics software.

Tuesday, October 16, 2012

Components of a Content Delivery Network (CDN)

Components of a Content Delivery Network (CDN)

You can see a number of service providers getting into the market of offering CDN services. Many of them sign-up partnership and just resell services from popular CDNs such as Akamai, Limelight and Edgecast. If you are a newbie to the CDN space and wondering what are the various components required to build an end to end, Content Delivery Network (CDN), you have come to the right place.

1) Media Library - Media Library is a storage repository where the content from CDN's customers get published. Media Libraries are typically large file servers (such as NetApp, EMC/Isilon file servers). Media libraries are optimized for storage and not for content delivery. Hence, media libraries support only a small number of concurrent connections. Media Libraries are front-ended by media servers or caches that can scale to a large number of concurrent users. Media Libraries sit in the CDN's origin data centers.

2) Media servers - Media servers are responsible for serving content to the users (or to the caches in the network). They are optimized for media delivery and hence can simultaneously support a large number of concurrent users.

3) Caching servers - Caching servers are used for storing/serving popular content to the users. Caching servers are typically deployed throughout the Content Delivery Network in a hierarchical fashion. There will be 2 or 3 tiers of caches deployed in a CDN. The first tier is called the origin tier, which refers to the group of caches that are deployed near the Media Library. The second tier is the mid-tier, which refers to the group of caches that are deployed at the core of the network. The third tier is the edge-tier, which refers to the group of caches that are deployed at the edge of the network. Media gets served from the caches in the edge-tier to the users. If a given request cannot be fulfilled by an edge-tier cache, the request gets forwarded to the caches in the upper tier. Upon receiving content from the upper tier caches or from the media library, it gets served to the end user.

4) Content Routing System - When a user requests for content, the request typically goes to the Content Routing System first, before getting forwarded to the appropriate cache in the network. The duty of the Content Routing System is to route the user to the appropriate cache, which can serve the content. Content Routing system selects a cache from the network, based on a number of criteria such as
  • load metrics of the caching server
  • service availability of the caching server
  • geographical proximity of the user to the cache 
There are multiple ways of routing a user request to the appropriate cache. The most popular mechanism is DNS based request routing. When a user requests for www.example.com/video/trailer.flv, www.example.com is resolved to the IP address of the cache in the edge-tier, which is closer to the user. Assuming that the media delivery protocol is HTTP, the browser/player establishes a TCP connection with the cache. The HTTP request from the browser/media player goes to the cache and the content gets served from the cache. Content Routing System should be highly available because, it is the crucial component of a Content Delivery Network.

Content Routing System gets the service availability and load information of the caches, through the Load Monitoring System.

5) Load Monitoring System - Load Monitoring System periodically sends probes to the caches and the media servers in the network to monitor their health and service availability. The load metrics that are collected from the caches/media servers are

  • Device load metrics such as CPU, memory, disks, and network ports utilization
  • Application load metrics such as # of TCP connections, amount of throughput served etc.,

The load metrics are collected using SNMP, HTTP API or through scripts that login to the devices directly to collect load metrics.

The Load Monitoring System also collects the service availability information of the caches/media servers. The load monitoring system shares the load information with the Content Routing System.

6) Logging Server - In a CDN network, several 100s and 1000s of log messages are generated every second. The log messages that get generated include service logs (such as access logs which are transactional), and system logs (such as syslogs which are generated when an important event occurs). In order to offload the media delivery servers from logging activity, CDNs deploy dedicated logging and log aggregation servers. The function of logging servers is to perform log collection, storage, and aggregation. The logs that are collected are used for generating historical reports. System logs are used for notifying administrators in case of a critical failure (such as network link failure).

7) Reporting / Analytics Server - The key differentiator of any CDN service is the reporting/analytics capability. Reporting/Analytics Servers produce two types of reports:

- Real time reports (mostly generated based on statistics / counters retrieved via SNMP or through CSV files from the caching/media servers)
- Historical reports (mostly generated based on access logs)

Reports/Analytics are used
  • To monitor the health of the CDN network including service utilization, and service availability.
  • To perform capacity management and to plan for capacity upgrades
  • To gather information for marketing / promotion of new services
  • To gather insights to expand/provide additional services
8) Provisioning & Management Systems - Provisioning and Management is an important component of any CDN.

There are two types of provisioning / management systems:
  • Service provisioning/management system 
  • Network provisioning/management systems
In a CDN network, Service provisioning/management systems are responsible for provisioning/managing services such as Live Streaming, On-demand streaming, and SSL offload. If a CDN customer wants to sign-up for a live streaming service, the service provisioning system gets all the required information from the CDN customer (such as domain, live publishing point, # of concurrent viewers, bandwidth etc.,) and uses that information to enable services by configuring various network equipment in the CDN network. Service provisioning system gets service related configuration and converts them to device configuration commands. The device configuration commands are then sent to the network provisioning/management systems. The network provisioning/management systems configure the devices such as routers/switches, load balancers and the caching/media servers.

In most cases, the Service & Network Provisioning / Management may very well be done by the same system in the CDN network.

9) Customer Portal - Customer Portal is the front-end used by the CDN customer to sign-up for services and enter service configuration information. Customer Portal also provides reporting/analytics that are specific to the customer.

10) Network Operations Control (NOC) Portal - NOC Portal is used by the CDN administrators/employees to manage/monitor the CDN network. NOC portal can be used for configuring the network equipment deployed at the CDN and in monitoring the health of the equipment. When a device generates an SNMP trap or syslog, that becomes available at the NOC Portal. The administrator can take an action based on the criticality of the event. NOC Portal provides an over-all view of the network, whereas customer portal provides only the customer specific view.

11) Load Balancers - Load Balancers are deployed at the POPs to distribute the load across the available caches in the POP. They also monitor the health of the caches before forwarding the request to the cache. Load Balancers deployed at the POP can distribute requests based on L4 or L7 based load distribution mechanisms. Many routers/switches support L4 based load distribution techniques, avoiding the need for costly Load Balancers. When L4 load balancers are used, load distribution decisions are made based on the L4 parameters such as IP address of the source/destination, TCP/UPD port numbers etc., When L7 load balancers are used, load distribution decisions can be made using L7 parameters (such as HTTP request URL, HTTP URL query parameters and HTTP request headers).

12) Dynamic Site Accelerators - Dynamic Site Accelerators are required for providing service to CDN customers such as e-commerce portals and banking websites. DSA servers support a number of optimizations such as TCP connection pooling, TCP connection persistence, Compression of content, and SSL offloading. CDNs that offer DSA services, deploy Dynamic Site Accelerators in their network.

13) Digital Rights Management Server - DRM servers are required for authorizing users before serving content to users. Premium video delivery services and services that provide per-user specific media viewing experience would need DRM servers. Licensed Media assets are served to users only after authorizing their eligibility for the service.

Thursday, October 4, 2012

Video Encoders & Cloud based Encoding Services

The following are some popular video encoders, encoding servers and cloud based video encoding services.


Envivio 4Caster

http://www.envivio.com/

Inlet Spinnaker (acquired by Cisco)

http://www.cisco.com/en/US/products/ps11804/index.html

Digital Rapids StreamZ

http://www.digital-rapids.com/

Viewcast Niagara Streaming Media Encoder

http://www.viewcast.com/

Sorenson - Squeeze Server

http://www.sorensonmedia.com/

ffmpeg

http://ffmpeg.org/

Cloud based encoding services

Aviberry 

http://www.aviberry.com/ 

Zencoder 


Encoding.com 


Transcode.it 



There are a number of other software products that can be installed in PCs to encode videos. They aren't included here.

Wednesday, October 3, 2012

List of mobile platforms and supported media players/formats

Here is a list of mobile platforms and the default media players and supported video/audio formats.

Popular Mobile OS / Devices
Default
Players
Video / Audio formats
Containers
Symbian
(Nokia, LG, Samsung, Sony Ericsson, Fujitsu, Motorola, BenQ)
Real Media Player
H.264, AAC, MPEG4,
Real Media,
Windows Media, AMR, VC1
MP4, M4A, MOV, 3GP, AVI, ASF, WMV, RM, RMVB, MPG
Flash Media Player
H.264, AAC, MPEG4, VP6, Sorenson Sparc
FLV, F4V
Blackberry OS
(RIM Blackberry)
Proprietary
Media Player
H.264, AAC, MPEG4, H.263, AMR, Windows Media
MP4, M4A, MOV, 3GP, M4V, AVI, ASF, WMV, WMA
Windows Mobile
(HTC, Samsung, Motorola, LG etc.,)
Windows
Media Player
Windows Media, MPEG4, H.263, (3GP supported only by v12+ players)
WMV, WMA, ASF, MPG, AVI, WAV
Android OS
(HTC, Samsung, Motorola, etc.,)
Open source Media players
H.264, AAC, MPEG4, H.263
MP4, M4A, 3GP, OGG
iPhone OS
(Apple iPhone)
Quick Time
H.264, AAC
MP4, M4V, MOV

Tuesday, October 2, 2012

Transcoding, Transrating or Transcontainerization - What do they mean?

If you are into the video delivery space, you'll quite often run into jargons such as Transcoding, Transrating and Transcontainerization. What do these jargons mean? For a novice, it is harder to understand what these jargons mean. I've taken a stab to simplify these jargons for better understa



Transcontainerization - Transcontainerization refers to re-publishing a video in a new container format. For ex., taking an MP4 video and converting it into formats such as Adobe HTTP Dynamic Streaming format, Apple HTTP Live Streaming format or Microsoft Smooth Streaming format is called trans-containerization.

Let us take for example, you are buying a few oranges and apples and carrying them in a plastic bag to home. After reaching home, you are transferring the oranges and apples from a plastic bag to a plastic basket. You are just changing the container without changing the properties of the oranges/apples. This is similar to transcontainerization - where just the wrappers of the MP4 video is changed. 


Transcoding - Transcoding refers to modifying properties of video such as modifying resolution, aspect ratio, frame rate, and audio/video codec. For ex., converting the resolution of the video from 1920 × 1080 to 1024 × 768 for viewing in different digital displays is called transcoding.

Let us take for example, you are buying a few oranges and apples. Assume that you have the powers of a magician. If you convert an apple to orange (or) an orange to apple, it is similar to Trancoding. Here you are trying to play with the video properties and codecs. 

Transrating - Transrating refer to changing the bit rate of the video without altering other properties of the video. For ex., changing a 1 Mbps video stream to a 512 Kbps video stream is called transrating.


Let us take for example, you are buying a few oranges and apples. Assume that you again have the powers of a magician. If you convert a large apple to a small apple, it is similar to transrating.  You are changing the bit-rate, without modifying the properties of the video. 

Encoding - Encoding refers to the process of preparing video for storage and transmission. For ex., capturing of human actions into digital format using camera or camcorder is encoding.

Well, that is self explanatory. So, no analogies :) 

Monday, October 1, 2012

DMCA guidelines for Transparent Proxy Caching


DMCA stands for Digital Millennium Copyright Act. Section 512(b) of DMCA limits the liability of service providers for the practice of retaining copies, for a limited time, of material that has been made available online by a person other than the provider, and then transmitted to a subscriber at his or her direction.

More details can be found at http://copyright.gov/legislation/dmca.pdf

DMCA provides guidelines for Service Providers in United States, who would like to deploy a caching solution. Service Providers from other parts of the world, also expect their caching solution to be fully DMCA compliant.

The following are limitations imposed by DMCA:

The content of the retained material must not be modified.
The provider must comply with rules about “refreshing” material— replacing retained copies of material with material from the original location— when specified in accordance with a generally accepted industry standard data communication protocol.
The provider must not interfere with technology that returns “hit” information to the person who posted the material, where such technology meets certain requirements.
The provider must limit users’ access to the material in accordance with conditions on access (e.g., password protection) imposed by the person who posted the material.
Any material that was posted without the copyright owner’s authorization must be removed or blocked promptly once the service provider has been notified that it has been removed, blocked, or ordered to be removed or blocked, at the originating site.

DMCA does not outline how a service provider can adhere to the above guidelines. Different service providers/caching vendors use different mechanisms to adhere to the above guidelines.

Transparent Proxy vs. Reverse Proxy Cache


What is a Transparent Proxy?

A cache/proxy that intercepts requests from users in a transparent manner and serves content from origin/cache without modifying the requests/responses

What is a Reverse Proxy?

A cache/proxy that receives requests destined to certain origin servers and serves content from origin/cache with/without modifying the requests/responses. Both origin and user can be aware of the presence of cache/proxy in-between.

When are the characteristics of a Transparent Proxy Cache?

  • Requests are redirected to the cache using Policy Based Routing (PBR) or Filter Based Forwarding (FBF) / Static route modification / WCCP 
  • Typically deployed at Internet Service Provider / Enterprise / Campus Edges only
  • Requires routing policy changes at all the edge routers/switches that are handling user requests
  • Does not require any contracts to be signed with the content publisher 
  • Server / Clients do not know about the presence of a cache in-between

When are the characteristics of a Reverse Proxy Cache?

  • Requests redirected to the cache using DNS based routing
  • Typically deployed at Origin or Content Publisher / CDN Edges. Can also be deployed at Internet Service Provider / Enterprise /Campus edges.
  • Requires changes in the DNS Server. Domains have to be modified to point to the cache's IP address.
  • May require contracts with the content publisher or content owner
  • Server / Client knows about the presence of a cache in-between


What are the advantages of deploying a caching in Transparent Proxy mode?

  •  Users need not modify browser/player settings to go through a proxy cache
  •  Client & Server are unaware of the presence of a cache in-between (assuming that the cache honors HTTP standards)
  •  Commonly used at ISP / Campus Edges


What are the limitations of Transparent Proxy caching?

  • Increases load on router/switch to inspect and forward requests to the proxy/cache
  • Cache has to be compliant with Digital Millennium Copyright Act (DMCA regulations)
  •  Not suitable for CDN or Content Publisher Edges


What are the advantages of a reverse proxy cache?

  •  Users need not modify their browser settings to go through a proxy cache
  •  Request routing done at the DNS server (centralized configuration change)
  •   Commonly used at Origin locations or Content Provider Edges


What are the limitations of a reverse proxy cache?

  • Server/Users know about the presence of the cache in-between 
  •  DNS entries have to be modified for all the domains (to route the requests to the proxy/cache)

When is a caching appliance deployed as a transparent proxy?

  • The customer doesn’t want the users or the origin servers to know about the presence of a proxy/cache
  • The customer wants to do caching at the edge
  • The customer is an ISP or Campus Edge operator
  • The customer doesn’t want its users to modify their browser configuration to point to a proxy
  • The customer doesn’t want to change the DNS configuration
  • The customer is ready to turn-on Policy Based Routing/Filter Based Forwarding/Static Routing at Edge


When is a caching appliance deployed as a reverse proxy?

  • The customer wants to do caching at the origin or  edge
  • The customer is a Content Publisher, CDN or a ISP/Campus Edge operator
  • The customer doesn’t want its users to modify their browser configuration to point to a proxy
  • The customer doesn’t want to change policies or configuration in the routers/switches
  • The customer is ready to change DNS configuration
  • The customer is OK, if the user/origin knows about the presence of a proxy/cache in-between