Virtual IMS Core use case
Add N+k definition Change-Id: I0a4e0a566446f81528584f5921244030e0251272
This commit is contained in:
parent
f30ae83473
commit
c043d9d606
155
usecases/Virtual_IMS_Core.rst
Normal file
155
usecases/Virtual_IMS_Core.rst
Normal file
@ -0,0 +1,155 @@
|
||||
..
|
||||
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=============================
|
||||
Virtual IMS Core
|
||||
=============================
|
||||
|
||||
This use case is about deploying a virtual IMS core as an NFV function in
|
||||
OpenStack, though is more broadly representative of the issues involved in
|
||||
deploying in a scalable Telco-grade control plane function deployed as a
|
||||
series of load-balanced N+k pools.
|
||||
|
||||
The use case focuses on those elements rather than more generic issues like
|
||||
supporting NFV orchestration and cloud platform HA.
|
||||
|
||||
This use case obsoletes the version previously hosted direct on the TelcoWG
|
||||
wiki [1].
|
||||
|
||||
Glossary
|
||||
========
|
||||
|
||||
* NFV - Networks Functions Virtualisation, see http://www.etsi.org/technologies-clusters/technologies/nfv
|
||||
* IMS - IP Multimedia Subsystem
|
||||
* SIP - Session Initiation Protocol
|
||||
* P/I/S-CSCF - Proxy/Interrogating/Serving Call Session Control Function
|
||||
* BGCF - Breakout Gateway Control Function
|
||||
* HSS - Home Subscriber Server
|
||||
* WebRTC - Web Real-Time-Collaboration
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
An IMS core [2] is a key element of Telco infrastructure, handling VoIP device
|
||||
registration and call routing. Specifically, it provides SIP-based call
|
||||
control for voice and video as well as SIP based messaging apps.
|
||||
|
||||
An IMS core is mainly a compute application with modest demands on
|
||||
storage and network - it provides the control plane, not the media plane
|
||||
(packets typically travel point-to-point between the clients) so does not
|
||||
require high packet throughput rates and is reasonably resilient to jitter and
|
||||
latency.
|
||||
|
||||
As a core Telco service, the IMS core must be deployable as an HA service
|
||||
capable of meeting strict Service Level Agreements (SLA) with users. Here
|
||||
HA refers to the availability of the service for completing new call
|
||||
attempts, not for continuity of existing calls. As a control plane rather
|
||||
than media plane service the user experience of an IMS core failure is
|
||||
typically that audio continues uninterrupted but any actions requiring
|
||||
signalling (e.g. conferencing in a 3rd party) fail. However, it is not
|
||||
unusual for client to send periodic SIP "keep-alive" SIP pings during a
|
||||
call, and if the IMS core is not able to handle them the client may tear
|
||||
down the call.
|
||||
|
||||
An IMS core must be highly scalable, and as an NFV function it will be
|
||||
elastically scaled by an NFV orchestrator running on top of OpenStack.
|
||||
The requirements that such an orchestrator places on OpenStack are not
|
||||
addressed in this use case.
|
||||
|
||||
User Stories
|
||||
------------
|
||||
|
||||
As a Telco, I want to deploy an IMS core as a Virtual Network Function
|
||||
running in OpenStack.
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
Project Clearwater [3] is an open-source implementation of an IMS core
|
||||
designed to run in the cloud and be massively scalable. It provides
|
||||
P/I/S-CSCF functions together with a BGCF and an HSS cache, and includes a
|
||||
WebRTC gateway providing interworking between WebRTC & SIP clients.
|
||||
|
||||
Affected By
|
||||
-----------
|
||||
|
||||
None.
|
||||
|
||||
Requirements
|
||||
============
|
||||
|
||||
The problem statement above leads to the following requirements.
|
||||
|
||||
* Compute application
|
||||
|
||||
OpenStack already provides everything needed; in particular, there are no
|
||||
requirements for an accelerated data plane, nor for core pinning nor NUMA.
|
||||
|
||||
* HA
|
||||
|
||||
Project Clearwater itself implements HA at the application level, consisting
|
||||
of a series of load-balanced N+k pools with no single points of failure [4].
|
||||
|
||||
To meet typical SLAs, it is necessary that the failure of any given host
|
||||
cannot take down more than k VMs in each N+k pool. More precisely, given
|
||||
that those pools are dynamically scaled, it is a requirement that at no time
|
||||
is there more than a certain proportion of any pool instantiated on the
|
||||
same host. See Gaps below.
|
||||
|
||||
That by itself is insufficient for offering an SLA, though: to be deployable
|
||||
in a single OpenStack cloud (even spread across availability zones or
|
||||
regions), the underlying cloud platform must be at least as reliable as the
|
||||
SLA demands. Those requirements will be addressed in a separate use case.
|
||||
|
||||
* Elastic scaling
|
||||
|
||||
An NFV orchestrator must be able to rapidly launch or terminate new
|
||||
instances in response to applied load and service responsiveness. This is
|
||||
basic OpenStack nova function.
|
||||
|
||||
* Placement zones
|
||||
|
||||
In the IMS architecture there is a separation between access and core
|
||||
networks, with the P-CSCF component (Bono - see [4]) bridging the gap
|
||||
between the two. Although Project Clearwater does not yet support this,
|
||||
it would in future be desirable to support Bono being deployed in a
|
||||
DMZ-like placement zone, separate from the rest of the service in the main
|
||||
MZ.
|
||||
|
||||
Related Use Cases
|
||||
=================
|
||||
|
||||
* https://review.openstack.org/#/c/163399/ (placement zones)
|
||||
|
||||
Gaps
|
||||
====
|
||||
|
||||
The above requirements currently suffer from these gaps.
|
||||
|
||||
* Affinity for N+k pools
|
||||
|
||||
An N+k pool is a pool of identical, stateless servers, any of which can
|
||||
handle requests for any user. N is the number required purely for
|
||||
capacity; k is the additional number required for redundancy. k is
|
||||
typically greater than 1 to allow for multiple failures. During normal
|
||||
operation N+k servers should be running.
|
||||
|
||||
Affinity/anti-affinity can be expressed pair-wise between VMs, which is
|
||||
sufficient for a 1:1 active/passive architecture, but an N+k pool needs
|
||||
something more subtle. Specifying that all members of the pool should live
|
||||
on distinct hosts is clearly wasteful. Instead, availability modelling shows
|
||||
that the overall availability of an N+k pool is determined by the time to
|
||||
detect and spin up new instances, the time between failures, and the
|
||||
proportion of the overall pool that fails simultaneously. The OpenStack
|
||||
scheduler needs to provide some way to control the last of these by limiting
|
||||
the proportion of a group of related VMs that are scheduled on the same host.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
* [1] https://wiki.openstack.org/wiki/TelcoWorkingGroup/UseCases#Virtual_IMS_Core
|
||||
* [2] https://en.wikipedia.org/wiki/IP_Multimedia_Subsystem
|
||||
* [3] http://www.projectclearwater.org
|
||||
* [4] http://www.projectclearwater.org/technical/clearwater-architecture/
|
Loading…
Reference in New Issue
Block a user