Session Initiation Protocol (SIP) - extracts from the Wikipedia

The Session Initiation Protocol (SIP) is an IETF-defined signaling protocol widely used for controlling communication sessions such as voice and video calls over Internet Protocol (IP). The protocol can be used for creating, modifying and terminating two-party (unicast) or multiparty (multicast) sessions. Sessions may consist of one or several media streams.

Other SIP applications include video conferencing, streaming multimedia distribution, instant messaging, presence information, file transfer and online games[citation needed].

Protocol design

A motivating goal for SIP was to provide a signaling and call setup protocol for IP-based communications that can support a superset of the call processing functions and features present in the public switched telephone network (PSTN). SIP by itself does not define these features; rather, its focus is call-setup and signaling. The features that permit familiar telephone-like operations: dialing a number, causing a phone to ring, hearing ringback tones or a busy signal - are performed by proxy servers and user agents. Implementation and terminology are different in the SIP world but to the end-user, the behavior is similar.

Network elements

A SIP user agent (UA) is a logical network end-point used to create or receive SIP messages and thereby manage a SIP session. A SIP UA can perform the role of a User Agent Client (UAC), which sends SIP requests, and the User Agent Server (UAS), which receives the requests and returns a SIP response. These roles of UAC and UAS only last for the duration of a SIP transaction.[6]

A SIP phone is a SIP user agent that provides the traditional call functions of a telephone, such as dial, answer, reject, hold/unhold, and call transfer.[7][8] SIP phones may be implemented as a hardware device or as a softphone. As vendors increasingly implement SIP as a standard telephony platform, often driven by 4G efforts, the distinction between hardware-based and software-based SIP phones is being blurred and SIP elements are implemented in the basic firmware functions of many IP-capable devices. Examples are devices from Nokia and Research in Motion.[9]

Each resource of a SIP network, such as a User Agent or a voicemail box, is identified by a Uniform Resource Identifier (URI), based on the general standard syntax[10] also used in Web services and e-mail. A typical SIP URI is of the form: sip:username:password@host:port. The URI scheme used for SIP is sip:. If secure transmission is required, the scheme sips: is used and SIP messages must be transported over Transport Layer Security (TLS).[6]

Proxy server : An intermediary entity that acts as both a server and a client for the purpose of making requests on behalf of other clients. A proxy server primarily plays the role of routing, which means its job is to ensure that a request is sent to another entity "closer" to the targeted user. Proxies are also useful for enforcing policy (for example, making sure a user is allowed to make a call). A proxy interprets, and, if necessary, rewrites specific parts of a request message before forwarding it.

SIP messages

SIP is a text-based protocol with syntax similar to that of HTTP. There are two different types of SIP messages: requests and responses. The first line of a request has a method, defining the nature of the request, and a Request-URI, indicating where the request should be sent.[12] The first line of a response has a response code.

For SIP requests, RFC 3261 defines the following methods:[13]

The SIP response types defined in RFC 3261 fall in one of the following categories:[14]

Transactions

SIP makes use of transactions to control the exchanges between participants and deliver messages reliably. The transactions maintain an internal state and make use of timers. Client Transactions send requests and Server Transactions respond to those requests with one-or-more responses. The responses may include zero-or-more Provisional (1xx) responses and one-or-more final (2xx-6xx) responses.

Transactions are further categorized as either Invite or Non-Invite. Invite transactions differ in that they can establish a long-running conversation, referred to as a Dialog in SIP, and so include an acknowledgment (ACK) of any non-failing final response (e.g. 200 OK).

Because of these transactional mechanisms, SIP can make use of un-reliable transports such as User Datagram Protocol (UDP).

Diagram showing colour coded SIP system interactions

If we take the above example, User1’s UAC uses an Invite Client Transaction to send the initial INVITE (1) message. If no response is received after a timer controlled wait period the UAC may have chosen to terminate the transaction or retransmit the INVITE. However, once a response was received, User1 was confident the INVITE was delivered reliably. User1’s UAC then must acknowledge the response. On delivery of the ACK (2) both sides of the transaction are complete. And in this case, a Dialog may have been established.[15]