The AirPlay 2 sender communicates with receivers through the RTSP protocol defined in RFC2326. However, Apple RTSP implementation has been customized to behave similarly to both an RTSP and a HTTP server.

The request-URI ends with RTSP/1.0 e.g. GET /info RTSP/1.0.

An RTSP HTTP-like request can have the following headers:

GET /info RTSP/1.0
X-Apple-ProtocolVersion: 1
Content-Length: 70
Content-Type: application/x-apple-binary-plist
CSeq: 0
DACP-ID: 93842EE68464F5B9
Active-Remote: 1149412352
User-Agent: AirPlay/409.16

Instead, a standard RTSP request can have the following headers:

SETUP rtsp:// RTSP/1.0
Content-Length: 819
Content-Type: application/x-apple-binary-plist
CSeq: 1
DACP-ID: 93842EE68464F5B9
Active-Remote: 1149412352
User-Agent: AirPlay/409.16

RTSP headers

Header Description
X-Apple-ProtocolVersion Protocol version
Content-Length Length of the content/body after the headers
Content-Type Type of content
CSeq Specifies the sequence number for an RTSP request/response pair. Receiver must reply with the same CSeq. Incremented by one in every request.
DACP-ID 64-bit value identifying the DACP server (remote control of the sender)
Active-Remote Authentication token for the DACP server (remote control of the sender)
RTP-Info Sent with FLUSH for RTP synchronization

RTSP requests

The receiver must implement the endpoints reported in the following subsections. Their behavior could be different depending on the content type.

GET /info

Sender needs info from the receiver

Content-type: application/x-apple-binary-plist

The sender asks for specific info. The request body is encoded in a binary plist. Usually the sender initiates the communication demanding the txtAirPlay qualifier of the receiver with the following binary plist:

{'qualifier': ['txtAirPlay']}

This requests actually asks the TXT record of the _airplay._tcp mDNS service as a binary plist. This also suggests that a receiver can have a minimal TXT mDNS record and then declare further information at this stage.


No content type

The server sends a second /info request to ask for additional information. The request has no body. The receiver replies with a binary plist which may contain the following key:value entries:

Key Type Description
initialVolume Integer A value from -144 to 0 corresponding to the initial volume of the receiver [dB]


POST /auth-setup


POST /fp-setup


POST /pair-setup


POST /pair-verify


POST /command


POST /feedback

Probably an heartbeat to ensure the ensure the receiver is alive. Sent until the receiver is disconnected.


POST /audioMode



This is the setup request used to establish the communication with the receiver and configure the time, event, control and data channels between the two. The time channel is used only with NTP synchronization and stays down when using PTP.

1) SETUP info and event

The sender communicate generic info about the device, timing protocol, timing peers and values related to encryption: ekey, eiv and et (encryption type). timingPeerInfo and timingPeerList are needed if receiver supports PTP and the sender announces timingProtocol=PTP.

The receiver sets up an event channel (TCP) and communicates the port, together with its timing info. If the receiver declares PTP time synchronization, then timingPort won't be used. If sender and receiver use NTP instead, the receiver must open a timing channel and declare its port into timingPort.

The event channel must be open or the RTSP won't continue.


2) SETUP control and data

This SETUP requests is sent as soon as audio streaming must start. The sender declares audio format, latencies, its control port and the following parameters:

  • audioFormat - the audio format;
  • ct - compression type;
  • shk - shared encryption key;
  • spf - Frames per packet.

The key type is used to declare the type of streaming.

Type Description
96 General audio - Real time
103 General audio - Buffered
110 Screen
120 Playback
130 Remote control

The key ct stands for compression type.

Compression type Description
1 LPCM (Linear Pulse Code Modulation)
2 ALAC (Apple Lossless)
4 AAC (Advanced Audio Coding)
8 AAC ELD (Enhanced Low Delay)
32 OPUS1

The receiver prepares its control and data channels (UDP or TCP depending on the type) and communicates the respective ports in controlPort and dataPort. The control channel will receive RTCP2 packets while the data channel the actual streaming payload over RTP3.

The value of audioFormat is encoded as described in section Audio codecs



Used to set parameters on the receiver end or to communicate something, depending on Content-Type.

Content-Type Body Description
text/parameters "volume: N” Volume to set on the receiver
text/parameters "progress: X/Y/Z” Progress of the current track (start/current/end)
image/jpeg data JPEG image of the artwork
application/x-dmap-tagged data Now playing info using DAAP

Example: volume


Get a parameter from the receiver.

Example: volume


Used to communicate other possible receivers in the multi-room group. It is a binary plist containing a list of IPv4 and IPv6 addresses of the devices in the group.

Example: single receiver

Example: multi-group join


The sender wants to start streaming.



Sent every time the audio streaming is about to start. The RTSP request includes the header RTP-Info: seq=X;rtptime=Y. X is the first RTP sequence number and Y the first RTP timestamp.



Sent when audio is paused or AirPlay is stopped. The body is a binary plist containing active streams, if audio is on pause, or empty if AirPlay is disconnected.

Example: Pause

Example: Disconnect

  1. Definition of the Opus Audio Codec - https://tools.ietf.org/html/rfc6716 ↩︎

  2. RTP: A Transport Protocol for Real-Time Applications - SR: Sender Report RTCP Packet - https://tools.ietf.org/html/rfc3550#section-6.4.1 ↩︎

  3. RTP: A Transport Protocol for Real-Time Applications - https://tools.ietf.org/html/rfc3550 ↩︎