The AirPlay 2 sender communicates with receivers through the RTSP protocol defined in RFC2326. However, Apple RTSP implementation has been customized to behave similarly to both an RTSP and a HTTP server.
The request-URI ends with
GET /info RTSP/1.0.
An RTSP HTTP-like request can have the following headers:
GET /info RTSP/1.0 X-Apple-ProtocolVersion: 1 Content-Length: 70 Content-Type: application/x-apple-binary-plist CSeq: 0 DACP-ID: 93842EE68464F5B9 Active-Remote: 1149412352 User-Agent: AirPlay/409.16
Instead, a standard RTSP request can have the following headers:
SETUP rtsp://192.168.1.12/15566703517752576217 RTSP/1.0 Content-Length: 819 Content-Type: application/x-apple-binary-plist CSeq: 1 DACP-ID: 93842EE68464F5B9 Active-Remote: 1149412352 User-Agent: AirPlay/409.16
|Content-Length||Length of the content/body after the headers|
|Content-Type||Type of content|
|CSeq||Specifies the sequence number for an RTSP request/response pair. Receiver must reply with the same CSeq. Incremented by one in every request.|
|DACP-ID||64-bit value identifying the DACP server (remote control of the sender)|
|Active-Remote||Authentication token for the DACP server (remote control of the sender)|
The receiver must implement the endpoints reported in the following subsections. Their behavior could be different depending on the content type.
Sender needs info from the receiver
The sender asks for specific info. The request body is encoded in a binary
plist. Usually the sender initiates the communication demanding the
qualifier of the receiver with the following binary plist:
This requests actually asks the
TXT record of the
_airplay._tcp mDNS service as a binary plist. This also suggests that a receiver can have a minimal
TXT mDNS record and then declare further information at this stage.
No content type
The server sends a second
/info request to ask for additional information. The
request has no body. The receiver replies with a binary plist which may contain the following key:value entries:
|initialVolume||Integer||A value from -144 to 0 corresponding to the initial volume of the receiver [dB]|
Probably an heartbeat to ensure the ensure the receiver is alive. Sent until the receiver is disconnected.
This is the setup request used to establish the communication with the receiver and configure the time, event, control and data channels between the two. The time channel is used only with NTP synchronization and stays down when using PTP.
1) SETUP info and event
The sender communicate generic info about the device, timing protocol, timing
peers and values related to encryption:
et (encryption type).
timingPeerList are needed if receiver supports PTP and the sender announces
The receiver sets up an
event channel (TCP) and communicates the port, together with its timing info. If the receiver declares PTP time synchronization, then
timingPort won't be used. If sender and receiver use NTP instead, the receiver must open a timing channel and declare its port into
The event channel must be open or the RTSP won't continue.
2) SETUP control and data
This SETUP requests is sent as soon as audio streaming must start. The sender declares audio format, latencies, its control port and the following parameters:
audioFormat- the audio format;
ct- compression type;
shk- shared encryption key;
spf- Frames per packet.
type is used to declare the type of streaming.
|96||General audio - Real time|
|103||General audio - Buffered|
ct stands for compression type.
|1||LPCM (Linear Pulse Code Modulation)|
|2||ALAC (Apple Lossless)|
|4||AAC (Advanced Audio Coding)|
|8||AAC ELD (Enhanced Low Delay)|
The receiver prepares its control and data channels (UDP or TCP depending on the type) and communicates the respective ports in
dataPort. The control channel will receive RTCP2 packets while the data channel the actual streaming payload over RTP3.
The value of
audioFormat is encoded as described in section Audio codecs
Used to set parameters on the receiver end or to communicate something, depending
|text/parameters||"volume: N”||Volume to set on the receiver|
|text/parameters||"progress: X/Y/Z”||Progress of the current track (start/current/end)|
|image/jpeg||data||JPEG image of the artwork|
|application/x-dmap-tagged||data||Now playing info using DAAP|
Get a parameter from the receiver.
Used to communicate other possible receivers in the multi-room group. It is a binary plist containing a list of IPv4 and IPv6 addresses of the devices in the group.
Example: single receiver
Example: multi-group join
The sender wants to start streaming.
Sent every time the audio streaming is about to start. The RTSP request includes
RTP-Info: seq=X;rtptime=Y. X is the first RTP sequence number and
Y the first RTP timestamp.
Sent when audio is paused or AirPlay is stopped. The body is a binary plist containing active streams, if audio is on pause, or empty if AirPlay is disconnected.