Skype for Business Media Relay Authentication Service (MRAS) and troubleshooting guidelines

In Skype for Business, every time you want to talk with a remote user you need to find an available path to establish the sRTP traffic.
Because the P2P connectivity is not available the client can use either the STUN or TURN candidate to start the media flow.

The following diagram shows a scenario when the STUN address is used, and the traffic flows through the NAT device to reach the remote user.

 

Sometimes, NAT devices don’t allow this kind of traffic to go through and in this situation the Edge Server (TURN candidate) is used for the media traffic.

 

Skype for Business is an enterprise product and the security of each communication is extremely important.
This is the reason why if you want to relay media traffic over the Edge Server, you need a mechanism that can guarantee this security standard.

That’s when the Media Relay Authentication Service comes into play.
The Media Relay Authentication Service is a service running on the Edge responsible for providing credentials to the client that will be used to request ports and establish media sessions through the Edge Server.  Without these credentials, clients will not be able to use the Edge Server as possible candidate when trying to establish a sRTP session.

The following diagram highlights the MRAS flow for an external user (internal users would follow the same mechanism but bypassing the Edge server).

 

After the client sends the initial REGISTER (1) and receives the acknowledge by the Front-End server (2), it continues the conversation sending a SERVICE request to get the MRAS token (3).
At this point the FE communicates with the Edge on port 5062 in order to get the credentials (4). These credentials are generated by using the internal certificate installed on the Edge server and sent back to the Front End (5). After the Front End receives the information, it sends them to the user in the 200 OK message.

The credentials provided to the user have 8 hours validity and after that period, they have to be refreshed.
The MRAS also contains the FQDN of the Edge media relay service for both internal and external users. This is the information used by the internal client to send media stream when using the Edge as media relay.

 

This is an example of MRAS request and response received by the client.

 

It is not only the client which requests the MRAS, but also other services such as MCU, which use the same concept.
If you are connected externally and you want to participate to a conference hosted by an internal user, your media traffic will flow as shown in the diagram below.

 

This is possible, because the MCU service received a Username and Password during the MRAS request sent after the service start-up.

In case internal or external clients are not able to obtain a valid MRAS token, they won’t be able to use the Edge Server as relay.
This will affect every call with external users / federated partners in which the Edge is involved in the path.

 

In real life, there are different reasons why the MRAS and consequently the calls with remote users fail, but here are a few of the most common:

Edge is not configured as Media Relay for the Pool
In an on-premises deployment, the Edge has to be configured as Media Relay for the Front-End Pool. If it’s not configured the client doesn’t even send a SERVICE request for the MRAS.
If you are facing a situation where you don’t see the MRAS SERVICE request, you need to associate the edge to the Front-End pool (Changing the Edge pool associated with a Front End pool)

Firewall Issue blocking port 5062
As I explained before, the Front End communicate with the Edge on port 5062 to obtain the MRAS token.
In situations where the connection between the FE and the Edge internal interface on the port 5062 is denied, the user might not receive any answer to the SERVICE MRAS request.

Edge routing issue
In enterprise environments, the client and server subnets can be numerous.
The Edge server is placed in the DMZ, and it should be configured to have the Internet breakout as default gateway: any request coming from a “unknown subnet” is sent to the internet gateway.
It’s important to correctly configure static routes on the Edge server, to route all the requests coming from “know subnets/internal subnets” to the internal gateway.

MRAS works fine, but calls fail
After the client received the MRAS information, it will be used for every call initiated by the user.
This means that the client will resolve the FQDN of the Media Relay Service (Edge FQDN) and send traffic to it. The FQDN used by the clients depends on its location (internal or external).
As basic troubleshooting guideline, you need to be sure that the Edge FQDN you see in the MRAS response is resolved with the correct IP address of the Edge/VIP and also that the communication on port 443 TCP and 3478 UDP are allowed.
If you have a proxy in the environment it’s better to bypass the Media Relay Service FQDN to avoid that the traffic will flow through the proxy.

MRAS works fine, but some calls fail
This usually happens when you have multiple Edge servers, Front End servers, client subnets. In complex environments, it might happen that the network rules are not implemented correctly for specific client subnets, or the routing is not correctly configured in the organization.
Having a good knowledge on network troubleshooting will help in this case.

Another reason that might cause a similar problems, and which is more complex to identify by looking at the problem from a network point of view, can be related to certificates.
Let’s take for instance a scenario with two Edge servers balanced by DNS Load Balancing with different internal certificates installed.
In this setup, if the client receives the MRAS credentials from the Edge server A but it uses the Edge Server B for media stream, the call will fail.
As explained before, the MRAS username and password are generated using the Edge internal certificate and this certificate is also used to validate the credentials when a client wants to use the MRAS service.
If a certificate is used to generate the credential you cannot use a different certificate to read them.
In a situation where different certificates are installed in the Edge Servers belonging to the same Edge Pool, all the calls that won’t use the same server for MRAS request and media stream will fail.
It’s mandatory that every Edge server belonging to a specific pool has the same certificate installed.

 

These are some of the most common misconfigurations I’ve seen in customer deployments. Checking these common issues before proceeding with deep network analysis might help you save hours of troubleshooting.

 

Leave a comment