By Mitrasingh Chetlall, Product Manager, Qosmos Division, Enea
Why is traffic encryption on the rise?
Encryption on the public Internet is constantly rising, with current estimates showing that over 80% of web traffic will be encrypted by the end of 2019. A few content providers (e.g. Facebook, YouTube, and Netflix) are responsible for most of the encrypted traffic. This is globally a positive evolution toward protecting privacy on the Internet, a trend accelerated since Snowden’s revelations about NSA interception activities.
Similar encryption trends can be observed for datacenters, with Yahoo, Google, and Microsoft encrypting all their data center traffic. In the enterprise space, more than 40% of traffic is currently encrypted due to a major shift towards cloud-based applications in the last 2 years.
The encryption adoption rate on the public internet has remained steady at around 30% for the last 3 years. This pace is likely to increase following encryption favorable initiatives like “Let’s Encrypt” and “Encryption Everywhere” by the Internet Security Research Group (ISRG) and Symantec respectively.
In the enterprise space, today’s trend towards a cloud-native approach shows that it will reach 30% by 2020 (15% today). This coupled with upcoming 2018 compliance regulations like the General Data Protection Regulation (GDPR)1, Network and Information Security (NIS) Directive2 and EPrivacy3 will definitely speed the pace towards encryption, which is forecast to reach close to 50% by 2020.
How to classify encrypted traffic
It is important to remember that encryption does not mean that the traffic is undetectable; it just means that the content remains private. Advanced techniques, such as Deep Packet Inspection (DPI) can still classify encrypted traffic, enabling service providers to continue to perform policy enforcement, optimize traffic and ensure a good user experience.
The primary aim of a DPI engine is to be able to classify applications and services independently from the underlying (transport) protocols, these being encrypted or not. To achieve this goal, Enea’s Qosmos Labs have reverse engineered the most important L4 and L7 protocols to identify key invariants that are used to classify the top applications. Among these protocols are those adding encryption layers such as TLS (1.1, 1.2 and 1.3), dtls, quic, spdy and http2. Furthermore, Qosmos technology uses DNS caching as well as statistical traffic models as a means to identify remaining traffic.
Here are a few examples of encrypted traffic classification techniques, with indications of their accuracy and limitations.
Example 1: Classifying traffic encrypted with SSL/TLS (e.g. https)
Typical protocols: Google, Facebook, WhatsApp
Classification method: Read name of service in SSL/TLS certificate or in Server Name Indication (SNI)
Accuracy: Deterministic method – 100% accurate
Limitations: If SNI does not appear at the start of the handshake, the SSL/TLS certificate may only be available after 5 or 6 packets, which can cause a slight delay. Depending on the content provider, the same certificate may be used for different services (like email, news etc.).
Important: It is of common belief that TLS 1.3 makes traffic undetectable with current techniques. In the latest draft of TLS 1.3 (draft-ietf-tls-tls13-23), the SNI remains clear and is even exchanged in the session resumption processes (0 or 1 RTT). This means that current classification techniques, including those used by Qosmos ixEngine®, will continue to be effective.
Example 2: Classifying encrypted P2P
Typical protocols: BitTorrent (RC4 Encrypted), Ares
Classification method: Statistical Protocol Identification (SPID)
Statistical protocol identification is a technique based on divergence measurement of the traffic being analyzed and a statistical traffic model. Qosmos technology uses a statistical traffic model specially developed in-house.
Accuracy: Typically more than 90% of P2P sessions are identified for RC4 encrypted BitTorrent.
Additional info: This technique requires more packets than others (typically 15 for BitTorrent) in order to have a correct traffic model for matching.
Example 3: Classifying Skype
Classification method: Search for binary patterns in traffic flows
This pattern is usually found in the first 2 or 3 packets
Accuracy: 90 – 95 % accurate
Additional info: In addition to the search for binary patterns, a statistical method is used to identify different services within Skype such as Skype voice, Skype video, and Skype chat. This method uses a combination of jitter, delay, length of packets, spacing of packets, etc.
Thanks to advanced classification techniques, traffic optimization, policy enforcement, and user experience are largely unaffected by encryption. This means that communication service providers can continue to leverage network intelligence to ensure service quality and manage resource utilization, while respecting subscriber privacy.
About the Author
(Mitrasingh) Danny Chetlall is Product Manager at the Qosmos Division of Enea. He has over 14 years’ experience in the architecture, design and development of IP technologies with specific expertise in Deep Packet Inspection and networking applications. His responsibilities include the protocol signature and protocol bundle updates for Qosmos ixEngine.