Recognition of traffic generated by WebRTC communication

Network traffic recognition serves as a basic condition for network operators to differentiate and prioritize traffic for a number of purposes, from guaranteeing the Quality of Service (QoS), to monitoring safety, as well as monitoring and detecting anomalies. Web Real-Time Communication (WebRTC) is an open-source project that enables real-time audio, video, and text communication among browsers. Since WebRTC does not include any characteristic pattern for semantically based traffic recognition, this paper proposes models for recognizing traffic generated during WebRTC audio and video communication based on statistical characteristics and usage of machine learning in Weka tool. Five classification algorithms have been used for model development, such as Naive Bayes, J48, Random Forest, REP tree, and Bayes Net. The results show that J48 and BayesNet have the best performances in this experimental case of WebRTC traffic recognition. Future work will be focused on comparison of a wide range of machine learning algorithms using a large enough dataset to improve the significance of the results.


Introduction
Web Real-Time Communication (WebRTC) is an open-source project that enables direct real-time communication in web browsers and mobile applications through a simple Application Programming Interface (API) [1]. It contains the basic building blocks for highquality communication on the web, such as network, audio, and video components used in audio and video applications. WebRTC aims to create a secure media connection between two or more web browsers without the need to install plugins or download native applications. Using existing protocols and applied APIs, WebRTC enables audio and video communication between users via a peer-to-peer connection, supporting all modern web browsers. WebRTC currently supports Chrome, Mozilla Firefox, Safari, Opera, and other Chromium-based browsers [2] [3].
In recent years, real-time network traffic classification has been a major challenge and is an increasingly important area with applications from the Quality of Service (QoS) to security monitoring and anomaly detection. The main goal of the classification is to provide the possibility of automatic recognition of the application that generated a given packet flow by direct or passive observation of individual packets or packet flows flowing through a network [4]. In the past, traffic classification was largely based on well-known protocol ports. WebRTC-based applications use random or non-standard ports, which makes these approaches much less efficient than in the past, as this mode of communication does not include any characteristic pattern for semantically based recognition [5] [6].
Newer techniques classify traffic by recognizing statistical patterns in externally observable traffic attributes, which include the length and arrival time of the packet. The main goal of the statistical method is based on grouping or classifying network traffic flows into groups that have identical statistical properties. The need to classify or group large data sets is one of the reasons for the introduction of Machine Learning (ML) techniques [7]. Statistical methods for accurate and efficient traffic recognition can be divided based on the type of machine learning used, supervised or unsupervised. The aim of this paper is to propose a model for recognizing traffic generated during WebRTC audio and video communication based on statistical characteristics and the use of machine learning.
The rest of the paper is structured as follows. Section 2 provides a brief review of issues, methods, tools, and algorithms for network traffic recognition. Section 3 gives an insight into the methodology used to perform the experimental study. Section 4 provides the results of the study in the form of a model for recognizing traffic generated during WebRTC audio and video communication. Section 5 concludes the paper and proposes the direction for future work.

Related work
This section provides an insight into related work that suggests models for network traffic recognition. Table 1 presents a non-exhaustive review of related work, which considers types of traffic, methods, tools, and algorithms for network traffic recognition. Also, Table 1 shows the main conclusions of the considered works. We have chosen 12 papers according to their relevance to a given topic.
In recent literature, several ways have been introduced to recognize the traffic generated during WebRTC audio and video communication based on statistical characteristics and the usage of machine learning. A total of two papers have proposed models based on a decision theory that enable recognition of encrypted WebRTC traffic using machine learning techniques, using the Weka tool. In addition, a comparison of the most important classification algorithms, such as J48, Simple Cart, Naive Bayes, and Random Forest, has been presented in [5] and [8]. The evaluation shows that the J48, Simple Cart, and Random Forest algorithms achieve better and more comparable performance than the Naive Bayes algorithm. Also, the experiment suggests that the J48 offers best results in terms of False Positive Rate (FPR), whereas Random Forest performs better in terms of True Positive Rate (TPR) detection.
Bayesian analyses, implemented in the Weka environment, are discussed in [9], [10], and [11]. Compared to other network traffic classification algorithms, the obtained results show the efficiency of the Naive Bayes algorithm in terms of accuracy.
Models for recognizing Skype traffic based on statistical characteristics and the usage of machine learning have been proposed in the [12] and [13]. An assessment of classification algorithms, such as J48, Simple Cart, and Naive Bayes, has been proposed in [12]. The comparison of algorithms shows that the J48 and Simple Cart achieve the best results. The authors of reference [13] propose appropriate machine learning tools for recognizing Skype traffic, implement a system that separates Voice over Internet Protocol (VoIP) calls made to Skype, and define functions to eliminate repetitive and redundant information all by providing a way to filter out records based on Internet Protocol (IP) address.
A framework based on two complementary techniques to reveal Skype traffic in real time is presented in [14]. The first approach is based on statistical recognition of Skype traffic using Pearson's Chi-Square test. Contrariwise, the second approach is based on a stochastic recognition of Skype traffic in terms of packet arrival speed and packet length, which are used as characteristics of a decision process based on Naive Bayes algorithm. Experimental results obtained from measurements in different networks show that the combination of the above techniques is very effective in identifying Skype traffic.
The authors of reference [15] present a comparison of the performance of machine learning algorithms for network traffic recognition. In [15], a performance assessment was performed for five IP traffic classification algorithms, such as: Naive Bayes with Discretization (NBD), Naive Bayes Kernel Estimation (NBKE), J48, BayesNet, and Naive Bayes Tree (NBTree). Comparing the classification speed, the J48 algorithm was able to identify network flows faster than the remaining algorithms. Also, the experimental results show that the NBK algorithm has the slowest classification speed, followed by the algorithms: NBTree, Bayes Net, NBD, and J48. Time taken to build model shows that NBTree is the slowest by a considerable margin. The rest of algorithms were more uniform, where a classier is built the fastest by NBK, followed by NBD, Bayes Net, and J.48.
A method for recognizing peer-to-peer network traffic between BitTorrent, PPLive, Skype, and MSN Messenger, based on the Support Vector Machine (SVM) algorithm, has been proposed in [16]. The experiment shows that this method can carry on effective classification for peer-to-peer flows, even for protocol encryption of application layer and some network flows which are difficult to be classified.
Previous related works rely on the classification of network traffic using statistical characteristics expressed in full-flow. The authors of references [17] and [18] propose a novel approach to train the machine learning classifier using statistical features calculated over multiple short subflows extracted from full-flow generated by the target application, resulting in excellent performance. Table 1 is based on statistical methods (91.67%) and machine learning algorithms using Weka tool (75%). Also, a number of classification algorithms were tested in related works, among which the most common are: Naive Bayes (75%), J48 (50%), and Simple Cart (25%). The overall analysis, which considers methods, tools, and algorithms for the classification of network traffic served to define the research methodology to be used in this paper.

Most related work from
As stated in Table 1, only two references (16.67%) study the classification of traffic generated during WebRTC communication, which compare only four classification algorithms. Therefore, the aim is to examine a number of algorithms for classifying WebRTC traffic and propose a model for recognizing such traffic based on statistical characteristics and usage of machine learning in Weka tool. Using the Naive Bayes algorithm, this approach showed results in excellent performance even when classification is initiated mid-way through a flow.

A. Experiment Design
In order to achieve the aforementioned aim, an experiment environment was setup and configured on HP laptop with Ubuntu virtual machines. One virtual machine was used to install Jitsi Meet [19], as WebRTC opensource media server, which achieved the best performance for relatively small number of participants [20], and another one was used to install Ostinato [21], as a tool to generate additional network traffic. This laptop was connected to wireless router Innbox F60 FTTH which provided the access to a Wireless Fidelity (Wi-Fi) network. To conduct the WebRTC audio and video call over Wi-Fi networks and collect whole generated traffic, two laptops were used with installed Google Chrome browser version 85.0.4183.121.
Two users, who were in different rooms, participated in this experiment. A free conversation task was performed between participants knowing each other and being located in different rooms as recommended in ITU-T P.805 [22].

B. Experiment Procedure
Both participants, after connecting to the server, had to perform eight steps, i.e., (i) launch Google Chrome browser, (ii) enter the domain jitsitest.mms.com, (iii) enter the name of pre-arranged common room and start audio and video call over Jitsi Meet, (iv) start the network traffic generator Ostinato, (v) start recording network traffic via Wireshark, (vi) participate in an audio and video call lasting 3 minutes, (vii) after the expiration of the defined time, and before the communication is interrupted, stop recording and save recorded network traffic via Wireshark [23], (viii) stop audio and video call and close the web browser.
The recorded network traffic was processed using the Weka tool [24], in which 10-fold cross-validation was used for evaluation. From recorded traffic, features such as time, source address, source port, destination address, destination port, protocol, length, and ID can be extracted. The resulting .csv file contains 720 samples and the 8 previously mentioned attributes on the basis of which the packages were classified on WebRTC (89 samples) and Normal (631 samples). Different sizes of these classes result in probabilities = 0.8764 and = 0.1236, that are the values required by different algorithms for attribute selection and decision making.

Results and discussion
Since the aim of this paper is to propose a model for recognizing traffic generated by WebRTC communication, models have been created using classification algorithms based on machine learning. The following algorithms have been used: Naive Bayes, J48, Random Forest, REPTree, and BayesNet. Table 2 shows a comparison of these algorithms based on the time taken to build the model and its accuracy. The J48 algorithm has the largest number of correctly classified instances (accuracy: 93.8889%), and the Naive Bayes algorithm has the largest number of incorrectly classified instances (accuracy: 79.8611%). The J48 algorithm requires the least time to build the model (0s), and the Random Forest algorithm requires the most time (0.19s).
A comparative analysis of classification algorithms on the same dataset is presented in Table 3  The best precision has the BayesNet algorithm, and the REPTree algorithm has a low percentage of precision. The recall metric is often presented as a TPR metric, so the results, shown in Table 3, are the same for these two metrics. The F1-measure, as a combination of precision and response metrics, has the highest percentage for the J48 algorithm and the lowest for the REPTree algorithm.   Figure 1 shows the comparison of all metrics for the five considered classification algorithms. Comparing the quality metrics of different classification algorithms, the best results were obtained by the J48 and BayesNet algorithms with an accuracy of 93.8889% and 92.5%, respectively. Time taken to build model for the J48 algorithm is 0s, and for the NaiveBayes algorithm is 0.03s. Finally, a model for recognizing traffic generated during WebRTC communication can be implemented using two algorithms, J48, and BayesNet. The existing models for traffic recognition, presented in Table 1, have a comparable or lower rate of classification accuracy compared to the models proposed in this paper. In [5] and [8], four models for WebRTC traffic recognition were proposed, based on classification algorithms, such as: J48, Simple Cart, Naive Bayes, and Random Forest. The four created models were analyzed based on the quality metrics used in this paper as well. Evaluation of quality metrics has shown that the J48, Simple Cart and Random Forest algorithms perform better than the Naive Bayes algorithm. Therefore, as in this paper, the J48 algorithm has better performance with an accuracy of 95.0652%, and the Naive Bayes algorithm has worse performance with an accuracy of 85.9404%.

Conclusion
Applications based on WebRTC technology, which provides real-time audio and video communication via a web browser, represent a significant innovation in web telephony. Communication based on WebRTC technology is difficult to detect because it can use dynamic port allocation and does not include any characteristic pattern that allows a semantic-based recognition. The focus of this paper was on statistically based methods for recognizing traffic generated during WebRTC communication. Based on the related work, the tool and machine learning algorithms were selected by which the model for recognizing WebRTC traffic has been created. Therefore, the main contribution of this paper are the new models for recognizing traffic generated by WebRTC communication, based on classification algorithms such as J48 and BayesNet. These proposed models provide the ability to recognize WebRTC traffic with greater accuracy than previously proposed models.
The results in this paper are a good starting point for future research activities, which will include a comparison of a wide range of machine learning algorithms using a large enough dataset to improve the relevance of the results. Furthermore, future work will include consideration of additional features used for classification purposes, such as flags, headerChecksum, timeToLive.