Optimizing Classification Decision Trees by Using Weighted Naïve Bayes Predictors to Reduce the Imbalanced Class Problem in Wireless Sensor Network

Standard classification algorithms are often inaccurate when used in a wireless sensor network (WSN), where the observed data occur in imbalanced classes. The imbalanced data classification problem occurs when the number of samples in one class, usually the class of interest, is much lower than the number in the other classes. Many classification models have been studied in the data-mining research community. However, they all assume that the input data are stationary and bounded in size, so that resampling techniques and postadjustment by measuring the classification cost can be applied. In this paper, we devise a new scheme that extends a popular stream classification algorithm to the analysis of WSNs for reducing the adverse effects of the imbalanced class in the data. This new scheme is resource light at the algorithm level and does not require any data preprocessing. It uses weighted naïve Bayes predictors at the decision tree leaves to effectively reduce the impact of imbalanced classes. Experiments show that our modified algorithm outperforms the original stream classification algorithm.


Introduction
A wireless sensor network (WSN) is a distributed platform that collects data over a broad area.It has a wide variety of practical military, medical, and industrial applications [1].e brain of a WSN is usually a decision-making algorithm that is capable of correctly mapping a set of newly collected observations from the sensors to one or more prede�ned categories.It uses a machine-learning algorithm to recall the classi�cation of old data and classify the new data accordingly.ere is no shortage of machine-learning algorithms available for decision making in WSNs [2,3].However, the imbalanced classi�cation is a common problem.is problem occurs when the classi�er algorithm is trained with a dataset in which one class has only a few samples and there are a disproportionally large number of samples in the other classes.is kind of imbalanced data causes classi�ers to be over�tted (i.e., produce redundant rules that describe duplicate or meaningless concepts) and as a result perform poorly, particularly in the identi�cation of the minority class.In WSN applications, these rare minority classes are oen critical.Some WSN examples include, but are not limited to, transaction fraud detection, machine fault monitoring, environmental anomalies, atypical medical conditions, and abnormal habitual behaviors-situations where the class of interest is a small sample of unusual readings.Studies [4] have shown that using standard classi�cation algorithms to analyze these imbalanced class distributions leads to poor performance.An imbalanced class problem may have another implication in WSN where it could be a symptom of producing traffic "hot-spot" in WSN.e energy consumption in the sensors may become imbalanced too, which leads to premature drainout for some local nodes.Some solution [5] has been proposed to better cluster the nodes and traffics although it is aimed at the energy level.
Most of the standard classi�cation algorithms assume that training examples are evenly distributed among different classes.In practical applications where this was known to be untrue, researchers addressed the problem by either manipulating the training data or adjusting the misclassi�cation costs.Resizing the training datasets is a common strategy that attempts to downsize the majority class and oversamples the minority class.Many variants of this strategy have been proposed [6][7][8].A second strategy is to adjust the costs of misclassi�cation errors to be biased against or in favor of the majority and minority classes, respectively.Using the feedback from the altered error information, researchers then [9,10] �ne-tune their cost-sensitive classi�ers and postprune the decision trees in the hope of establishing a balanced treatment of each class in the new imbalanced data collected by the network.
e authors of this paper argue that replacing the traditional classi�er with an optimized stream classi�er is another effective solution.As mentioned above, the current techniques for dealing with imbalanced data require additional data preprocessing or feedback learning and pruning of a trained decision tree.ough they may be useful in minimizing the impact of imbalanced data, these preand postprocessing mechanisms require working through a whole database and their operations incur certain overheads in the data-mining environment, which may not be favorable in a WSN.In a wireless sensor network, data mining is done in real time with a compact device with limited memory and processing power called a sink, and most importantly the incoming data for classi�cation training and testing are streaming in nature.ese data streams are nonstationary data that may only be read one time at the intermediate nodes of a sensor network and are then forgotten.Furthermore, these nodes may be required to perform real-time classi�cation as the data �ows along the WSN.Fast prediction results with satisfactory accuracy must be propagated from node to node.In this dynamic environment, techniques based on data storage and feedback style aer learning cannot be used to correct imbalanced data.On the other hand, it has been demonstrated that a stream classi�er is a good candidate for WSN applications [11].e contribution of this paper is a set of simple modi�cations that optimize an existing stream classi�cation algorithm called Very Fast Decision Tree to handle the imbalanced class problem at the algorithmic level.One important extension is the use of weighted naïve Bayes predictors installed at the decision tree leaves.e assigned weights have the effect of countering the "biases" that are introduced by the problems of imbalanced class found in imperfect WSN data.e paper is organized as follows.Section 2 describes in detail the modi�cations that tackle the imbalanced class problem.In Section 3 a range of experiments is described, the "biased" datasets with imbalanced classes are introduced and the experimental results are discussed.Section 4 concludes the paper.

Optimizing the Very Fast Decision Tree
2.1.Motivation and Overview.ree special modi�cations are proposed to enhance the Very Fast Decision Tree (VFDT) algorithm.ese modi�cations are embedded in line with the codes that implement the classi�cation logic of the stream classi�er.e modi�cations to reduce the imbalanced class problem are made in four phases: the training phase, where new nodes are created if the statistical criteria established in the learning from the labeled samples phase are met; the prepruning phase, in which the quali�ed nodes and branches are tested to see whether they can indeed improve the prediction accuracy (before they are added to the decision tree); the prediction phase where unseen samples are being categorized to prede�ned classes; and the pruning phase which uses the functional tree leaf [12].e modi�cations are in the forms of simple computation and conditional checks that do not incur heavy resource consumption at the sensor nodes.
e improved version of VFDT is generally called the Optimized Very Fast Decision Tree with Functional Leaves (OVFDT-FL).Our previous work [13] shows that the OVFDT-FL prototype can classify data streams with the maximum possible accuracy with the minimum tree size.In this paper, OVFDT-FL is tested with imbalanced class data.e design of OVFDT-FL is given as follows.
OVFDT, which is based on the original VFDT design, is implemented using a test-then-train approach for classifying a continuously arriving data stream, even    where  is the total number of training instances as shown in Figure 1.e whole test-then-train process is synchronized so that when the data stream arrives one segment at a time, the decision tree is tested �rst for prediction output and training (which is also known as updating) of the decision tree model then occurs incrementally.Suppose that  is a vector of  attributes and  is the number of classes included in the data streams.When a new data sample (   ) arrives, it travels from the root of the decision tree to an existing leaf via the current decision tree structure, provided that the root existed initially.Otherwise, a heuristic function is used to construct a tree model with a single root node using the procedure shown in Pseudocode 1. Suppose that a decision tree model HT can give a prediction to a class  ′  according to the functional tree leaf ℱ, where HT(   ′  .Comparing the predicted class  ′  to the actual class   , the statistics of the true,   , and false,   , predictions are updated immediately.Meanwhile, the sufficient statistics   , which are a count of the attribute   with value  that belong to class   , are updated for each node.is series of actions is called the testing phase. e training phase immediately follows the testing phase.Node-splitting estimation is used to initially decide if HT should be updated or not, depending on the number of samples received that can potentially be represented by additional underlying rules in the decision tree.In principle, the node-splitting estimation should apply to every single new sample that arrives.However, this would be too resource expensive and would slow down the tree building process.Instead, VFDT proposes a parameter  min that only carries out the node-splitting estimation when  min examples have been observed on a leaf.In the nodesplitting estimation, the tree model should be updated when a heuristic function ( chooses the most appropriate attribute, with the highest heuristic function value   ), as a splitting node, according to Hoeffding's bound and the tie-breaking threshold.e heuristic function is implemented as an information gain here.is in situ system of node-splitting estimation constitutes our training phase.
Two modi�cations are proposed for the training phase of OVFDT to manage imbalanced data classes.e �rst is to dynamically adjust the tie-breaking function in the splittingnode determination using the mean value of Hoeffding's bound.e growth of the tree is in�uenced by the mean value of the traffic �uctuation (which was found to correlate with Hoeffding's bound in our previous work) rather than the imbalanced data class.e second modi�cation is to use prepruning to test if the leaf chosen to be split, and therefore increase tree growth, is indeed a valid choice given the imbalanced data class.In this way, we can assume that the expansion of the tree is a result of genuinely accurate predictions.us, postpruning on the decision tree is not necessary.Section 2.2 presents the details of the functional leaf strategy for handling the imbalanced data class, and the details of the modi�cations to the training phase are given in Section 2.3.

Functional Tree Leaf Prediction in Testing
Phase.e sufficient statistics   is an incremental count number stored in each node in the OVFDT.Suppose that a node Node  in HT is an internal node labeled with attribute   .Suppose that  is the number of classes distributed in the training data, where   .A vector   is constructed from the sufficient statistics   in Node  such that   =   ,   , … ,   }.   is the observed class distribution (OCD) vector of Node  .OCD stores the count of the distributed class at each tree node in OVFDT.It helps to keep track of the occurrences of the instances of each attribute.
For the actual classi�cation, OVFDT uses HT)   ′  to predict the class label when a new sample (, ) arrives.e predictions are made according to the OCD in the leaves, which is called the functional tree leaf ℱ. Originally, in VFDT the prediction used only the majority class functional tree leaf ℱ MC .e majority class only considers the counts of the class distribution, but not the decisions based on combinations of attributes.e naïve Bayes functional tree leaf ℱ NB was proposed to compute the conditional probabilities of the attribute values given a class at the tree leaves by naïve Bayes.As a result, the prediction at the leaf is re�ned by the consideration of the probabilities of each attribute.To handle imbalanced class distribution in a stream, a weighted naïve Bayes functional tree leaf ℱ WNB and an adaptive functional tree leaf ℱ Adaptive are proposed in the paper.

Majority Class Functional Tree
Leaf.In the ODC vector, the majority class functional tree Leaf ℱ MC chooses the class with the maximum distribution as the predictive class in a leaf, where ℱ MC : arg max         …     …    }, and where 0 <  < .

Naïve Bayes Functional Tree
Leaf.In the OCD vector   =       …     …    }, where  is the number of observed classes and 0 <  < , the naïve Bayes functional tree leaf ℱ NB chooses the class with the maximum possibility, as computed by the naïve Bayes, as the predictive class in a leaf.  is updated to  ′  by the naïve Bayes function such that  ′   P(    ) ⋅ P(  )/P(), where  is the new arrival instance.Hence, the prediction class is

Dynamic Splitting Test and Prepruning in the Training
Phase.e node-splitting control is modi�ed to use a dynamic tie-breaking threshold , which restricts the attribute splitting at a decision node.e  parameter traditionally is precon�gured with a default value de�ned by the user.e optimal value is usually not known until all of the possibilities in an experiment have been tried.Longitudinal testing of different values in advance is certainly not favorable in real-time applications.Instead, we assign a dynamic tie threshold, equal to the dynamic mean of the HB value at each pass of stream data, as the splitting threshold, which controls the node splitting during the tree-building process.Tie breaking that occurs close to the HB mean can effectively narrow the variance distribution.e HB mean is calculated dynamically whenever new data arrives and the HB value is updated.e estimation of splits and ties is only executed once for every  min (a user-supplied value) sample that arrives at a leaf.Instead of a pre-con�gured tie, OVFDT uses an adaptive tie that is calculated by incremental computing.At the th nodesplitting estimation, Hoeffding's bound  estimates whether there are sufficient statistics from a large enough sample size to split a new node, which corresponds to the leaf .Let   be an adaptive tie corresponding to leaf , within  estimations seen so far.Suppose that   is a binary variable that takes the value of 1 if  relates to leaf  and 0 otherwise.T  is computed by (1).To constrain HB �uctuation, an upper bound  UPPER  and a lower bound  LOWER  are proposed in the adaptive tie mechanism.e formulas are presented in ( 2) and (3) as follows: For resource-light operations, we propose an error-based prepruning mechanism for the OVFDT, which stops noninformative node splitting before it splits into a new node.e prepruning takes into account both global and local nodesplitting errors.

Lemma 1 (Monitoring Global Accuracy). e model's accuracy varies whenever a node splits and the tree structure is updated. Overall accuracy of a current tree model is monitored during node splitting by comparing the number of correctly and incorrectly predicted samples. e numbers of correctly predicted instances and otherwise are recorded as current global performance indicators. is monitoring allows the determination of global accuracy.
When a new instance arrives, it will be sorted to a leaf by the current HT structure before the node-splitting estimation.is is the "testing" phase in OVFDT.Suppose that   is the number of correctly predicted instances in the current HT and   is the number of incorrectly predicted instances.Aer the th node-splitting estimation, let Δ  be the difference between   and   , then Δ  is computed by (4), which re�ects the global accuracy of the current HT prediction on the newly arrived data streams.If Δ  ≥ 0, the number of correct predictions is no less than the number of incorrect predictions in the current tree structure; otherwise, the current tree graph needs to be updated by node splitting.

Lemma 2 (Monitor Local Accuracy). e global accuracy can be tracked by comparing the number of correctly predicted samples with the number of incorrectly predicted samples. Likewise, comparing the global accuracy as measured at the current node-splitting estimation with the global accuracy measured at the previous splitting, means that the variation in accuracy is being tracked dynamically. is monitoring allows us to check whether the current node splitting is advantageous at each step by comparing it with the previous step.
Suppose that Gain Accu is the gain in accuracy of the th and the (  1th estimations, as calculated in (5), which re�ects a local accuracy of changes.If Gain Accu (HT   ≥ 0, the measurement of accuracy at the th splitting HT structure is no worse than the accuracy at the (1th splitting; otherwise, the old tree structure needs to be updated.e splitting estimation is implemented once for every  min sample that arrives at a leaf.e tree size increases by  when a new node splits.e number of samples that meets the �rst pruning condition is ( min ⋅, where  is the probability of the optimal node splitting calculated in (8).Only one value of  can be chosen at one splitting estimation.e calculation of tree size at estimation  is given in (6).  and   in the th splitting estimation give feedback on the tree's current classifying accuracy.By continually comparing this with (  1th, the pruning maintains the accuracy sequentially.In other words, the optimum result is obtained by comparing the current tree status to its previous status as follws: Gain Accu HT   = Δ   Δ 1 , Gain Tree Size HT   =     1 ,  0 = 1 , Figure 2 shows why our proposed prepruning takes into account both the local and the global accuracy in the incremental pruning.At the th node-splitting estimation, the difference between correctly and incorrectly predicted classes was Δ  , and Δ +1 at the +1th estimation.Gain Accu (HT +1  was negative, indicating that the local accuracy of  + 1th estimation was worse than that at the previous node-splitting, although both were on a globally increasing trend.us, if accuracy is declining locally, it is necessary to update the HT structure even if accuracy is increasing globally. e optimal node splitting control consists of a dynamic tie for node splitting and a prepruning mechanism that tries to hold the tree growth in neutral with respect to the imbalanced class distribution.In each node-splitting estimation process, the Hoeffding bound (HB) value that relates to leaf  is recorded.e recorded HB values are used to compute the adaptive tie, which uses the mean of the values for each leaf  instead of a �xed user-de�ned value as in VFDT.Using all the prediction statistics gathered in the testing phase for implementing prepruning, Pseudocode 3 presents the pseudocode of the training phase used by OVFDT for building an upright tree.

Bias Generator.
For our experiments, we adopted and customized massive online analysis (MOA), one of the most popular data stream-mining toolkits, by including the aforementioned modi�cations into the OVFDT algorithm.However, the latest version of the MOA simulation environment is not able to simulate a biased data stream with an imbalanced class.A bias generator was therefore written in JAVA code and integrated into MOA for the purpose of evaluating the performance of stream-mining algorithms under imbalanced class data.Using either a simple command-line console or a graphic user, of which an example is shown in Figure 3, the generator injected biased instances from a speci�c imbalanced class into a given A�FF �le.e input parameters are as follows: Aer the generator con�guration, the instances with CCI class are replaced by BCI instances according to the CP setting.For example, in the snapshot below BCI = 5, CCI = 4, and CP = 80%; this means that 80% of CCI instances are replaced by BCI instances (the original settings are BCI = 10% and CCI = 10%, aer BCI = 18% and CCI = 2%).

Experiment Datasets and Visualization.
Six datasets were used to test the performance of OVFDT + FL versus ordinary VFDT.e datasets included those generated by the biased simulator and naturally imbalanced real-life data downloaded from the UCI machine-learning archive (http://www.ics.uci.edu/∼mlearn).Table 1 describes these experimental datasets in detail, and Figure 4 provides a group of class distribution visualizations.
e following charts visualize the bias-included datasets with the imbalanced class.e pie charts on the le show the class distribution in the full experimental datasets and the charts on the right show the class distribution being progressively updated as new data streams arrive.For example, from Table 1 we see that the biased classes in the LED24 dataset are Class 2 (18%) and Class 4 (18%).ere are 80% more data samples for these two classes than for the other classes.Originally the data distributions over all classes were equal.e charts representing the other datasets show at least one class, which has a larger percentage of data distribution than others.

Experiment Results Comparing VFDT and OVFDT.
VFDT is deemed to be a suitable candidate for real-time classi�cation in wireless sensor networks, because of its incremental learning nature based on a test-and-train approach.In this paper, we extend the design of VFDT to OVFDT, which has superior mechanisms for dealing with imbalanced data classes.is following comparison is between VFDT and OVFDT, which use the same types of functional tree leaf in the imbalanced datasets.e goal is to observe the comparative impact of the imbalanced classes on VFDT and OVFDT.For VFDT, the �xed tie breaking threshold (range from 0 to 1) is an important prede�ned parameter , which controls the node-splitting speed.In the experiment,  was set at different values from 0.1 to 1.0 to test several different trails of VFDT, as a priori information for  values is unavailable until the model is actually put to the test.e number of correctly classi�ed instances measures the accuracy over the total number of arrived instances.
e results show, on the one hand, that OVFDT Adaptive has better performance results than any other method, for imbalanced data streams.e highlighted areas in Figure 5 show that OVFDT consistently outperformed VFDT.OVFDT MC had lower accuracy than other functional tree leaf strategies in OVFDT.e advantage of the functional tree leaf approach is more apparent in the analysis of imbalanced data streams that have a signi�cantly large bias in class distribution.is means that the modi�cation at the testing phase is substantially effective, even when processing highly imbalanced data classes.On the other hand, the advantage of the other two modi�cations to the training phase, prepruning and dynamic node splitting, shows their usefulness in reducing the over�tting problem caused by imbalanced class data streams.e radar chart in Figure 6 demonstrates that OVFDT results in a much smaller tree size than VFDT in all cases.A small tree size means lower runtime memory requirements, which makes it suitable for operating sensor node devices in WSNs.Tree size is measured by the number of leaves in a decision tree.Ideally there should be just enough leaves and corresponding branch paths to correctly classify the samples.Having too many leaves is a symptom of over�tting, which results in a decision tree that cannot make meaningful predictions and uses up memory space.
As these experimental results show, OVFDT with a functional tree leaf handles imbalanced data streams more effectively than VFDT.For this reason, VFDT will not be considered in the following experiments.Instead, we will    analyze in detail the experimental results of OVFDT using different types of functional tree leaves.

Experiment Results
Comparing OVFDT Functional Tree Leaf Accuracy.Comparing the classi�cation accuracy of four different types of functional tree leaves, we �nd that OVFDT MC always obtains the lowest accuracy and OVFDT Adaptive has consistently better accuracy than the other methods.In addition, OVFDT WNB is better than OVFDT NB in experiments that weight the probabilities of each attribute occurrence (see Figure 8).Another good performance benchmark is the receiver operating characteristic (ROC), which is a standard method for analyzing and comparing classi�ers when the costs of misclassi�cation are unknown.In a stream-mining scenario, it is not possible to know the misclassi�cation costs, because the mining process is incremental over running data streams, and does not analyze a full dataset.e ROC provides a convenient graphical display of the tradeoff between the true and false positive classi�cation rates for two class problems [11�.In the decision tree classi�cation, however, there are more than two classes.erefore, we extend the standard ROC model to a multiclass ROC analysis to evaluate the tree learning algorithm's performance.Suppose that there is a -class classi�cation system, with -dimensional classes that need to be classi�ed by the tree learning algorithm.A   -dimensional confusion matrix or contingency table , which summarizes the results of the classi�cations, presents the true positives and false positives for the multi-class analysis.Each entry   of the matrix  gives the number of examples, whose true class was   , that were actually assigned to   , where 1 ≤  ≤ .Each entry   of the matrix  gives the number of examples, whose true class was   , that were actually assigned to   , where    and 1 ≤ ,  ≤ : class  can be converted into a two-class problem, with the corresponding values of True Positive (10), False Positive (11), False Negative (12), and True Negative (13) (see Figure 7):

International Journal of Distributed Sensor Networks
Precision-Recall is a well-known method of analyzing ROC.In pattern recognition, precision is the fraction of retrieved instances that are relevant, while recall is the fraction of relevant instances that are retrieved.e values of precision and recall range from 0 to 1.A precision score of 1 for a class  means that every item labeled as belonging to class  does indeed belong to class .A recall score of 1 means that every item from class  was labeled as belonging to class .Precision-Recall scores are not analyzed in isolation.F measure [12] is a weighted harmonic mean of the Precision-Recall measure.e F1-measure evenly weights precision and recall scores.e best value for the F1-measure is 1 and the worst score is 0. In addition, the true positive rate (TPR) and the false positive rate (FPR) are common benchmarks in ROC analysis: We analyze the Precision-Recall for each class for all the imbalanced class datasets.Due to limited space, the detailed charts are given in the Appendix.e average Precision-Recall values are described in Figures 9, and 10.
ese charts illustrate that the average precision of OVFDT MC is worse than those of the other methods.OVFDT Adaptive obtains the highest precision in homogenous (nominal only and numeric only) datasets.All methods have the same average values of recall (the lines appear to overlap).We then apply the F1-measure to evaluate the experiment result.As the chart below shows, the value ranges from 0 to 1; OVFDT MC again has the lowest F1-measure value.However, because the datasets resampling the observations from a bounded archive so as to balance the imbalance.Others may resort to postpruning the decision tree and redistributing the classi�cation costs in a backward-learning process.All of these proposed techniques worked well in traditional data mining but might not suit a real-time stream-mining scenario, where all the data arrive in a single pass; at a sensor sink it is neither practical nor feasible to archive a stationary set of data, let alone to resample.In this paper, a novel solution is introduced at the algorithmic level, which is based on a popular stream-mining algorithm called the Very Fast Decision Tree (VFDT).ree modi�cations are proposed for VFDT as a means to reduce the e�ect of imbalanced class data.e modi�cations are implemented at the training phase prior to expanding the decision tree and at the testing phase, where prediction accuracy is �ne-tuned by weighting the leaves of the decision trees according to the probabilities of the arriving data.e overall solution is called the Optimized VFDT with Functional Tree Leaf (OVFDT + FL).e mechanism of Function Tree Leaf is implemented by using weighted naïve Bayes predictors, installed at the decision tree leaves of the OVFDT.Speci�cally, perturbed datasets that include �biased� class distribution are used for experiments for illustrating the efficacy of the new algorithm.OVFDT + FL is shown to outperform VFDT in a series of experiments where datasets are deliberately biased by a custom-made data generator soware program.In particular, two variants of FL called adaptive and weighted naïve Bayes performed consistently better than other techniques.OVFDT succeeded in minimizing the impacts of imbalanced class data, while maintaining high accuracy and a compact decision tree size.is contrasts with the known over-�tting problems of poor accuracy and huge tree size usually caused by imbalanced class data.e OVFDT + FL is validated as a good classi�cation model for wireless sensor networks.

F 2 :
Example of incremental pruning.
(i) biased class index (BCI): the class index that the biasadded instances belong to, (ii) bias change from class index (CCI): the class index that the bias instances will replace, (iii) change reduction percentage (CP): the proportion of instances that will change to biased instances.

F 3 :
Snapshot of the Bias generator for generating data with imbalanced class, on MOA platform.T 1: Datasets with imbalanced class used in the experiment.

F 4 :
A collection of visualizations of the datasets that have different degrees of imbalanced class distribution.

F 5 :F 6 :
Accuracy of the classi�cation experiments by VFDT and OVFDT with datasets of imbalanced data class.Tree size of the classi�cation experiments by VFDT and OVFDT with datasets of imbalanced data class.

F 9 :F 10 :
-class ROC statistics, each class  to  in the multi-class ROC is assigned a negative or positive value.Samples with class  are positive; otherwise, negative.True positives (TP) are examples correctly labeled as positives.False positives (FP) refer to negative examples incorrectly labeled as positive.True negatives (TN) are negatives correctly labeled as negative.Finally, false negatives (FN) refer to positive examples incorrectly labeled as negative.Each Precision and Recall values of the classi�cation experiments by di�erent FL types of OVFDT with datasets of imbalanced data class.F1-measure of the classi�cation experiments by di�erent FL types of OVFDT with datasets of imbalanced data class.
Functional Tree Leaf.In the OCD vector   =       …     …    }, where  is the number of observed classes and 0 < r < k, the weighted naïve Bayes functional tree leaf ℱ WNB chooses the class with the maximum possibility, as computed by the weighted naïve Bayes, as the predictive class in a leaf.According to the functional tree leaf strategy, the current HT sorts a newly arrived sample (   ) from the root to a predicted leaf  ′  .Comparing the predicted class  ′  to the actual class   , the statistics of truly   and falsely   prediction are updated immediately.  and   are used in the model-training phase.Pseudocode 2 is a �owchart of the modi�ed testing phase.
is updated to  ′  by the weighted naïve Bayes function such that  ′     ⋅ P(    ) ⋅ P(  )/P(), where  is the new arrival instance, and the weight is the probability of class  distribution amongst all the observed samples such that    ∏   (  / ∑     ), where   is the count of class .Hence, the prediction class is ℱ WNB : arg max f =  ′    ′   …   ′   …   ′  }. 2.2.4.Adaptive Functional Tree Leaf.In a leaf, suppose that  ℱ MC is the observed class distribution vector with the majority class functional tree leaf ℱ MC , suppose that  ℱ NB is the observed class distribution vector with the naïve Bayes functional tree leaf ℱ NB , and suppose that  ℱ WVB is the observed class distribution vector with the weighted naïve Bayes functional tree leaf ℱ WNB .Suppose that  is the true class of a new instance .Suppose that  ℱ is the prediction error rate using a functional tree leaf ℱ.  ℱ is calculated by the average   error  /, where  is the number of examples and error  is the number of examples mispredicted using ℱ. e adaptive functional tree leaf chooses the class with the minimum error rate predicted by the other three strategies, where ℱ Adaptive : arg min ℱ =  ℱ MC   ℱ NB   ℱ WVB }.
Imbalanced data classi�cation is a challenging problem that generally refers to a learning model created for a dataset that has far more samples in one class than in the others.In an ubiquitous environment such as a wireless sensor network, it is not uncommon for the data of interest to fall into a small minority class.Previous researchers have tackled this problem by using techniques that inevitably create additional computation overheads.ese techniques usually include International Journal of Distributed Sensor Networks contained biased instances in imbalance classes, the average Precision-Recall analysis is not sufficient to determine accuracy.We must consider the Precision-Recall for the distributed class in every different data stream.From the above experiments, we observe that OVFDT Adaptive always achieves higher precision, recall, and F1-measure values than OVFDT MC , but this was not the case for OVFDT NB and OVFDT WNB .