Two-Layer Hidden Markov Model for Human Activity Recognition in Home Environments

Activities of Daily Livings (ADLs) refer to the activities that are carried out by an individual for everyday living. Recognition of ADLs is key element for building intelligent and pervasive environments. We propose a two-layer HMM to build a ADLs recognition model that can represent the mapping between low-level sensor data and high-level activity based on the binary sensor data. We used embedded sensor with appliances or object to get object used sequence data as well as object name, type, interaction time, and location. In the first layer, we use location data of object used sensor to predict the activity class and in the second layer object used sequence data to determine the exact activity. We perform comparison with other activity recognition models using three real datasets to validate the proposed model. The results show that the proposed model achieves significantly better recognition performance than other models.


Introduction
Activity recognition is an important task for many ubiquitous computing applications.In a home environment, activity recognition systems observe the resident's activities and can remind users to perform the missed activities or complete actions, helping them to provide some automated service.In a hospital environment, activity recognition system can remind a doctor or nurse to perform certain tests before operating.In a factory environment, it can monitor the activities of the workers and encourage them to act more safely.Activity recognition systems use several types of sensors, microphones and video cameras, RFID readers, wearable sensors, and embedded sensors with appliances (or object), to determine the state of the physical world.Based on the state of the physical world, activity recognition systems observe the behavior of persons and if necessary take actions in response.Microphones and video camera based activity recognition is complex to implement because it requires processing of multidimensional data and may violate issue of user's privacy [1][2][3][4][5][6].It is effective to recognize primitive sequence of movements (walking, sitting, standing, etc.) but difficult to recognize ADLs (showering, grooming, preparing meals, etc.) using wearable sensors [7][8][9].Home users perform an activity by interacting with appliances (or objects) within nearby location at a given time.Appliances or object used information can be gotten by attaching embedded sensor with them [10][11][12][13].Embedded sensors are inexpensive, invisible, and nonintrusive having no impact on regular life activities.Tapia et al. have presented how simple binary sensors have solid potential for solving the ADLs recognition problem in the home [11].Binary sensors can also be applied in human-centric problems such as health and elder care [14][15][16].Van Kasteren et al. use binary sensors and motion sensors for performing ADLs recognition in a house setting [17].However sensors based activity recognition is challenging due to the inherent noisy nature of the input [18].In this context the temporal probabilistic reasoning and machine learning approaches are very effective.Several object use based models have been used to recognize ADLs from binary sensors such as Bayesian Networks [11], Conditional Random Field (CRF) [19], Hidden Markov Model (HMM) [17], and Hidden semi-Markov Model [17].In [20], the authors proposed Triplet Markov Chains (TMC) to deal with International Journal of Distributed Sensor Networks nonstationary HMM by introducing an auxiliary process governing the regime switching of the hidden process.
The location of a person in an environment can provide important context information for activity recognition [21,22].Location based model is able to provide high-level abstraction for an activity.For example, bedroom is for sleeping, bathroom for showering, and kitchen for preparing meal.The same location can be used for different activities such as showering, brushing teeth, and toileting as bathroom activities and preparing meal, taking meals, and dishwashing as kitchen activities.So location information based modeling is not suitable for recognizing ADLs.Our aim is to build an activity recognition model that can represent the mapping between low-level sensor data and high-level activity based on the binary sensor data.We want to combine location of object used sensor with the object used data to overcome the limitation for each approach.Using embedded sensors with appliances or object we can get object used data as well as object name, type, interaction time, and location.We propose a two-layer Hidden Markov Model (HMM) for recognition of ADLs.In the first layer, we use location data of object used sensor to predict the activity class and in the second layer object used sequence data to determine the exact activity from the selected class.In our 2-layer HMM, first layer selects one activity class so second layer only considers the activities (variables) which are related to that class only.This approach reduces time complexity of this model.We evaluate and compare the activity recognition performance of the proposed models on three fully annotated real world datasets generated by Van Kasteren [23].Based on our experiment, 2layer HMM outperforms other activity recognition models.Using location data with object used data, this model can result in a significant increase in recognition performance.
The rest of the paper is organized as follows: Section 2 presents an overview of the data used in this study.Section 3 illustrates the details of the proposed model.Section 4 provides the experimental settings and results obtained followed by conclusion and future work in Section 5.

Dataset
We evaluate the recognition performance of the proposed model and compare it with naïve Bayes, Conditional Random Field (CRF), and Hidden Markov Model (HMM) on three fully annotated real world datasets generated by Van Kasteren et al.Wireless Sensor Networks are used to observe the behavior of inhabitants inside the houses by collecting binary temporal data.Wireless network nodes are equipped with various kinds of sensors: reed switches to measure whether doors and cupboards are open or closed, pressure mats to measure sitting on or lying in bed, mercury contacts to detect the movement of objects, passive infrared (PIR) to detect motion in a specific area, and float sensors to measure the toilet being flushed.Wireless network node sends an event to base station when the state of the digital input changes or when the analog input crosses some predefined threshold.An overview of the datasets is presented in Table 1.
Time series data is discretized into a set of time slices of constant length Δ.A sensor event for time  is denoted by   , indicating whether sensor  fired at least once between time  and time  + Δ, with   ∈ {0, 1}.If  sensors are installed in a home, then the observation vector for each time slice will be   = [ 1 ,  2 , . . .,  −1 ,   ]  .Each time slice corresponds to a single data instance during data representation.The class of each data instances is defined by the activity label of the corresponding time slices.The activity at time slice  is denoted by   .It is the task to a classifier to find a mapping between a sequence of observations Θ = { 1 ,  2 , . . .,  −1 ,   } and a sequence of labels  = { 1 ,  2 , . . .,  −1 ,   } for total  time intervals as shown in Figure 1.

Proposed Model
We have proposed a 2-layer HMM for recognition of ADLs.We used the locations information of the used object in the first layer and select a group which satisfied maximum joint probability.In the second layer we considered the data of used object and select one activity among the classes.Activities are grouped according to the location of objects used to perform similar activity.Figure 2 shows an example of the grouping of the activity.Sensors are also classified based on the location of the objects which are involved with each activity group.Figure 3 presents an example of classification of the location of sensor installed with different object for each activity group.We consider  = { 1 ,  2 , . . .,   } as the set of activities, where  is the total number of activities and these activities are divided into  groups;  = { 1 ,  2 , . . .,   } for all   ⊂ .Let  = { 1 ,  2 , . . .,   } be the set of sensors installed and  = { 1 ,  2 , . . .,   } the set of locations in an environment on deploying  number of sensors in  number of locations; then the location of sensors is . .,    } represents the set of locations of observed sensor sequences associated with the set of sensor observations sequence Θ = { 1 ,  2 , . . .,   } at a given time .Each group corresponds to a location.

First Layer of HMM.
In the first layer, the classifier uses the location information of the observed sensors to classify x  the group of activities.The first layer of this HMM is characterized by  1 = {, , }, where  is the initial state distribution,  is the transition probability matrix, and  is the observation probability matrix for this layer.We assume each group of activities   is a hidden state and location information of the observed sensor    is the observations for first layer of HMM.The graphical representation of the first layer of HMM is shown in Figure 4. Initial state distribution is denoted by where  is the number of probability distributions.The state transition probability distribution represents the probability of transition from state  to state  and is represented by The observation probability distribution indicates the probability that the state  would generate observation    : The hidden   state at time  depends only on the previous hidden state  −1 .The observation variable    at time , namely, depends only on the hidden variable   .The goal is to find the joint probability distribution: The observation distribution (   |   ) represents the probability that the activity group   would generate  distinct observation symbols.We apply naïve Bayes assumption for the observation distribution as (5)

Second Layer of HMM.
In the second layer, the classifier uses the sensor data to classify the individual activity.For each activity group   , a separate HMM is used.Let  2 = { 1 2 ,  2 2 , . . .,   2 } be the set of all HMMs in the second layer in which HMM is characterized by   2 = {   ,   ,   }, where  is the index that represents the HMM to be used and for all  = ||.First layer calculates the joint probability distribution and selects the most probable groups for this index as an individual activity classification.The graphical representation of the second layer of HMM is shown in Figure 5.We assume all the activities in a group   = {  1 ,   2 , . . .,    } are a hidden x  n +1 where  is the number of probability distributions.The state transition probability distribution represents the probability of transition from state  to state  and is represented by The observation probability distribution indicates the probability that the state  would generate observation   : The hidden state    at time  depends only on the previous hidden state   −1 .The observation variable    at time  depends only on the hidden variable    .The goal is to find the joint probability distribution: The observation distribution (   |    ) represents the probability that the activity   would generate  distinct observation symbols.We apply naïve Bayes assumption for the calculation of observation distribution as

3.3.
Learning the Model Parameters.The model parameters are apparently learnt from learning samples.To estimate the parameters from the sequences of observation Baum-Welch (BW) based algorithm is used [24,25].In BW method, for a given sequence of observations, an HMM with  states and an initial model  0 , the forward-backward (FB) algorithm computes the expected number of state transitions and state emissions based on the current model (E-step) and then reestimates the model parameters (M-step) using estimation formula during each iteration.After each iteration of the E-step and M-step, the likelihood of the observations increases until a convergence to a stationary point occurs.The reestimation formulas are as follows.
The probability of being in state   at time  and state   at time  + 1, given the model  1 and the observation sequence Θ  , can be estimated using forward () and backward () variables as follows: The probability of being in state   at time , given the observation sequence Θ  and the model  1 , can be estimated as follows: First layer of HMM ( 1 = {  , , }) is Similarly the probability of being in state    at time  and state    at time  + 1, given the model   2 and the observation sequence Θ, can be estimated using (11).Moreover the probability of being in state    at time , given the observation sequence Θ and the model   2 , can be estimate using (12).

Inferring the Best State Sequence.
To infer the best sequence of groups as well as activities we have used Viterbi algorithm which has been successfully applied with HMM to solve many activity recognition problems [24].Viterbi algorithm can find out the best sequence efficiently using dynamic programming and discard a number of paths at each time step.The computational complexity of Viterbi algorithm is less than other algorithms (direct calculation, forwardbackward).Inference is done in simultaneous manner: the groups and the activities are estimated simultaneously based on Viterbi algorithm.

Experimental Setup and Results
We evaluate the performance of the proposed 2-layer HMM in recognizing activities of ADLs and compare it with three well-known classifiers: naïve Bayes, CRF, and HMM using three real datasets.Sensor data were segmented in time slices of length Δ = 60 seconds where there were a total of 36,000 time slices for dataset of "Kasteren-A", 21,600 time slices for "Kasteren-B", and 28,800 time slices for dataset of "Kasteren-C".The activities and sensors are grouped based on the location and tabulated in Table 3.The sensor networks deployed in the home environment generate raw data stream.The sensor data streams have been presented in three different feature representations [10].
Raw.This feature uses the sensor data directly as it was collected from the sensor network.The value is 1 when the sensor fires and 0 otherwise.

Change Point (CP).
This feature indicates when a sensor changes value.The value is 1 when a sensor state goes from zero to one or vice versa and 0 otherwise.Last-Fired (LF).This feature indicates which sensor fired last.The sensor that changed state last continues to value 1 and changes to 0 when another sensor changes state.
We accommodate three distinct features, Raw, Change Point (CP), and Last-Fired (LF), and four combined features: Raw + CP, Raw + LF, CP + LF, and Raw + CP + LF.The combined feature representation is a concatenation of the feature matrices.Figure 6 shows the three distinct features.
The data were split into a test and training set using a "leave one day out" approach in which one day of sensor data is used for testing and the remaining days are used for training.The process is repeated for each day and the average performance is measured.The performance of the proposed 2-layer HMM is evaluated using precision, recall, and -measure.-measure data were expressed as a mean ± standard deviation and significance was analyzed using Student's -tests.Statistical significance was considered as  < 0.05.-measure, precision, recall, and accuracy can be calculated using the confusion matrix shown in Table 4: The precision and recall are defined as The recognition accuracy of each activity for three datasets is shown in the confusion matrix separately.In the case of the "Change Point + Last-Fired" feature the highest accuracy is achieved for each activity.Tables 5-7 show confusion matrix for 2-layer HMM using the "Change Point + Last-Fired" feature of three datasets.The average -measure values for the three datasets are tabulated in Tables 8-10.Rows in the tables correspond to the different distinct and combined features whereas columns represent the experimental results for each activity recognition model.In case of the "Change point + Last-Fired (CP + LF)" feature the highest -measure value is achieved.-measure data are expressed as a mean ± standard deviation and significance is analyzed using Student's -test.Statistical significance is considered as  < 0.05.
The experimental results for House-A of Kasteren datasets are similar to House-B of Kasteren datasets.The 2layer HMM achieves the best -measure value for all datasets using distinct feature "Change Point (CP)" and combined feature "Change Point + Last-Fired (CP + LF)."The average performance of each activity model for all datasets is shown in Figure 7. From Figure 7 we can conclude that the 2-layer activity model outperforms the other activity models.
Time complexity arises while calculating the joint probability of each state sequence with the observed series of events.For example, an HMM having  states will need   state transition probabilities, 2  output probabilities (assuming      output sequence length , multiplying the number of symbols emitted in corresponding layer by the number of time slices with time duration of 60 s.In House-A of Kasteren dataset, the number of hidden states and the number of observation symbols in each state for the HMM are 10 and 14, respectively.Then time complexity can be calculated as  2  = 10 2 × 14 × 1440 = 2,016,000.For 2-layer HMM we consider 5 hidden states and 14 observation symbols in first layer.Second layer is composed of 5 sublayers and the numbers of hidden states and observation symbols for each sublayer are 1 and 1, 2 and 1, 1 and 1, 4 and 7, and 1 and 2, respectively.Time complexity for 2-layer HMM can be calculated as  2  = (5 2 × 14 + 1 2 × 1 + 2 2 × 1 + 1 2 × 1 + 4 2 × 7 + 1 2 × 2) × 1440 = 676,800.Similarly, time complexity of HMM and proposed 2-layer HMM can be calculated for House-B and House-C of Kasteren datasets and are tabulated in Table 11.

Conclusion
We have presented new approach to recognize ADLs using binary sensors in home environments.We proposed 2layer HMM using location information in first layer and object used information in another layer.We used three real datasets to evaluate the recognition performance of the proposed model and then make a comparison with the other activity recognition models.For evaluation we considered measure, a measure that considers the correct classification of each class as equally important.-measure data were expressed as a mean ± standard deviation and significance was analyzed using Student's -test.Statistical significance was considered as  < 0.05.Comparison results show that combination of location data with object used data can lead to significantly better performance in real world activity recognition.This work demonstrates that ADLs recognition can be done effectively by binary sensor network deployed in home environments.In the future we can apply machine learning schemes for learning the model parameters.

Figure 4 :
Figure 4: The graphical representation of the HMM used in the first layer of the activity model.

Table 5 :Table 6 :
Confusion matrix of two-layer HMM using "Change Point + Last-Fired" feature for House-A of Kasteren dataset.Confusion matrix of two-layer HMM using "Change Point + Last-Fired" feature for House-B of Kasteren dataset.Predicted activity Recall Breakfast Brushing teeth Dinner Drinking Dressing Leaving house Others Preparing Breakfast Preparing Dinner Sleeping Showering Toileting Using dishwasher

Figure 7 :
Figure 7: Performance comparison with the existing systems.
Table 2 shows percentage (%) of instances in each activity class of the used datasets.
• • •Figure 3: Classification of the location of sensor installed with different objects for each activity group.Table 2: Instances of each activity classes in Kasteren dataset measured in percentage (%).
Figure 5: The graphical representation of the HMM used in the second layer of the activity model.statewhere 1 ≤  ≤  and sensor data   at a time  are the observations for second-layer HMM.Θ  = {  1 ,   2 , . . .,    } is the observed sequence for sensor where  represents the index of second-layer HMMs and   1 ∈ .The initial state distribution (   ) is denoted by

Table 3 :
Details of group information for Kasteren datasets: activity classification based on location data.

Table 4 :
Confusion matrix showing the true positives (TP), total of true labels (TT), and total of inferred labels (TI) for each class of activity.

Table 7 :
Confusion matrix of two-layer HMM using "Change Point + Last-Fired" feature for House-C of Kasteren dataset.

Table 10 :
[26]sure expressed in percentage (%) for House-C of Kasteren datasets.timecomplexityto derive the probability of an output sequence of length [26].Time complexity is uncontrollable for realistic problems as the number of possible hidden node sequences typically is extremely high.To reduce time complexity of probabilistic model we proposed 2-layer HMM.Using 2-layer HMM we can reduce the number of hidden states as well as the number of symbols in each hidden state.Comparison of time complexity for HMM and 2layer HMM is tabulated in the following table.We calculate

Table 11 :
Comparison of time complexity ( 2 ) for different models.