In-vehicle localization based on multi-channel Bluetooth Low Energy received signal strength indicator

High-precision in-vehicle localization is the basis for both in-vehicle location-based service and the analysis of the driver or passengers’ behaviors. However, interferences like effects of multipath and reflection of the signals significantly raise great challenges to the positioning accuracy at in-vehicle environment. This article presents a novel high-precision in-vehicle localization method, namely, the LOC-in-a-Car, based on functional exploration and full use of multi-channel received signal strength indicator of Bluetooth Low Energy. To achieve higher positioning precision, a hierarchical computation algorithm based on Adaboost and support vector machine is proposed in our method. In particular, we also proposed a device calibration method to deal with the heterogeneity of different smartphone terminals. We developed an Android app as a component in which the channel time-sharing acquisition method is fulfilled, enabling smartphones to distinguish data from multi-channels. The system performance is verified via intensive experiments, of which the results show that our method can distinguish the locations of driver or passengers with an accuracy ranging from 86.80% to 92.02% for each seat on Nexus phone, and the overall accuracy is 89.86%, with standard deviation of 2.64%. On Huawei phone, the accuracy ranges from 85.43% to 93.33% with overall accuracy of 89.75% and standard deviation of 3.07%. Both outperform the existing methods.


Introduction
The location-based service (LBS) has been a very popular topic in recent years, as it offers great convenience to our life. Being a branch of LBS, indoor and invehicle LBS have demonstrated huge potentials. For example, indoor LBS is necessary for market navigation, while in-vehicle LBS is supposed to offer more comfortable environment inside a car, including dynamic temperature adaption by tracking passengers' seated distribution. Hence, the application technology of indoor and in-vehicle LSB has been in growing demand as well.
Typically, outdoor localization depends on GPS. Unfortunately, the strength of GPS signal is usually weakened indoors due to architecture block. In this way, wireless sensors based on Bluetooth, ZigBee, and Wi-Fi are widely used for indoor localization. However, compared with indoor conditions, the in-vehicle environment is extremely limited in space where driver or passengers' position and gesture will 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China interfere with signal strength. Hence, those wireless sensors cannot be directly used in the in-vehicle environment. Other challenges with which in-vehicle localization is faced include the following: 1. Limited in-vehicle resources. The devices for invehicle localization are supposed to be available or able to be installed aboard. 2. Precise localization requirement. Localization error should be much less than 1 m. Limited by the sizes of typical vehicles, most existing methods cannot satisfy this requirement. New mechanisms for localization with higher precision are required. 3. Little interference. Safety is the top priority. The mechanism should minimize users' involvement and forbid interference with the driver.
To address these challenges and provide better invehicle LBS service, we propose a hierarchical computation algorithm for in-vehicle localization-LOC-in-a-Car-which utilizes multi-channel Bluetooth Low Energy (BLE) received signal strength indicators (RSSIs) to achieve localization in a car. The localization process is as follows: In the training phase, three BLE transmitters were installed in a car transmitting signals. Next, we record the collected BLE RSSIs with multi-channel information and the position of receiver as the training data. Note that some receiving terminals, for example, smartphones, cannot differentiate multiple channels. For these terminals, we proposed a time-sharing broadcasting approach, utilizing Bluetooth transmitters to transmit data via different channels in different time periods. In this way, the terminal could collect and recombine data from different channels and record corresponding RSSIs. To enhance the overall accuracy, we deployed hierarchical classifiers including a coarse classifier and a fine one. These two classifiers were trained upon the training data. In the testing phase, we first collected testing data using another transmitter and receiving terminal simulating a real scenario. To reduce the impact led by different transmitters and receiving terminals, data filtering and device calibration was conducted. We then applied the trained models upon testing data to predicting locations. To improve the result's accuracy, duplicate experiments were carried out while the position with the highest probability was declared as the result of localization. The actual experiment shows that the proposed method can determine the passenger's location in a car with an accuracy in the range of 86%-92%, with an average of 89.86%.
The remaining content of this article is organized as follows: Section ''Related works'' introduces the related works and device calibration algorithms. The proposed in-vehicle location method including the data processing, location-identifying algorithms, and device calibration method is detailed in section ''Method and algorithm design.'' Section ''Experiments and results'' contains a validation experiment and the corresponding results. Section ''Conclusion and future works'' concludes this article.

Related works
An ideal localization system should work in a fast, convenient and reliable way. Among the previous studies on indoor localization, the traditional localization methods include: 1. Three-point localization. 1 This method measures and estimates the distances between the located point and three different pivot points. The scope tolerance of this method is approximately 1-3 m, which is not accurate enough; 2. Methods based on time of arrival (TOA), 2 time difference of arrival (TDOA), and angle of arrival (AOA). 3 TOA records arrival time of signals transmitted from at least three points. TDOA records arrival time of signals from at least two points and the propagation delays. Similar to TDOA, AOA records the arrival time of signals from at least two points, while it records angles between signals' headings as well. Given the distances between transmitters and receiver, the localization can be realized using recorded information. Note that time synchronization is essential for these methods; 3. Fingerprint method. 4 This method requires to record the data packet's RSSI at different places; 4. The method introduced in Yang et al. 5 combines the signal with sensors' information and original location to judge the current location.
Besides these traditional methods, Wi-Fi 6 -based localization method has become popular in recent years. Some of the related works take advantage of channel state information (CSI) which describes the propagation of Wi-Fi 7 to realize localization, of which the result proves to be inspiring.
However, as mentioned above, localization in a car is even more challenging than indoor localization because of the limited space and available signal intensity. In-vehicle localization mechanisms must, therefore, be space and power conserving. Furthermore, the special environment in a car disables some indoor localization methods because of the following: 1. The errors of traditional localization methods are approximately 1-3 m, 8 which can satisfy the requirement of an indoor condition but not in a car (which typically requires an accuracy within 40 cm); 2. The differences between the sensor signals detected from different passengers are so little that they cannot be used in multiple-sensor processing; 3. Wi-Fi-based localization system is hard to be deployed in a car, what is more, in-vehicle environment has its own particularity because vehicles have great dynamic and change in range, so it has uncertainty when localization with Wi-Fi.
To achieve in-vehicle localization, many methods have been proposed. The simplest way is to install pressure sensors under the seat. Through detecting pressure changes, it is easy to know which seats get loaded. 9 However, this method cannot tell if it is a human being or goods on the seat. Wang et al. 10 introduce another method utilizing accelerator sensors to detect the speed and acceleration differences between two sides of a car. When the car turns left, the speed and acceleration of the left side are smaller than those of the right side. However, this method depends on the turning operation, and it can only distinguish the left or right side rather than all the locations of a car. Yang et al. 11 describe a novel method to determine the driver's phone using car speakers (it requires four extra speakers which can beep sonic waves at a specific frequency). The experiment result is inspiring; however, installing so many speakers in a car is inconvenient and costly, especially with all the facilities having been mounted well.
In this article, inspired by the Wi-Fi CSI-based method, we proposed a novel method to achieve invehicle localization. Instead of Wi-Fi, we adopt BLE as the wireless signal, while replacing CSI with RSSI. The BLE is a relatively new Bluetooth protocol, also called Bluetooth 4.0, which can broadcast the information used for localization. Moreover, most BLE transmitters are small in size, which contributes to BLE RSSI-based method being an ideal one for in-vehicle localization.
In our recent research, 12 we have used the method of multi-channel BLE-based localization; however, it did not take difference between receiving terminals and BLE transmitters into consideration. Device calibration is an important part of localization. When positioning, users may use different receiving terminals, such as personal computer (PC), personal digital assistant (PDA), and smartphones. The hardware chip and underlying software used by the receiving terminals lack a uniform standard, which may degrade the localization performance. More specifically, the differences between devices, namely, the device heterogeneity problem, can result in mismatch between the collected data from model's training phase and that from testing phase. Therefore, device calibration is an essential procedure of the whole localization process.
To solve this problem, the most commonly used way is to adjust the RSSI manually. 13 It requires the data to be collected from some specific locations with the training devices and testing devices, and data for the other places are obtained by interpolation with linear models. However, this method is time-consuming with large numbers of devices. Some other papers 14-16 about device calibration were proposed in the field of indoor localization, and most of them used the first-order linear relationship to achieve the mapping. The relationship between different devices can be shown as ''RSSI training = a 3 RSSI testing + b,'' where parameters a and b are required. With the least square approach, these two parameters can be determined easily.

Method and algorithm design
To determine the passengers' positions in a car, we propose a multi-channel BLE RSSI localization algorithm called ''LOC-in-a-Car,'' which fully uses the multichannel RSSI of BLE and based on hierarchical probabilistic computation models. The models are built first on the training data and then used for online location determination. To illustrate the principle and procedure of the algorithm, we first introduce some basic theory of the Bluetooth localization method.

Multi-channel BLE RSSI
Localization based on signal frequency has been used in previous works; 17,18 they utilize the RSSI that can be influenced by signal frequency. Roughly, the RSSI can be estimated by equation (1) where P r denotes the received power, which is determined not only by the transmitted power (P t ), the distance between transmitter and receiver, but also by the frequency of wireless signal. Ceylan et al. 17 and Capriglione et al. 18 used different wireless signals to get the frequency difference, aiming to enhance the accuracy of localization; however, wireless signals at different frequencies are resolved using frequency demultiplier and frequency multiplier, which cannot be deployed in vehicles.
BLE4.0 has 40 channels in total and 3 of them are used for broadcast. Table 1 shows the distribution of frequency and type.
From Table 1, we can find the frequency differences between broadcast channels. When broadcasting, all the three broadcast channels are used; however, most receivers just record the RSSI and ignore the channel information. In this article, we propose a method that can make full use of the BLE broadcast channels. We adopt the RSSIs in different channels as the source of localization data. When the transmitters broadcast, the receivers record not only the values of RSSIs but also the channel information (named discrete channel) for further analysis. To verify the assumption that the RSSIs in different channels can result in different values, we collect the RSSI in the same setting and record the discrete channel and integral channel (just record the RSSI and ignore the channel information), and the result is shown in Figure 1.
As shown in Figure 1, discrete channel and integral channel in the same setting are quite different. For the broadcast data packets with the same setting, the RSSIs from different channels have considerable differences. Using discrete RSSI values to do the localization has the following advantages: 1. Differences in channels cause differences in RSSI, which contain more useful information for identifying the locations; 2. The propagation effect may cause the integral RSSIs collected from different locations to be identical. Discrete RSSI can increase the data dimension and decrease the probability of receiving indistinguishable RSSIs under this situation; 3. Discrete channel can help the classifiers focus on the differences between the RSSIs from different locations rather than the differences between the channels belonging to the same broadcast packet.
Judging from the above consideration, we choose discrete channel BLE RSSIs as the source of data for further fusion.
Data filtering and device calibration Data filtering. The RSSI is vulnerable to multipath effects. The experiments conducted by Gaertner and Cahill 19 showed that the intensity of signal is affected by human block, users' orientation, and so on. Although in vehicles, the passengers and facilities are relatively fixed in position, the movements of both driver and passengers are unpredictable, resulting in some outliers which can cause inaccurate positioning results.
Here, the autoregressive moving average (ARMA) model is introduced to implement the filtering. The ARMA is an important method for studying time series; it has been used for long-term tracking in market research. The ARMA consists of an autoregressive (AR) model and a moving average (MA) model, and has p and q variables, respectively, so ARMA(p, q) is composed of AR(p) and MA(q), which can be expressed as follows Then, we determine an ARMA model according to the least mean square (LMS) of the raw data X i and filtered data Y i ; the one with the least LMS can be used for data filtering To do the filtering, the value of p and q should not be too large. p and q from range [1,5] will be searched and the pair with least LMS is used to do the filtering. An ARMA model is constructed for each channel of each transmitter.
Device calibration. Regarding device calibration, many papers 13,20,21 consider that a linear relationship exists between different types of terminals. However, based on the data collected with different smartphones, the  calculated Pearson correlation of the training data and testing data is 0.2752, indicating there is almost no linear relationship between them. To solve the calibration problem, a mapping function f is defined to map the testing RSSI to the training RSSI for different types of terminals, denoted as equation (4) RSSI k training = f RSSI k testing + err for every k ð4Þ where k stands for the terminal and f stands for the mapping relation. A mapping relationship is constructed for the testing terminals.
To represent the non-linear relationship, we use the support vector regression (SVR) to map the data. The SVR has three kernel functions: linear, poly, and radial basis function (RBF). Here, we use RBF, which has a better performance for the non-linear relationship. It is given by K(x i , x j ) = exp ( À gjjx i À x j jj 2 ). Besides the kernel function, the penalty factor C and g can determine an SVR mapping.
The ideal distribution between the testing data and training data is near the straight line y = x. To evaluate the validity of SVR models, we introduce the dispersibility d to denote the average distance between the data and ideal distribution where d means the dispersibility, x stands for the training data, y stands for the testing data, and n means the number of data packets. Less dispersibility means better calibration performance, so the distribution of the mapped data is closer to the ideal distribution.

Hierarchical classifiers
After discrete RSSI of BLE transmitters is collected, the discrete RSSI is characterized to model the positions inside the vehicle. Standard models for different locations are constructed, which can be used to determine the locations.
Here, we use coarse and fine classifiers to determine the locations: coarse classifier is used to classify the first and second row, while fine classifier is used to determine the final position. We extract features from the RSSIs of the three BLE transmitters, each with three channels. To avoid overfitting, the principle component analysis (PCA) is used for dimensionality reduction.
Coarse classifier. Coarse classifier is a pre-classifier. It is used to identify which row of a specific location within a car. Adaboost is chosen as the coarse classifier because it is one of the best classifiers 22 and can mine information in different perspectives. The main idea of this classifier is to use many weak classifiers within iterative training. After this process, the information is obtained about the row where passengers are seated, which can be used in the following process.
Fine classifier. After determining the row, fine classier is used to judge the final position of the passengers. Here, the support vector machine (SVM) is used to do the classification. It can be seen as a supplement to provide more information besides the Adaboost. The SVM can classify data in addition to linear classification with the help of hyperplane and kernel function. Typically, the SVM is used for binary classification. To classify our training data into two or three classes for each row of a typical car, we have to perform the procedure C 2 n times, where n is the number of positions in a row, to resolve each pair of positions in a row, and to resolve each pair of positions before the final location can be obtained.
After a model is constructed for every location, the modes are used in localization testing.

Experiments and results
To check the accuracy and validity of the proposed method, a series of experiments are conducted to verify the proposed localization procedure.

System equipment and model construction
In our experiments, the BLE transmitters were made by Zhongke Tianhe Technology with packing size 3:5 cm 3 3:5 cm 3 1:5 cm and based on the BLE4.0 protocol, and they were set to broadcast every 500 ms (The left one in Figure 2). When broadcasting, the RSSI, packet number, and channel information can be resolved from the data packet. The receiver used for collecting the BLE information is a professional receiver named sniffer produced by Xunlian Electronics, and the result can be displayed in Wireshark. The receiver records the RSSI, as well as the channel information of the Bluetooth data packet. The devices are shown in Figure 2.
For collecting the training data used for determining the locations of passengers inside a car, three BLE transmitters were first mounted at different places in the car, while BLE receivers were deployed under each seat in a car, and the total number of BLE receivers is 5. Figure 3 shows the deployment of the transmitters and receivers. We chose to use three transmitters because fewer transmitters did not provide enough information, while more transmitters caused duplicate data from different transmitters. The BLE transmitter near the driver-side door handle is called Trans1, the one near the front passenger-side door handle is called Trans2, and the last one near the light above the rear middle seat is called Trans3. To reduce the influence caused by the position of the antenna, the pointing directions of different transmitter antennas are controlled. The antenna's directions of Trans1 and Trans2 are downward, while that of Trans3 is forward. To construct an accurate model which can better simulate the driving condition, we collected the data with both in a fully occupied vehicle and a vehicle having only one person inside.
The data were collected twice in a five-seat BYD to prevent correlations between the training data and testing data, so that we can conduct cross-validation to build an optimal model for every location. Each time, the data were collected at five locations with one person in the vehicle and with fully occupied vehicle. The data were collected for 30 min for every seat, and more than 40,000 data packets were obtained in total, with each data packet containing the RSSI from three channels of three BLE transmitters.
After the RSSIs from different positions were collected, the ARMA model was used for data filtering. An ARMA(p, q) model is constructed for all the channels from each BLE transmitter. The ARMA models were designed to get the best parameters for different   transmitters in different channels. For example, Figure 4 shows one set of the original data and the data filtered by ARMA (3,5). It can be found that the outliers can be eliminated by the filter without being overkilled. The average of the original data is 251.63, while that of the filtered data is 251.57, which are quite close. The LMS between the original value and filtered value is 1816.30, which is the least value.
Hierarchical classification models are then run to determine the positions. Evaluation of a classifier needs a balance between the computing time and accuracy. To achieve the best classification effect, the iteration time versus accuracy is plotted in Figure 5 together with the area under the curve (AUC), which can evaluate classifier result. The receiver operating characteristic (ROC) curve is also presented in Figure 5. Figure 5 shows that 40 iterations can generate good results: the accuracy can reach 94.8%, with AUC of 0.957. From the ROC curve, we can find that the detection rate can reach 97.3% with a 0.07 false-positive rate (FPR) rate, and the models are stored in the format of the JSON arrays. Then, the SVM classifier was used as the fine classifier to construct models for different locations, and the LIBSVM 23 was adopted to build the best matching model. The result shows RBF kernel can get best result with c = 8:0, g = 0:5.

Positioning process
In the actual positioning process, smartphones are more likely to be used, so in the experiments, models were installed and run on smartphones. The flowchart of the positioning process is shown in Figure 6.
Experimental equipment. As illustrated above, the LOCin-a-Car requires the BLE discrete channel RSSI. However, common receiving terminals cannot distinguish the channels of BLE broadcast data packet. To solve the problem, we propose a time-sharing broadcast method, in which the BLE transmitters broadcast with one channel at a time and the channel information and packet sequence number are added into the  broadcast packet. By parsing the received packet, the packet sequence number and channel info can be learned. A sample data packet is shown in Figure 7. Here, the packet sequence is 0x49b2, while the channel number is 0x0025 (decimal 37).
An entire broadcasting package is split into three parts with one packet for each part, which will be received in sequential with different channels. The data packets in Figure 8 show an entire broadcasting package. The upper part is the parsed information and the lower part is the original broadcast data. It can be found that the packets have channel numbers from 37 to 39 with the same sequence number.
Each broadcast package was used to make one estimate of the transmitter positions. To increase the accuracy, we used six consequential packages to make six estimates and chose the position that happened most as the resulting position for each transmitter. We made up a 9 3 6 virtual table for the six packages and three transmitters with three channels each as shown in Figure 9. When a receiver got a data packet, RSSI was filled in the specific cell determined by its transmitter number, packet sequence number, and channel number.
Data preprocessing. The received data were filtered with the established ARMA model to prevent the influence from outliers. Then, as mentioned above, different receiving terminals went through different calibration models. We first used the sniffer (device used for model construction), Nexus (Nexus 5, Android 6.0), and Huawei (Honor 6, Android 6.0) mobile phone to collect the data at every location in the car at the same   time, and then analyzed them statistically. Figure 10 shows the boxplot of these data, from which the upper quartile, lower quartile, distribution, and bias of data can be found. From this figure, we can conclude that device calibration must be applied.
In this article, we used the five-folder validation to establish the best SVR with the least dispersibility ( Figure 11). Two parameters can determine an SVR model for Gaussian kernel, where C means emphasis on outliers, g should be larger than 0. The greater the value of g is, the fewer the amount of support vectors is. LIBSVM is used to find best parameters. The result shows that the best calibration result can be obtained when C = 1000, g = 0:1 for Nexus and C = 500, g = 0:5 for Huawei. Figure 11 shows the scatter diagram of the standard data and received data before and after calibration. The dispersibility of Nexus is reduced from 7.425 to 3.758, while the dispersibility of Huawei is reduced from 9.646 to 2.358. The dispersibility results show that the SVR mapping can get best calibration results. These calibration models can be used in the stage of localization.
Besides the differences between the receiving terminals, the differences between BLE transmitters also need to be considered. For checking the differences between BLE transmitters, three different BLE transmitters were used at the same time with receiving terminals collecting RSSI at 15 cm, 40 cm, and 70 cm from BLE transmitters and receivers in indoor environment. Here, we use these distances because they can represent the estimate distances from every five seats to three BLE transmitters when in-vehicle positioning. For obtaining the RSSI from different BLE transmitters more accurately, the RSSIs were recorded for 20 min and then the averages were calculated. Two receiving terminals were used to ensure reliability. Table 2 shows the means of the RSSIs received from different BLE transmitters with Nexus and Huawei.
Differences between the BLE transmitters are shown in Table 3. Differences of 15 cm, 40 cm, and 70 cm are calculated, respectively.
It can be found that the differences between BLE transmitters at 15 cm, 40 cm, and 70 cm are nearly the same. The maximum and the minimum gaps between the same pair of BLE transmitters are not greater than 2.17 (difference between BLE transmitter 1 and BLE transmitter 2, data collected by Nexus, 4.513 2 2.340 = 2.173). Therefore, in this article, it is assumed that the gap between different transmitters at the same distance is equal and can be used for error elimination.
When positioning, users should offer the RSSI measured at 15 cm, 40 cm, and 70 cm with the BLE transmitters to be used in indoor environment. Device calibration model has been constructed and stored in system, and the gap at different distances (d 15 at 15 cm, d 40 at 40 cm, and d 70 at 70 cm) will be calculated, d of the BLE transmitters by equation (3). Then, the final RSSI will be obtained with the standard RSSI (RSSI std ) and d    Localization system. To test the localization model and the methods, we set up a localization system, which was implemented with the proposed time-sharing method deployed on the BLE transmitters, ARMA filter, device calibration method, hierarchical classification models, virtual table method, and a localization app deployed on smartphones. The app user interfaces (UIs) are shown in Figure 13, which require the users to provide the information of BLE transmitters, for example, the media access control (MAC) address, RSSI measured at 15 cm, 40 cm, and 70 cm to reduce the error caused by different BLE transmitters. The k-nearest neighbors (KNN) and three point localization methods were also implemented to compare with the LOC-in-a-Car. When positioning, users can choose the localization method, and the result is shown on the second page of the app with text and graph. We have carried out experiments in which the cell phone is placed at 15 cm to the left, 15 cm to the right, and 15 cm to the front of the subject's body to simulate real situation when people in a car.

Result
To evaluate the performance of the proposed method on different smartphone terminals, we defined two evaluation criteria, overall accuracy, and accuracy for each seat. The overall accuracy Acc overall is defined as equation (7) where P Res corr means the number of the correct tests and P Res means the number of total tests. The accuracy for each seat, on the other hand, is the ratio of the number of the correct tests to that of total tests for each seat, which is denoted as 8 In our experiment, we randomly placed a Nexus 5 as well as a Huawei phone on a seat of a five-seat BYD electric vehicle. If the location result shown on the app  is the same with the actual location, we recorded it as a correct test. We conducted 700 rounds of tests for each seat using those two types of phones with different methods. The corresponding results are shown in Tables 4 and 5, respectively.  From Tables 4 and 5, we can see that our proposed method has a higher accuracy while with less standard deviation no matter on Nexus phone or Huawei phone, showing our method outperforms the KNN and threepoint method in accuracy and stability. On the other hand, for each seat, we obtained similar results on two different types of terminals, which validates the effectiveness of our calibration method.

Conclusion and future works
In this article, we propose an in-vehicle localization method based on BLE RSSI named LOC-in-a-Car, which can determine the passengers' locations by receiving RSSI from multi-channels. BLE transmitters use time-sharing broadcast method, while receiving terminals use virtual table method in LOC-in-a-Car to solve the problem that some smartphones cannot distinguish channels of BLE broadcast data. Moreover, the ARMA is also used to filter the data to reduce interference of outliers, and the SVR is used to reduce the differences between different receiving terminals. To provide better in-vehicle services, we also implemented localization app. From the results, it can be concluded that the LOC-in-a-Car and our app can determine the driver and passengers' locations with high precision. In addition, it makes full use of the communication character of BLE, provides a new thought for localization in small space. It builds a low-cost and relatively stable network environment in small-scale environment on demand and solves the problem of a lack of resource in small space.
So far, the device calibration method is not automatic because it requires training models for different devices. Hence, in the future studies, a better device calibration method will be investigated. On the other hand, the current experiment is conducted in a five-seat vehicle, and application of the proposed method on other kinds of cars needs further research.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.