Incentive mechanism based on Stackelberg game under reputation constraint for mobile crowdsensing

Encouraging a certain number of users to participate in a sensing task continuously for collecting high-quality sensing data under a certain budget is a new challenge in the mobile crowdsensing. The users’ historical reputation reflects their past performance in completing sensing tasks, and users with high historical reputation have outstanding performance in historical tasks. Therefore, this study proposes a reputation constraint incentive mechanism algorithm based on the Stackelberg game to solve the abovementioned problem. First, the user’s historical reputation is applied to select some trusted users for collecting high-quality sensing data. Then, the two-stage Stackelberg game is used to analyze the user’s resource contribution level in the sensing task and the optimal incentive mechanism of the server platform. The existence and uniqueness of Stackelberg equilibrium are verified by determining the user’s optimal response strategy. Finally, two conversion methods of the user’s total payoff are proposed to ensure flexible application of the user’s payoff in the mobile crowdsensing network. Simulation experiments show that the historical reputation of selected trusted users is higher than that of randomly selected users, and the server platform and users have good utility.


Introduction
Mobile crowdsensing (MCS) network usually includes a cloud-based server platform (SP) and numerous users with smart sensor devices. 1 When the SP publishes a set of sensing tasks, several users will be selected to participate in performing the sensing tasks. MCS is recently used in various fields, such as traffic information, 2 noise pollution, 3 WiFi coverage information, 4 and water pollution. 5 However, the selected users need to spend their time and limited resources (such as energy, internal storage, and CPU computing power) when they perform sensing tasks. Thus, voluntarily participation of users in sensing tasks may be unsustainable. The SP provides users with rewards to compensate for cost in the sensing tasks for encouraging users with smart devices to actively participate in the tasks.
Designing justifiable incentive mechanism is challenging in the MCS network. The collected sensing data's information amount will be insufficient when the reward given by the SP is less. However, the utility of the SP will reduce and its cost will increase when the SP gives more rewards to users. Therefore, the core problem of the MCS network is designing a rational and valuable incentive mechanism. 6 The existing incentive mechanism can be roughly divided into the platform-centric and user-centered incentive mechanisms. 7,8 The platform-centric incentive mechanism is designed to improve the information increment of the SP and reduce the reward to users. 9 The user-centered incentive mechanism mainly increases the motivation of users to participate in sensing tasks and encourages users to collect sensing data initiatively. 10 Therefore, this study designs an incentive mechanism that considers the SP and the users. In this mechanism, the SP receives highquality sensing data while the users acquire considerable payoff.
The users can randomly submit sensing data to obtain more rewards at the lowest cost in the MCS network system. 11 However, dishonest users may deliberately send some false sensing data to mislead the network system, and this event causes inaccuracy of the sensing task's result. Therefore, the reputation of the users is a crucial parameter in the MCS system. 12 The users need to spend time and limited resources of the sensors device to complete the sensing task. Thus, no rational users will upload the sensing data actively if the reward given by the SP is less than the cost that the users use in collecting sensing data.
The reputation constraint incentive mechanism algorithm (RCIMA) is proposed. This study aims to design an incentive mechanism for maximizing the utility of the users and SP, and the users with high reputation will be encouraged to participate and collect high-quality sensing data in the sensing task. The primary contributions of this study are summarized as follows: (a) A resource contribution game algorithm (RCGA) based on Stackelberg game theory is proposed. The SP and users choose their optimal strategies to maximize their utility, and the existence of the Nash equilibrium point is proven in the Stackelberg game. (b) A reputation update method for the users is proposed. After the users upload the sensing data, the expectation-maximization (EM) algorithm is applied to evaluate the quality of sensing data collected by the users, and the SP updates the historical reputation of the users participating in the sensing task. (c) Two methods of reward conversion are proposed to select reward application for the users. The first method uses the user's total payoff as the total reward when he needs to publish tasks in the MCS. The second method converts the user's total payoff into real currency. Thus, the payoff of users will be more flexible circulation in the MCS.
The rest of the article is organized as follows. Section ''Related works'' presents various incentive mechanisms proposed in recent years. The MCS system model is introduced in section ''System model.'' Section ''Details of RCIMA'' describes RCIMA, which has four parts: selecting trusted users, RCGA, updating the reputation of each user, and incentive allocation. Section ''Simulation results and analysis'' presents the performance evaluation. The conclusion is presented in section ''Conclusion.''

Related works
In recent years, incentive mechanisms have become a research hotspot in the field of MCS. 7 Several researchers have applied distinct game models to design incentive mechanisms in the MCS system. 13,14 The auction model is a universal mathematical method for designing incentive mechanisms. Good auction model needs to satisfy individually rational, incentive-compatible, feasible budget. 15 A long-term dynamic quality incentive mechanism is proposed to capture the dynamic nature of users' data quality in Wang et al. 16 The incentive mechanisms based on the auction model are studied considering privacy protection and social cost minimization in Lin et al. 17 The SP selects users using a predefined scoring function, and the computational efficiency, individual rationality, and truth and differential privacy of the algorithm can be guaranteed. The incentive mechanism based on Sybil-proof auction is studied to prevent Sybil attacks in Lin et al. 18 A reverse auction-based incentive mechanism (RAIN) is proposed in Ji et al., 19 which considers participants' potential contributions when recruiting new workers. An online auction algorithm is studied combining multiattribute auction and reverse auction to dynamically select users in Wang et al. 15 Different game models have distinct goals for designing the incentive mechanism in addition to the auction model. Some scholars design incentive mechanisms based on Stackelberg game in MCS. The incentive mechanism considering the social network effect based on Stackelberg game theory is applied to analyze the relationship between users and service providers in Nie et al. 1 Stackelberg game theory is applied to design the incentive mechanism with user resource requirements as parameters, and the dynamic incentive mechanism based on the deep reinforcement learning method is studied without learning the user's private information in Zhan et al. 20 A delay-sensitive MCS network technology is designed based on the Stackelberg game in Cheung et al. 21 A three-stage Stackelberg game is proposed in the continuous time-varying scene of the MCS incentive mechanism in Li et al. 22 Most of the traditional incentive mechanisms only consider the utility of the SP and users. However, other factors also affect the sensing task results, such as interests and history reputation of the users. The reliability of the collected sensing data in the MCS system is also a concern. 23 According to reports, users can submit some random sensing data to obtain more payoffs when performing the sensing task at the minimum cost. 11 Moreover, users with low reputation may upload some false sensing data to affect the result of the sensing task. 12 Therefore, the SP should select trusted users to collect sensing data. The MCS system considers the contribution quality and reputation level of the user in the social network to obtain the reputation level of each user in Amintoosi and Kanhere. 24 The author uses the Gompertz function to evaluate the contributions of participating devices, and the reputation system calculates the new reputation based on the location and time of the users in Huang et al. 25 However, the incentive mechanism, which is the core of the MCS network system, is ignored in Amintoosi and Kanhere 24 and Huang et al. 25 The historical reputation of a user reflects its previous behavior, 26 which is used as parameter for selecting the users to minimize the threat from dishonest users. Therefore, the historical reputation of users is combined to design the algorithm in our incentive mechanism.
In addition, scholars have also proposed some multi-attribute incentive mechanisms. A hybrid incentive mechanism based on blockchain technology is proposed, and this mechanism integrates data quality, reputation, and money factors to encourage users to collect sensing data while preventing malicious behavior in Wei et al. 27 However, the application problem of the reward obtained by the users when they perform the sensing task is always ignored in the MCS system. The users obtain the reward accordingly after performing a sensing task. The reward application of the users can enhance the flexibility of the MCS system.
On the basis of the abovementioned analysis, this study designs an RCIMA incentive mechanism based on the Stackelberg game in the MCS network. The SP selects trusted users to ensure the quality of the collected sensing data. Then, the Stackelberg game is employed to analyze the balance problem of the SP and the users. The EM algorithm is also utilized to evaluate the quality of the collected sensing data by users, and the SP updates the reputation of each user. Finally, two conversion methods of users' total reward are proposed.

System model
The MCS network is mainly composed of the task publisher (TP), the SP, and the users. As shown in Figure 1, the execution process of the sensing task is as follows. First, the TP publishes the sensing task information and total reward R to the SP. The SP broadcasts the sensing task information to users equipped with the smart sensor device. The users interested in the task sign up for the sensing task, and the users' set is U = {u 1 , u 2 , ..., u n }. Then, the SP selects some trusted users to participate in the task, and the selected users choose the optimal strategy to perform the sensing task. After the users complete the sensing task, they upload the sensing data to the SP. Finally, the SP updates the reputation of each user; the users are allocated reward in the sensing task. Besides, each user chooses a conversion method of reward to deal with the obtained reward.
The detailed process is presented as follows: 1. TP published a sensing task and total reward R to SP; 2. If the users with a mobile smart device sensor are interested in the sensing task, then they will sign up to participate in the sensing task. The users' set is U = {u 1 , u 2 , ..., u n }; 3. SP uses users' historical reputation to select the trusted users W = {w 1 , w 2 , ..., w m } (m ł n); 4. The SP and the users choose their optimal strategies by RCGA. The users will perform the sensing task and submit data to the SP when user selects the optimal strategy and utility of user is greater than zero; 5. SP evaluates the quality of the sensing data, and the SP updates the reputation of users; 6. The users receive the reward allocated by SP, and users select a method to convert virtual currency.
The relationship between the SP and the users is constructed as a Stackelberg game model. The selected users' set is W = {w 1 , w 2 , ..., w m }, and each user w i 2W selects its resource contribution level X i , where X i ø 0. The user w i chooses an optimal strategy X i * according to the total reward R provided by the SP in the sensing task. The resource contribution level strategy set of users is X = (X 1 , ..., X m ), and X 2 i = (X 1 , ..., X i 2 1 , X i + 1 , ..., X m ) represents the strategy excluding w i . The resource contribution level X i of user w i is defined as follows. . That is Definition 2. The energy consumption ratio E i ' is the ratio of the consumed energy of the user w i in transmitting the sensing data to the SP and the remaining energy where E 0 i 2 (0, 1). E i is the consumed energy of user w i in performing the sensing task. E 0 is the initial energy before performing the sensing task, and E 0 À E i is the remaining energy after completing the sensing task. The energy consumption of each user mainly comes from the energy consumption of sending and receiving data in performing the sensing task. Thus, the other energy consumed by users is ignored. 28 Equation (3) represents the energy consumption of transmitting and receiving sensing data where k*E elect represents the energy consumed when sending and receiving k bit sensing data. d 0 is the distance threshold equal to 87 m. e fs and e amp represent the amplifier power consumption of the free-space and multipath attenuation models, respectively. The freespace model is employed when the distance between the user and the SP is less than d 0 , and the transmission power is attenuated to d 2 . Otherwise, the multipath attenuation model is used, and the transmission power is designated as d 4 .
The utility function of the user w i is defined as The utility of user w i is composed of two parts. The first part is the user's payoff, which is determined by the resource contribution level X i . The second part is the user's cost function, which is the cost spent by the user in performing the sensing task, and a i is the unit cost of the user w i .
The resource contribution level of the users in performing sensing task is converted into the SP's payoff function u(To). The SP's utility is the payoff subtracted by the total reward R, that is The function u(Á) is utilized to convert the user's resource contribution level into the SP's payoff, which reflects the law of diminishing payoff. The payoff of the SP increases with the resource contribution level of the user. However, the marginal payoff decreases. l is a system parameter, which represents the equivalent monetary value of contributed resource by users.
The game theory model is employed to construct the relationship between the SP and the users as a noncooperative game. 29 The strategy of the user w i is to determine the resource contribution level X i , and the strategy of the SP is to determine the total reward R to maximize their utility. The Stackelberg game can solve the benefit conflict between the SP and the users and find their optimal strategy. 29,30 Therefore, a two-stage Stackelberg game is applied to solve the incentive allocation problem of the relationship between the SP and the users.

Definition 3
Two-stage Stackelberg game. The first stage of leader game (SP). The SP determines the total reward R to obtain more utility, that is The second stage of follower game (users). Each user chooses his strategy according to the total reward R by the SP and the resource contribution level of other users, and the purpose is to ensure that his utility reaches the maximum, that is The second stage is regarded as a non-cooperative game and is called RCGA. This study analyzes the Nash equilibrium of the Stackelberg game, as discussed in section ''Analysis of RCGA.''

Details of RCIMA
Publishing sensing task and selecting users collect sensing data The sensing task is published by the TP, and the TP uploads task information (such as name, function, number of users m, and total reward R) to the SP. The SP broadcasts the sensing information to the users, and the users U = {u 1 , u 2 , ..., u n } with smart devices sign up for the sensing task. The SP selects m users with the highest historical reputation in the n registered users to ensure the quality of the collected data. The user will be deleted when the user's initial energy cannot complete the sensing task. Finally, the selected users' set is W = {w 1 , w 2 , ..., w m } (m ł n). Furthermore, the selected users choose the optimal response strategy based on RCGA to decide whether to continue to participate in the sensing task and to collect sensing data.

Analysis of RCGA
The relationship between the SP and the users is modeled as the Stackelberg game. The SP is the leader, and its strategy is to announce the total reward R of the sensing task. The users are the followers, and their strategy is to choose the resource contribution level. Each user looks for his optimal response strategy by the SP's strategy, and the SP further adjusts its strategy to maximize its utility. Each user is rational in performing the sensing task. Thus, the user uploads the sensing data when the utility is greater than zero. If the utility obtained by the users is less than zeros, then no users will participate in the sensing task.
Follower game. Once the users participate in the sensing task, the total reward R given by the SP will be allocated to users according to the weight of each user's resource contribution level. The utility of user w i by equation (4) is When all users choose the optimal strategy, a steady state will be achieved in the RCGA. As a result, all participants cannot change the strategy to obtain more utility, which is the Nash equilibrium in noncooperative games. 31 The following defines the Nash equilibrium and optimal response strategy in RCGA.

Definition 4
Optimal response strategy. Given X 2 i , a strategy X i ø 0 is the optimal response strategy if it is maximized u i (X i , X 2 i ), which is denoted by X i * .
is the Nash equilibrium in the RCGA when each user w i satisfies u i Theorem 1. A unique Nash equilibrium point exists in the follower game when the SP provides the total reward R to users in the RCGA.
Proof. To study the optimal strategy to maximize the utility of the user w i , the first and second derivatives of the utility function u i about its resource contribution level strategy X i are calculated by equation (9) ∂u The utility function is strictly concave with respect to the strategy of the user w i because the second derivative is negative. The SP provides the total reward R . 0 and other users' strategy X 2 i . If an optimal strategy X i * exists, then the optimal response strategy of user w i is unique.
The first derivative is set to zero using equation (10) Once the user w i uploads the sensing data, user w i is the winner and X i . 0; otherwise, X i = 0. The selected users are defined as W. The set of winners is defined as W and W = j 2 W jX j .0 È É . m 0 is the number of winners m 0 = W j j. Considering that P j2W X j = P k2 W X j , we have By summing all the elements of W in equation (13), we obtain By solving P k2 Substituting equation (15) into equation (13) yields The strategy X i * is the optimal strategy for the user w i when X i * is positive in equation (16). If X i * is negative, then the user w i does not participate in the sensing task. Therefore, the optimal strategy for user w i is Theorem 2. Given the total reward R . 0, if the optimal strategy set of all users X Ã = (X 1 Ã , X 2 Ã , ..., X m Ã ) is the unique Nash equilibrium of RCGA, then the following conditions are met.
(a) W j jø 2; (c) if a k ł max j2 W j j a j È É and k 2 W ; (d) Sort the user's costs in a non-decreasing sequence such that a 1 ł a 2 ł ÁÁÁ ł a m , and set h as the largest integer of a h \ R P j2W X j , then W = {1,2,ÁÁÁ, h}.

Condition (a) is proven as follows. If
W j j = 0, then no user is participating in the sensing task currently. Thus, any user can change the strategy X i = 0 to X i . 0 in the MCS to obtain utility, which contradicts the definition of Nash equilibrium. Therefore, W j j 6 ¼ 0. When 1 ł W j j\ 2, the utility of user w i is u i = R À a i *X i . User w i can unilaterally change the X i strategy to X i 2 1 to obtain more utility, and this condition is still opposed to the definition of Nash equilibrium; Therefore, W j j ø 2. We also prove Condition (b). The user w i participates in the sensing task when i 2 W . Under the condition that other users' strategies are constant, user w i 's optimal strategy is X * i = ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi R P i X i À Á a i q À P i X i , and user w i obtains the greatest utility. The user w i does not participate in the sensing task when i 6 2 W , and the strategy X i = 0.
Condition (c) is proven as follows. When i 2 W , In addition, the following conclusions are drawn We suppose a k ł max j2 W a j È É , and k 6 2 W . The strategy of user w k from Condition (b) is X k = 0, and it is substituted into equation (10) Therefore, the user w k can improve the utility by unilaterally changing the strategy to X k . 0, which contradicts the Nash equilibrium.
Condition (d) is proven as follows. The costs a 1 , a 2 , ÁÁÁ, a m of the m users are sorted in a non-decreasing sequence. From Conditions (a) and (c), an integer k in [2, m] Therefore, user k + 1 can obtain more utility by increasing the resource contribution level of the sensing task, and this condition contradicts the Nash equilibrium.
In Algorithm 1, the SP first initializes the set of users W who are willing to upload sensing data, the resource contribution level set {X i }, and all users' payoff set {p i }. Then, the unit cost a i of all users in W is sorted in a non-decreasing sequence, and the first users are added to the set W . Next, other users (with utility greater than zero) are added to set W by the SP. Finally, the resource contribution level of each user is calculated, and reward is allocated to each user by the SP.
The time complexity of Algorithm 1 is O(nlogn). The time required for all users to sort is O(nlogn), while the time required for while loop (6-8 lines) and for loop (9-12 lines, 14-17 lines) is O(n).
Leader game. The SP and the users are participants in the RCGA, and the SP is the leader and the users are the followers. All users have a unique Nash equilibrium point when the SP provides the users with the total reward R. Therefore, the SP can determine the value of R to maximize its utility.
Theorem 3. An optimal strategy R Ã exists in the RCGA and constitutes a unique Stackelberg equilibrium point (R Ã , X Ã ), where X Ã is the optimal strategy set for all users. The utility of the SP is the maximum when the total reward is R Ã .
Proof. By substituting equation (15) into equation (13), we obtain Substituting equation (22) into the utility function of the SP yields The second derivative of the SP utility function is determined where Therefore, the utility function of the SP in the RCGA is strictly concave as obtained by equation (24), and there is only Stackelberg equilibrium in the RCGA. A unique R Ã exists such that the SP's utility function u 0 '(R, X Ã ) reaches the maximum under the condition of (R Ã , X Ã ). The unique R Ã is calculated by utilizing the Newton method. 32

Evaluating reputation
After the users upload the sensing data, the SP will evaluate the reputation of the selected users. First, the EM algorithm 33 is employed to evaluate the quality of the upload sensing data by the users. Then, the SP evaluates the selected users' reputation based on the sensing data quality result. Finally, the user's historical reputation is updated after the user's reputation is evaluated.
Quality evaluation. The quality of the submitted sensing data by the users reflects the quality of the sensing task they completed. Here, the user w i collects urban noise sensing as an example in Peng et al., 34 and each user w i estimates the quality evaluation matrix e w i , which is a m 3 m matrix of elements e w i rs 2 ½0, 1, r = 1,2,..., m, s = 1,2,..., m. The quality of the sensing data is mapped to the quality evaluation matrix by the function q i = g(e wi ). Thus, the reading of sensing data is divided into m discrete intervals and expressed as a set D = {d 1 , d 2 , ..., d m }, which represents the quality level of collected sensing data. Given a set of collected sensing data S, a set of P missing true indicators, probability matrix E, and probability density function f are obtained. The probability matrix E is To find the maximum likelihood estimate of E, the following two steps are iteratively run by the EM algorithm until convergence (with the assumption thatÊ t is the current value of the probability matrix E after t iterations). E-step: According to the conditional distribution of P given observation S under the current estimated value of E, the expected value of the likelihood function is calculated as follows M-step: The estimationÊ that maximizes the expectation function is determined E-step and M-step are iterated until the estimated value reaches convergence. The converged evaluation of the user's effort matrix indicates the quality of the sensing data, and the noise interval distribution implies the noise pollution level.
The specific steps are given as follows: Step 1: For each task t, the index function I d k t = d j À Á = 1 of user's sensing data d t k falls into the real interval d j , and the probability distribution of the real noise interval p t is initialized as Step 2: The likelihood function of the sensing probability matrix is estimated, andê w i rs represents the value after t iterationŝ The real noise interval distribution is estimated aŝ Step 3: The real noise interval is estimated. Given the sensing data S, the quality evaluation matrix E, and the noise interval distribution P, the true noise interval P is estimated using Bayesian inference. The real noise interval distribution is calculated using the following formula Step 4: Convergence. Steps 2-3 are iterated until the two estimates converge, that is, jÊ t + 1 ÀÊ t j\e,jP t + 1 ÀP t j\h,e.0,h.0. With the estimation for the quality evaluation matrix e w i , we can obtain the quality of user w i 's sensing data though the mapping function g e w i ð Þ. Therefore, the quality of collected sensing data by user w i is Updating reputation. Through the abovementioned quality evaluation process, the quality of the collected sensing data of user w i is q w i . Then, the reputation value of user w i will be normalized and converted to [0,5]. Thus, the reputation value of the user w i is where q max is the highest data quality value in the sensing task. The user's historical reputation is updated as where o is the number of historical tasks in which user w i participates, and Rep i0 is the historical reputation value of user w i .

Reward distribution
The reward provided by the SP to user w i is expressed as Re i , and the user w i chooses the optimal resource contribution level X i through the RCGA. Thus, the final reward is The total payoff ReT i of the user w i is the reward for performing one task or several tasks in the MCS network system. When the total payoff ReT i is greater than a certain threshold V min , the virtual currency (ReT i ) is converted and applied to the following two methods.
In the first method, the user w i regards the virtual currency of total payoff ReT i as the reward for publishing sensing task. The total reward R is also distinct when the user w i publishes different sensing tasks. However, when the total payoff of user w i must not be less than the reward R required for publishing a sensing task, he can successfully publish the sensing task. When the user w i 's total payoff is insufficient as a reward in publishing the sensing task, he will convert real currency (do) into virtual currency (V p ) to publish the sensing task by equation (38). Then, user w i will publish the sensing task that he needs In the second method, the total payoff ReT i of user w i directly converts virtual currency into real currency. The SP will convert virtual currency into real currency successfully when the total payoff of user w i is greater than the threshold V min , which is where V p . V min , V p and do are virtual currency and real currency, respectively. c and c' are system parameters determined by the MCS network system, and c' is slightly larger than c. Therefore, if the user w i needs to publish a sensing task, then he will be more willing to convert virtual currency into the total reward of the sensing task.

Simulation results and analysis
Simulation experiments are conducted with MATLAB R2016a and the following network topology is established to evaluate the performance of RCIMA. One TP, One SP, and 1000 users are randomly distributed in the target area with the range of 1 3 1 km 2 , and the TP can be successfully published tasks to the SP. The parameters and experimental values in this study are shown in Table 1. Figure 2 shows the relationship between average payoff obtained by the users and the total reward R given by the SP in RCGA. The reward given by the SP to the user is the user's payoff when the user completes the sensing task. The users' payoff is related to the total reward R of the SP and the users' optimal resource contribution level. As shown in Figure 2, the average payoff of the users will be more when the number of selected users is less. The average payoff increases with the R when the number of selected users is fixed. Therefore, it is more beneficial to users that the number of selected users is small when the SP provides a fixed total reward. It is more favorable for the users when the greater the total reward given by SP given the number of users.

Average payoff of users
Average utility of users Figure 3 shows the relationship between the average utility of users and the total reward R given by the SP in the algorithm. The utility of the user is the payoff subtracted by the cost when user complete the sensing task. In Figure 3, the average utility of users increases with the total reward R. Moreover, the total reward R is linearly related to the average utility of users. Under the condition of R, as the number of users selected increase, the average utility of the users reduce. Because the number of users goes up, the weight of each user's optimal strategy goes down. Thus, the payoff of each  user will be less, and the user's utility will be reduced correspondingly. Figure 4 shows the relationship between the utility of SP and the total reward R paid by SP to users. The utility of SP is related to the resource contribution level X i of the users and the total reward R of SP. The experimental result shows the utility of SP decreases as the total reward R increases. The resource contribution level of the users is fixed when the number of selected users is constant. Thus, the utility of the SP is less when the total reward R increases. Given a certain total reward R, SP will obtain more utility when the number of users is more.

Utility of the SP
Resource contribution coefficient b i of the user Figure 5 shows the relationship between the resource contribution coefficient b i of the user w i and the total reward R of the SP. The resource contribution coefficient b i of the user w i is related to the resource contribution level X i and the user's energy ratio E i ' . The experimental results show that the smaller the number of users, the larger b i the user has. And the resource contribution coefficient b i of the user is unsteadiness. The reason is the smaller the number of users, each user has the more reward when R is fixed. Thus, the weight of the user's optimal strategy will increase. b i will not change significantly with the increasing of R because the optimal strategy of the user is different when R is distinct. Figure 6 analyzes the relationship between the user w i 's reputation and the quality evaluation matrix e ii of user w i in collecting sensing data. The quality evaluation matrix e ii of the user in submitting the sensing data approximately follows the same normal distribution, 34 where m = 0.75 and s = 0.125. Figure 5 shows a linear relationship between the quality evaluation matrix and reputation. When the quality evaluation matrix e ii of user w i in submitting the sensing data is smaller, the user w i 's reputation is correspondingly lower in the sensing task. However, if the quality evaluation matrix of   the user w i 's is larger in the sensing task, then the reputation of the user w i will be higher.

Reputation evaluation
Analyzing the reputation value of selected users Table 2 analyzes the comparison of different historical reputation values between randomly selected users and selected trusted users. First, the SP chooses trusted users who have higher average historical reputation value than randomly selected users. Then, the average historical reputation value of users has a small difference when randomly selecting users. However, when selecting trusted users, the average historical reputation value of users is high if the number of selected users is small.

Conclusion
In this study, an incentive mechanism (RCIMA) is proposed on the basis of the Stackelberg game that considers the benefit of the SP and users for MCS. The overall mechanism includes choosing trusted users, RCGA, and reputation update and reward distribution method. The credibility of the collected sensing data has an obvious improvement because the users are selected by the reputation. Compared with the random selected users, the proposed model in this article has higher average historical reputation value. The utility of SC and MUs in the proposed method is good in the RCGA. Meanwhile, two conversion methods between virtual currency and real currency are used to ensure flexible application of the users' total payoff in the MCS system. However, this article does not consider the user selection task problem when multiple tasks are released. Therefore, the incentive mechanism considering multiple TPs will be investigated in future work. Moreover, the submission of sensing data by users will be studied to prevent the leakage of private information.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Figure 6. Relationship between the quality evaluation matrix of sensing data and the user's reputation.