Evolution of coordination in pairwise and multi-player interactions via prior commitments

Upon starting a collective endeavour, it is important to understand your partners’ preferences and how strongly they commit to a common goal. Establishing a prior commitment or agreement in terms of posterior benefits and consequences from those engaging in it provides an important mechanism for securing cooperation. Resorting to methods from Evolutionary Game Theory (EGT), here we analyse how prior commitments can also be adopted as a tool for enhancing coordination when its outcomes exhibit an asymmetric payoff structure, in both pairwise and multi-party interactions. Arguably, coordination is more complex to achieve than cooperation since there might be several desirable collective outcomes in a coordination problem (compared to mutual cooperation, the only desirable collective outcome in cooperation dilemmas). Our analysis, both analytically and via numerical simulations, shows that whether prior commitment would be a viable evolutionary mechanism for enhancing coordination and the overall population social welfare strongly depends on the collective benefit and severity of competition, and more importantly, how asymmetric benefits are resolved in a commitment deal. Moreover, in multi-party interactions, prior commitments prove to be crucial when a high level of group diversity is required for optimal coordination. The results are robust for different selection intensities. Overall, our analysis provides new insights into the complexity and beauty of behavioural evolution driven by humans’ capacity for commitment, as well as for the design of self-organised and distributed multi-agent systems for ensuring coordination among autonomous agents.


Introduction
Achieving a collective endeavour among individuals with their own personal interest is an important social and economic challenge in various societies (Barrett, 2016;Hardin, 1968;Ostrom, 1990;Pitt et al., 2012;Sigmund, 2010). From coordinating individuals in the workplace to maintaining cooperative and trust-based relationship among organisations and nations, its success is often jeopardised by individual self-interest (Barrett, 2007;Perc et al., 2017). The study of mechanisms that support the evolution of such collective behaviours has been of great interest in many disciplines, ranging from Evolutionary Biology, Economics, Physics and Computer Science (Andras et al., 2018;Han, 2013;Kumar et al., 2020;Nowak, 2006;Perc et al., 2017;Sigmund, 2010;Tuyls & Parsons, 2007;West et al., 2007). Several mechanisms responsible for the emergence and stability of collective behaviours among such individuals have been proposed, including kin and group selection, direct and indirect reciprocities, spatial networks, reward and punishment (Nowak, 2006;Okada, 2020;Perc et al., 2017;Skyrms, 1996;West et al., 2007).
Recently, establishing prior commitments has been proposed as an evolutionarily viable strategy inducing cooperative behaviour in the context of pairwise and multi-player cooperation dilemmas (Arvanitis et al., 2019;Frank, 1988;Han et al., 2017;Nesse, 2001;Ohtsuki, 2018;Sasaki et al., 2015); namely, the Prisoner's Dilemma (PD) Hasan & Raja, 2013) and the Public Goods Game (PGG) Kurzban et al., 2001). It provides an enhancement to different forms of punishment against inappropriate behaviours and of rewards to stimulate the appropriate ones (X. Chen et al., 2014;Cimpeanu et al., 2019;Martinez-Vaquero et al., 2015Powers et al., 2012;Sasaki et al., 2015;Szolnoki & Perc, 2012;Wang et al., 2019), allowing one to efficiently avoid free-riders (Han & Lenaerts, 2016;Han, Santos, et al., 2015) and resolve the antisocial punishment problem (Han, 2016). These works have primarily focused on modelling prior commitments for improving mutual cooperation among self-interested agents. In the context of cooperation dilemma games (i.e. PD and PGG), mutual cooperation is the only desirable collective outcome to which all parties are required to commit if an agreement is to be formed. The same argument is applied to other pairwise and multi-player social dilemmas such as the Stag-Hunt and Chicken games, since although the nature of the games is different from the PD and PGG, mutual cooperation is the only desirable outcome to be achieved (Pacheco et al., 2009; F. C. Santos et al., 2006;Skyrms, 2003). In other contexts such as coordination problems, this is not the case anymore since there might be multiple optimal or desirable collective outcomes and players might have distinct, incompatible preferences regarding which outcome a mutual agreement should aim to achieve (e.g. due to asymmetric benefits). Such coordination problems are abundant in nature, ranging from collective hunting and foraging to international climate change actions and multi-sector coordination (Barrett, 2016;Bianca & Han, 2019;Ohtsuki, 2018;Ostrom, 1990; F. P. Santos et al., 2016;F. C. Santos & Pacheco, 2011;Skyrms, 1996).
Hence, we explore how arranging a prior agreement or commitment can be used as a mechanism for enhancing coordination and the population social welfare in this type of coordination problems, in both pairwise and multi-player interaction settings. Before individuals embark on a joint venture, a pre-agreement makes the motives and intentions of all parties involved more transparent, thereby enabling an easier coordination of personal interests (Cohen & Levesque, 1990;Han, 2013;Han, Santos, et al., 2015;Nesse, 2001). Although our approach is applicable for a wide range of coordination problems (e.g. single market product investments as described above), we will frame our models within the technology investment strategic decisionmaking problem, allowing us to describe the models clearly. Namely, we describe technology adoption games capturing the competitive market and decision-making process among firms adopting new technologies (Bardhan et al., 2004;Zhu & Weyant, 2003), with a key parameter a representing how competitive the market is (thus describing how important coordination is). Similar to previous commitment models, we will perform theoretical analysis and numerical simulations resorting to stochastic methods from Evolutionary Game Theory (EGT) (Hofbauer & Sigmund, 1998;Sigmund et al., 2010).
We will start by modelling a pairwise technology adoption decision making, where two investment firms (or players) compete within a same product market who need to make strategic decision on which technology to adopt (Chevalier-Roignant et al., 2011;Zhu & Weyant, 2003), a low-benefit (L) or a high-benefit (H) technology. Individually, adopting H would lead to a larger benefit. However, if both firms invest on H, they would end up competing with each other leading to a smaller accumulated benefit than if they could coordinate with each other to choose different technologies. However, given the asymmetry in the benefits in such an outcome, clearly no firm would want to commit to the outcome where its option is L, unless some form of compensation from the one selecting H can be ensured.
We then extend and generalise the pairwise model to a multi-player one, capturing the strategic interaction between more than two investment firms. In the multiplayer model, a key parameter m is ascribed to the market demand of high technology, that is, what is the optimal fraction of the firms in a group to adopt H. We analytically examine how players can be coordinated when there is a market demand for a particular technology. We show that differently from the twoplayer game, the newly defined parameter m leads to a new kind of complexity when trying to achieve group coordination. When there is a high level of diversity in demand (i.e. intermediate values of m), as can be seen in different technologies adoption contexts (Beede & Young, 1998;Schewe & Stuart, 2015), introducing prior commitment can lead to significant improvement in the levels of coordination and population social welfare.
The next section discusses related work, which is followed by a description of our models and details of the EGT methods for analysing them. Results of the analysis and a final discussion will then follow.

Related work
The problem of explaining the emergence and stability of collective behaviours has been actively addressed in different disciplines (Nowak, 2006;Sigmund, 2010). Among other mechanisms, such as reciprocity and costly punishment, closely related to our present model is the study of cooperative behaviours and precommitment in cooperation dilemmas, for both two-player and multi-player games Hasan & Raja, 2013;Quillien, 2020;Sasaki et al., 2015). It has been shown that to enhance cooperation, commitments need to be sufficiently enforced and the cost of setting up the commitments is justified with respect to the benefit derived from the interactionsboth by means of theoretical analysis and of behavioural experiments (Arvanitis et al., 2019;X.-P. Chen & Komorita, 1994;Cherry & McEvoy, 2013;Kurzban et al., 2001;Ostrom, 1990). Our results show that this same observation is seen for coordination problems. However, arranging commitments for enhancing coordination is more complex, exhibiting a larger behavioural space, and furthermore, their outcomes strongly depend on new factors only appearing in coordination problems; namely, a successful commitment deal needs to take into account the fact that multiple desirable collective outcomes exist for which players have incompatible preferences; and thus how benefits can be shared through compensations in order to resolve the issues of asymmetric benefits is crucially important (Bianca & Han, 2019).
We moved further by expanding our two-player game in the previous work to a multi-player model, the outcome was more complex as there were more players involved. We yet again investigated how coordination and cooperation can be improved using prior commitment deal when there are multiple players involved and also when there is a particular market demand (Bianca & Han, 2019). Our approach in exploring how implementing prior commitment enhances cooperation dilemma has also been investigated by previous researchers in the past (X.-P. Chen & Komorita, 1994). A good level of cooperation was seen in a PGG experiment when there was a binding agreement made during the prior communication stage among members of the group. They hypothesised that if members of a group are allowed to make a pledge (a degree of bindings/ commitment) before their actual decisions, they will be able to communicate their intentions, and it will overall increase cooperation rate in the population. As predicted, their results clearly demonstrate that making a pledge improves cooperation although the degree of commitment required in the pledge deferentially affected the cooperation rate (X.-P. Chen & Komorita, 1994;Cherry & McEvoy, 2013;Kurzban et al., 2001).
There have been several other works studying the evolution of coordination, using the so-called Stag-Hunt game (see, for example, Pacheco et al., 2009;F. C. Santos et al., 2006;Sigmund, 2010;Skyrms, 2003). However, to the best of our knowledge, there has been no work studying how prior commitments can be modelled and used for enhancing the outcome of the evolution of coordination. As our results below show, significant enhancement of coordination and population welfare can be achieved via the arrangement of suitable commitment deals.
Furthermore, it is noteworthy that commitments have been studied extensively in Artificial Intelligence and Multi-agent systems literature (see, for example, Castelfranchi & Falcone, 2010;Chopra & Singh, 2009;Harrenstein et al., 2007;Rzadca et al., 2015;Singh, 1991;Winikoff, 2007). Different from our work, these studies utilise commitments for the purpose of regulating individual and collective behaviours, formalising different aspects of commitments (such as norms and conventions) in multi-agent systems. However, our results and approach provide important new insights into the design of such systems as these require commitments to ensure high levels of efficient collaboration and coordination within a group or team of agents. For example, by providing suitable agreement deals, agents can improve the chance that a desirable collective outcome (which is best for the systems as a whole) is reached even when benefits provided by the outcome are different for the parties involved.

Models and methods
In the following, we first describe a two-player technology adoption game then extend it with the option of arranging prior commitments before playing the game. We then present a multi-player version of the model, with and without commitments, too. Then, we describe the methods, which are based on EGT for finite populations, which will be used to analyse the resulting models.
3.1. Two-player tech adoption game 3.1.1. Two-player tech adoption without commitments. We consider the scenario that two firms (players) compete for the same product market, and they need to make a (strategic) decision on which technology to invest on, a low-benefit (L) or a high-benefit (H) technology. The outcome of the interaction can be described in terms of costs and benefits of investments by the following payoff matrix (for row player) where c L , c H and b L , b H (b L < b H ) represent the costs and benefits of investing on L and H, respectively; a 2 (0, 1) indicates the competitive level of the product market: the firms receive a partial benefit if they both choose to invest on the same technology. Collectively, the smaller a is (i.e. the higher the market competitiveness), the more important that the firms coordinate to choose different technologies. For simplicity, the entries of the payoff matrix are denoted by a, b, c, d, as above. We have b . a and c . d. Without loss of generality, we assume that H would generate a greater net benefit, that is, Note that although we describe our model in terms of technology adoption decision making, it is generally applicable to many other coordination problems for instance wherever there are strategic investment decisions to make (in competitive markets of any products) (Chevalier-Roignant et al., 2011;Zhu & Weyant, 2003).
3.1.2. Two-player tech adoption in presence of commitments. We now extend the model allowing players to have the option to arrange a prior commitment before a tech adoption (TD) interaction. A commitment proposal is to ask the co-player to adopt a different technology. That is, a strategist intending to use H (resp., L) would ask the co-player to adopt L (resp., H). We denote these commitment proposing strategies as HP and LP, respectively. Similar to previous models of commitments (for PD and PGG) , to make the commitment deal reliable, a proposer pays an arrangement cost E. If the co-player agrees with the deal, then the proposer assumes that the opponent will adopt the agreed choice, yet there is no guarantee that this will actually be the case. Thus, whenever a co-player refuses to commit, HP and LP would play H in the game. When the co-player accepts the commitment though later does not honour it, she has to compensate the honouring co-player at a personal cost d.
Different from previous models on PD and PGG where an agreed outcome leads to the same payoff for all parties in the agreement (mutual cooperation benefit), in the current model, such an outcome would lead to different payoffs for those involved. Therefore, as part of the agreement, HP would compensate after the game an amount u 1 to accepted player that honours the agreement; while LP would request a compensation u 2 from such an accepted co-player.
Besides HP and LP, we consider a minimal model with the following (basic) strategies in this commitment version: � Non-proposing acceptors, HC and LC, who always commit when being proposed a commitment deal wherein they are willing to adopt any technology proposed (even when it is different from their intended choice), honour the adopted agreement, but do not propose a commitment themselves. They play their intended choice, that is, H and L, respectively, when there is no agreement in place; � Non-acceptors, HN and LN, who do not accept commitment, play their intended choice during the game and do not propose commitments; � Fake committers, HF and LF, who accept a commitment proposal yet play the choice opposite to what has been agreed whenever the game takes place. These players assume that they can exploit the commitment proposing players without suffering the consequences. 1 Note that similar to the commitment models for the PD game , some possible strategies have been excluded from the analysis since they are dominated by at least one of the strategies in any configuration of the game: they can be omitted without changing the outcome of the analysis. For example, those who propose a commitment (i.e. paying a cost E) but then do not honour (thus have to pay the compensation when facing a honouring acceptors) would be dominated by the corresponding non-proposers.
Together the model consists of eight strategies that define the following payoff matrix, capturing the average payoffs that each strategy will receive upon interaction with one of the other seven strategies (where we Note that when two commitment proposers interact, only one of them will need to pay the cost of setting up the commitment. Yet, as either one of them can take this action, they pay this cost only half of the time (on average). In addition, the average payoff of HP when interacting with LP is given by We say that an agreement is fair if both parties obtain the same benefit when they honour it (after having taken into account the cost of setting up the agreement). For that, we can show that u 1 and u 2 must satisfy u 1 = (b � c � E)=2 and u 2 = (b � c + E)=2, and thus, both parties obtain (b + c � E)=2. Indeed, they can be achieved by comparing the payoffs of HP and HC when they interact, that is, b � E � u 1 = c + u 1 , where solving this equation, we would obtain With these conditions, it also ensures that the payoffs of HP and LP when interacting with each other are equal. Our analysis below will first focus on whether and when the fair agreements can lead to improvement in terms of coordination and the overall social welfare (i.e. average population payoff). We will discuss how different kinds of agreements (varying u 1 and u 2 ) affect the outcome, with additional results provided in Appendix 1.
3.2. Multi-player TD game 3.2.1. Multi-player TD without commitments. We now describe a N-player (N . 2) version of the TD model. Again, as before, we will introduce the model in the context of technology investment market decision making. In a group (of size N ) with k players of type H (i.e. N � k players of type L), the expected payoffs of playing H and L can be written as follows where a H (k) and a L (k) represent the fraction of the benefit obtained by H and L players, respectively, which depend on the composition of the group, k. For twoplayer TD, both are equal to a.
The rationale of these definitions is that whenever k < m, full benefits from adopting H can be obtained, and moreover, if k . m, the larger the k, the stronger the competition is among H-adopters. Similarly for Ladopters. The parameters a 1 and a 2 stand for the intensities of competition for investing in H and in L, respectively. For simplicity, we assume in this article a 1 = a 2 = a. Note that for N = 2 we recover the twoplayer model given in equation (1), given that the current a is scaled (by 2) compared to the value of a in the pairwise game, solely for the purpose of a clear presentation.
The optimal group payoff is achieved when there are exactly m players adopting H and the rest adopting L, leading to an average payoff for each member given by Multi-player TD in presence of commitments. We can define the N -player game version with prior commitments in a similar fashion as in the two-player game. Commitment proposing strategists (i.e. HP and LP players) will propose before an interaction that the group will play the optimal arrangement (so that every player obtains an average payoff A). For simplicity, we assume that the committed players adopt the fair agreement, that is, every member will obtain the same payoff after compensation is made to those adopting L. As such, we don't need to consider who will adopt H or L, as all would receive the same payoff at the end. Moreover, whenever a player in the group refuses to commit, commitment proposers will adopt H. Details of payoff calculation will be provided in 'Results' section (cf. Table 1).

Evolutionary dynamics
In this work, we will perform theoretical analysis and numerical simulations (see next section) using EGT methods for finite populations (Hauert et al., 2007;Imhof et al., 2005;Nowak et al., 2004). Let Z be the size of the population. In such a setting, individuals' payoff represents their fitness or social success, and evolutionary dynamics is shaped by social learning (Hofbauer & Sigmund, 1998;Sigmund, 2010), whereby the most successful individuals will tend to be imitated more often by the other individuals. In the current work, social learning is modelled using the so-called pairwise comparison rule (Traulsen et al., 2006), a standard approach in EGT, assuming that an individual A with fitness f A adopts the strategy of another individual B with fitness f B with probability p given by the Fermi function The parameter b represents the 'imitation strength' or 'intensity of selection', that is, how strongly the individuals base their decision to imitate on fitness difference between themselves and the opponents. For b = 0, we obtain the limit of neutral drift -the imitation decision is random. For large b, imitation becomes increasingly deterministic.
In the absence of mutations or exploration, the end states of evolution are inevitably monomorphic: once such a state is reached, it cannot be escaped through imitation. We thus further assume that, with a certain mutation probability, an individual switches randomly to a different strategy without imitating another individual. In the limit of small mutation rates, the dynamics will proceed with, at most, two strategies in the population, such that the behavioural dynamics can be conveniently described by a Markov Chain, where each state represents a monomorphic population, whereas the transition probabilities are given by the fixation probability of a single mutant (Hauert et al., 2007;Imhof et al., 2005;Nowak et al., 2004). The resulting Markov Chain has a stationary distribution, which characterises the average time the population spends in each of these monomorphic end states. It has been shown to have a range of applicability which goes well beyond the strict limit of very small mutation (or exploration) rates (Han et al., 2012;Hauert et al., 2007;Rand et al., 2013;Sigmund, 2010;Sigmund et al., 2010).
Before describing how to calculate this stationary distribution, we need to show how payoffs are calculated, which differ for two-player and N-player settings, as below.
� Average Payoff for the Two-Player Game Let p ij represent the payoff obtained by strategist i in each pairwise interaction with strategist j, as defined in the payoff matrices in equations (1) and (2). Suppose there are at most two strategies in the population, say, x individuals using i(0 < x < Z) and (Z � x) individuals using j. Thus, the average payoff of the individual that uses i or j can be written, respectively, as follows � Expected Payoff in the Multi-player Game In the case of N-player interactions, suppose the population includes x individuals of type i and Z � x individuals of type j. The probability to select k individuals of type i and N � k individuals of type j, in N trails, is given by the hypergeometric distribution as follows (Gokhale & Traulsen, 2010;Sigmund, 2010) Hence, in a population of xi-strategists and (Z � x)jstrategists, the average payoff of i and j are given by Now, for both two-player and N -player settings, the probability to change the number x of individuals using strategy A by 61 in each time step can be written as (Traulsen et al., 2006) The fixation probability of a single mutant with a strategy i in a population of (Z � 1) individuals using j is given by (Nowak et al., 2004;Traulsen et al., 2006) Considering a set f1, :::, qg of different strategies, these fixation probabilities determine a transition matrix M = fT ij g q i, j = 1 , with T ij, j6 ¼i = r ji =(q � 1) and T ii = 1 � P q j = 1, j6 ¼i T ij , of a Markov Chain. The normalised eigenvector associated with the eigenvalue 1 of the transposed of M provides the stationary distribution described above (Imhof et al., 2005), describing the relative time the population spends adopting each of the strategies.
Risk-dominance: An important measure to determine the evolutionary dynamic of a given strategy is its riskdominance against others. For the two strategies i and j, risk-dominance is a criterion which determines which selection direction is more probable: an i mutant is able to fixating in a homogeneous population of agents using j or a j mutant fixating in a homogeneous population of individuals playing i. In the case, for instance, the first was more probable than the latter then we say that i is risk-dominant against j (Nowak et al., 2004;Sigmund, 2010), which holds for any intensity of selection and in the limit for large population size Z when This condition is applicable for both two-player games, N = 2, and when N-player games with N . 2 (Gokhale & Traulsen, 2010;Sigmund, 2010). It will allow us to derive analytical conditions such as when commitment proposing is an evolutionarily viable strategy, being risk-dominant against all other strategies in the population.

Results
We will first describe results for two-player games, then proceeding to provide those for the N-player version. Table 1 summarises the key parameters in both versions, for ease of following.

Two-player TD game results
4.1.1. Analytical conditions for the viability of commitment proposers. To begin with, using the conditions given in equation (10), we obtain that if u 1 + u 2 \b � c then HP is risk-dominant (see Methods) against LP. Otherwise, LP is risk-dominant against HP.
Similarly, we derive the conditions regarding the commitment parameters for which HP and LP are evolutionarily viable strategies, that is, when they are riskdominant against all other non-proposing ones. Indeed, HP and LP are risk-dominant against all other six nonproposing strategies, respectively, if and only if Note that each element in the min expressions above corresponds to the condition for one of the six nonproposing strategies HN, LN, HC, LC, HF, LF, respectively.
Thus, we can derive the conditions for u 1 , u 2 and d In particular, for fair agreements, that is, It is because 3b � c � 2d . b + c � 2 maxfa, dg, which is due to b . c and maxfa, dg ø d.
In general, these conditions indicate that for commitments to be a viable option for improving coordination, the cost of arrangement E must be sufficiently small while the compensation associated with the contract needs to be sufficiently large (see Figure 2 for numerical validation). Furthermore, for the first condition to hold, it is necessary that b + c . 2 maxfa, dg. It means that the total payoff of two players when playing the TD game is always greater when they can coordinate to choose different technologies, than when they both choose the same technology.
Moreover, the conditions in equation (13) can be expressed in terms of a and the costs and benefits of investment, as follows (see again the payoff matrices in equation (1)) which can be rewritten as Ogbo et al.
This condition indicates under what condition of the market competitiveness and the costs and benefits of investing in available technologies, commitments can be an evolutionarily viable mechanism. Intuitively, for given costs and benefits of investment (i.e. fixing c L , c H , b L , b H ), a larger cost of arranging a (reliable) agreement, E, leads to a smaller threshold of a where commitment is viable. Moreover, given a commitment system (i.e. fixing E and d), assuming similar costs of investment for the two technologies, then a larger ratio of the benefits obtained from the two technologies, b H =b L , leads to a smaller upper bound for a for which commitment is viable.
Remarkably, our numerical analysis below (see Figure 1) shows that the condition in equation (14) accurately predicts the threshold of a where In general, the commitment proposing strategies HP and LP dominate the population when a is small while HN and HC dominate when a is sufficiently large in all cases, which is robust for different values of intensity of selection, b. The HN and HC dominate the population as the market competition decreases (i.e. when a increases). Larger values of b increase the difference between strategies' frequencies but do not change the outcomes in general. Parameters: in all panels, c H = 1, c L = 1, b L = 2 (i.e. c = 1), b H = 6 (i.e. b = 5). Other parameters: d = 6; b = :01, :1 and 1; population size Z = 100; Fair agreements are used, where u 1 and u 2 are given by commitment proposing strategies (i.e. HP and LP) are highly abundant in the population, leading to improvement in terms of the average population payoff compared to when commitment is absent (Figure 3). For example, when E = 0:1, 1 and 2, the upper bounds for a are 0:658, 0:583 and 0:5, respectively.
On the contrary, when a is sufficiently large, little improvement can be achieved, especially when b H =b L is large (which is in accordance with the analytical results above).

4.1.2.
Numerical results for pairwise TD game. We calculate the stationary distribution in a population of eight strategies, HP, LP, HN, LN, HC, LC, HF and LF, using methods described above. In Figure 1, we show the frequency of these strategies as a function of a, for different values of E and game configurations. In general, the commitment proposing strategies HP and LP dominate the population when a is small while HN and HC dominate when a is sufficiently large even with different values of b utilised in the comparison. That is, commitment proposing strategies are viable and successful whenever the market competitiveness is high, leading to the need of efficient coordination among the competing players/firms to ensure high benefits. Notably, we observe that the thresholds of a below which HP and LP are dominant, closely corroborate the analytical condition described in equation (14), in all cases. This observation is also robust for different values of intensity of selection, b.
This observation is robust for varying commitment parameters, that is, the cost of arranging commitment, E, and the compensation cost associated with commitment, d, see Figure 2. Namely, we show the total frequency of commitment strategies (i.e. sum of the frequencies of HP and LP) for varying these parameters and for different values of a. It can be seen that, in general, the commitment strategies dominate the population whenever E is sufficiently small and d is sufficiently large. This observation is in accordance with previous commitment modelling works for the cooperation dilemma games . In addition, we observe that in the current coordination problem, that the smaller a is, these commitment strategies dominate the population for wider range of E and d. Our additional results show that these observations are robust with respect to other game configurations, including b (comparing the three rows in Figure 2). Now, in order to determine whether and when commitments can actually lead to meaningful improvement, in Figure 3, we compare the average population payoff or social welfare when a commitment is present and when it is absent. In general, it can be seen that when a is sufficiently small (below a threshold), the smaller it is, the greater improvement of social welfare is achieved through the presence of a commitment deal. Moreover, the smaller the cost of arranging commitments, E, the greater improvement is obtained. When a is sufficiently large, commitment leads to no improvement or might even be detrimental for social welfare, especially when b H =b L is large (which is in accordance with the analytical results above). The detriment is further increased when b is small. We can observe that the thresholds for which a notable improvement can be achieved is the same as the one for the viability of HP and LP (i.e. as described in equation (14)).

Multi-player game results
4.2.1. Payoff derivation in N-player TD game. As mentioned above, compared to cooperation dilemmas such as PD and PGG, fake strategies make less sense in the context of coordination games since they would not earn the temptation payoff by adopting a different choice from what is being agreed. To focus on the group effect and the effect of the newly introduced parameter m, we will consider a population consisting of HP, LP, HN, LN, HC and LC (i.e. excluding fake strategies). As shown in the two-player game analysis, the fake strategies (i.e. HF and LF) are not viable options in TD games and can be ignored. It is equivalent to consider to the full set of strategies with a sufficiently large d.
First of all, we derive the payoffs received by each strategy when encountering specific other strategies (see a summary in Table 2). Namely, P ij (k) and P ji (k) denote the payoffs of a strategist of type i and j, respectively, in a group consisting of k player of type i and N � k players of type j. The first column of the table lists all possible strategies which can be used by player i (focal player), whereas the second column shows strategies of co-players (opponents). The third column shows the payoffs of focal players.

4.2.2.
Analytical conditions for the viability of commitment proposers in N-player TD game. We now derive the conditions under which HP is risk-dominant against the rest of strategies. Since we assume fair agreements, the conditions for LP would be equivalent to those for HP in terms of risk-dominance. For ease of following the derivations below, we recall that A denotes the optimal group payoff achieved when there are exactly m players adopting H and the rest adopting L, that is,

HP is risk-dominant against HC if
Adaptive Behavior 30(3) which can be written as Where Similarly, HP is risk-dominant against LC if For risk-dominance of HP against HN, Figure 2. Total frequency of commitment strategies (i.e. sum of the frequencies of HP and LP), as a function of E and d, for different values of a and b. Primarily, the commitment proposing strategies dominate the population whenever E is sufficiently small and d is sufficiently large. Furthermore, the smaller a, these commitment strategies dominate for a wider range of E and d, especially when a is smaller. These observations are robust for different values of b. Nevertheless, a larger b leads to a greater frequency of commitment proposing strategies where they are evolutionarily viable and a lower frequency otherwise. Parameters: in all panels, c H = 1, c L = 1, b L = 2 (i.e. c = 1) and b H = 6 (i.e. b = 5). Other parameters: b = :01 in the first, b = :1 in the second and b = 1 in the third row; population size Z = 100; fair agreements are used, where u 1 and u 2 are given by which equivalently can be written as Finally, HP is risk-dominant against LN if which can be rewritten as Figure 3. Average population payoff as a function of a, when commitment is absent and when it is present, for different values of E and b. We observed that when a is small, significant improvement in terms of the average population payoff can be achieved through prior commitment. When a is sufficiently large, commitment leads to no improvement or might even be detrimental for social welfare, especially when b is small. That is, at a = :7 in panel (a) and a = :9 in panel (d), without commitment will be more beneficial. Parameters: in all panels, c H = 1, c L = 1, b L = 2 (i.e. c = 1); in panels (a, b and c), b H = 6 (i.e. b = 5) with b = :01, :1 and 1, respectively. Also, in panels (d, e and f), b H = 3 (i.e. b = 2) with b = :01, :1 and 1, respectively; other parameters: d = 6; population size Z = 100; fair agreements are used, where u 1 and u 2 are given by u 1 = (b � c � E)=2 and u 2 = (b � c + E)=2.
In short, in order for commitment proposers to be risk-dominant against all other strategies, it requires that E is sufficiently small, namely, smaller than minimum of the right-hand sides of equations (15) to (18).

Numerical results for N-player TD game
We compute stationary distributions in a population of six strategies HP, LP, HN, LN, HC and LC, for the Nplayer TD game, using the payoffs in Table 1 and the Methods described above. To begin with, in Figure 4 (see also Figure 9 in Appendix 1), we provide numerical validation for the analytical conditions obtained in the previous section regarding when commitment proposing strategies are evolutionarily viable strategies (being risk-dominant against others). Similar to the pairwise TD game, we observe that there is a threshold for E below which it is the case. Moreover, Figure 5 shows that the frequencies of these strategies (HP and LP) decrease for increasing a. They dominate the population whenever E is sufficiently small (e.g. E = 0:1 and 1). That is, it is more beneficial to engage in a prior commitment deal when the market competition is harsher (i.e. small a). These results are robust for different intensities of selection (see Figure 10 in Appendix 1). In general, our results confirm the similar observations regarding the effects of E and a on the evolutionary outcomes obtained in the pairwise game above.
We now focus on understanding the effect of the new parameter in the N-player game, m, on the evolutionary outcomes. Recall that m indicates the demand In the N-player game, the new parameter m describes the market demand for a high technology, which was set to 1 in the pairwise game. HP and LP have a high frequency for sufficiently small E for m = 2 in both games and also when m = 1 for the first, easy coordinate situation (first row). When m = 5, that is, when all players can adopt H without benefit reduction, HC always dominates and commitment strategies are not successful. This means that when there is a need for a diversity of technology adoption, initiating prior commitments to enhance coordination is important. Parameters: in panels (a, b and c), b H = 6 (i.e. b = 5) with m = 1, 2, 5, respectively. Also, in panels (d, e and f), b H = 3 (i.e. b = 2) with m = 1, 2, 5, respectively; other parameters: N = 5, b = :1; a = :5; c H = 1, c L = 1, b L = 2 (i.e. c = 1).
for high technology (H) in the group, describing what is the maximal number of players in the group that can adopt H without reducing their benefit due to competition. Figure 4 shows the effect of different values of m on the frequency or evolutionary success of all strategies as a function of E. When m is small to intermediate, and the cost of arranging prior commitment is also small, the commitment proposing strategies are dominant. This suggests that arranging prior commitments might be more beneficial in such instances. These results also imply that m is very essential in determining when commitment should be initiated. Apparently, the greater the need for a group mixture or market diversity of technologies, indicating a more difficult coordination situation, the greater is the need for the utilisation of commitment to enhance coordination among players. This observation is even more evident in Figure 6, where we examine the success of commitment for varying m and E, in regard to two different game configurations. It can be observed that an intermediate value of Figure 5. Frequency of the six strategies HP, LP, HN, LN, HC and LC, as a function of a in a multi-player game with commitment, for different values of E and also two different game configurations. In general, the commitment proposing strategies (HP and LP) decrease in frequency for increasing a. They dominate over other strategies for sufficiently small a and E. That is, it is more beneficial to engage in a prior commitment deal when the market competition is fierce and the cost of arranging the commitment is very minimal. Parameters: in all panels, c H = 1, c L = 1, b L = 2 (i.e. c = 1); in panels (a, b and c), b H = 6 (i.e. b = 5) with E = 0:1, 1 and 2, respectively. Also, in panels (d, e and f), b H = 3 (i.e. b = 2) with E = 0:1, 1 and ,2, respectively; other parameters: N = 5, b = :1; m = 2. m leads to the highest frequency of commitment strategies, especially in the more difficult coordination situation (i.e. the right panel).
We now closely examine the gain in terms of social welfare improvement when using prior commitments. As shown in Figure 7, whenever m\N (N = 5), that is, there is a need to coordinate among the group players to avoid competition that induces benefit reduction, prior commitments lead to increase of social welfare. This increase is more significant in the more difficult coordination situation (i.e. the lower row) and when the cost of arranging commitment is low, which is also slightly more significant for intermediate values of m and higher values of intensity of selection, b.

Conclusions and further discussion
We have described in this article novel EGT models showing how prior commitments can be adopted as an efficient mechanism for enhancing coordination, in both pairwise and multi-player interactions. For that, we described technology adoption (TD) games where technology investment firms would achieve the best collective outcome if they can coordinate with each other to adopt a mixture of different technologies. To this end, a parameter a was used to capture the competitiveness level of the product market and how beneficial it is to achieve coordination, while another parameter m to capture the optimal coordination mixture or diversity of technology adopters in a group (in the pairwise case, we assume the optimal mixture is where two firms adopt different technologies to avoid conflict).
In the coordination settings, there are multiple desirable outcomes and players have distinct preferences in terms of which outcome should be agreed upon, thus leading to a larger behavioural space than in the context of cooperation dilemmas Hasan & Raja, 2013;Sasaki et al., 2015). We have shown that whether commitment is a viable mechanism for promoting the evolution of coordination strongly depends on a: when a is sufficiently small, prior commitment is highly abundant leading to significant improvement in terms of social welfare (i.e. population avarage payoff), compared to when commitment is absent. Importantly, we have derived the analytical condition for the threshold of a below which the success of commitments is guaranteed, for both pairwise and multi-player TD games. Furthermore, moving from pairwise to a multiplayer setting, it was shown that m plays an important role for the success of commitment strategies as well. In general, when m is intermediate, equivalent to a high level of diversity in group choices, arranging prior commitments proved to be highly important. It led to significant improvement in terms of social welfare, especially in a harsher coordination situation.
In the main text, we have considered that a fair agreement is arranged. In Appendix 1 (Figure 8), we have shown that whenever commitment proposers are allowed to freely choose which deal to propose to their co-players, our results show that, in a highly competitive market (i.e. small a), commitment proposers should be strict (i.e. sharing less benefits), while when the market is less competitive, commitment proposers should be more generous.
In both pairwise and multi-player coordination settings, our analysis has shown that the cost of arranging agreement must be sufficiently small, to be justified for the cost and benefit of coordination. This is in line with previous works in the context of PD and PGG (Han et al., 2013. It is due to the fact that those who refuse to commit can escape sanction or compensation. Solutions to this problem have been proposed in the context of PD and PGG, namely, to combine commitment with peer punishment, intention recognition, apology or social exclusion to address non-committers (Han & Lenaerts, 2016;Han, Santos, et al., 2015;Martinez-Vaquero et al., 2017;Quillien, 2020) or to delegate the costly process of arranging commitment to an external party (Cherry & McEvoy, 2013. Our future work will investigate how to combine prior commitments with such mechanisms to provide a more adaptive and efficient approach for coordination enhancement in complex systems. Prior commitments and agreements have been used extensively in the context of distributed and selforganising multi-agent systems, for modelling and engineering a desirable correct behaviour, such as cooperation, coordination and fairness (Chopra & Singh, 2009;Singh, 1991;Winikoff, 2007). These works however do not consider the dynamical aspects of the systems nor under what conditions, for instance, regarding the relation between costs and benefits of coordination and those of arranging a reliable commitment, commitment proposing strategies can actually promote a high level of desirable system behaviour. Thus, our results provide important insights into the design of such distributed and self-organising (adaptive) systems to ensure high levels of coordination, in both pairwise and multi-party interactions (Bonabeau et al., 1999;Pitt et al., 2012).
In future work, we will consider how commitments can solve more complex collective problems, for example, in a technological innovation race (Han et al., 2020), bargaining games (Rand et al., 2013;Zisis et al., 2015), climate change actions (Barrett, 2007;F. P. Santos et al., 2020) and cross-sector coordination (F. P. Santos et al., 2016), where there might be a large number of desirable outcomes or equilibriums, especially when the number of players in an interaction increases (Duong & Han, 2016;Gokhale & Traulsen, 2010).
Overall, our work has demonstrated that commitment is a viable tool for promoting the evolution of diverse collective behaviours among self-interested individuals, beyond the context of cooperation dilemmas where there is only one desirable collective outcome (Barrett, 2007;Skyrms, 1996). It thus provides new insights into the complexity and beauty of behavioural evolution driven by humans' capacity for commitment (Frank, 1988;Nesse, 2001).

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: T.A.H. is supported by a Leverhulme Research Fellowship (RF-2020-603/9). T.A.H and A.E. are also supported by Future of Life Institute (grant RFP2-154).

ORCID iD
The Anh Han https://orcid.org/0000-0002-3095-7714 Note 1. Compared to cooperation dilemmas such as Prisoner's Dilemma (PD) and Public Goods Game (PGG), fake strategies make less sense in the context of coordination games since they would not earn the temptation payoff by adopting a different choice from what is being agreed. Moreover, in the presence of an agreement, players obtain an additional compensation when adopting the disadvantageous choice (i.e. L). We will keep the fake strategies in the analysis of pairwise games for confirmation of these intuitions but will exclude them from multi-player settings for simplicity, without being detrimental to the results.
Numerical confirmation of risk-dominant conditions in the N-player game See Figure 9 for numerical results confirming the riskdominant conditions in the N-player game in the main text. Figure 8. Average population payoff as a function of u 1 and u 2 , for different values of a and b (for pairwise TD games). When a is small (panels a and b), the highest average payoff is achieved when u 1 is sufficiently small and u 2 is sufficiently large, while for large a (panel c), it is the case when u 1 is sufficiently large and u 2 is sufficiently small. Figure 4 also shows that for a small value of b, the highest average payoff is achieved when a is very minimal compared to other panels with higher value of b (compare panels a, d and g). Parameters: in all panels, c H = 1, c L = 1, b L = 2 (i.e. c = 1) and b H = 6 (i.e. b = 5). Other parameters: d = 4, E = 1; b = :01, :1 and 1; population size Z = 100.
Results for other intensities of selection in the N-player game