Effects of Gehl’s urban design guidelines on walkability: A virtual reality experiment in Singaporean public housing estates

Walkability has become an important theme of urban design research and practice. Evidence suggests that environmental attractiveness can have a significant impact on the amount of walking activities that take place, but relatively little research exists on which environmental features linked to attractiveness increase walkability. Using a virtual reality experiment, the present study examined the effects on walkability of three key features, as defined by Jan Gehl, an influential urban planning practitioner and theorist: liveliness, high-quality façades and low buildings. A virtual reality simulation allowed isolating the effects of these features, while avoiding confounding factors, such as the presence of shops, which has been difficult to do in past field studies. Our study confirmed that the combination of features recommended by Gehl promoted walking activity in the study’s context. Further exploratory analyses suggested that improved façade quality was positively linked to walking activity, and that building height and liveliness had negligible effects. Our findings contribute to the existing understanding of walkability, which may benefit urban planning practice and models of walkability. Further research is necessary to confirm our results regarding the effects of specific features on walking activity in different contexts.

implemented similar urban design guidelines as Gehl's, without citing Gehl specifically (City of Melbourne, 2015;City of Stockholm, 2021; for Singapore, see Urban Redevelopment Authority, 2020). In the academic world, a good example is the most-cited series of studies on the relationship between urban design variables and walkability by Reid Ewing (see e.g. Ameli et al., 2015;Ewing and Clemente, 2013). Ewing and his team asked a group of urban design experts to determine metrics for assessing urban design quality, and then to rank street scenes according to these metrics. In studies that followed, the team then correlated the urban design ratings with pedestrian counts to see which urban design features most contribute to walkability. In many ways, the experts' urban design quality metrics are similar to those of Gehl, including liveliness, façade transparency, complexity and human scale (Ewing and Clemente, 2013).
What distinguishes Gehl's guidelines for improving walkability through urban design from these and others is that Gehl's are more specific, which is the second reason why we chose to focus on them. For example, Gehl specifies façade dimensions that encourage walkability. Such specific guidelines regarding the shape and properties of objects that form the built environment are straightforward to implement unambiguously in a 3D modelled VR environmentand are hence more suitable for our study than similar but broader and less specific guidelines, for example, those by Ewing and Clemente (2013), described above.
Gehl's urban design guidelines to promote walking activities According to Gehl, urban designers should aim to create lively environments with high-quality façades and low buildings. He argues that these features encourage walking activities because they meet human social, psychological and physiological needs, as follows: First, Gehl argues that liveliness is the cornerstone of attractive walking environments because it creates a virtuous cycle where the presence of people attracts more people (Gehl, 1980;Gehl, 1987: 25;Gehl et al., 2006: 37-9;Gehl, 2010: 65). In lively places, many people engage in what Gehl calls 'optional activities', such as sitting or slowly strolling, as opposed to 'necessary activities', such as 'going to work or school, waiting for the bus, [or] bringing goods to customers' (Gehl, 2010: 20, 71; for a similar distinction between necessary and optional walking activities, see Nakamura, 2020). Lively places are attractive because they provide a promise of social interaction and add to positive experiences (Gehl, 2010: 63, 67). Thus, according to Gehl, liveliness encourages more optional walking-related activities.
Second, Gehl argues that high-quality façades encourage walking because they meet people's need for stimulation and social interaction. Firstly, high-quality façades (one feature of which is transparency) allow mutual visibility and interaction between people inside and outside a building. This increases the sense of liveliness outside the façade and thus increases walking activity, as described in the previous paragraph. Additionally, Gehl argues that high-quality façades meet people's basic need for stimulation at 5-second intervals as they contain small-scale details and consist of visually distinct units that are 5-6 m long (Gehl, 1987: 93-7;Gehl et al., 2006;Gehl, 2010: 44, 77-9). For example images of façades Gehl rated, consult Gehl (2010: 241).
Third, despite advocating for compact cities, Gehl argues that tall buildings discourage walking activity. Partly this is because they reduce thermal comfort and impede the connection between building occupants and passers-by (Gehl, 2010: 171, 41-2, 68). However, Gehl also implies that tall buildings are simply visually unattractive to pedestrians (Gehl, 1987: 69, 99). He further suggests that tall buildings can be made more attractive by creating a setback, so that the building is L-shaped (Gehl, 2010: 69).

Empirical studies on Gehl's guidelines
Empirical studies on the relationship between walking activity and the features included in Gehl's guidelines have yielded equivocal results. Existing studies have been conducted on the field, where it is difficult to isolate the effects of features from one another, and from the effect of active functions on people's willingness to walk in the environment. No study that we are aware of has been able to comparatively examine all the environmental features included in our study.
Reviewing Gehl's most important publications (Gehl, 1980(Gehl, , 1986(Gehl, , 1987(Gehl, , 1989(Gehl, , 2010Gehl et al., 2006), we did not find empirical evidence for the claim that low rather than tall buildings encourage walking activity. Other studies have reached inconclusive results (e.g. Ameli et al., 2015;Ewing and Clemente, 2013;Isaacs, 2000). Several studies have examined the relationship between walking activity and enclosure (Ameli et al., 2015;Isaacs, 2000;Ewing and Clemente, 2013), a commonlyused measure that is positively correlated with building height (for a definition, see Ameli et al., 2015: 397). While Isaacs (2000) found that pedestrians prefer enclosed environments, Ameli et al. (2015) and Ewing and Clemente (2013) found no relation between enclosure and walking activity. Hence, the existing evidence on the relationships between building height and walking activity is limited and inconclusive.
Several studies have investigated the effect of liveliness on walking activity. Gehl's field research suggests that in lively environments, pedestrians move more slowly than in non-lively environments, thereby spending more time in the environment and creating a virtuous cycle of increasing liveliness and attractivity (Gehl et al., 2006: 37-9). However, Gehl's lively environments also contained higher-quality façades and more active ground floor functions (such as shops and cafes) than the control environments, leaving uncertain the effect of liveliness itself on walking activity. Although for example, Isaacs (2000) and Mehta (2008) conclude that people prefer lively places, it remains unclear how, and to which degree this preference would be linked to walking activities.
Lastly, important gaps remain in research on the effects of façade quality on walking activities. Again, Gehl's research is unable to disentangle the effects of façade quality from those of liveliness and active ground floor functions. This is because high-quality façades were defined as transparent and active, thereby making them indistinguishable from active functions. Liveliness was another confounding factor because active functions correlate closely with it, too. Field studies by other authors have faced similar challenges. Although several studies indicated that transparent façades are associated with more walking activity, only one study (Ewing and Clemente, 2013) adequately controlled for the presence of active ground floor functions. Thus, it is still uncertain to what extent façade quality itself, as opposed to function, encourages pedestrian movement.
In sum, there is a lack of clarity regarding the link between walking activity and each of the three commonly implemented urban design features (i.e. building height, liveliness and façade quality). Mostly this is due to the difficulty of isolating the effects of features from one another, and from external confounding factors.

Justification for the current experiment
Given these gaps in the literature, we conducted a VR simulation to further investigate the effects of building height, liveliness and façade quality on walking activity. Together, VR and the context of Singapore public housing estates allowed us to address the gaps in literature through a controlled experiment in a relevant, but previously lesser-studied context.
Compared to real-life observations, the benefit of a VR simulation is that researchers can vary environmental features (e.g. Kuliga et al., 2015), and thereby isolate the effect of independent variables on the dependent variable. This is especially useful for the present study, given that it has previously been difficult to isolate the effects of urban design features on walking activity, as described in the two sections above ('Gehl's Urban Design Guidelines to Promote Walking Activities' and 'Empirical Studies on Gehl's Guidelines'). Although the question of validity is important in VR research, recent studies using head-mounted displays have indicated that people behave in and perceive virtual environments in similar ways as real ones (see e.g. Higuera-Trujillo et al., 2017;Kisker et al., 2019;Kuliga et al., 2020). Another benefit of VR for our experiment was that it enabled automatic measurement of the speed at which participants moved through the environments. Thus, we could test their behavioural responses to the environment in addition to questionnaire responses (according to Gehl (2010: 79), people move more slowly in attractive and walkable environments).
Singaporea high-density, high-rise urban environmentwas deemed a relevant context for the study. First, research in the Singapore context might be especially relevant for rapidly-growing Asian cities (United Nations, Department of Economic and Social Affairs, Population Division, 2019), as these tend to be particularly car-oriented, and may benefit from evidence on measures that promote walkability (Nakamura, 2020). Second, Singapore is simply a novel context for walkability research, which has mostly focused on North America and Europe (e.g. Ameli et al., 2015;Ewing and Clemente, 2013;Mehta, 2008). As Singaporeans are accustomed to higher population density and taller buildings, and familiarity is known to affect environmental preferences (Pedersen, 1978), it is interesting to study how liveliness and building height affect walkability in Singapore. Within Singapore, our experiment focused on residential neighbourhoods, specifically publicly-owned Housing Development Board (HDB) estates, which housed 80% of Singaporeans in 2020 (Government of Singapore, 2021). A residential context allowed studying the effect of façade quality on walkability without the confounding factor of active ground floor functions which had been present in previous research (see above, section 'Empirical Studies on Gehl's Guidelines).

Aims and hypotheses
Our main goal was to examine whether façade quality, building height and livelinessas defined by Gehltogether affect walking activity, given that existing research has remained inconclusive on the link between these features and walking activity. Thus, we tested one primary hypothesis (H1): (1) People are willing to engage in more walking activities, and walk more slowly, in environments containing the following features: (a) a high degree of liveliness, (b) high façade quality and (c) low buildings, compared to environments that are not lively, have low façade quality, and tall buildings.
In addition, the study aimed to tentatively examine which of these features is most strongly linked to walking activities. Thus, we also explored two secondary hypotheses (H2 and H3): (2) Each feature (liveliness, façade quality and building height) by itself is positively linked to walking activities. (3) Liveliness has the largest effect on walking activities.

The virtual environments
The virtual environments ( Figure 1) were based on two real-world environments. The virtual versions were experimentally manipulated with respect to the independent variables façade quality, building height and number of people present. Table 1 presents all the levels of the independent variables and their descriptions. In total, the experiment contained 14 distinct virtual environments, of which each participant experienced nine. Each environment contained a yellow line which participants were instructed to follow, to ensure that everyone experienced the environments similarly.
The real-life bases for the virtual environments were chosen based on their façade quality, out of a selection of Singaporean public housing estate façades which were reviewed according to Gehl's criteria: transparency/relief, articulation and detail (Gehl, 2010: 78). For more details on the selection of base environments and the façade ratings, see Supplementary Material, Section 1. Basing the study's high-quality facades on real-life façades improved the validity and realism of the experiment (helping to ensure that the high-quality façades exemplified a realistic level of Gehl's guidelines, such as level of detail). More specifically, the two chosen real-life environments (based on estates in the Singaporean districts Simei and Bedok) were selected because they exemplified Gehl's conception of high-quality façades in different ways. The Bedok façade used colour as a predominant way of bringing about detail and façade articulation, while the façade in Simei had less dramatic changes in colour, but more variety in shapes. Using two different case studies helped test the concept of stimulation more broadly, reducing the risk of merely testing whether one particular type of façade (e.g. a colourful one) was more attractive than another.
Next, we created virtual models of the chosen estates using Rhinoceros 6, a 3D modelling software (McNeel and Other, 2018) and Unity (version 2019.1), a game development platform. These virtual models were manipulated to represent variations of the independent variables: façade quality, building height and liveliness.
When deciding the number of and differences between the levels of each variable, we relied as far as possible on Gehl's guidelines for walkable environments, summarised in above in the section 'Gehl's Urban Design Guidelines to Promote Walking Activities'. Two levels were considered appropriate for liveliness, since Gehl does not specify detailed categories of different amounts of liveliness. Three levels were chosen for building height because in this case, Gehl identifies three clear categories (low, tall and L-shaped buildings). From a practical point of view, it is interesting to include all three building height levels in the experiment. Density requirements in urban areas often mean that replacing tall buildings with low ones to improve walkability is unfeasible. Thus, it is interesting to know whether L-shaped buildings can bring similar benefits as low ones, yet without compromising density.
While Gehl provides a more nuanced classification of façades, we only chose two levels in the façade variable for two main reasons. First, since existing research had not established the effect of façade quality on walking activity, it seemed appropriate to first study whether a clear difference in façade quality has an effect on walking activity. Second, limiting the number of levels in the Table 1. All levels of the independent variables. Except in location, numbers (0-2) represent the hypothesised contribution of each level to environment quality. Zero represents the hypothesised low-quality version of a feature and one (or, in the case of building height, two), represents the highest-quality version.

Liveliness (L)
0 -Low Few virtually simulated agents present. Agents only engaged in walking (not staying) activities 1 -High Many virtually simulated agents engaging in both walking and staying activities (sitting on benches, or standing in groups and gesticulating as if in a conversation) Bedok High facade quality version has more variability in colour and less articulation and variability in shapes Simei High facade quality version has less variability in colour and more articulation and variability in shapes experiment reduced the number of environments that each participant needed to experience, therefore reducing participants' fatigue, which in turn might negatively influence the questionnaire or speed measurement results (Bradley and Daly, 1994;Cherchi and Hensher, 2015; see also Supplements S2 for a discussion of how our pilot study results informed our decision). Our pilot study furthermore helped to ensure that the contrasts between each level of each variable were noticeable to participants on the one hand, and not too extreme on the other (for details, see Supplementary Material, Section 2).
As creating all possible combinations of two locations and three features would have required 2 * 2 * 2 * 3 = 24 environments, some combinations were excluded. Figure 2 shows the combinations included in the experiment. Participants were semi-randomly placed in one of two groups (A or B), experiencing distinct, but partly overlapping sets of environments. As our goal was to reduce fatigue (as explained above), each participant experienced only nine of the 14 environments, presented in a randomised order. The specific number of environments was based on the pilot study findings on fatigue, discussed in Supplements, S2.
All participants experienced the hypothesised highest and lowest-quality versions of the Bedok and Simei environments (see Figure 2). The hypothesised highest-quality environments were lively and contained low buildings and high-quality façades, while the hypothesised lowest-quality environments were not lively and contained tall buildings and low-quality façades.
A pilot study confirmed the validity of the environments (for details, see Supplementary Material, Section 2).  Table 1).

Participants and power calculation
Recruitment for our experiment targeted Singaporean students who were permanent residents or citizens. Arguably, a student sample is unlikely to distort results (Stamps, 1999). Based on Cohen (1988: 273, 384), we required a sample size of 45 participants to detect medium to large effect sizes (f = 0.3, α = 0.05, power = 0.8). This can be considered an upper limit since Cohen's calculation applies to 2x2 analyses of variance (ANOVAs) with between-subjects designs. The within-subjects design, used in our study, increases power (Cohen, 2013: 519). Participants were assigned to experimental conditions semi-randomly, according to their chosen experimental time slot. For sample sizes in each individual analysis, see Statistical Analyses and Sample Sizes section.
Participants were between 19 and 32 years old, and the sample was gender-balanced (24 males, 24 females). Participants had almost no formal training in urban design and architecture, and were used to high-rise environments (see Supplementary Material, Section 3 for details).

Measured variables
The dependent variable 'walking activities' was measured, firstly, by recording the speed of participants as they moved through the virtual environments, and secondly, through a customised, environmental evaluation questionnaire (referred to as 'activity questionnaire') with a seven-point scale. Besides pure walking activities, the questionnaire measured staying activities because according to Gehl (2010: 65), they are related to optional walking activities and contribute to the liveliness of an environment, potentially making it more walkable for others (see Introduction).
Section 4 in the Supplementary Material contains the full questionnaire, alongside additional details on the scale and choice of items.
Lastly, we issued a custom-made post-experiment questionnaire to understand participants' characteristics and perceptions of the experiment, including perceptions of realism and experiences of cybersickness (motion sickness in VR). For details, see Supplementary Material, Section 5.

Experimental procedure
After signing informed consent, participants experienced the VR environments through a headmounted display (HTC Vive). To reduce the risk of tripping and cybersickness, participants were seated on a chair that could be rotated, as shown in the Supplementary Material, Section 6. Participants could virtually walk backwards and forwards using the HTC Vive remote controller. The direction of movement was determined by the direction the headset was facing. Participants could move forward by touching the touchpad; to move faster, they could simultaneously pull the trigger at the back of the controller. Three different speeds were available: 0 km/h, 3.9 km/h and 7.9 km/h. Throughout the experiment, a custom-made script recorded the speed of each participant at 0.25 s intervals.
After a practice scene teaching the participant how to steer their movement, the environments were presented to each participant in a random order. After experiencing a virtual environment, participants removed the head-mounted display and filled in the activity questionnaire. Finally, at the end of the experiment, participants filled in the post-experiment questionnaire.

Dimensionality reduction
Dimensionality reduction of the questionnaire data was necessary before hypothesis testing (see Supplementary Material, section 7 for details). The analysis suggested that resting and walking activities were not separate constructs, and that activity questionnaire items could be combined into a single composite activity score. This score was calculated as a weighted average of activity questionnaire responses.

Statistical analyses and sample sizes
The hypotheses were tested through ANOVAs. Three separate ANOVAs per dependent variable (activity score and walking speed) were necessary due to the fractional study design (Figure 2). After checking that the data met ANOVA's sphericity assumption through Mauchly's test, each analysis was carried out for the two dependent variables.
First, a 2x2 ANOVA tested the primary hypothesis (H1). This analysis involved four environments: the hypothesised highest and lowest-quality versions of each location. Thus, it investigated the link between urban design qualities as a whole and walking activity. This analysis had the largest sample size: 48 and 34 participants for the questionnaire and speed data analyses, respectively. Sample sizes were smaller than 52 (the total number of participants) because some participants gave extremely and uniformly low questionnaire ratings across environments, or did not follow instructions properly, and because of speed measurement errors (see Supplementary Materials, section 8).
Two secondary, exploratory, analyses were also conducted, based on a smaller subsample. These analyses tested whether building height, liveliness and façade quality each individually contributed to walking activity (H2) and whether liveliness had the largest positive effect (H3). These analyses were conducted separately for group A and B data (cf. Figure 2 for the groupings). The sample sizes were 26 and 22 participants for groups A and B, respectively, for the questionnaire analysis (due to the measurement errors mentioned above, sample sizes were not large enough for speed analyses).
A one-way ANOVA was carried out for each group, using planned contrasts to compare pairs of environments that each differed with respect to one feature (cf. Table 2). For example, participants in each group experienced two pairs of environments which differed with respect to façade quality, but had the same location, liveliness and building height. Thus, the difference in walking activity between the paired environments could be attributed to the difference in façade quality. Since the planned contrasts were non-orthogonal and seven contrasts in each analysis were carried out, the significance threshold was adjusted to p = 0.007 using the Bonferroni correction.

Results
Analyses of variance Primary analysis. As described in Statistical Analyses and Sample Sizes section, we tested H1 through 2x2 ANOVAs on speed and activity questionnaire data.

Secondary exploratory analysis: effects of individual features on walking activity
The secondary, exploratory one-way ANOVAs tested H2 and H3 to explore the relative individual importance of each independent variable in encouraging walking activity. In total, four analyses were carried out for questionnaire and speed data for groups A and B (see Figure 4).  Figure 3. Two-way analyses of variance (ANOVA) results on questionnaire data (left) and speed data (right). High-quality environments received statistically significantly higher mean composite activity scores and significantly lower mean speeds in both locations. Secondary exploratory analysis: effects of individual features on walking activity.
In Group A questionnaire data, Mauchly's test showed that the assumption of sphericity was violated, χ2 (35) = 82.47, p < 0.05; thus, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (ε = 0.58). Activity scores were statistically significantly associated with the environment, F (4.67, 116.77) = 6.00, p < 0.001, η p 2 = 0.194. This means that at least one environment was significantly different from others, with environmental differences accounting for 19.4% of variation in the data.
The pairwise comparisons for Group A showed a statistically significant increase in activity scores from environments with low-quality façades to those with high-quality façades (see  Table S6). Façade quality was studied through two contrasts, both of which had large effect sizes, η p 2 . The first contrast was between environments 8 and 10. The environment with high façade quality showed an increase in walking activity of 1.23 points out of seven, F (1, 25) = 18.46, p = 0.001; η p 2 = 0.425. The second contrast, between environments 11 and 12, showed a mean increase of 1.10 points F (1, 25) = 16.72, p = 0.007; η p 2 = 0.401. Façade quality had an almost equal effect size in each contrast (η p 2 difference of 0.024), despite one contrast containing a lively context and another containing a non-lively context. This suggests there was no statistically significant interaction between façade quality and liveliness based on our data.
Liveliness and the height of buildings were not significant in Group A data, contrary to our hypotheses H2 and H3 (cf. Table S6 in Supplements Material).
In Group B questionnaire data, Mauchly's test indicated that the assumption of sphericity had been violated, χ2 (35) = 76.15, p = 0.015; therefore, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (ε = 0.46). The results show that the activity scores were statistically significantly linked to the environment, F (3.71, 77.80) = 2.97, p = 0.028, η p 2 = 0.124. This means that at least one environment was significantly different from others, with environment accounting for 12.4% of variation in the data.
Hypotheses H2-H3 were again tested using planned contrasts for Group B data (cf. Table S7 in Supplementary Material). Contrary to H2, there were no statistically significant differences between pairs of environments at the Bonferroni-corrected significance level. The planned contrast between Simei environments 1 and 3 (with high buildings, a high degree of liveliness and low and highquality façades, respectively) was nominally statistically significant, F (1,21) = 6.96, p = 0.015, η p 2 = 0.249. The other contrast involving façade quality, between environments 4 and 5, was insignificant, F (1,21) = 2.95, p = 0.101, η p 2 = 0.123. Contrasts involving high-quality façades had more significant and larger effects compared to contrasts involving liveliness and building height. This again suggests that façade quality had a greater effect on composite activity scores than liveliness, contrary to H3.
Speed analyses did not show statistically significant differences between environments in either Group A or Group B data due to reduced sample sizes (see Supplementary Material, Section 10 for details).

Post-experiment questionnaire
The post-experiment questionnaire results can be summarised as follows (for details, see section 5 in Supplementary Materials). First, participants found the VR environments 'fairly realistic'. Second, participants generally felt little cybersickness, nor did nausea significantly affect their speed of movement in VR (for details, see Table S5 in Supplementary Materials). Thirdly, participants felt that the number of virtual agents in the environments had the biggest impact on their activity questionnaire ratings, while building height had the fourth largest impact (of nine features). This implies that participants noticed the manipulations, but raises questions about their lack of importance in the ANOVA analyses.
Although the post-experiment questionnaire is of lesser importance than the activity questionnaire, it is worth elaborating on the participants' own perception that the number of agents influenced them most, as this is in tension with the secondary analyses' results, based on activity questionnaire data. Importantly, in the post-experiment questionnaire, participants often misunderstood the term 'façade,' and therefore could not reliably evaluate the relative importance of the number of agents vs. façade quality on their own activity questionnaire ratings (which did not share this problem, as the activity questionnaire did not contain the word 'façade'). More generally, the post-experiment questionnaire relied more on participants' memory of the different environments and the features that changed. It is plausible that participantsalmost all of whom had no background in architecture or urban designwould over-emphasise the importance of virtual agents, simply because they were more conscious of this manipulation than the façade manipulation, if they were not actively paying attention to façades.

Discussion
This VR experiment tested how three urban design variables recommended by Gehl (building height, façade quality and liveliness) encourage walking activity. The virtually simulated environments were based on two existing HDB estates in Singapore, located in the Bedok and Simei neighbourhoods. These estates exemplified two different ways of achieving Gehl's conception of high-quality façades, with Bedok scoring higher on narrowness of units and Simei on the level of detail. Walking activity was measured through walking speeds and a questionnaire-based composite activity score. We tested three hypotheses: that higher façade quality, higher liveliness and lower buildings together increase the measured walking activities (H1); that each of these features individually increases walking activities (H2); with liveliness having the largest positive effect (H3).
Our analyses confirmed H1: environments with all three features (low buildings, high-quality façades and a high degree of liveliness) encouraged walking-related activities compared to environments with none of these features. Increasing the hypothesised environmental quality from lowest (none of these features present) to highest (all features present) significantly increased both composite activity scores and lowered mean speeds in both Bedok and Simei. Thus, four comparisons (lowest versus highest-quality combinations in Bedok and Simei measured through questionnaire and speed data) showed consistent results, suggesting that the result is not an artefact. Furthermore, neither location nor the interaction between location and overall environment quality were significant in either analysis, indicating that location did not confound the results. These findings are also in line with past field studies suggesting that urban design features can increase walkability (Ameli et al., 2015;Ewing and Clemente, 2013;Gehl, 2010;Isaacs, 2000;Mehta, 2008). The unique contribution of our analysis was its ability to control for confounding factors, such as active functions, which is difficult to do in the real-world studies cited above.
The additional, exploratory results suggested that façade quality was the biggest determinant of high composite activity scores, contradicting H3. In all four contrasts involving high-quality and low-quality façades, environments with high-quality façades gained higher composite activity scores. Two contrasts were statistically significant when accounting for the multiple-testing burden, and a third reached the set level for statistical significance. Despite the lack of significance, based on the medium to large effect sizes and literature, we might speculate that the effect of façade quality is relevant. However, improving façade quality had a greater effect on walkability in Bedok than in Simei, as discussed below in the next paragraph. Future studies might consider examining these effects further.
One possible explanation for the difference in mean activity scores for high-quality Bedok and Simei facades is that some of Gehl's recommendations for creating high-quality façades are more important than others. The Bedok environment had both narrower units and more variation in colour than the Simei environment. However, given that the Bedok and Simei façades differed in multiple ways, which indeed was a reason to include them in the study, we cannot discern which exact feature(s) caused the difference. Nevertheless, our study strongly indicates that façade quality itself, even without a lively or commercial context, increased intended walking activity within our study context. This has not been conclusively shown by past studies, which have defined high-quality façades as ones that provide views of other people, and had difficulty controlling for the presence of active or commercial functions (Ameli et al., 2015;Gehl, 2010).
Our study provides mixed support for the idea that low (and L-shaped) buildings are preferable to tall ones (H2). In all four contrasts involving low and high buildings, environments with low buildings received higher scores, with effect sizes ranging from small to medium. On the other hand, contrasts comparing low buildings to L-shaped buildings were inconclusive, with low buildings gaining higher scores in Simei and lower scores in Bedok. These effects were both statistically insignificant. The effect of building height on walking activity is, therefore, still left open: it seems likely that effects are small, and detecting them would have required a larger sample size. However, the result is perhaps strengthened when viewed alongside research by Ewing and Clemente (2013) and Ameli et al. (2015), who did not find a relationship between enclosure and walking activity in New York City and Salt Lake City. Similarly, Frölich et al. (2015) found that people find tall buildings unpleasant mainly when they are combined with very narrow spaces.
The link between liveliness and walking activity was not conclusively settled in the current study. However, our study suggests that, if anything, this effect was negative (contrary to H2). In all four contrasts involving liveliness, lively environments received lower activity scores than non-lively ones, with effect sizes ranging from small to medium while not reaching statistical significance.
There are at least two possible explanations for the finding that liveliness did not encourage walking activities. First, participants could not interact with the virtually simulated agents, and therefore the agents may not have represented the 'promise of social interaction' that Gehl argued encourages walking (Gehl, 2010: 63). However, participants in the post-experiment questionnaire reported being influenced by the number of virtual agents more than by their realism or style, suggesting that the lack of interaction did not have a major impact on results. Another possible explanation could be the specificity to the context of Singapore or HDB estates. Perhaps the Singaporean participants preferred less lively places during their leisure time, given Singapore's high population density, or preferred less lively residential environments. It is likely that participants recognised the residential HDB setting, as most had themselves lived in an HDB estate, and due to the presence of distinctive features such as void decks (common areas on the building's ground floor, visible to the outside), uniform stories above the void deck, and the absence of gates surrounding the estates. Nevertheless, the link between liveliness and the residential settings is a suggestive result that would benefit from further study.

Limitations
The main limitation in our study was the small sample sizes of the exploratory analyses analysing H2 and H3 (not the primary analysis), examining the effects of individual features on walking activity. Nonetheless, we argue that a larger study would have yielded similar results. This is because we observed consistent trends: improving façade quality was consistently associated with higher activity scores, while the association was both weaker and less consistent in the case of building height and liveliness. Regardless, we argue that the smaller sample size in the exploratory analyses was justified as we expected fatigue to worsen the quality of the data collected (see section 'The Virtual Environments'). We faced a trade-off of having possibly worse data for the entire study, and having a sub-study with a smaller sample size. Since existing literature has not established the effect of any of our features on walking activity, we considered it justified to conduct a stronger, main study on the combined effects of our three features, and the secondary, exploratory studies on the individual effects.
A second limitation of the study is that it took place in virtual environments that had no sound, where walking happened through a remote controller. Although moving by actual walking would have been more realistic, arguably it was most important that participants could move in an intuitive way that allowed them to focus on observing the environment. The remote controller was deemed an intuitive way to move that fulfilled this condition.

Conclusions
In sum, this study contributes to knowledge about urban design features encouraging walkability, specifically adding quantitative evidence from the context of a highly dense Asian city (Singapore). Our two aims were to validate existing walkability guidelines, and to contribute to walkability indexes and models, which are used in a variety of fields from real estate development to transport planning and public health.
The study contributes two main conclusions. Firstly, it suggests that urban design features are important for encouraging walking activity, even when commercial functions are not included among them. Secondly, our results tentatively suggest that façade quality has a significantly greater positive impact on walkability than tall buildings and liveliness.
The conclusion that non-commercial urban design features can improve walkability contributes to existing literature on walkability. This is something that past field studies have had difficulty doing, as it is hard to distinguish the effect of urban design variables from the effect of commercial functions.
Accordingly, urban design features are generally underweighted or disregarded in walkability models such as WalkScore (2020) and even the more holistic Q-PLOS (Quality of Pedestrian Level of Service index, see Talavera-Garcia and Soria-Lara, 2015). WalkScore focuses exclusively on the contribution of convenience (i.e. the proximity to attractive destinations) to walkability. Q-Plos also takes into account the ability to meet the other major walking needs (safety, comfort and attractiveness); however, it assumes that the presence of commercial functions is the only environmental feature that increases attractiveness. Hence, our findings (and the findings of future studies on the impact of non-commercial urban design features) may motivate walkability analysts and researchers to incorporate urban design variables in walkability metrics.
This research may also benefit from our other main (though tentative) conclusions that façade quality had the largest impact on walkability, of the features studied. In this regard, the benefit of our study is that it tested the effect of concrete, measurable façade features and estimated effect sizes. Both of these are needed in efforts to improve walkability metrics through, for example, street image data (Hasan, 2020;Miranda et al., 2021;Shen et al., 2018).
Our findings on the individual effects of the three features studied may also have practical implications. Firstly, they suggest that Gehl's façade guidelines are broadly valid, and that cities are justified in using them in building codes. Secondly, visual considerations do not seem to be a good reason to limit building height (although other aspects, which were not the subject of this VR study, such as microclimate conditions, may naturally still justify limiting building heights). Thirdly, and more tentatively, even if previous research in non-residential areas correctly demonstrated that people prefer lively public spaces, this may not always be the case. We speculate that our study's context of a residential area in a high-density city may have influenced the finding that liveliness was negatively correlated with walking activity, though further research on this would be needed.

Future work
As implied above, an interesting line of future work may be incorporating our findings into models of walkability, and using such models to analyse large urban areas.
Alternatively, future research may have a narrower scope than our study, by focussing on how the individual features, or even subfeatures, included in our study affect walkability. This could mean examining which subfeatures of façade quality affect walkability, or whether context (such as that of a dense city) affects the link between liveliness and walkability. Future studies could also examine whether building height has more substantial effects on walkability when accounting for factors that our study did not focus on, such as the microclimate or other ambient factors like soundscapes.
At least some of this further research would benefit from VR, a promising approach to gathering empirical knowledge about the effects of design features on behaviour. Virtual reality is of particular benefit for urban researchers seeking to avoid confounding factors found in field studies.