Fitness testing at police academies: Optimal fitness for duty

Fitter police recruits are more likely to graduate, are less injury-prone, and fatigue less rapidly. Although most police academies implement fitness testing to ensure a minimum standard of job-specific fitness, academies may benefit from utilizing fitness tests that challenge recruits beyond the minimum fitness demand. The current study evaluated whether a fitness test called Professional Fit (ProfFit) of a police academy aligned with the academy’s purpose to challenge their recruits to become optimally fit. We evaluated whether the ProfFit measures an all-round range of fitness characteristics. Moreover, we evaluated whether the ProfFit measures higher fitness levels than the minimum fitness demand for duty. Police recruits (N = 103) were tested on the ProfFit, six extra fitness tests known to measure one (isolated) aspect of fitness, and a standard practice job-specific fitness test. Results showed that the ProfFit measures an all-round range of fitness characteristics: anaerobic power, strength lower extremities, strength upper extremities, isometric strength, and muscular endurance of the trunk muscles. The results also showed that recruits indicated a higher rating of perceived exertion during the ProfFit than during the job-specific fitness test, indicating that they experienced the ProfFit as more demanding than the minimum fitness demand for duty. It was concluded that the ProfFit facilitates to challenge police recruits physically, as was the aim of the police academy. These findings provide empirical support for fitness tests that aim to improve police recruits’ fitness levels to be not just fit for duty, but rather optimally fit for duty.


Introduction
Police academies worldwide test whether their recruits are fit for duty. They use job-specific fitness tests to test, monitor, and promote the physical fitness needed to be a police officer (Anderson et al., 2001;Koedijk et al., 2020;Payne and Harvey, 2010). Research has shown that individual fitness characteristics are associated with job-specific tasks (Beck et al., 2015;Canetti et al., 2021;Dawes et al., 2016). Accordingly, studies have established which fitness characteristics and tasks should be included in the tests to create a manifest relation with on-duty performance (i.e. the validity of job-specific fitness tests for police officers; Cesario et al., 2018;Lindsay et al., 2021;Lockie et al., 2019). As such, police academies can develop fitness tests that are relevant for, and directly related to, fitness for duty.
However, job-specific fitness standards are commonly developed to secure only a minimum level of physical conditioning needed to perform job-specific tasks (i.e. criterion-referenced) Visser, 2002, 2004;Petersen et al., 2016;Tipton et al., 2013;Zumbo, 2016). Consequently, job-specific fitness tests do little to optimize the fitness levels of officers beyond the minimum requirement to perform the job. An additional issue with jobspecific fitness tests is that recruits can compensate for relatively weak fitness characteristics. In many job-specific fitness tests, recruits who are fast but not strong will be able to compensate for their limited strength by agility and speed (as is possible in the execution of police work). A number of studies have postulated that fitness tests should require all-round fitness, meaning that recruits must demonstrate a sufficient fitness level on a wide range of characteristics, in which weakness in one of the fitness characteristics relevant for police work cannot be compensated for by relative strength in another (Arvey et al., 1992;Cesario et al., 2018;Crawley et al., 2016;Lockie et al., 2018Lockie et al., , 2019. Dawes et al. (2016) argued that although having all-round fitness does not necessarily serve as a predictor of job performance, it does contribute to overall fitness, health, and injury prevention.
Fitness tests that set performance standards beyond the minimum requirements to perform the job can increase recruits' motivation to maintain and (further) improve their fitness. Lagestad and Van den Tillaar (2014) indicated that a decrease in motivation and commitment to physical activity might occur if officers are not required to undergo regular physical testing. Research has shown that recruits' learning dependence increases when testing is the last activity in a learning program compared with spending an equal amount of time practicing skills without a test at the end (Kromann et al., 2009;Roediger et al., 2011). Furthermore, Ross et al. (2006) showed that recruits selfregulate their learning in line with the demands of the tests. Together, these studies illustrate that testing and evaluation drive recruits' engagement with training activities. As such, one could argue for the inclusion of more intensive (beyond the minimum fitness demand) and allround tests at police academies, because recruits would then be motivated to train for these more demanding tests and thus more all-round fitness.
Police academies would benefit from preparing and graduating optimally fit recruits for a number of reasons. First, police recruits that optimize their fitness characteristics have increased chances of successfully graduating from the police academy (Korre et al., 2019;Lockie et al., 2018;Shusko et al., 2017). Shusko et al. (2017) found that improved performance on 1.5-mile run and push-ups components were strongly associated with successful graduation and may be effective tests to optimize the fitness of recruits. Next, higher fitness standards can be used to identify recruits at risk of injury or failure . Suboptimal fitness may be a risk factor for injury, illness, and lost time during police academy MacMillan et al., 2017). As a last reason, we argue that fitter officers are able to perform job-specific tasks for longer, recover faster, and fatigue less rapidly (Knapik et al., 2015). Thus, police academies would benefit from stimulating and facilitating recruits to optimize their fitness levels. Importantly, this requires fitness tests that set performance standards beyond the bare minimum standards in current common practice job-specific tests and appeal to an all-round range of fitness characteristics important for police work.
There has been little research on fitness tests to challenge police recruits physically beyond the minimum job requirements and improve their fitness levels to be not just fit for duty, but rather fitter for duty. So far, surveys have shown that police recruits expressed a positive attitude towards fitness testing to challenge them to improve their fitness further (Bissett et al., 2012;Lagestad, 2012). The current study evaluates a fitness test called Professional Fit (ProfFit). This test includes components such as push-ups and sit-ups, which are utilized globally in test practices of police organizations. For more practical examples of international fitness tests, we refer to Marins et al. (2018), Orr et al. (2021) and Zulfiqar et al. (2021), among others.
The police academy that participated in the current study aims to deliver recruits with a higher level of fitness at graduation than when they entered the academy. Against this background, the police academy formulated two purposes for their fitness test: (a) it should measure whether recruits possess an all-round range of fitness characteristics, and (b) it should measure higher fitness levels than the minimum fitness demand for duty. Thus, the study aims to answer two questions: • Does the ProfFit measure an all-round range of fitness characteristics? • Does the ProfFit measure higher fitness levels than the minimum fitness demand for duty?
To answer the first research question, we related performance on the ProfFit with a wide range of fitness characteristics relevant for police work. To answer the second research question, we compared the objective and subjective load of performing the ProfFit test with the load of a job-specific test, namely the Physical Competence Test (PCT). The PCT is a job-specific obstacle course that represents tasks police officers encounter on duty and is utilized by the police academy to test the minimum fitness demand for duty. The PCT is developed on the basis of a task analysis of the daily working activities and the minimum demands on physical fitness of the Dutch Police (Mol and de Vries, 2007;Mol and Visser, 2002). The minimum physical demands in the PCT were quantified in terms of the duration, frequency, and intensity (%HRR) of physical tasks and activities on duty (Anderson et al., 2001;Mol and Visser, 2002). The minimum physical demands for duty tested in the PCT have been evaluated and validated several times (Strating and Bakker, 2008;Strating et al., 2010). The current study thus evaluates whether the police academy's fitness test assesses an all-round range of fitness characteristics and measures higher levels of fitness than the minimum fitness demand for duty.

Design
The current study utilized a cross-sectional within-subjects design. Figure 1 shows an overview of all the tests performed by participants: the ProfFit test, the PCT, and six extra fitness tests. Participants' fitness characteristics were measured using fitness tests known to record one (isolated) aspect of fitness (see the section Extra Fitness Tests). This formed a two-step process. First, we searched for all fitness characteristics important for regular street patrol officers to perform work-related tasks (Beck et al., 2016;Marins et al., 2019). We then selected the fitness tests most valid for each particular fitness characteristic. Within the organizational constraints of the study, it was not possible to include a test that separately measured the fitness characteristic aerobic endurance, because this would pose too much load or fatigue on participants within the time frame of the study and the full testing protocol.
To prevent overload and fatigue, each participant performed the tests over two separate test sessions, at least six days apart. To prevent order effects, half of the participants completed the ProfFit test during the first test session, and the PCT and extra fitness tests during the second test session; the other half first completed the PCT and the extra fitness tests in the first session, and the ProfFit test during the second test session. Test sessions were organized in a carousel style, meaning that subgroups of participants started the test session with different test components, further preventing systematic bias caused by a build-up of fatigue over the test components. All recruits performed a self-selected 5-10-min warm-up before beginning the fitness testing.

ProfFit test
The ProfFit is a fitness test consisting of seven components: 500 m rowing, squats, push-ups, sit-ups, burpees, pull-ups, and a 1,600-m run. The components were performed as they are administered in the test practice at Dutch police academies. The components have also been described in previous studies investigating fitness among police officers Lockie et al., 2018). For a few components, additional reference aids were used to make the assessment of correct execution easier (see the detailed description of test components and Figure 2). Rest periods between test components ranged from 1 to 6 minutes. Participants had to meet standards for each test component that are specific to their gender and age. For all standards of test components, see the online Supplemental Material. Each test component was assessed by a police instructor or researcher who ensured that the movement performance was correct, counted the number of repetitions and/ or kept time. If participants' movement performance did not meet the set requirements, the execution was rejected and not counted. This is congruent with the assessment of performance of the ProfFit test components at Dutch police academies. The procedures for the components are detailed hereafter. 500 m Rowing. Participants' 500 m rowing performance was assessed using a Concept 3 Model E PM5 rowing ergometer. The recorded time stopped as soon as the participant completed 500 m. Participants had to meet the time standard that applied to their gender and age.
Squats. Participants were instructed to place their feet slightly straddled, bend their knees 90 degrees, and extend their hips fully when coming up after the squat. To standardize the execution of squats, reference lines (elastics), pre-aligned to the participants' anthropometric measurements, were used to check whether participants performed the squat with a 90-degree knee bend and whether the participants were stretching the hips enough to be fully upright after performing the squat (Figure 2(a)). Participants had to meet the number of squats within 1 minute that applied to their gender and age.
Push-ups. Participants were instructed to place their hands shoulder-width apart, align their heel, hip, and shoulders (plank), and bend their elbows 90 degrees (shoulders were in line with the elbow joint). To standardize whether participants' hands were placed shoulder-width apart, a tool consisting of movable surfaces adjusted to shoulder width was used (Figure 2(b)). Participants were instructed to place their hands on these surfaces during the performance of the exercise. Participants had to meet the number of push-ups within 1 minute that applied to their gender and age.
Sit-ups. Participants were instructed to bend their knees 90 degrees, tap their hands on their foot's instep, and lift their back off the ground. Adjustable reference lines were used to standardize the sit-ups' execution, pre-aligned to the participant's anthropometric measurements (Figure 2(c)). The bottom reference line was placed at the height of the participant's foot instep. The top reference line was placed at the height of the participant's fingertips while lying with their back and head fully on the ground (starting position of the sit-up). Participants were then instructed to bring the fingers of both hands from the top line to the bottom line simultaneously with each sit-up. Participants had to meet the number of sit-ups within 1 minute that applied to their gender and age.
Burpees. Participants were instructed to start the burpee from an upright position, then bend their knees, and place their hands on the floor. At the same time, participants had to jump back and assume a plank position and perform a push-up. From the push-up, the participant came back up and jumped, clapping their hands above their head. participants had to extend their hips when jumping up (standing completely upright). To ensure that participants fully extended their hips, a reference line (elastic) was used (Figure 2(d)). This reference line was placed 15 cm above the recruit's height with arms raised. When jumping up and clapping their hands, the hands had to be the same height as the reference line. Participants had to meet the number of burpees within 1 minute that applied to their gender and age.
Pull-ups. Participants were instructed to start the pull-up hanging from a bar with their arms fully extended. Participants had to pull up until their chin was above the bar. Female participants were allowed to put one or two feet against the wall for support (in line with the test practice and requirements for female recruits at the academies). Participants had to meet the number of pull-ups within 1 minute that applied to their gender and age.
1,600 m Run. Participants performed a 1,600-m run on a treadmill. Research has shown that the endurance required to run a certain speed for a certain amount of time on a firm surface is comparable with the endurance required for the same speed and duration on a treadmill with a 1% incline (Jones and Doust, 1996). Therefore, the incline was set at 1%. The recorded time stopped as soon as the participant completed the 1,600 m. Participants had to meet the time standard that applied to their gender and age.

Extra fitness tests
The extra fitness tests used in this study have been found in the literature to measure one (isolated) aspect of fitness and are therefore good reference tests that can be interpreted unambiguously. Moreover, the procedures adopted during testing are reflective of research incorporating law enforcement populations in the published scientific literature (Cocke et al., 2016;Crawley et al., 2016;Dawes et al., 2016Dawes et al., , 2017Orr et al., 2021).
Wingate test. Anaerobic capacity was assessed via a Wingate test (Medbó̷and Tabata, 1989). Participants performed the test on a Monark 894E cycle ergometer (Monark, Stockholm). We set the test's resistance at 7.5% of body weight in kg (Bar-Or, 1987). Participants could pedal for 5 s to reach maximum cadence and were then instructed to maintain their maximum speed throughout the 30-s period once the resistance was applied.
Vertical jump test. The strength of lower extremities was assessed via a vertical jump test. This test protocol has been previously performed in studies in police officers (Crawley et al., 2016;Dawes et al., 2017). The vertical jump height was measured using a wooden board. The participant puts on a harness with a measuring tape. The measuring tape runs through an indicator on the board. The participant is instructed to stand straight and look straight forward, and the measuring tape is pulled to assess standing height. As the participant jumps, the measuring tape runs through the indicator on the way up and will be slacked as the participant comes down. The measuring tape and indicator thus measure the highest point of the jump. Participants were instructed to jump as high as possible with their hands on their hips, legs stretched, and land within the plane indicated on the platform. Before and after the jump, values in centimeters on the measuring tape were read. We calculated the jump height by subtracting the value before the jump (standing height) and the value after the jump (highest point of the jump).
Sit-and-reach test. Flexibility was assessed via a sit-and-reach test. This method of testing has been used in other studies in this population (Lockie et al., 2019). We used a box with a wooden shelf extending 15 cm over the edge of the box. A ruler was attached to the shelf. The participant was asked to take off their shoes and sit in front of the box with their feet flat against the front of the box. The participant's knees were fully extended throughout the test. The participant was instructed to bend forward with two arms and try to push a lightweight block as far as possible, with the fingertips of both hands. We noted the distance in centimeters to where the fingertips reached on the box. Participants performed three attempts to reach as far as possible.
Handgrip strength test. Maximal isometric muscle strength was assessed via the handgrip strength test (Schlüssel et al., 2008). The participant performed the test with a Takei Grip Strength Dynamometer (Takei Kiki Kogyo, Tokyo, Japan). The testing protocol used in this research has been used in this population and is described in several studies (Dawes et al., , 2017. Participants were instructed to squeeze the handle maximally and to sustain this for 3-5 seconds. Participants performed three maximum attempts for each grip strength measurement. Plank test. Trunk muscle endurance was assessed via the plank test. This test has been previously performed in studies in police officers (Grani et al., 2021;Mc Gill et al., 2015). The participant was instructed to maintain a prone position in which the toes and forearms supported the bodyweight. We terminated the test when participants could not maintain their posture, or their pelvis moved up or down by 5 cm or more.
1RM bench press. Strength in the upper extremities was assessed via the 1RM test bench press. The test protocol was performed based on the standards outlined by Hoffman and Collingwood (2015). The participant was instructed to grip the barbell at a width slightly wider than the shoulders and to lightly touch the chest with the barbell before returning to the top position while keeping the feet on the floor and the hips, upper back, and head on the bench.

PCT
The PCT is a well-standardized job-specific test that has been evaluated and validated several times (Mol and Visser, 2004;Strating et al., 2010). Participants performed various physical tasks that represent those police officers encounter on duty, such as running, moving over obstacles, and transferring (heavy) objects. In each of the five rounds that formed the entire course, the participant had to run from task to task and then return to the starting point to begin the next round. Participants had to meet the time standard for their gender and age to complete the five rounds. These time standards were based on the minimum fitness demand for duty. The following tasks are included in the PCT: • moving over a vaulting box in a broad direction; • moving over a vaulting box (length) in a longitudinal direction; • moving over Swedish gymnastic benches; • pushing a handcart of 200 kg for 6 m; • pulling a handcart of 200 kg for 6 m; • moving three medicine balls weighing 5 kg.

Measurements
Performance scores of the ProfFit included the time in seconds for the components 500 m rowing and 1,600 m running, and the number of repetitions in 1 minute for the strength components (squat, push-up, sit-up, burpee, pull-up). Performance scores on the extra fitness tests included: the Wingate 30-s power output (W/kg), highest jump height (cm) over three jump trials, mean sit-and-reach distance (cm) over three attempts, mean handgrip strength (dekanewton (daN)) over three attempts, holding time (s) on the plank test (s), and weight (kg) on the 1RM test.
The intensity on the ProfFit and PCT was evaluated using ratings of perceived exertion (RPE) and percentage of maximum heart rate (%HRmax). A Borg Rating of Perceived Exertion Scale was used to determine participants' RPE (Borg, 1998). Participants indicated how much physical effort it took to perform the test components on a scale of 1 (very light) to 10 (very hard). Participants indicated their RPE score by marking it on a hard copy scale immediately after performing a component of the ProfFit. During the PCT, participants indicated their RPE scores by calling out their effort score to the researcher on passing the end point of each round. A large poster showing the Borg scale was put on the wall at the end/start point for reference. %HRmax was determined using a Polar H10 heart rate monitor (Polar Electro Oy, Finland). This is a physiological monitoring device that was strapped to the participant's chest. The Polar H10 heart rate monitor has demonstrated good accuracy in heart rate measurements during exercise (Gilgen-Ammann et al., 2019). Heart rate data were recorded in the field on the Polar device and downloaded for each participant as a time-stamped, second-by-second heart-rate data sheet to be analyzed offline. Participants' maximum heart rates were estimated using the 220 − age formula (Astrand and Ryhming, 1954), which allowed the heart-rate values of each participant to be converted into %HRmax values. Members of the research team used time records to identify the start and end times for each of the components in ProfFit and each round in PCT. For each of these events, we calculated the average %HRmax by averaging %HRmax values between the start and end times.

Statistical analyses
Relationship between ProfFit components and fitness characteristics. Multiple linear regression analyses were used to determine to what extent anaerobic capacity, strength of the lower extremities, strength of the upper extremities, flexibility, maximum isometric strength, and muscle endurance of the trunk muscles were predictive of the score for each ProfFit test component. Sex and age were used as control variables because there are corrections for sex and age in the ProfFit and numerous studies have documented sex and age differences in the physical performance of law enforcement populations (Dawes et al., 2017;Lockie et al., 2018Lockie et al., , 2019Strating et al., 2010). The alpha level for significance was set at .05. Regression analyses were performed using a forward stepwise method (for more information, see Twisk, 2003Twisk, , 2007. All possible predictors were first examined regarding their relationship with the dependent variable using single regression analyses. The predictor that best predicted the dependent variable, based on the lowest p-value (provided that p < .10) was then included in an initial model with only the constant and this predictor. The next step was to include a second predictor, which was retained in the model if it significantly increased the proportion of the variance in the dependent variable explained by the model. This procedure continued until all variables were examined and included or rejected.
Independent t-tests were used to determine whether recruits who had passed a particular ProfFit component score higher on the relevant extra fitness test than recruits who had not passed the relevant ProfFit component. If analysis showed that a certain aspect of fitness (measured with the extra fitness tests) was predictive of the score on the ProfFit test component, and we found a significant difference in that fitness characteristic between recruits who passed the ProfFit test component and those who did not, we concluded that the ProfFit component taxes the fitness characteristic concerned.
Comparison of objective and subjective load between ProfFit and PCT. We conducted one paired samples t-test to compare mean %HRmax over all five rounds (entire test) of the PCT with mean %HRmax over all components of

Relationship between ProfFit components and fitness characteristics
Descriptive data for the ProfFit components and extra fitness tests are shown in Table 1. Multiple linear regression data and independent t-test data of the ProfFit components and extra fitness tests are shown in Table 2.
Multiple linear regression data and independent t-test data were used to investigate whether recruits should possess an allround range of fitness characteristics to meet the standards of the ProfFit. Analyses revealed that for each ProfFit component, at least one extra fitness test significantly increased the proportion of the variance in the dependent variable explained by the model, on top of the control variables sex and age.
The multiple regression analyses showed that anaerobic capacity and isometric strength were significant predictors for 500 m rowing, explaining 61.0% of the variance in rowing scores, F(2,67) = 24.51, p < .001. In addition, the independent t-tests revealed that recruits who had passed the 500 m rowing test had a significantly higher anaerobic capacity and isometric strength than those who had not passed this component (Table 2). Strength in the lower extremities was the single significant predictor for performance on the ProfFit component squats, explaining 31.7% of the variance in squat scores, F(1,55) = 24.11, p < .001. In addition, recruits who had passed the component squat test significantly had greater strength in the lower extremities than recruits that had not passed (Table 2). Strength in the upper extremities was the single significant predictor for push-ups, explaining 21.1% of the variance in push-up scores, F(1,56) = 14.05, p < .001. In addition, recruits that had passed the component push-up significantly had greater strength in the upper extremities than recruits that had not passed ( Table 2). Endurance of the trunk muscles was the single significant predictor for sit-ups, explaining 29.2% of the variance in sit-ups scores, F(1,55) = 10.94, p < .001. There was, however, no significant difference in endurance of the trunk muscles between recruits who passed the sit-up component and those who did not (Table 2). Strength in the upper extremities, flexibility, and endurance of the trunk muscles were significant predictors for 1,600 m running, explaining 22.1% of the variance in the 1,600 m running scores, F(3,52) = 4.60, p < .05. Only endurance of the trunk muscles was significantly better for recruits who passed the 1,600 m running component compared with those who did not. None of the reported extra tests were predictive of the ProfFit components burpees and pull-ups. Collectively, the results indicate that the ProfFit measures an all-round range of fitness characteristics, namely anaerobic capacity, isometric strength, strength lower extremities, strength upper extremities, and endurance of the trunk muscles.

Comparison of objective and subjective load between ProfFit and PCT
Participants' physiological variables are presented in Table 3, which shows intensity expressed in %HRmax Table 2. Multiple regression analysis with forward selection method and means, standard deviations and t-test statistics for recruits who passed the predicting components and those who did not. and RPE-scores for each component of the ProfFit as well for rounds 1 to 5 of the PCT.

Variables
To compare participants' intensity of performing the ProfFit and that of performing PCT, paired samples t-tests were conducted. The analysis revealed that mean % HRmax during the performance over all five rounds (entire test) of the PCT (M = 89.55, SD = 5.59) was significantly higher than %HRmax during the performance over all components (excluding rest periods) of the ProfFit (M = 86.34, SD = 6.09), t(30) = −17.53, p < .001. This indicates that the PCT poses a higher intensity compared with the ProfFit. By contrast, the RPE scores during the ProfFit (M = 5.83, SD = 1.36) were significantly higher than the mean RPE scores during the PCT (M = 5.13, SD = 1.05), t(30) = −2.88, p < .005. This indicates that the ProfFit was experienced as more demanding by the participants compared with the PCT.

Discussion
Police academies may value optimal fitness levels for recruits and graduates beyond the minimum fitness demand for duty. This study aimed to investigate the extent to which a fitness test (ProfFit) demands all-round fitness and measures higher fitness levels than the standard practice job-specific fitness test (PCT), which benchmarks the minimum fitness demand. Results showed that the ProfFit test measures several characteristics of fitness in different test components, namely anaerobic power, explosiveness/strength lower extremities, maximum strength upper extremities, muscular endurance of the trunk muscles, and aerobic endurance. This indicates that the ProfFit measures an all-round range of fitness characteristics, as is the aim of the police academy. Yet, flexibility is not tested in ProfFit. Research has shown that increased flexibility leads to more efficient movement performance during physical exertion and a reduced risk of injury (Rahnama et al., 2005;Watsford et al., 2010). Therefore, police academies may consider including a component that measures flexibility to make their test even more allround. When interpreting these results, it must be considered that aerobic endurance was not assessed as a separate fitness characteristic, which is a limitation of the study. However, there is compelling evidence from the literature that both the 1,600 m running and 500 m rowing components measure aerobic power (endurance) (Anderson, 1992;Tomkinson and Olds, 2007), and as such we are confident that aerobic endurance is also taxed in the ProfFit test.
None of the separate fitness characteristics were related to performance on the ProfFit components burpees and pull-ups. For the burpees, this may be explained by their composite nature. A burpee consists of several components, basically a push-up and a jump (Soro et al., 2019). As a result, many different muscle groups are involved in the performance, namely the arms, chest, trunk, and leg muscles, but none of these muscle groups seems to play a dominant role in the performance of the burpees. The lack of predictors for the pull-up component may be a consequence of anthropometric properties that influence performance. Forearm length has been shown to affect performance (Johnson et al., 2009), and this anthropometric influence may obscure or override the influence of fitness characteristics on performance. Moreover, literature and practical experience showed that relatively many individuals cannot perform a single correct repetition of a pull-up (Sanchez-Moreno et al., 2016). This leads to null scores and is associated with decreased training motivation (Koedijker et al., 2011). Based on the impact of unmalleable anthropometric characteristics and the issue of null results, the pull-up does not seem to be the most suitable component for a test that aims to promote fitness in recruits.
Recruits experience the ProfFit as more demanding than the PCT, although the PCT poses a higher cardiovascular intensity. In the case of the ProfFit, there might be a systematic underestimation of the cardiovascular intensity because recruits had rest periods (albeit short ones) between the components. Their heart rates will have recovered slightly in these periods. In the PCT, there is only one phase where the heart rate lags slightly behind the oxygen requirement (namely at the start of the first round). In the ProfFit, there are six such phases (beginning of each test component after a short rest period), possibly explaining why the PCT posed a higher cardiovascular intensity then the ProfFit.
There are several explanations why the ProfFit is experienced as more demanding. A few specific components in the ProfFit seem to be especially demanding for recruits. The 500 m rowing, burpees, and 1,600 m run components had the highest intensity in the ProfFit (see Table 3). These components involve many muscle groups, and as such contain an endurance component, thus placing a large cardiovascular load on the recruits (Anderson, 1992). The RPE scores showed a similar pattern to the % HRmax values because recruits experienced 500 m rowing, burpees, and 1,600 running as most intensive. Taken together, these results suggest that 500 m rowing, burpees, and 1,600 m running are objectively and subjectively challenging for recruits, indicating that these components are notably suitable for police academies aiming to challenge their recruits physically. Another interesting suggestion that may explain why the ProfFit is perceived as more demanding than the PCT is that recruits seem to have less self-control over their effort and cannot compensate for their relatively weak fitness characteristics with strong fitness characteristics. The PCT is an obstacle course that requires continuous effort, but which recruits can regulate. As a result, recruits feel that they have control over speed and effort during the test. For example, recruits may choose to run faster from obstacle to obstacle to compensate for lost time on obstacles on which they are relatively weak (Tucker and Noakes, 2009). The ProfFit, however, has an interval character in which isolated parts are performed within a standardized time. Recruits must show satisfactory performance on every component. In other words, the recruits must show satisfactory fitness levels on all fitness characteristics relevant for police work, with no possibility to compensate between them.
In practice, police academies may have different motives for challenging recruits to maintain and (further) improve their fitness, such as a higher likelihood of recruits graduating, less likelihood of recruits getting injured or sick, and more efficient training at the academy (Korre et al., 2019;MacMillan et al., 2017;Orr et al., 2016). Based on the idea that testing and evaluation stimulate recruits' engagement in training activities (Kromann et al., 2009;Roediger et al., 2011;Ross et al., 2006), it can be suggested that police academies that offer challenging tests on a regular basis encourage recruits to continue working toward better individual fitness levels. It should be borne in mind that to explicitly test improvements in fitness, recruits would need to be tested twice (before and after a fitness test or training paradigm) or between two groups that did and did not participate in the test that challenged them to optimize their fitness. Nonetheless, we argue that if police academies want to encourage optimum fitness, a first step can be taken by offering fitness tests that go beyond the minimum fitness standards in current job-specific tests and appeal to a diverse range of fitness characteristics relevant to police work.
Police (fitness) instructors and police academies that want to use fitness testing for the purpose of challenging recruits physically have to critically consider whether their test facilitates and stimulates this. The current study provides a comprehensive approach for such an evaluation. Tests that aim to challenge recruits physically can be further enhanced by identifying, on the one hand, elements that increase the challenging character of the test (e.g. adding physically demanding components to the test practice, such as 500 m rowing in this study) and, on the other hand, elements that hinder recruits in their motivation (such as the pull-up component in this study).
Police recruits who engage in higher levels of physical activity and are more physically fit have reduced risk of sustaining an injury (Nabeel et al., 2007;Orr et al., 2021). These findings advocate for better officer health and fitness standards to reduce the risk of on-the-job injuries and absenteeism (Boyce et al., 1991;Orr et al., 2021). However, setting higher standards or introducing complex new tasks may result in higher injury rates among recruits due to increased physical conditioning requirements, reduced opportunity for recovery, and increased risk of overtraining (Orr and Pope, 2015). It is thus about finding an optimal balance. As such, further work is required to determine the effects of further promoting physical fitness levels using fitness testing on the risk of injury, risk of overtraining, and engagement in training of recruits.

Conclusion
The current study thoroughly examined a fitness test that aims to evaluate fitness levels beyond the minimum demand for duty. Overall, we found clear indications that the ProfFit physically challenges police recruits to become optimally fit. The results showed that the test measured an all-round range of fitness characteristics in which weakness in one characteristic cannot be compensated for by strength in others. Moreover, the ProfFit was experienced as physically more demanding than the current practice job-specific test, indicating that the test, at least in the perception of recruits, requires a higher level of fitness than the minimum fitness demand for duty. These findings provide empirical support on tests that aim to challenge police recruits and improve their fitness levels, not only to get them just fit for duty, but rather to get them optimally fit for duty.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.