Arbitration between model-free and model-based control is not affected by transient changes in tonic serotonin levels

Background: Serotonin has been suggested to modulate decision-making by influencing the arbitration between model-based and model-free control. Disruptions in these control mechanisms are involved in mental disorders such as drug dependence or obsessive-compulsive disorder. While previous reports indicate that lower brain serotonin levels reduce model-based control, it remains unknown whether increases in serotonergic availability might thus increase model-based control. Moreover, the mediating neural mechanisms have not been studied yet. Aim: The first aim of this study was to investigate whether increased/decreased tonic serotonin levels affect the arbitration between model-free and model-based control. Second, we aimed to identify the underlying neural processes. Methods: We employed a sequential two-stage Markov decision-task and measured brain responses during functional magnetic resonance imaging in 98 participants in a randomized, double-blind cross-over within-subject design. To investigate the influence of serotonin on the balance between model-free and model-based control, we used a tryptophan intervention with three intervention levels (loading, balanced, depletion). We hypothesized that model-based behaviour would increase with higher serotonin levels. Results: We found evidence that neither model-free nor model-based control were affected by changes in tonic serotonin levels. Furthermore, our tryptophan intervention did not elicit relevant changes in Blood-Oxygenation-Level Dependent activity.


Exclusion criteria for screening and during study
• Any conditions posing safety issues with the fMRI scan (e.g.metallic implants, cardiac pacemakers, pregnancy etc.) • Corrected binocular visus below 0.8 (to ensure sufficient view of the visual cues in the paradigm) • Any psychiatric disorder requiring pharmacologic treatment within the last year • Life time history of one of the following: organic psychiatric disorders (F0), opiate, cocaine, stimulant, hallucinogen, inhalant, or poly-substance dependence, schizophrenia or related disorders (F2), affective disorders (F3) • Current somatic disease requiring treatment with drugs affecting the central nervous system We further excluded participants, who failed to show up sober, who did not tolerate the mixture (strong aversion, vomiting) or who actually planned to get pregnant.Light and heavy smokers were accepted.We asked them not to smoke at least one hour before the fMRI session.The participants were allowed to take regular medication (except psychiatric and antiepileptic, opiates or any other drugs affecting the brain).

Exclusion for behavioural analyses:
For analysis, we had to exclude five participants due to task unresponsiveness (more than 30% missing trials in at least one session), two due to technical issues during data acquisition, and two due to an intervention randomization error in one session.This resulted in a sample of 98 participants for behavioural analysis.

Exclusion for fMRI analyses because of bad data quality:
For fMRI analysis, we excluded another 10 participants due to insufficient data quality and hence analysed a sample of 88 participants.We initially performed a visual quality control of the acquired data and excluded 5 participants with strong unrepairable artifacts such as severe ghosting, spiking or cut off brain images.All other fMRI images were repaired with the ArtRepair toolbox.The repair was data driven (i.e. the extent of repair differed due to actual data quality).We excluded four participants who had more than 30% repaired slices in at least one session and one further participant with a very high scan-to-scan motion.
This results in a sample of 88 participants for the fMRI analyses.

Demographic data
The sample used for the behavioural analyses included 61 male (62,2%) and 37 female (37,8%) participants, aged 20 to 42 years (Mean: 32.2 years, SD = 6.1 years).During the baseline visit, we measured the participants body weight, which was needed to calculate the amino-acid dosage (cf.Supplement section "Tryptophan intervention and blood sampling").The participant's weight ranged between 50 and 139 kg, with a mean body weight of 75.1 kg (SD = 15.8 kg).The body mass index was in mean 24,2 kg/m 2 .As expected females were more lightweight (Mean weight = 65.0 kg, SD = 9.8 kg) as men (Mean weight = 81.2kg, SD = 15.7 kg).As seen in table S1, 53 participants (54%) had an academic degree or higher.A majority of 76 % were non-smokers or ex-smokers (nicotine abstinence for at least three months prior to baseline).The sample used for the fMRI analyses included 52 male (59%) and 36 female (41%) participants, aged 20 to 42 years (Mean 31.8 years, SD = 6,1 years).50 participants (57%) had an academic degree.

Genotyping
It was hypothesized that serotonergic functioning is influenced by the expression of a serotonin transporter in the cell membrane.This expression is genetically controlled (Meltzer and Arora, 1988).The gene coding for the serotonin transporter is the SLC6A4 gene on the long arm of chromosome 17.This gene has a short (S) and a long (L) allele.
Homozygous carriers of the short allele are supposed to have a lower transcription rate (and therefore expression) of the serotonin transporter than homozygous carriers of the long allele.Recent studies found that the L allele itself is varied through an adenine-guanine substitution (SNP rs25531) resulting in a triallelic locus (S, LA, LG).It could be shown that the LG version has the same transcriptional properties as the S allele in vitro (Hu et al., 2006).
Hence, we can subsume the different genotypes according to their transcriptional efficiency into three categories: High transcriptional efficiency (LA/LA), low transcriptional efficiency (S/S, S/LG, LG/LG) and the heterozygous participants with a mixed serotonin transporter expression (S/LA, LA/LG).Here, we refer to this classification as functional genotype.We decided to use a triallelic locus for the genotype.There was a misassignment of ten participants to the study group.Nine participants are only homozygous in a biallelic testing but not in a triallelic locus (LA/LG) and should not have been included.One participant was heterozygous in both loci (S/LA) and was only invited to the main study due to a miscommunication with the laboratory.Including these participants does not influence the interpretation of our results.
All participants were genotyped at the time of the baseline visit.The samples were collected in 9 ml tubes, laminated with ethylenediaminetetraacetic (EDTA) (Sarstedt, Nümbrecht, Germany) and then immediately frozen in a -80°C refrigerator.The distribution of the different allelic variations is shown in Figure S2.

Tryptophan intervention and blood sampling
We asked our participants to abstain from protein rich foods 24 hours before beginning of the study and having fasted overnight.All amino acid mixtures contained the same amount of (body weight adapted) large neutral amino acids (LNAA but differed in the amount of tryptophan.The amino acids were from the outset body weight adapted (as detailed in To ensure the effectiveness of the intervention, we performed a blood sampling four times during the study.The first blood taking was performed before ingestion of the amino-acids as a reference (T0), then 1h later (T1), shortly before the fMRI-scan (approximately 3 -3,5 h after ingestion of the amino-acids, T2) and after the fMRI scan (approximately 6 h after drinking the mixture, T3).In case the blood taking was not possible or only partially possible, we continued the study without blood takings.The samples were collected in 9 ml tubes, laminated with ethylenediaminetetraacetic (EDTA) (Sarstedt, Nümbrecht, Germany).
Immediately after collection, the sample was centrifuged at 4000 g and 4°C for 10 minutes and then the plasma was separated and stored in the -81°C refrigerator until analysis.
Analyses were conducted at the Department of Chemistry and Food Chemistry of the Technische Universität Dresden as described previously (Henle et al., 1991).
For the analysis of the tryptophan blood levels, we used the same procedure as previously described in the paper by Neukam et al (2018).In short, we calculated the ratio between the tryptophan plasma levels and the other large neutral amino acids (ΣLNAA), which we consider more relevant for the interpretation of our results as the tryptophan levels alone (cf.Fernstrom and Wurtman, 1971).To account for intraindividual differences in baseline blood levels, all obtained blood levels were normalised by subtracting the reference measure (T0) from the other time points (T1, T2, T3).We integrated over all time points to compute area under the curve (AUC) scores, which were then entered into repeatedmeasures ANOVA as dependent variable with intervention as a within-subject factor.

Computational model:
We set up the model as detailed by (Otto et al., 2013).The paradigm consists of three states.One first stage (grey boxes, denoted as S) and two possible second stages (either pair of colored boxes, denoted as S'1 and S'2).For each stage, the participant can choose between two actions (denoted as a1 and a2).The transition contingency between the firststage S and the second-stage S' is predefined as followed: P(S'1|S,a1)=0.7, P(S'2|S,a2)=0.7 or P(S'1|S,a2)=0.
Whereby, rep(a) is a function which equals one if the previous first-stage choice is repeated and zero if not.

ROI masks
The masks were selected as described by Kroemer et al (2014).In short, the masks of vmPFC, ventral striatum and dlPFC were taken from meta-analysis software (http://old.neurosynth.org/terms/and the BrainMap database (Nielsen and Hansen, 2002)), respectively.The vmPFC mask from neurosynth.org was smoothed and parts of the ACC were removed.The mask of the dlPFC was also smoothed.The mask of ventral striatum from BrainMap has been thresholded with p>.6.

Figure S3: ROI masks
Image despiking and motion correction Procedure: For image repair we used the ArtRepair toolbox for SPM.All volumes were subjected to a bad slice correction step, which repairs slices with an unusual amount of data scattered outside the head (Threshold: T=7%) by interpolation, followed by slice time correction (reference: middle slice) and by realignment to the first volume of the run to correct for motion.Distortion correction based on the field map was then applied to the realigned EPI images.All distortion-corrected images were consigned to a despiking (Threshold: T=4 percentage signal change to the mean image) and additional big scan-to-scan motion repair (Threshold: T=0.5mm/TR).

Pharmacological results of the tryptophan intervention
As seen in Table S3 and Figure S4 the tryptophan blood levels reached its maximum 3 hours after ingestion of the amino acid mixture in the loading condition and its minimum 3 hours after ingestion in the depletion condition.In the loading condition, we observed an increase of tryptophan peak levels of 339% relatively to the baseline (pre-drink) measure.
After ATD the tryptophan levels decreased by 31% relatively to the baseline measure.To check the comparability of our study and interventional procedure with other dietary tryptophan loading and depletion studies, we compared the measured tryptophan and LNAA values with the literature.Dougherty et al (2008) tested the effectiveness of either a 50 g or 100 g amino acid drink, which was based on the formulation of Young et al (1985) in 112 healthy adults.The primary findings of this study were, that either the 100 g and the 50 g formulation, led to robust changes of plasma tryptophan.As in our study, a 75 kg participant (mean weight of our participants) received 48.75 g of large neutral amino acids plus a maximum of 5.25 g tryptophan in the loading condition, we can only compare to the 50 g formulation in the Dougherty-study.In the ATD condition, Dougherty et al measured a tryptophan level of 22 µM in the 50 g total formulation, while we observed a mean tryptophan level of 22.56 µM.The sum of the tryptophan competitors was 1532 µM in the above cited paper and 1360 µM (excl.threonine, lysin and methionine) in our study.For BAL Dougherty et al measured a tryptophan level of 143 µM, while we observed a mean tryptophan level of 39.75 µM.The sum of the tryptophan competitors was 1431 µM in the above cited paper and 1334 µM (excl.threonine, lysin and methionine) in our study.For ATL Dougherty et al measured a tryptophan level of 491 µM in the 50 g total formulation, while we observed a mean tryptophan level of 147.62 µM.The sum of the tryptophan competitors was 1223 µM in the above cited paper and 1157 µM (excl.threonine, lysin and methionine) in our study.The fasting tryptophan concentration was between 56 µmol/l and 74 µmol/l in the Dougherty paper, which is slightly higher than in our study (between 32.62 and 33.62 µM/L).
The ANOVA revealed a significant main effect of the tryptophan intervention on TRP/ΣLNAA AUC values, F (2,166) = 658.99,p<.001.An additional contrast analysis showed a significant decrease of TRP/ΣLNAA AUC values in the tryptophan depletion condition in comparison to balance, F (1,83) = 75.22,p<.001, while we observed an increase in acute tryptophan loading compared to balance, F (1,83) = 601.63,p<.001.In conclusion, our findings with the Moja-De formulation are in the same range as in the above cited paper using the Young formulation and the ANOVAs demonstrate significant changes of the TRP/LNAA ratio as a parameter for brain serotonergic availability.

Behavioural data: Effect of genotype and order on 1 st stage choice repetition
To analyse the effect of the tryptophan intervention, we set up a repeated-measures ANOVAs with the model-free effect, and another with the model-based effect as dependent variables and intervention as the factor of interest.Furthermore, we added the 5-HTTLPR genotype and the order of the three interventions as between-subject factor to our ANOVA.

Figure S2 :
Figure S2: Distribution of different genetic variations in our sample.Frequency of allelic variations in all genotyped participants (A): S: short allele; L: long allele.Pie chart of allelic variations in all genotyped participants (B): Heterozygous participants were excluded from the study.Participants with S/S, LG/LG or S/LG genotype were merged to a group with low transcriptional activity of the serotonin transporter.L/L and LA/LA participants were considered as group with high transcriptional activity of the serotonin transporter (functional genotype).Tables with distribution of genotypes in the final sample for behavioral analyses (n=98) (C +D).
3, P(S'2|S,a1)=0.3 and remains fixed during the whole task.After each trial the participant is rewarded according to a diffusive probability which assigns non-zero probabilities within the boundaries of 0.25 and 0.75 to each second-stage action.The model is computed as a hybrid reinforcement learning model with weighted components for model-free (MF) and model-based (MB) controls.The MF system is characterized by updating state-action values at each trial by temporal difference learning, whereas the MB component takes the transition contingencies between the two stages into account.Noticeably, on the second stage there is only a MF component which is driven by the expected reward in the current trial (denoted as rt) and the current state-action-value Q(S't,a).Thereby follows for the second-stage:Q (S't,a)=Q (S't-1, a) + α2[rt-Q(S'1-t,a)]Here, rt-Q(S'1-t,a) represents the reward prediction error (RPE) and 0<α2<1 the learning rate at the second stage.For the first stage value updating works similar but with a different learning rate 0<α1<1 and an additional eligibility parameter for stage-skipping value updating 0<λ<1.QMF(St,a)=QMF(St-1,a)+ α1[QMF(S't-1,a)-QMF(St-1,a)]+ α1 λ[rt-1-QMF(S't-1,a)]The MB component learns by mapping state-action pairs to transition probabilities such as QMB(St,a)=P(S'1|S,a)maxQMF(S'1,t-1,a)+P(S'2|S,a)maxQMF(S'2,t-1,a)The values are connected to choices via a softmax choice rule, which maps probabilities to each action according to a combination of MF and MB components.The two components are weighted by a parameter β MB and β MF.The tendency to repeat the previous first-stage option is captured by the perseveration parameter π.This results in the probability for a single choice at the first stage as

Figure S4 :
Figure S4: Pharmacokinetics of the tryptophan intervention.(A) Percent change of blood tryptophan concentration over the time of the study in comparison to the baseline testing (before ingestion of the amino acids).The graph shows an increase in blood tryptophan levels in the loading condition, while in the balanced and tryptophan depletion condition the tryptophan concentration remained nearly stable during the time of the study.(B) Percent change of the concentration of the tryptophan competitors (LNAA).Peak levels of LNAA were reached 3h after ingestion.(C) Percent change of TRP/ΣLNAA Ratio as a measure of central serotonergic availability.Serotonergic availability peaked three hours after ingestion of the amino acids in the acute tryptophan loading condition (ATL) or reached their minimum after three hours in the depletion condition (ATD).Error bars: Standard error of the mean.

Table S1 :
Percentage of highest reached education.For two participants the data is missing.

Table S2
below), which results in a total amount of 48.75 g of large neutral amino acids (LNAA) for a 75 kg participant (mean body weight of our participants).As females were more lightweight in our study they received in general a smaller amount of LNAAs (42.25 g for a 65 kg female participant).In the depletion condition the mixture was completely tryptophan-free, for balance condition 7 mg/kg body weight tryptophan (.525 g for a 75 kg participant) were added, whereas in the loading condition we added 70 mg tryptophan per kg body weight (5.25 g for a 75 kg participant).The three different amino acids were provided by a commercial manufacturer (Amino-Factory, Lindenberg, Germany) in powder form and was dissolved in water and sprite (without carbonic acid) about 1 hour before study begin.All included amino-acids are listed in TableS2below.We instructed our participants to drink the mixture as fast as possible.The mean time needed for drinking the mixture was 11 minutes (SD: 8 minutes).

Table S2 : Composition of the amino acid mixture
. The amino acid mixture consists of the same amount of LNAAs in each condition (depletion, balanced, loading).The amount of tryptophan varies between the conditions and is added separately to the mixture.Bold: All amino acids that are tryptophan competitors.

Table S3 :
Concentrations (µmol/l) of the amino acids for all three interventions with standard deviation.Bold: All amino acids that are tryptophan competitors.