Visualisation of Brain Statistics with R-packages ggseg and ggseg3d

There is an increased emphasis on visualizing neuroimaging results in more intuitive ways. Common statistical tools for dissemination, such as bar charts, lack the spatial dimension that is inherent in neuroimaging data. Here we present two packages for the statistical software R, ggseg and ggseg3d, that integrate this spatial component. The ggseg and ggseg3d packages visualize pre-defined brain segmentations as both 2D polygons and 3D meshes, respectively. Both packages are integrated with other well-established R-packages, allowing great flexibility. In this tutorial, we present the main data and functions in the ggseg and ggseg3d packages for brain atlas visualization. The main highlighted functions are able to display brain segmentation plots in R. Further, the accompanying ggsegExtra-package includes a wider collection of atlases, and is intended for community-based efforts to develop more compatible atlases to ggseg and ggseg3d. Overall, the ggseg-packages facilitate parcellation-based visualizations in R, improve and ease the dissemination of the results, and increase the efficiency of the workflows.


INTRODUCTION
Visualization is increasingly important for accurate guidance and interpretation of neuroimaging results, as current research is able to generate a high amount of data and outcomes. For Magnetic Resonance Imaging (MRI), neuroimaging software provides whole-brain information by using many small units of space (>100.000). Nonetheless, this data is often grouped and summarized into a limited number of regions using predefined brain parcellation atlases. Brain parcellations segment the brain into a finite set of meaningful neurobiological components, which reflect one or more brain features either based on structural or connectivity properties (Eickhoff et al. [2018]). The use of brain atlases is widespread as these facilitate interpretation and minimize the amount of data, hence reducing problems with multiple comparisons. This enables replicability and data sharing in otherwise computationally expensive analyses, which are often performed in specialized software environments such as R (R Core Team [2019]).
MRI data provides good spatial resolution and thus an optimal representation has to respect spatial relationships across regions. Results from brain atlas analyses are most meaningfully visualized when projected onto a representation of the brain, thus it is desirable that any visual representation takes this relation into account. The projection of data onto brain representations provides clear points of reference -especially when the reader is unfamiliar with the atlas -eases readability, guides interpretation, and conveys the spatial patterns of the data. Adopting the grammar of graphics implemented in ggplot2 (Wickham [2016]), one can plot neuroimaging data directly in R with several tools such as ggBrain (Fisher [2019]) and ggneuro (Muschelli [2017]; see neuroconductor [2018] for curated neuroimaging packages for R). Yet these tools display whole-brain image files and are not well-suited for representing brain atlas data.
In this tutorial, we introduce two packages for visualizing brain atlas data in R. The ggseg and ggseg3d -pluss the complimentary ggsegExtra -packages include pre-compiled data sets for different brain atlases that allow for 2D and 3D visualization. The two-dimensional functionality in ggseg is based on polygons and ggplot2-based grammar of graphics (Wickham [2016]), while the 3D functionality in ggseg3d is based on tri-surface mesh plots and plotly (Sievert [2018]). Both packages present compiled data sets, tailored functions that allow brain data integration and plotting, and other minor features such as custom colour palettes. The data featured in the packages are derived from two well-known parcellations: the Desikan-Killany cortical atlas (DKT; Desikan et al. [2006]), which covers the cortical surface of the brain, and the Automatic Segmentation of Subcortical Structures (aseg; Fischl et al. [2002]), which covers the subcortical structures. Both atlases are implemented in several neuroimaging softwares, such as FreeSurfer , , Fischl and Dale [2000]), and are commonly used in relation to developmental changes, disease biomarkers, genomic data, and cognition (Amlien et al. [2019], Walhovd et al. [2005], Pizzagalli et al. [2009]). The ggsegExtra package contains a collection of precompiled atlases (currently 15 additional atlases) and it is frequently updated. A summary of all available atlases compatible with the ggseg-packages as of December 2019 can be found in Table 1.

TUTORIAL
This tutorial will introduce the ggseg, ggseg3d, and ggsegExtra packages and familiarize the reader with the main functions and the general use of the packages. The tutorial will focus on the two main functions: ggseg() for plotting 2D polygons and ggseg3d() for plotting 3D brains based on tri-surface mesh plots.

Plotting polygon data (ggplot2)
ggseg is the main function for plotting 2D data. By default, the function automatically plots the DKT atlas (see Figure 1). The ggseg()-function is a wrapper for geom polygon() from ggplot2, and it can be built upon and combined like any ggplot-object. The image plot consists of a simple brain representation containing no extra information. Hence, ggseg plots can be easily complemented with any of the available ggplot2 features and options. We recommend users to get familiarized with ggplot2 (Wickham [2016]). The package is currently only available through github, but we expect to submit the ggseg-package to The Comprehensive R Archive Network (Hornik [2012]) in 2020. In addition to the standard options for ggplot2 polygon geoms, the function also has several options for plotting the main brain representations. These options are atlas-specific. For cortical atlases, such as the dkt, one can stack the hemispheres, display only the medial or lateral side, choose either one or both hemispheres, or any combination of hemisphere and view (see Figure 2 for examples). For subcortical atlases, such as the aseg atlas, the options are more limited but one can often choose between axial, sagittal, and coronal views.
palettes corresponding to those used in the original neuroimaging softwares one can use atlas-specific 'brain' palette scales ( Figure 3). # Figure 3 ggseg(mapping=aes(fill = area), colour="black") + scale fill brain("dkt") + theme(legend.justification=c(1,0), legend.position="bottom", legend.text = element text(size = 5)) + guides(fill = guide legend(ncol = 3)) Most users will use ggseg() to display -using a color scale -some descriptive or inferential statistics, such as mean thickness or brain-cognition relationships across the different brain regions. Yet, before projecting the statistics onto the segments, one should explore the structure of the atlas data sets. The atlas data set structure will help users understand what incoming statistical data needs to look like. Note that each atlas corresponds to a unique data set. All data sets have a similar structure and contain key information regarding the atlas, the region names, and the coordinates for the segment polygons.
# Look at the top 5 rows of the dkt dataset head(dkt, 5) In any atlas, the column 'label' is particularly useful for combining the data of interest with the ggseg-polygons. The column 'label' contains the label (region) names as in the original neuroimaging software. For example, the DKT atlas label column matches the region names from Freesurfer statistics table outputs. Yet, the data in ggseg is in a long format -that is each region has a single row -and any 5/17 data of interest needs to be in this same format. Often data sets are organized in wide format, in which subjects are represented by rows and each different data variable is represented in a separate column, and thus need to be rearranged to work with ggseg. See below an example of wide-to-long conversion. Data in long format can then be used directly with the ggseg()-function, as the 'label' column corresponds in name and content with the 'label' column in the atlas data of dkt. The data must include a column that has the same column name and at least some data matching the values in the corresponding column in the atlas data. In the next example we create some data with 4 rows, with an 'area' and 'p' column, representing the results of a hypothetical analysis. The ggseg()-function will recognise the matching column 'area', and merge the supplied data into the atlas using dplyr-joins. We use the 'p' column as the column flooding the segment with colour. The appearance of the plot can then be modified similarly to any other ggplot2 graph using functions such as scales, labs, themes, etc., as seen in Figure 4 # Make some mock data someData = data.frame( area = c("transverse temporal", "insula", "pre central","superior parietal"), p = sample(seq(0,.5,.001), 4), stringsAsFactors = FALSE) # Figure 4 ggseg(.data=someData, mapping=aes(fill=p)) + labs(title="A nice plot title", fill="p-value") + scale fill gradient(low="firebrick",high="goldenrod") If the results are only in one hemisphere, but you still want to plot both of them, make sure your data.frame includes the column 'hemi' with either 'right' or 'left'. In this case, data will be merged into the atlas both by 'area' and by 'hemi'. For more information about adapting data and viewing only one hemisphere or side, the package vignettes contains more elaborate information.  . Supplying data through the '.data' option in ggseg() enables use of columns in the supplied data to aesthetical arguments (such as 'fill'). The ggseg plot can be used with any other polygon compatible function from ggplot2 or ggplot2 extentions, for instance adding title, changing the legend name and the colour scheme with standard ggplot2 functions.

Creating subplots
There is often the need to plot a statistic of interest in different groups (e.g. thickness or brain -cognition relationships in young or older adults). This may be obtained also with ggseg(), using ggplot2's facet wrap or facet grid, with three guiding rules: 1) as before, data needs to be in long format with a column indexing which group the row corresponds to (group data should appear in seperate rows, not in separate columns).
2) The data needs to be grouped using dplyr's group by() function before providing the data to the ggseg()-function. The ggseg()-function will detect grouped data, and adapt it to facet's requirements. 3) Apply facet wrap or facet grid to the plot having used the above two rules. An example of this can be seen in Figure 5, where a mock data set including summary statistics for two groups ('Young' and 'Old') is used when faceting a ggseg-plot.

Plotting 3D mesh data
Representing brains as 2D polygons is a good solution for fast, efficient, and flexible plotting, and can be easily combined with interactive apps such as Shiny (Chang et al. [2019]). Yet, brains are intrinsically 3-dimensional and it can be challenging to recognize the location of a region as a flattened image. This For instance, there is no option to show only a single hemisphere. Furthermore, rather than showing lateral and medial surfaces, it shows an axial and sagittal slice.

8/17
problem is exacerbated in atlases that represent subcortical features as they are 3-dimensional, while cortical structures, such as grey matter structures, can be flattened to 2-dimensions. Hence, here we also provide the ggseg3d package to plot, view, and print 3D-atlases in R. ggseg3d is based on tri-surface mesh plots using plotly (Sievert [2018]). The data structure is more complex than the ggplot2 polygons, and includes additional options for brain inflation, glass brains, camera locations, etc. As ggseg3d is based on plotly, the resulting brain atlases are interactive, which guides interpretation, and is useful for public dissemination. We recommend users to familiarize themselves with plotly (Sievert [2018]) when using this function. Out-of-the-box, ggseg3d() plots the dkt 3d atlas in 'LCBC' surface, but there are two more surfaces available for cortical atlases (Figure 7) The 'LCBC' surface consists on a semi-inflated white matter surface based on the fsaverage5 template subject. All [...] 3d atlases include a colour column that based on the color scheme used in the source neuroimaging software. The 3D-atlas data is stored in nested tibbles. Each cortical atlas has data sets for three different surfaces (see Figure 7) and the two hemispheres. Only one surface is available for subcortical atlases as inflation procedures are irrelevant. The 'ggseg 3d' column includes all necessary information for ggseg3d() to create a 3D mesh-plot, and should not be modified by the user. The additional 3D-atlases in ggsegExtra have the same data structure. It is important to note that the coordinates in the plot (X, Y, Z) are not any type of radiological coordinate system, but arbitrary Cartesian plot coordinates. # remotes::install github("LCBC-UiO/ggseg3d") library(ggseg3d)

External data supply
Similarly as in the 2D-atlas, the user will use ggseg3d() to display through a colour scale some descriptive or inferential statistic. If the data is not already in the correct long format or uses similar naming as the atlas, the users should inspect the atlas data for a specific surface (and hemisphere, if desired), and then unnest(ggseg 3d) it to see how the atlas data is organised.
# Select surface and hemisphere, and then unnest to inspect the atlas data dkt 3d %>% filter(surf == "inflated" & hemi == "right") %>% unnest(ggseg 3d) %>% head(5) Note the mesh column, which contains lists. Each list corresponds to a region and contains 6 vectors required to create the mesh of the tri-surface plot. It should also be noted that the 'label', 'annot' and 'area' columns could provide matching values for your own data. Similarly to the ggseg()-function, the 'label' column should match the region names used in the original neuroimaging software while 'area' and 'annot' provide alternative/secondary names. It is thus important to match your regional identifiers with those used in the atlas. To colour the segments using a column from the data, a column name from the data needs to be supplied to the colour option, and providing it to the text option will add another line to the plotly hover information.

Customizing colours and the colour bar
You can provide custom colour palettes either in hex or R-names, as seen in Figure 8. Colours will be evenly spaced when creating the colour-scale. A palette may also be supplied as a named numeric vector, where the vector names are the colours that users wish to use, and the numeric values are the breakpoints for each colour (e.g. c("red" = 0, "white" = 0.5, "blue" = 1)). This way the users can control the minimum and maximum values of the colour scale, and also how the gradient is applied. If another colour than the default gray is wanted for the NA regions, supply 'na.colour', either as HEX colour or colour name. This option only takes a single colour.

Adding a glass brain
Subcortical atlases include cortical surfaces and other landmark structures for visualization purposes only. One can control the opacity of the these NA structures, to improve visualization. Glass brains can be added to provide a frame of reference for the subcortical structures with the function add glassbrain() (Figure 9) , which takes three extra arguments: hemisphere, colour, and opacity.
# Figure 9 ggseg3d(atlas = aseg 3d, na.alpha= .5) %>% add glassbrain("left") %>% pan camera("left lateral") %>% remove axes() 10/17 Figure 9. For subcortical structure visualization, one can add a glass brain to the plot. This will help with locating the structures relative to the cortex, and make the plot easier to interpret. The glass brain is controlled by three options: opacity, hemisphere, and colour.
ggsed3d() is based on plotly and thus additional plotly functionalities can be used to modify and improve the 3D atlas representations. In addition to Carson Sievert's book on plotly in R ([2018]), we recommend resources for modifying axes in 3D plots (Damiba [2019a]), the basic introduction to tri-surface plots (Damiba [2019b]), and this tutorial on tri-surface plots with plotly in R (Riddihiman [2016]). Finally, we recommend orca command line tool to save ggseg3d atlas snapshots.

Additional atlases
The ggseg and ggseg3d packages have two atlases each, which are 2D and 3D variations of the same main atlases: the dkt (Desikan et al. [2006]) and aseg (Fischl et al. [2002]) atlases. These are, however, only two among many meaningful ways of segmenting the brain into different regions. Thus, the ggsegExtra package is a repository containing additional with additional data sets for plotting with the ggseg and ggseg3d packages. There is an ever-increasing amount of new atlases being created, as research and methods in neuroimaging analysis progresses. The ggsegExtra-package is intended to be expanded as a community-effort, as new and informative atlases are published. A small collection of the atlases currently in the ggsegExtra package may be viewed in Figure 10

DISCUSSION
The main aim of the ggseg, ggseg3d, and ggsegExtra packages is to ease and streamline visualization of brain atlas data in R, by gathering a collection of atlases from several scientific sources and providing customized plotting functions. In this tutorial, we introduced the packages to the readers by presenting some use examples and highlighting the main functions and options that are available. As visualization tools, these packages add up to manifold functionalities such as ggBrain (Fisher [2019]) and ggneuro (Muschelli [2017]) in R, and software-specific image viewers such as FSLeyes (McCarthy [2019]) and Freeview ). In this regard, we do not aim to compete with software-specific visualizations or advocate for the superiority of the ggseg-packages as visualization tools. After all, flattened 2D polygons do not rely on a meaningful brain coordinate system and the units of information in 3D meshes are limited to the number of parcellations. On the contrary, we believe the ggseg niche among visualization tools resides in its simplicity and its ability to be combined with statistical analysis pipelines. The possibility to serve as an interactive tool for dissemination and reproducibility when combined with other technologies, such as Binder (Project Jupyter et al. [2018]) or Shiny (Chang et al. [2019]), is an added benefit. This is exemplified in the online supplementary information of Vidal-Piñeiro et al. ([2019]).
The three ggseg-packages contain three main features: 1) a collection of 2D-polygon and 3D mesh brain parcellation atlases. The atlas data include the necessary coordinates for plotting, and include other information that should be recognizable for users. 2) ggseg() and ggseg3d() functions for visualization. Both functions are flexible and well-adapted to their environment and can be combined with any additional argument from ggplot2 and plotly, respectively. ggplot2 object. -ggseg3d() is a plotly wrapper function for tri-surface mesh plots which prints 3D atlases.
3) Complimentary features -e.g. color scales -and functions such as as ggseg atlas() and as ggseg3d atlas() to convert data in the correct atlas format.
These functions provide users with the possibility of adapting the plots to their wishes, and also makes it possible to create and contribute to the atlas repository in ggsegExtra.
The foundations of the ggseg-packages trace back to the necessity of visualizing and exploring the lifespan trajectories of cortical thickness across different brain regions (see supplementary information in Vidal-Piñeiro et al. ([2019]) ) . That is, ggseg appears with the need to inspect and display brain information over time -i.e. including a spatial dimension and a time-varying factor -overcoming the constrains of printed journals and classical 2D plots (e.g. bar plots). The current state of science requires researches to share the results of studies in both high detail and in an intuitive manner, as it permits communication to wide audiences and facilitates reproducibility. Hence, we believe this tool conforms to the essence of open science and invite users to improve the code, provide examples, or tutorials, and contribute to the atlas collection according to their own interest and needs via the public ggseg GitHub repository, ggseg3d GitHub repository, and ggsegExtra GitHub repository.
Finally -while the ggseg-packages are circumscribed to brain parcellations -we believe that the structure and functions of the package can be easily applied to any scientific field that benefits from data being displayed across the spatial dimension. We encourage readers to borrow the package functionalities and adapt it to their respective fields and structures of interest, such as has already been done with the gganatogram-package (Maag [2018]).

PLANNED PACKAGE IMPROVEMENTS
In the ggsegExtra github wiki, we offer a pipeline to create and supply atlases for 2D plotting. At the moment, the creation of atlas for ggseg is convoluted and difficult and requires manual intervention. We are in the process of designing a simple, straightforward pipeline to facilitate the creation of new ggseg (2D) atlases. The aim is to create a set of functions that will call specialized tools like FSL(Woolrich et al. [2009]), Freesurfer , , Fischl and Dale [2000]) and Imagemagick (Ooms [2019]) to detect the polygon vectors or the mesh segments given an MRI image containing a parcellation specification, and organize these into valid ggseg and ggseg3d atlases. We encourage users to contribute to the ggsegExtra brain atlas repository by including additional brain atlases.

CONCLUSION
Visualization is a fundamental aspect of neuroimaging to explore and understand data, guide interpretation, and communicate with colleagues and the general audience. In this tutorial, we have introduced the ggsegpackages, tools for visualizing brain statistics through brain parcellation atlases in R. This visualization tool easily combines with interactive routines as well as with diverse statistical analysis pipelines. We hope this tool and tutorial proves useful to neuroscientists and inspires others to apply the functions in a wide variety of fields and structures.

AUTHOR CONTRIBUTIONS
Didac Vidal-Piñeiro generated the idea for the tool, and the initial scripts for plot visualization. He has also been responsible for converting images from neuroimaging data to ggseg-like data (e.g. polygons and mesh data). Athanasia M. Mowinckel adapted the initial scripts and made the functions into package format and has continued developing the functions with the aim of increasing user-friendliness. She is also responsible for conceiving and adding the mesh-plot functionality through plotly, and developing the pipeline for making that possible. A. M. Mowinckel wrote the first draft of the paper, and both have since critically edited it.

CONFLICTS OF INTEREST
The authors declare that there were no conflicts of interest with respect to the authorship or the publication of this article.