Multivariate Analysis of Variegated Expression in Neurons
Hannah M Shoenhard, Michael Granato
Disclaimer
DISCLAIMER – FOR INFORMATIONAL PURPOSES ONLY; USE AT YOUR OWN RISK
The protocol content here is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.
Abstract
Behavioral screens in model organisms have greatly facilitated the identification of genes and genetic pathways that regulate defined behaviors. Identifying the neural circuitry via which specific genes function to modify behavior remains a significant challenge in the field. Tissue- and cell type-specific knockout, knockdown, and rescue experiments serve this purpose, yet in zebrafish screening through dozens of candidate cell-type-specific and brain-region specific driver lines for their ability to rescue a mutant phenotype remains a bottleneck. Here we report on an alternative strategy that takes advantage of the variegation often present in Gal4-driven UAS lines to express a rescue construct in a neuronal tissue-specific and variegated manner. We developed and validated a computational pipeline that identifies specific brain regions where expression levels of the variegated rescue construct correlate with rescue of a mutant phenotype, indicating that gene expression levels in these regions may causally influence behavior. We termed this unbiased correlative approach Multivariate Analysis of Variegated Expression in Neurons (MAVEN). The MAVEN strategy advances the user’s capacity to quickly identify candidate brain regions where gene function may be relevant to a behavioral phenotype. This allows the user to skip or greatly reduce screening for rescue and proceed to experimental validation of candidate brain regions via genetically targeted approaches. MAVEN thus facilitates identification of brain regions in which specific genes function to regulate larval zebrafish behavior.
Steps
Perform the mating cross for your experiment. Here, we describe an example cross for Gal4 x UAS-induced rescue of the loss-of-function phenotype. Depending on your strategy, your cross may differ. Cross fish that are heterozygous (or homozygous mutant, if available) for a loss-of-function mutation in the gene of interest. At least one fish should carry a Gal4 for the cell type of interest, and at least one should carry a UAS construct that expresses a tagged version of the target gene.
Assay phenotype of interest
Assay your phenotype of interest in your larvae (See Figure 2 of the associated paper). Sort the larvae according to their phenotype and keep larvae with different phenotypes separate.
Storage of Larve (Optional)
After phenotyping but before fixation, larvae can be stored in individual wells in 100% methanol for 1-2 days while phenotypic analysis steps are completed. It is entirely possible to skip the methanol step and proceed straight to fixation if desired. Larvae should not be allowed to dry out at any point.
Labeling larval phenotype
Larvae should be physically labeled such that their phenotype is clearly identifiable (e.g. by cutting their tails bluntly vs. at an angle, pulling off pec fins, etc. We have not attempted removing an eye because we were concerned this would affect registration to the 3D atlas).
All larvae from all phenotypes should be fixed at the same time and stained in the same tube to minimize artifacts.
Fixation
Fix larvae in 4% PFA in PBS + 0.25% Triton (PBS-T), overnight (O/N) at 4⁰C. Note that with MAP-mapping, quick fixation is essential due to the rapid kinematics of ERK phosphorylation, whereas since this protocol only uses tERK, exact timing of fixation is not essential for success. We have found it best to use room-temperature PFA and fix at RT, NOT on ice, then move the larvae to 4 degrees after about 5-10 minutes.
Wash off PFA with 3 5-minute PBS-T washes.
Larvae can be stored for 1-2 weeks at 4 degrees in PBS (not PBS-T), or you may proceed directly to "preparation for immunostaining" section.
Preparation for Immunostaining
Bleach larvae (skip this step if you raised larvae in PTU-- this is only to bleach pigment cells so brains can be imaged without obstruction). After this point, be very careful not to lose larvae in pipetting and washing steps. You may wish to use a glass pipette rather than plastic both for better visibility and to avoid larvae sticking to the sides.
Original PTU protocol:
PTU alters visual behaviors:
PTU induces autophagy:
Prepare bleaching solution fresh every time: for 1 mL, combine 700 uL PBS-T, 200 uL 5% KOH, and 100 uL 30% H2O2.
Incubate on a rocker at RT for about 10 minutes or 55 degrees for about 5 minutes, until eyes are light orange in color, then rinse the bleach immediately off with 2-3 quick PBS-T washes, followed by one 5 minute PBS-T wash. Larvae will continue to bleach until they are completely washed, so be careful to begin washing when they are slightly darker than needed.
Antigen retrieval
Incubate in 150mM TrisHCl (9) for 5min at RT
Transfer to 70⁰C water bath for 15min
Wash with PBS-T-- one quick rinse, then 2x 5 min
Permeabilize larvae
Thaw trypsin 0.05% Trypsin -EDTA on ice.
For 6 dpf larvae, incubate in trypsin on ice for 45 minutes
Rinse the trypsin off with two quick washes in PBS-T, visually confirming that all the pink is gone, then wash for 10 min with PBS-T.
Immunostaining
Block larvae
Make blocking solution in PBS-T (can make in large batches and freeze aliquots):
2% Normal Goat Serum (only use NGS if no primary antibodies are made in goat or sheep)
1% BSA
1% DMSO
Incubate for 1 hour on a rocker at room temperature. In a 1.5 mL Eppendorf tube, use ~1 mL of blocking solution. Do not re-use blocking solution.
Apply primary antibody
Dilute antibodies 1:200 for anti-GFP and 1:500 for anti-tERK in 1%BSA, 1%DMSO in PBS-T
Incubate overnight on a gentle rocker or rotator at 4 degrees C
Wash off primary 3X 15 min in PBS-T on rocker at room temperature.
Apply secondary antibody
We used
Incubate overnight in the dark (for example, by placing in an opaque box or covering with aluminum foil) on a gentle rocker or rotator at 4 degrees C
Wash off secondary in the dark 3X 15 min in PBS-T on rocker at room temperature.
Store at 4 degrees in total darkness in 2:1
Imaging
Imaging
Mount Vectashield-soaked larvae in 1.1%- 1.25% low-melt agarose in a glass-bottomed petri dish, dorsal side down. Multiple larvae can be mounted in the same dish for higher efficiency.
As larvae are mounted, be sure to examine their tails carefully to determine what phenotype they had. Document this information (and their positions in the dish, if you are mounting multiple larvae in the same dish).
For our experiments, we used a Zeiss 880 microscope with a 20X 0.8 NA air lens at 0.8 zoom. The "tile" function of Zen was used to capture and stitch together two images, one including the forebrain, midbrain, and rostral part of the hindbrain and the other including the caudal hindbrain and anterior spinal cord. Step size was 2 microns. A brain usually comprised around 130-150 slices. Laser intensity and gain were calibrated such that the brightest neurons in the brain were saturated, because otherwise signal in the dimmest neurons was lost. (Note that saturated pixels exist in some portions of the reference brain as well-- it is best to attempt to match the staining of the reference brain as closely as possible.) 3-5 larvae were inspected before final settings were chosen, due to the variability in brightness of the GFP signal between brains. Ideally, the full range of each channel should be utilized. Once settings were determined, the same imaging settings were used for every brain in a staining batch. Images were saved in 8-bit, because they will be downsampled to 8-bit at a later step anyway.
Equipment
| Value | Label |
|---|---|
| LSM880 with Airyscan | NAME |
| Confocal microscope | TYPE |
| Zeiss | BRAND |
| LSM880 | SKU |
.
After imaging, place each individually-identified larva in a well of a genotyping plate, keeping careful track of which larva corresponds to which image and phenotype.
Once imaging is complete, genotype the larvae according to your own protocol.
Preparing images for registration in FIJI
Software
| Value | Label |
|---|---|
| FIJI (Image J) | NAME |
| NIH | DEVELOPER |
| https://fiji.sc/ | LINK |
If you have not already installed FIJI on your computer, see the installation guide + downloads here: https://imagej.net/software/fiji/downloads
Orient brains exactly vertically in FIJI using Transform🡪 Rotate. Even slight deviations from perfectly vertical can cause registration errors -- even a 2% rotation is worth it.
Split channels using Colors🡪 Split channels
If you imaged on the Zeiss 880 with the dorsal side of the fish closest to the coverslip, you must flip Z orientation using Transform🡪 Flip Z (as stack number goes higher, you must approach the dorsal side of the brain—look at the reference brain to be sure you’ve got it right. If your brain is not in the same orientation as the reference brain, the registration will fail.)
Save individual channels as .nrrd files with _01 suffix for the tERK channel and _02 for the GFP (or other marker of your expression construct) channel.
Registration to the reference brain
Registration to reference brain
For alternative instructions using the command line, see Randlett et al. (2015)
In order to facilitate this step for those who are not comfortable using the command line, we strongly recommend using the CMTK Registration Runner GUI by Sándor Kovács. There are detailed instructions to install and use this program at the link below. Install the version appropriate for your operating system.
https://github.com/sandorbx/Fiji-CMTK-registration-runner-GUI#readme
The image registration parameters, taken from Randlett et al. 2015, are as follows:
- awr 010203 –T -8 –X 52 –C 8 –G 80 –R 3 –A ‘accuracy 0.4’ –W ‘accuracy 1.6’
In CMTK Registration Runner, this translates to:
- a run affine transformation - CHECK
- w run warp transformation - CHECK
- c channels for registration - CHECK the number of channels in your images
- r run reformat on those channels - CHECK
- T (threads) default auto - Number of compute threads to use -- user's choice, depends on computer's capabilities
- X (exploration) 52
- C (coarsest) 8
- R (refine) 3
- G (grid spacing) 80
- Accuracy 1.0

Click "OK" to run your registration. This may take some time (e.g. hours), depending on how many brains you are registering and how many threads you are using.
Once the brains are done registering, open the registered brains in Image J along with the reference brain. Carefully compare the tERK channel in the registered brain with the reference brain. If the brain has not registered correctly, there are two options. The first is to exclude it from the analysis. The second option is to take all of the incorrectly-registered brains from a single batch and register them to a brain from the same batch that did register correctly, as an intermediate step, then re-register all of those brains to the reference brain. This sometimes succeeds if the tERK staining pattern varies only slightly from the reference brain's, likely due to batch effects in staining. If the fish were poorly positioned during imaging, it is unlikely they can be saved.
Make sure your registered brains are in their own folder with no other files or subfolders in it. At this point, you can set aside the registered tERK channels (with the suffix _01). All future steps are for the GFP channels (with the suffix _02) only.
Smooth and reformat your registered brains using the PrepareStacksForMAPMapping.ijm macro in FIJI. This macro can be found at Owen Randeltt's github page, https://github.com/owenrandlett/Z-Brain, or here: PrepareStacksForMAPMapping.ijm . Make sure that the maximum pixel intensity ("max =" __, line 3) is correct-- if you're using 8-bit images, the correct number is 256. Once you click "Run" you will be asked to direct the computer to the file containing your .nrrd output images as well as a new folder where the smoothed and reformatted images should go. Running this step should be substantially faster than registration (e.g. 15 minutes or less).
<img src="https://static.yanyin.tech/literature_test/protocol_io_true/protocols.io.14egn78jyv5d/Smoothing_Reformatting_Screenshot.png" alt="The ImageJ macro "PrepareStacksforMAPMapping.ijm"" loading="lazy" title="The ImageJ macro "PrepareStacksforMAPMapping.ijm""/>
After this step, in your output folder there should be a new .tiff file corresponding to each of your registered .nrrd files. If you've allowed the default naming of each step, your filenames should be something like "Ref20131120pt14pl2_Fish1_02_warp_m0g80c8e1e-1x52r3.nrrdGauSmooth.tiff"
Once installation is finished, register brains. Note that these instructions are for a Windows user; Mac / Linux users will need to modify. Begin by opening MobaXterm, the Linux emulator you just installed from the link above.
In the left side menu, click WSL-Ubuntu-20.04. This should open a new tab in MobaXterm with the header "WSL-Ubunut-20.04." In that tab, type pcmanfm and press enter.
<img src="https://static.yanyin.tech/literature_test/protocol_io_true/protocols.io.14egn78jyv5d/MobaXterm_Screenshot.png" alt="A screenshot of what MobaXterm will look like just before pressing "enter."" loading="lazy" title="A screenshot of what MobaXterm will look like just before pressing "enter.""/>
You will need to download the reference brain from the ZBrain 2.0 atlas before you can register your brains. To do this, go to this link: https://zebrafishatlas.zib.de/downloads. On the right side, "Other", there will be a button for "Reference brain." Move the reference brain file to an easy-to-access place in your file structure, but do not put it in the same folder as the brains that you will be registering.
For "reference brain (file)" navigate to the reference brain image file.
For "images to register (directory)" navigate to the folder in which all of your brain images have been saved. They should be .nrrds with the suffix _01 for the tERK file and _02 for the GFP file.
For "output selection" we recommend making a new folder for your registered brains.
Quantification of signal in each brain region
This step, quantification of GFP signal in each brain region, requires the use of Matlab. https://www.mathworks.com/products/matlab.html
You will need to download the file 'AnatomyLabelDatabaseDownsampled.hdf5' from https://zebrafishatlas.zib.de/downloads (under the "Others" header at the far right).
You will need to download the file 'MaskDatabaseDownsampled.hdf5' here: MaskDatabaseDownsampled.mat
Since the Matlab section of this code will likely only take <1 hour to run, assuming you have gathered all the necessary information in advance, it may be possible to run on a shared computer or using the Matlab free trial, if purchasing the program is not an option.
The code in this section was modified from code originally written by Owen Randlett.
The function QuantifySignalMultipleBrains.m will take as an argument your chosen output file name. Within the function, you will point Matlab to a folder containing the .tiff files of your aligned, smoothed brains. The function also requires the files 'AnatomyLabelDatabaseDownsampled.hdf5' and "MaskDatabaseDownsampled" to on the computer, and you will be asked to direct Matlab to the folder containing these files.
For those new to Matlab -- your command will look like this: QuantifySignalMultipleBrains("YourDesiredFileNameHere"). Don't forget the quotes! There is no need to have a ".csv" in the filename.
The function will loop through all of your .tiff files, asking you to input values for key variables that will go into the column name corresponding to that brain. The output of the function will be a .csv file. Each column of the file corresponds to a single brain. Each row corresponds to the signal intensity in one of the 293 brain regions in the anatomy database.
\Each column header contains some metadata about the fish that you provided.
If a given piece of metadata is not provided, its space is filled with an "x". If the metadata is out of order or not marked by an "x" the R code in the next step will not function as intended.
<img src="https://static.yanyin.tech/literature_test/protocol_io_true/protocols.io.14egn78jyv5d/Quantified_Signal_Excel.png" alt="Screenshot from Excel showing example output from the QuantifySignalMultipleBrains function. In the left column, "ROIname," (Region of Interest Name) are the names of brain regions from the anatomy database. In the right column, we have a brain name that describes, from left to right, the filename, the genotype, the phenotype ("LLC" refers to a fish that performed predominantly long-latency C-bends on a decision-making assay), the date the fish was stained, and the date the fish was imaged. Values in the cells are raw GFP signal intensity. " loading="lazy" title="Screenshot from Excel showing example output from the QuantifySignalMultipleBrains function. In the left column, "ROIname," (Region of Interest Name) are the names of brain regions from the anatomy database. In the right column, we have a brain name that describes, from left to right, the filename, the genotype, the phenotype ("LLC" refers to a fish that performed predominantly long-latency C-bends on a decision-making assay), the date the fish was stained, and the date the fish was imaged. Values in the cells are raw GFP signal intensity. "/>
Import data into R
This section requires RStudio, which is open source and free.
I have provided an example analysis in the following RMarkdown files.MAVEN_Whole_Project_220222.Rmd MAVEN_Whole_Project_220222.html
For examples of data analysis, see the RMarkdown files (html file is an interactive "tour" of analysis and figures, while RMarkdown file contains editable code for performing all the analyses and generating all the figures in the .html file).
Before you can analyze your own data, you will need to install packages, load some custom functions (attached below) and modify the data import section of this code. Instructions for each step are provided below.
R allows users to develop "packages" which contain specialized user-created functions. I employ several packages in my analysis. One option is to install the following packages manually in R using the Packages--> Install menu in the lower right corner of RStudio. Once a package is installed, it must be loaded using the library() command.
- tidyverse
- glmnet
- readxl
- janitor
- hablar
- ggpubr
- here
- gt
- ComplexHeatmap
- circlize
- corrplot
- renv
Alternatively, to run the code with exactly the same versions of these packages that I used:
-
Install the renv package
-
Make sure the renv.lock file is in your working directory renv.lock
-
Run the command renv::restore()
Next, following the R code, we load some custom functions and associated files for the analysis. Download these files and place them in the same folder as the R Project you're working with. More details on what these functions do are available in the comments of the R code (step 19.5). export_top_PC_regions.R exportStrongestCorrelatedRegions.R filterCaudalBrain.R graph_PC_components.R
normalizeSignalBy.R tidyImportedDataUnderscore.R 251_BrainRegions.xlsx 275_BrainRegions.xlsx 293_brain_regions_formatted.csv
Example code to load a function:
source("exportStrongestCorrelatedRegions.R")
Next, following the R code, we load into R the raw brain region signal intensity data that we generated using Matlab.
We assume that each sample name contains some metadata about the fish: namely, its number, the pair it came from, its genotype, phenotype, the date it was collected (e.g. the date behavior was performed) and the date it was imaged. Thus, the FishName column contains entries that look like this:
Fish10_41_mut_LLC_210314_210320
In the Matlab code, if a given piece of metadata is not provided by the user, its space is filled with an "x". If the metadata is out of order or not marked by an "x" the R code will not function as intended.
You will need to modify the code to point it to your Matlab output file.

We also load in a list of the names of all 293 brain regions plus corresponding abbreviations for these brain regions. We often use the abbreviated forms in our graphs because some of the true anatomical brain regions are long enough to interfere with axis labels for figures. If you want to move between different forms of brain region names, you can use a join function to combine your data with this key, then the dplyr select function to retain only the form of the names that you want to work with.
In the code Section 7.1 Correlational analysis: which other regions correlate with the DCR6? of the R code the user generates a list of brain regions where signal is most highly correlated with another brain region. This can be used to identify alternative candidate phenotype-causative regions if the first region identified by LASSO regression fails to validate. The R code exports a .csv file in a format that can be read by the Matlab function CustomBrainRegionStack.m, which can be used to generate customized images showing the anatomical arrangement of the set of brain regions specified. A detailed description of how to use that function is contained in its header.
From this point forward, use the comments and instructions within the RMarkdown code chunks to guide your own analysis. After every step, be sure to stop and think about the results and what they imply for future steps. It is likely that every analysis will be slightly different, since the underlying structure of the variation in gene expression will be slightly different. This code should provide a useful framework, but a good grasp on multivariate statistics is also essential to help interpret results as you go. If you need help, refer to resources in the note of Step 18. Good luck!


