Generation of Library for Sequencing

Clémentine Delan-Forino, David Tollervey

Published: 2021-09-03 DOI: 10.17504/protocols.io.bntimeke

Abstract

The RNA exosome complex functions in both the accurate processing and rapid degradation of many classes of RNA in eukaryotes and Archaea. Functional and structural analyses indicate that RNA can either be threaded through the central channel of the exosome or more directly access the active sites of the ribonucleases Rrp44 and Rrp6, but in most cases, it remains unclear how many substrates follow each pathway in vivo. Here we describe the method for using an UV cross-linking technique termed CRAC to generate stringent, transcriptome-wide mapping of exosome–substrate interaction sites in vivo and at base-pair resolution.

We present a protocol for the identification of RNA interaction sites for the exosome, using UV cross-linking and analysis of c DNA (CRAC) [ 1 , 2 ]. A number of related protocols for the identification of sites of RNA–protein interaction have been reported, including HITS-CLIP, CLIP-Seq, iCLIP, eCLIP, and others [ 3 , 4 , 5 , 6 ]. These all exploit protein immunoprecipitation to isolate protein–RNA complexes. CRAC is distinguished by the inclusion of tandem affinity purification and denaturing purification, allowing greater stringency in the recovery of authentic RNA–protein interaction sites.

To allow CRAC analyses, strains are created that express a “bait” protein with a tripartite tag. This generally consists of His6, followed by a TEV-protease cleavage site, then two copies of the z-domain from Protein A (HTP). The tag is inserted at the C terminus of the endogenous gene within the chromosome. The fusion construct is the only version of the protein expressed and this is under the control of the endogenous promoter. Several alternative tags have been successfully used, including a version with N-terminal fusion to a tag consisting of 3× FLAG-PreSission protease (PP) cleavage site-His6 (FPH) [ 7 ]. This is a smaller construct and is suitable for use on proteins with structures that are incompatible with C-terminal tagging. An additional variant is the insertion of a PP site into a protein that is also HTP tagged. This allows the separation of different domains of multidomain proteins. Importantly, the intact protein is cross-linked in the living cell, with domain separation in vitro. This has been successfully applied to the exosome subunit Rrp44/Dis3 to specifically identify binding sites for the PIN endonuclease domain [ 8 ].

Briefly, during standard CRAC analyses, covalently linked protein–exosome complexes are generated in vivo by irradiation with UV-C (254 nm). This generates RNA radicals that rapidly react with proteins in direct contact with the affected nucleotide (zero length cross-linking). The cells are then lysed and complexes with the bait protein are purified using an IgG column. Protein–RNA complexes are specifically eluted by TEV cleavage of the fusion protein and cross-linked RNAs trimmed using RNase A/T1, leaving a protected “footprint” of the protein binding site on the RNA. Trimmed complexes are denatured using 6 M Guanidinium, immobilized on Ni-NTA affinity resin and washed under denaturing conditions to dissociate copurifying proteins and complexes. The subsequent enzymatic steps are all performed on-column, during which RNA 3′ and 5′ ends are prepared, labeled with 32P (to allow RNA–protein complexes to be followed during gel separation) and linkers ligated. Note, however, that alternatives to using 32P labeling have been reported (e.g., [ 6 ]). The linker-ligated, RNA–protein complexes are eluted from the Ni-NTA resin and size selected on a denaturing SDS-PAGE gel. Following elution, the bound RNA is released by degradation of the bait protein using treatment with Proteinase K. The recovered RNA fragments are identified by reverse transcription, PCR amplification and sequencing using an Illumina platform.

Relative to CLIP-related protocols, CRAC offers the advantages of stringent purification, that substantially reduces background, and on-bead linker ligation that simplifies separation of reaction constituents during successive enzymatic steps. It also avoids the necessity to generate high-affinity antibodies needed for immunoprecipitation. Potential disadvantages are that, despite their ubiquitous use in yeast studies, tagged constructs may not be fully functional. This can be partially mitigated by confirming the ability of the tagged protein to support normal cell growth and/or RNA processing, or by comparing the behavior of N- and C-terminal tagged constructs. Additionally, because linkers are ligated to the protein–RNA complex, a possible disadvantage is that UV-cross-linking of the RNA at, or near, the 5′ or 3′ end it may sterically hinder on-column (de)phosphorylation and/or linker ligation. With these caveats, CRAC has been successfully applied to >50 proteins in budding yeast, and in other systems ranging from pathogenic bacteria to viral infected mouse cells [ 7 , 9 ].

Before start

Appropriate negative controls and experimental replicates are required to determine the background signal and true positive binding sites. We routinely use the (untagged) yeast parental strain as a negative control, performing a minimum of two biological and technical replicates for each sample. It is commonly observed that technical replicates (even samples from the same culture) processed in two independent CRAC experiments show more differences than two biological replicates (independent cultures) processed together.

All steps should be performed wearing disposable gloves and materials should be free of DNase and RNase. Prior to each CRAC experiment, pipettes should be cleaned with DNAZap (ThermoFisher; AM9890) to avoid DNA contamination at the PCR step, followed by RNaseZAP (ThermoFisher; AM9890) treatment, and rinsed with deionized water. All the buffers should be prepared with deionized water and free of RNases; however, DEPC treatment is not normally essential. To minimize buffer contamination, adjust the pH by taking small aliquots for measurements. Filter-sterilize stock solutions following preparation, and store at 4 °C. Where required, add β-mercaptoethanol and protease inhibitors to the buffers shortly before use. Wash buffers should be prepared immediately before starting the CRAC experiment.

All steps must be carried out on ice, unless stated otherwise. For troubleshooting, it is a good idea to monitor the course of the experiment by retaining samples at points during the CRAC protocol. This allows potential problems with Protein–RNA purification steps to be identified. Three aliquots per sample are taken during the purification (Subheading 3.2.2 3.2.2 “Crude Lysate” and “IgG supernatant,” Subheading 3.2.3 3.2.3 “TEV Eluate”). These can be analyzed by Western blot.

Steps

Reverse Transcription of Purified RNA

1.

Note
To increase the efficiency of this step, prepare fresh dNTP dilution prior RT or aliquot and store at -20°C to avoid multiple thawing.

2.

Resuspend the RNA pellet in 11µL. Add 1µL and 1µL.

3.

Heat the samples to 80°C for 0h 3m 0s, then chill On ice for 0h 5m 0s. Collect the contents by brief centrifugation.

4.

To each sample, add 4µL (Invitrogen), 1µL, and 1µL.

5.

Incubate at 50°C for 0h 3m 0s and add 1µL (Invitrogen). This step will help dissociate any nonspecifically annealed primers from the RNA.

6.

Incubate at 50°C for 1h 0m 0s.

7.

Inactivate the Superscript III by incubating the samples at 65°C for 0h 15m 0s.

8.

Add 2µL and incubate for 0h 30m 0s at 37°C.

PCR Amplification of cDNA Libraries

9.

Note
The number of cycles used to prepare cDNA libraries should be optimized for the template and limited to minimize artifacts due to overamplification, that is, the frequency of PCR duplicates. Generally, 21–22 cycles have been sufficient to produce complex libraries from cDNA generated from Exosome subunit-bound RNA, however we typically vary between 19 and 24 cycles and will increase number of independent PCR reactions (up to 5) for samples with low abundance of cDNA.

10.

To 3µL, add 47µL containing:

  • 5µL
  • 1µL
  • 1µL
  • 5µL
  • 0.5µL
  • 37.5µL

Note
We prepare three or more PCR reactions per sample to increase the complexity of our libraries.

11.

The reaction is run with the following cycling conditions:

ABC
TempTimeCycle
95 °C2 min 
98 °C20 s21 cycles
52 °C20 s
68 °C20 s
72 °C5 min 
12.

Pool PCR reactions into a clean microcentrifuge tube and precipitate with 0.1 volume sodium acetate (pH 5.2) and 2.5 volumes of ice cold absolute ethanol.

Note
Alternatively, you can concentrate cDNA libraries using MinElute PCR purification kit as indicated in the manufacturer’s instructions. Elute your samples with 20µL.

13.

Incubate at -20°C for 0h 30m 0s (it is better to not precipitate longer to avoid recovering too much salt). Centrifuge at 16000x g,4°C.

14.

Remove the supernatant and air dry the pellet. Resuspend in 15µL.

Size Selection of cDNA Libraries on Gel

15.

Note
At this stage, it is possible to adjust library size distribution and enrich the DNA library for cDNA of a certain length before sequencing. This size selection is dependent on the length of sequencing that will be used, the protein, and the biological questions CRAC is supposed to answer. If 50 bp sequencing length is planned, it is not useful to recover extra-long cDNAs; moreover longer sequences will decrease resolution of protein binding sites. On the other hand, for most proteins, it is preferable to avoid overpopulation of the library by short sequences (shorter than 20 nt), which are difficult to map confidently. In some case, these general guidelines have to be adjusted for biological relevance: for instance, cDNA libraries from Rrp44-HTP are cut just above 130 nt to also recover short sequences enriched in cDNAs corresponding to RNAs bypassing the long exosome channel and directly accessing Rrp44.

16.

Prepare a 3% Metaphor agarose gel using 1× TBE buffer (with 1:1000 SYBR Safe) and store it at 4°C for a minimum of 0h 30m 0s.

Note
Preparing a Metaphor gel takes longer than preparing a standard agarose gel, and it is common for the agarose to form “lumps” which are hard to dissolve. One option is to let the Metaphor powder to soak for 30 min in 1× TBE before agitating it on a magnetic stirrer hot plate. A second option is to microwave the mixture before agitating it on a magnetic stirrer hot plate. The gel can be prepared the day before and stored at 4 °C wrapped in cling film.

17.

Add 5µL to precipitated sample and load the entire volume onto the prepared 3% Metaphor agarose gel along with 50 bp DNA ladder.

18.

Run the gel at 80 V for approximately 2h 0m 0s or until the bromophenol blue dye front reaches 2 cm from the edge of the gel.

19.

Image the gel.

Note
We use a Typhoon FLA9500 laser scanner (GE Life sciences) for increased sensitivity and print the gel images at 1:1 scale. A lower band around 120 nt corresponding to the amplified sequencing adapter dimers is sometimes visible and should be avoided when cutting. The cDNA libraries appear as a smear running above primer dimers that should be apparent in the negative control samples. The presence of a sharp band may indicate excessive RNA digestion. For other proteins, this can simply indicate the presence of a highly abundant binding target. However, this has not been observed with exosome components. Lack, or small amounts, of PCR products on the agarose gel (despite strong signal by autoradiography) suggests inefficient enzymatic reactions.

20.

Place the gel on a transparent film and align it to the 1:1 scan of the gel. Excise the libraries using a sterile scalpel by cutting from the bottom of the smear to the predefined upper limit.

21.

Transfer the gel slices to 2 ml microcentrifuge tubes. Rescan gel afterward to check the expected bits are cut out.

22.

Add 1mL from the MinElute Gel Extraction purification kit (QIAgen) and incubate the gel slices at 42°C for 0h 15m 0s0h 20m 0s to dissolve the agarose.

23.

Transfer the volume to a MinElute column fitted to collection tubes and spin at 16000x g. Discard the flowthrough.

Repeat with the leftover of buffer/agarose to bind all the sample to the same column.

24.

Add 750µL and spin at 16000x g. Discard the flowthrough.

25.

Add 750µL (QIAgen) to the columns and incubate for 0h 10m 0s at Room temperature. Spin at 16000x g and discard the flowthrough.

26.

Dry the columns by spinning at 16000x g. Transfer the columns to clean 1.5 ml microcentrifuge tubes.

27.

Add 20µL on membrane and let stand for 0h 2m 0s0h 5m 0s. Elute the purified cDNA by spinning at 16000x g.

28.

Quantify the cDNA library using a Qubit high sensitivity DNA assay kit and fluorometer and store the libraries at -20°C.

Sequencing

29.

Note
The samples can be submitted for single end sequencing on Illumina MiSeq, HiSeq, MiniSeq, or NextSeq platforms.The read depth required for sufficient coverage of binding sites will depend on the number of RBP binding sites and complexity of the library generated (i.e., number of PCR duplicates). The exosome binds a huge diversity of targets. Since the highest proportion of the reads are aligned to ribosomal RNA, it is necessary to sequence deeply enough to detect less frequently bound targets. We generally aim to generate 17–35 nt trimmed RNA fragments that contain enough sequence information for a unique alignment, and that are short enough to ensure the protein interaction site is contained within the sequenced portion. We routinely use Illumina 50 bp single end sequencing, which is long enough to sequence into the 3′ adapter sequence.

推荐阅读

Nature Protocols
Protocols IO
Current Protocols
扫码咨询