ABSTRACT
With the expanding catalogue of novel disease-genes, there is an increasing need to establish the clinical significance of potential disease-causing variants. Based on the idea that pathogenic variants in structured protein domains disturb folding and association with macromolecular assemblies, we employed Fluorescence Correlation and Cross-Correlation Spectroscopy (FCS and FCCS) to assess in vivo protein complex formation. Since the molecular underpinning of BRCA-associated breast and ovarian cancers is well defined and data from a recent genome editing screening allowed us to compare the binding data with a reliable functional HRD test, we examined the binding of BRCA1 to BARD1 and RBBP8, respectively. The results demonstrate that FCCS, whether applied to full-length BRCA1 in live cells or to isolated domains in cellular lysates, reliably identified pathogenic BRCA1 RING or BRCT variants. We moreover demonstrate the feasibility of employing FCCS for analysis of HNPCC-related factor MSH2 and MEN1 factor Menin variants in combination with DNA mismatch repair factor MSH6 and transcription factor JUND, respectively. Because the procedure can be completed within a clinically relevant time frame, FCCS is an appealing complement to current clinical procedures for classifying variants. Given its generic nature and design, the approach can be applied to a variety of monogenic diseases.
INTRODUCTION
The rapid advancements in genomics have enhanced our comprehension of the genetic underpinnings of various rare diseases. With an ever-expanding catalogue of novel disease-associated genetic variations, there is a pressing need to establish the clinical significance of potential disease-causing variants. Traditionally, variant classification depended on the nature of the variant and its known association or co-segregation with a specific disease. However, co-segregation is not always feasible, leading to the contemporary utilization of ACMG/AMP guidelines (1), which based on data from functional studies, segregation analyses, or clinical correlations categorize variants as pathogenic, likely pathogenic, variants of uncertain significance (VUS), likely benign, or benign. VUS, in particular, pose a significant clinical challenge as they leave patients in a state of uncertainty. To address this challenge, the development of large-scale variant databases, enhanced computational predictive tools like AlphaMissense (2), and the advancement of functional analyses have represented significant steps forward in characterizing VUS. Two notable examples of functional analyses involve the early use of minigenes (3) to investigate mutations affecting mRNA splicing, and more recently, systematic screening approaches such as the BRCA1 saturation editing screening (4), aimed at assessing homologous recombination deficiency. However, there is currently no single, universally applicable approach for functional testing of protein variants, which impedes the application of functional analyses in the clinical environment.
Proteins are fundamental to cellular processes, and it is estimated that the majority of proteins within a given proteome exert their function as part of complexes with other factors (5). Assembly is governed by conserved structured domains within the proteins and pathogenic variants frequently disrupt the structure of these domains, leading to misfolding of the proteins and destabilization of the functional units (6-9). Various methods, such as affinity purification coupled with mass spectrometry (AP-MS), yeast two-hybrid (Y2H) assays, and computational approaches, have effectively been employed to characterize protein-protein complexes. While these methods have contributed significantly to our understanding of cellular protein networks, they are labour-intensive and may not be readily adaptable for clinical analyses. Fluorescence Correlation and Cross-Correlation Spectroscopy (FCS and FCCS) offer alternative generic approaches to rapidly and precisely assess protein diffusion, stoichiometry, and complex formation (Figure 1) (10,11). FCS is based on the recording of fluorescence fluctuations produced by labelled molecules or proteins entering and exiting a small focal volume and analysing these fluctuations through autocorrelation. FCCS, on the other hand, leverages cross-correlation between molecules or proteins labelled with distinct spectral markers to determine if two factors associate or are part of the same macromolecular complex (10). FCS and FCCS are versatile and have been applied to analysis of various cellular processes, such as, protein-protein interactions, protein-nucleic acid interactions, receptor-ligand binding, and molecular diffusion dynamics. They offer single-molecule sensitivity and can analyse proteins and their interactions both in live cells and in vitro. From a clinical perspective, they may also be appreciated for their speed and reproducibility.
Schematic representation of fluorescence correlation spectroscopy (FCS) and fluorescence cross-correlation spectroscopy (FCCS) methods used in the study. Upper panel - FCS measurements are performed either in cells expressing a protein of interest fused to a GFP tag or in cell lysates from cells expressing the GFP fusion protein. Focal volume is positioned in a specific cellular location (live cell FCS) or in a fluorescent protein solution (cell lysate FCS). Fluorescence fluctuations are recorded and analysed by the autocorrelation function, which generates an autocorrelation curve, that is used to determine the diffusion time of the examined GFP-tagged protein. Lower panel - FCCS is based on the combined FCS measurements of two spectrally-distinct fluorescent proteins, in this case GFP and mCherry. Autocorrelation function is used for analysing diffusion of GFP- and mCherry-tagged proteins and also applied between channels, generating the cross-correlation curve. Interacting GFP- and mCherry-tagged proteins diffuse synchronously through the focal volume, so the cross-correlation curve (grey) is positive. In contrast, non-interacting GFP- and mCherry-tagged proteins diffuse independently from each other yielding a flat cross-correlation curve. The cross-correlation curves shown in the Figures reflect the interaction between the different fluorescently labelled molecules. The maximum possible correlation occurs when a molecule labeled with one fluorophore interacts with a molecule labelled with the other fluorophore. This interaction cannot happen more frequently than the presence of the most abundant molecule. Therefore, the FCCS curve can only reach a maximum value corresponding to the concentration of the most abundant molecule.
Based on the idea that pathogenic variants in structured protein domains disrupt macromolecular assemblies, we employed FCS and FCCS to rapidly and precisely assess in vivo protein complex formation. We used hereditary breast and ovarian cancer (HBOC) as a model disease because the molecular underpinning is well defined (12). Moreover, data from the recent saturation genome editing (SGE) screening allowed us to compare the binding of variants with a functional test (4). HBOC susceptibility genes are involved in homologous recombination repair (HRR), replication fork stability, DNA replication, and cell cycle checkpoint pathways. HRR is governed by a large assembly of factors, including BRCA1 and BRCA2, as well as PALB2, ATM, CHEK2, RAD51C, RAD51D, BARD1, RBBP8 and BRIP1. BRCA1 serves as scaffold for a number of these factors and pathogenic mutations are frequently found in the BRCA1 RING and BRCT domains that among other factors associate with BARD1 and RBBP8, respectively (13-15). We report that FCCS, whether applied to full-length BRCA1 in live cells or to isolated domains in cellular lysates, reliably identify BRCA1 RING or BRCT pathogenic variants. We also demonstrate the feasibility of employing the method for analysis of hereditary non-polyposis colorectal cancer (HNPCC)-related factor MSH2 (16) and MEN1 factor Menin (17) in combination with DNA mismatch repair factor MSH6 (18) and transcription factor JUND (19), respectively. Since the analysis can be completed in just a few hours, FCCS is a potential useful complement to current clinical procedures for classifying genetic variants. Given its generic nature and design, we moreover envision that FCCS could serve as a valuable tool for variant testing in other monogenic diseases as well.
RESULTS
Fluorescence Correlation Spectroscopy (FCS) of nuclear BRCA1
In order to assess whether FCS was able to portray the nuclear diffusion of BRCA1, HeLa cells were transiently transfected with a plasmid encoding GFP-BRCA1 or GFP for comparison. Since the autocorrelation is inversely correlated with the concentration of the fluorescent molecule under scrutiny, recordings were obtained from cells exhibiting low expression of the factors (Figure 2A). As illustrated by the normalized autocorrelation curves, nuclear GFP-BRCA1 diffuses about 8 times slower than GFP (Figure 2B) indicating that GFP-BRCA1, in contrast to GFP, is likely to be part of a larger assembly. In agreement with this, a subsequent fitting of the autocorrelation curve to models of 1, 2 or 3 components showed that the GFP-BRCA1 autocorrelation curve did not fit a 1-component diffusion model (blue line) (Figure 2, panels C and D). The poor fitting is illustrated by the residuals, which show the difference between the experimental data and the model fit. A better fit was observed after employing a free (3D) 2-component diffusion model (red line), although the fluctuation of the residuals was not completely random and fluctuating around zero. In contrast, fitting to a 3-component diffusion model (orange line) generated an almost random distribution of the residuals indicating that nuclear GFP-BRCA1 associates with a number of nuclear proteins.
Complex formation of GFP-BRCA1 in live cells A. Confocal image of HeLa cells expressing GFP and GFP-tagged BRCA1. GFP exhibits a diffused fluorescent pattern throughout the cell while GFP-BRCA1 shows a prominent nuclear localization of the tagged protein. Crosses in the nuclear space show how focal volumes were arbitrarily positioned in the nucleoplasm to record the fluorescent measurements. B. Normalized autocorrelation curves for GFP and GFP-BRCA1 showing an averaged slower diffusion of GFP-BRCA1 compared to free GFP. C. Fitting of the GFP-BRCA1 autocorrelation curve to 1-, 2- and 3-component diffusion models (top) and their corresponding residuals (bottom). D. Diffusion time and percentage (%) of GFP-BRCA1 fitting to a 3-component diffusion model.
Fluorescence Cross-Correlation Spectroscopy of BRCA1 binding to BARD1 and RBBP8
BRCA1 is an essential scaffold in the assembly of the HRR complex and the N-terminal RING domain of BRCA1 interacts strongly with BARD1, whereas the C-terminal BRCT domain, among other factors, associates with RBBP8 (CtIP) (13-15). We therefore exploited the possibility to detect and quantify complex formation with these two factors by Fluorescence Cross-Correlation Spectroscopy (FCCS). Cells were co-transfected with plasmids encoding GFP-BRCA1 and mCherry-BARD1 or mCherry-RBBP8, respectively. For comparison, we also included results from a GFP-mCherry fusion construct (20) (Supplemental Figure 1). Nuclear FCCS measurements were performed in the nucleoplasm avoiding the nucleolus, matrix and dense foci with conceivably immobile GFP-BRCA1. For each measurement, the GFP-BRCA1 and mCherry-BARD1 or mCherry-RBBP8 autocorrelation curves and the cross-correlation curve were obtained. Note that cross-correlation always relate to the most abundant fusion protein (Figure 1, legend). Measurements were routinely obtained from 5-10 different cells and in each cell, we performed at least 3 recordings of 30 s. As mentioned, we also attempted to identify cells exhibiting low levels of the expressed factors in order to reach an acceptable level of correlation. Only correlations exceeding 1.005 were considered for analysis.
Cross-correlation of EGFP and mCherry fusion proteins. HeLa cells were transfected with either separate EGFP-C1 and mCherry-C1 vectors (right panel) or with a EGFP-mCherry fusion construct (left panel) (20) before FCCS recordings were performed in live cells (upper panels) or in lysates (lower panels). Both in live cells and lysates we observed a cross-correlation of ∼35% in cells transfected with the fusion protein, whereas no cross-correlation was observed in the cells transfected with separate EGFP and mCherry vectors. Cross-correlation is not expected to reach 100% since a fraction of the molecules will undergo photo-bleaching and quenching, moreover some fluorescent proteins may be ‘off’ or in dark states.
GFP-BRCA1, mCherry-BARD1 and mCherry-RBBP8 were mainly nuclear, although a significant fraction of BRCA1 and BARD1 also remained in the cytoplasm. In cells exhibiting nuclear speckles the factors co-existed indicating they were part of the same complexes (Figure 3, panels A and B). Figure 3, panels C and D, shows the averaged autocorrelation curves for GFP-BRCA1, mCherry-BARD1 or mCherry-RBBP8 and the corresponding cross-correlation, respectively. The averaged cross-correlation for mCherry-BARD1 and - RBBP8 were 28% (STDEV 8%) and 36% (STDEV 11%), respectively. For comparison the mCherry-GFP fusion protein exhibited a cross-correlation of 33% (STDEV 5%) (Supplemental Figure 1). Panel D shows the variation of the correlation amplitudes among different cells. Since the autocorrelation readings obtained in live cells are relatively low and noisy compared to measurements in solutions (see below), we averaged correlation values over a series of readings in order to establish the plateau of the curves as indicated in panels C and D. At the described STDEV the assay should in principle be able to detect a reduced cross-correlation (i.e.: binding) of at least 35% with a P value equal to or less than 0.05.
Association of BRCA1 with BARD1 and RBBP8/CtIP in live cells. A. Confocal images of HeLa cells expressing GFP-BRCA1 (left) and mCherry-BARD1 (middle), together with the merged (right) image. B. Confocal images of HeLa cells expressing GFP-BRCA1 (left) and mCherry-RBBP8/CtIP (middle), together with the merged (right) image. C. Autocorrelation and cross-correlation curves obtained from GFP-BRCA1 and mCherry-BARD1 nuclear measurements. D. Autocorrelation and cross-correlation curves obtained from GFP-BRCA1 and mCherry-RBBP8 nuclear measurements. The pink columns in C and D show the range of autocorrelation values averaged to determine the plateau of the curves. E. The graph shows the raw distribution and variation among the averaged autocorrelation values for BARD1 and RBBP8/CtIP and their corresponding cross-correlation curves in 8 different cells (labelled 1-8), respectively.
FCCS analyses of BRCA1 RING and BRCT domain variants in live cells
We subsequently examined binding of a series of known benign (ACMG 1 and 2), VUS (ACMG 3) and pathogenic variants (ACMG 4 and 5) in the BRCA1 RING and BRCT domains, respectively. The predicted AlphaFold structure (21) of the two domains in the context of the entire BRCA1 is depicted in Figure 4 and the position of the included variants is outlined in the blow-up of the two domains. As shown, the RING and BRCT domains are positioned in the core of BRCA1 surrounded by what are presumably large stretches of intrinsically disordered regions. Variants were selected so they covered the domains and represented a balanced distribution of benign and pathogenic variants. Moreover, they should be listed in ClinVar or in the recent SGE analyses (4). The RING module is considered to extend from amino acids 2 to 101 (BRCA1 VECP), which includes two flanking alpha helixes governing heterodimerization with BARD1 (amino acids 2-23 and 65-101), as well as, the catalytic RING domain (amino acids 24-64). The BRCA1 BRCT domain is composed of two tandem BRCT modules from amino acids 1650-1857 (BRCA1 VECP) and variants were chosen in both modules positioned from amino acids 1650 to 1857. Benign variants located in immediate proximity to the BRCA1 RING (Ala102Gly and Tyr105Cys) and BRCT (Pro1859Arg) domains were also included as controls.
A. Space filling representation of the predicted AlphaFold structure of BRCA1 (UniProt identifier P38398). The amino acids are coloured according to the AlphaFold per-residue model confidence score (pLDDT) from very high (pLDDT > 90) (dark blue), high (90 > pLDDT > 70) (light blue) to low (70 > pLDDT > 50) (yellow) and very low (pLDDT < 50) (orange). The conserved RING and BRCT domains are located in the core of the protein and embedded by presumably unstructured stretches. B. Shows a ribbon of the RING domain composed of the BARD1-interacting alpha helices (lower part) and the Zn2+ coordinated ligase (upper part). The position of the examined variants are shown in red together with the corresponding amino acid. C. Shows a ribbon of the two BRCT domains with the tested variants labelled in red. The position of selected amino acids is shown to facilitate identification of the labelled variants.
Figure 5, panels A and B, shows the autocorrelation curves of wild-type BRCA1 in combination with two RING or BRCT variants exhibiting reduced cross-correlation. The results from all tested variants are shown below in panels C and D. The percentage of cross-correlation was calculated from the averaged correlation values of the plateau ranging from ∼1E-5 to 2E-5 seconds. With the exception of T37K and R71G variants, the pathogenic RING mutations reduced cross-correlation between GFP-BRCA1 and mCherry-BARD1 (P<0.05), while the benign mutations had no effect on the binding (Figure 5, panels A and C, and Table 1). C44F reduced binding of∼80%, whereas the observed reduction of remaining pathogenic variants ranged from 25%-45% of the wild-type. C44 amino acid contributes to the coordination of Zn2+ site I and alterations such as C44Y and C44F significantly impaired the binding of BRCA1 and BARD1. The phenylalanine substitution at C44 was clearly more severe than the tyrosine substitution, probably due to change in the properties of amino acid side chains. The pathogenic R71G substitution is considered to affect mRNA splicing (22,23) (Table 1) and is not expected to display impaired binding between BRCA1-BARD1 in our assay. With respect to T37K, which is located in a pocket in the proximity of RING ligase, it was noteworthy that T37R exhibited moderate but significant reduction in binding. This variant was therefore examined in more detail employing isolated domains (see below). We also examined M18T variant but were unable to obtain data because the variant was not expressed at a sufficient level for analysis. We surmised that the mutant was unstable and therefore decided to examine it in the context of isolated domain as described below. Pathogenic variants in the BRCT module clearly reduced RBBP8/CtIP binding, and the binding data were concordant with both ClinVar classification, SGE screening and AlphaMissense (Figure 5, panels B and D, and Table 1) with the exception the T1684A variant that was classified as functional in the SGE but exhibited a 50% reduction (P<0.002) in binding in the FCCS. Variants such as T1699Q and R1699W, almost completely abolished binding to RBBP8 and were indistinguishable from the truncating L726fs and K991* controls. Table 1 delineates the results from the FCCS analysis and shows the corresponding classifications from ClinVar, SGE and AlphaMissense. With the exception of T37K and the T1684A, the live cells analysis of variants was in accordance with these data.
Impaired binding of BRCA1 variants to BARD1 and RBBP8/CtIP in live cells. A. Normalized mCherry-BARD1 correlation and cross-correlation curves with GFP-BRCA1 wild-type or GFP-BRCA1-C44F (curves not shown). B. Normalized mCherry-RBBP8 autocorrelation and cross-correlation curves with GFP-BRCA1 wild-type and GFP-BRCA1 M1775R (curves not shown). C. Normalized cross-correlation values of RING variants and D. Normalized cross-correlation values of BRCT variants. STDEV and * P<0.05 (t-test, two-tailed, unequal variance) are indicated on each column.
FCCS analyses of isolated BRCA1 RING and BRCT domain variants in cell lysates
We recently developed a FCS and FCCS protocol for analysis of ribonucleoprotein (RNP) assemblies employing cell lysates that complements live cell readings (20). Obviously, lysates do not reconcile the spatial resolution of live cells, but they have less spatial constraints and provide optimal readings because they can be diluted, which allows you to discern even small changes in binding. The procedure is moreover simple and exhibits a high reproducibility (Supplemental Figure 2), due to the use of trimmed RING and BRCT domains. Briefly, cells were transfected with the trimmed BRCA1 GFP-RING or GFP-BRCT domains in combination with mCherry-BARD1 or mCherry-RBBP8, respectively, followed by iso-osmolar lysis in buffer containing non-ionic detergent to release the cytoplasmic content. Measurements were performed on a drop of lysate positioned at the bottom of the same type of glass dishes that were used for live recordings. For each variant, we performed 12 readings in 3 biological replicates, resulting in autocorrelation ranging from 1.5-2.2, which is about 50 times higher than in live cells. The averaged results from the analysis of selected variants are shown in Figure 6C and the associated raw data from three independent biological replicates which illustrates the assay reproducibility is shown in Supplemental Figure 2, panels B and C. For the wild-type RING domain, cross-correlation was 25% of the mCherry-BARD1 correlation and for the BRCT domain the cross-correlation with mCherry-RBBP8 was 17%. The isolated domains reconciled to a large extent the results from the full-length BRCA1, and as illustrated in Figure 6A, the higher accuracy of the assay was able to depict small differences in binding of the T37K, T37R and T37A variants of the RING domain. We noted a minor but significant decrease in the binding of pathogenic T37K variant that could not be distinguished in live cells, and the results further underscored the more severe effect of arginine substitution. Moreover, unlike live cells, M18T variant was adequately expressed in cell lysates and found to exhibit reduced binding, which allowed us to classify it as damaging. For the BRCT domain, we were also able to identify pathogenic variants in concordance with the live cell analysis (Figure 6B). Compared to the RING domain, BRCT variants in general had slightly more severe effects on cross-correlation and even benign variants appeared to invoke a minor reduction in binding to RBBP8. Taken together, we conclude that isolated structural modules may serve as a convenient complement to live cell analysis to classify genetic variants.
A. Representative averaged correlation curve from analyses of cell lysates from cells transfected with RING (left panel) or BRCT (right panel) domains, respectively. Correlations are indicated in absolute values. B. and C. Inter-assay variation of RING or BRCT domain FCCS in cellular lysates. The columns show the percent of cross-correlation from three independent biological replicates. Each replicate comprise 12 averaged recordings. The SDEV represents the variation among the individual recordings. Each recording takes about 10 minutes and including cell transfection and lysis the assay of a variant may be completed over 2 days. Q1857 is located just outside of the BRCT and was included as control.
BRCA1 variant analysis employing isolated RING and BRCT domains in cell lysates. A. Cells were transfected for 24 h with GFP-RING or with GFP-RING-T37A, GFP-RING-T37K or GFP-RING-T37R in combination with mCherry BARD1 before they were solubilized and the lysates examined by FCCS. The panel shows the normalized mCherry-BARD1 correlation curve (red) and corresponding wild-type RING cross correlation (grey) as well as the mCherry-BARD1 correlation and cross-correlation curves (orange) from the analyses of the GFP-tagged RING-T37A (dark blue), T37K (light blue) or T37R (purple) variants. The right panel is a blow up of the cross-correlation curves from each variant. B. Correlation curves of GFP-BRCT (green) and mCherry-RBBP8/CtIP (red) and the corresponding wild-type cross-correlation curve (grey) as well as the GFP-BRCT autocorrelation (blue) and cross-correlation curves (orange) from the analyses of the GFP-tagged BRCT-Q1846K and R1699W variants. C. Normalized cross-correlation values of a series of RING and BRCT variants. STDEV and * P<0.05 (t-test, two-tailed, unequal variance) are indicated on each column.
Application of FCCS analyses to MSH2-MSH6 and Menin-JUND heterodimers
To explore if FCCS could be applied to the characterisation of variants predisposing to other diseases and proteins involving other structural domains, we finally performed assays with MSH2 and MSH6, that are involved in hereditary non-polyposis colorectal cancer (HNPCC), and Menin, that is encoded by the MEN1 gene, in combination with JUND. The analyses were performed as described for BRCA1, and in live cells we observed a strong cross-correlation of 34% between MSH2 and MSH6 in both the nucleus and in the cytoplasm (Figure 7A). Cross-correlation was markedly reduced by insertion of pathogenic variants MSH2 P622L and C697F (9,24) (Figure 7A). In lysates cross-correlation was slightly lower (27%), but as in live cells the two pathogenic variants caused a significant reduction in cross-correlation. In lysates, we could moreover observe a left shift in the correlation curves, indicating that the variants are likely to be excluded from an even larger protein assembly. The interaction between Menin and JUND was examined in lysates and this revealed a cross-correlation of 21%. Two variants - A237V and A242V (25), that in ClinVar were categorized as VUS and likely pathogenic, respectively, and according to AlphaMissense are likely pathogenic, reduced binding to approximately 50% of the wild-type. Taken together, we infer that FCCS is feasible for variant classification in various structural domains.
FCCS analyses of MSH2 and Menin. A. Nuclear correlation (green and red) and cross-correlation curves (dark grey) from cells transfected with GFP-MSH2 and mCherry-MSH6. The light grey curves show the same analyses where the focal volume was positioned in the cytoplasm. The blow up to the right shows the analyses of two pathogenic variants P622L and C697F. B. FCCS analyses of the same MSH2 constructs in cellular lysates (left) and with GFP-Menin and mCherry-JUND (right) or with JUND in combination with Menin A237V (NM_001370259.2(MEN1):c.710C>T) or A242V (NM_001370259.2(MEN1):c.725C>T) variants (same as A242V, A247V - MEN1_HUMAN ENST00000337652.5). C. Normalized cross-correlation results from three independent experiments. STDEV and * P<0.05 (t-test, two-tailed, unequal variance) are indicated on each column.
DISCUSSION
Key cellular processes, including transcription, mRNA processing and transport, protein synthesis, signalling events, and metabolism, hinge upon the orchestration of protein assemblies with varying degrees of complexity. While protein complexes may exhibit homomeric structures, more frequently they involve an interplay of distinct proteins. Binding is typically governed by conserved domains characterized by well-defined structures, and less commonly by intrinsically disordered regions (26,27). Many monogenic diseases are associated with variants that disrupt the formation of specific functional assemblies (28-30), e.g. hereditary cancer syndromes, developmental abnormalities resulting from disruptions in cytoskeletal or ciliary architecture, as well as metabolic, mitochondrial diseases, and late-onset conditions like ALS (31). Recent data moreover suggest that pathogenic mutations are concentrated at protein interfaces, underscoring the potential utility of analysing protein interactions to characterize the pathogenic significance of missense variants (26). Generally, missense variants impact protein structure by inducing misfolding and impaired binding to partners, occasionally leading to the degradation of the involved factors. Based on the idea that analysis of protein assemblies could represent a sensitive and generic way of identifying pathogenic variants, we employed fluorescence correlation and cross-correlation spectroscopy to rapidly and precisely assess in vivo protein complex formation.
The limitations of FCS and FCCS are mainly regarding fluorophore tagging that may affect the folding and bioactivity of some proteins. BRCA1 can be tagged without major loss of biological activity and GFP-BRCA1 has previously been employed in the characterisation of its role in double-strand break (DSB) sites (32). In agreement with earlier studies, our initial characterisation of the employed GFP-BRCA1 construct showed that it located to characteristic nuclear foci and complexed with other proteins. Association between BRCA1 and BARD1 and RBBP8/CtIP was close to 30%, which is similar to that obtained with a GFP-mCherry fusion protein, showing that binding is regular and stable (Supplemental Figure 1). Due to photobleaching, misfolding or fluorescent proteins being “off” or in dark states (33), cross-correlation is not expected to be 100%. In our experience it is useful to separate the protein under investigation and the fluorophore by a short flexible linker sequence (20). We have moreover experienced that the relative abundance of the factors should be as similar as possible and care should obviously be taken to work in the linear range of the photodetector. Transient expression systems provide relatively high concentrations of the expressed proteins and cells exhibiting low levels of expression should in general be chosen for analysis. Whenever possible and meaningful, we prefer lysates since the concentration of the factors can be accurately controlled to achieve the best possible autocorrelation amplitude and accuracy of the assay. Finally, it is important to underscore that the assay can only be considered for its positive (reduced binding) predictive value since protein binding in principle may occur flawlessly despite loss of e.g. enzymatic activity or incorrect splicing as described below.
In general, FCCS findings aligned with the current SGE, ClinVar, and AlphaMissense classifications of the examined variants. Reduced binding was associated with pathogenic or ambiguous variants in both live cells and after analysis of the isolated domains in lysates. Two known pathogenic variants, T37K and R71G, exhibited normal binding in live cells. The solvent exposed RING T37 locates to a small cavity between the BRCA1-BARD1 heterodimerization interface and the catalytic site (34). There are three known variants at this position (T37K, T37R and T37A) and the results provide an example of the limitations, as well as the capacity of FCCS. In contrast to the T37K variant, the T37R reduced binding in live cells, whereas the T37A had no effect on the binding in agreement with the classification as benign. In lysates using the isolated RING domain, it was apparent that T37K had a small but significant effect on binding, whereas the T37R exhibited a greater loss of binding, likely reflecting that arginine, which forms a larger number of electrostatic interactions, perturbed the overall structure to a larger extent. R71G has been documented to reduce splicing (22) and the reduced mRNA expression is also reflected in the SGE analysis (Table 1). Since the employed constructs are intron-less it was expected to be similar to the wild-type. For BRCT variants, we observed a single discrepancy with the SGE in the case of the T1684A variant, that reduce binding moderately by ∼50%, whereas the SGE shows a reduced activity (score -0.62) that was insufficient to classify the variant as pathogenic. The difference may be explained by the thresholds of the assays and in agreement with a small but significant functional reduction ClinVar depicts the variant as a VUS and AlphaMissense find it is ambiguous.
The domain-specific analyses reconciled the results from live cells but also revealed that variants overall had a larger impact on binding in the isolated domains. This was particularly evident for variants in the conserved BRCT domains, where even benign variants reduced binding to RBBP8/CtIP. The observation underscores the biological significance of molecular assemblies, in the sense that the concerted action of the factors may stabilise binding of each individual factor. Moreover, the results may point at a role for the long stretches of intrinsically disordered sequence. The analyses of isolated domains e.g. the T37 variants also provide a direct illustration of the deleterious effect of genetic variants as a continuum in the same way as demonstrated by the quantitative SGE screening (4). In this way pathogenic variants should perhaps be regarded as variants that fall below a certain threshold in the majority of the population, whereas benign variants remain over the threshold in the majority of the population. Consequently, particular variants may only be harmful for susceptible persons and in the future this may warrant the use of quantitative variant analysis such as FCCS, perhaps in combination with the analysis of other host factors e.g. chaperone activity. A recent study have shown that protein-folding chaperone binding patterns may affect the pathogenicity of variants in the BRCA1 C-terminal (BRCT) domain. HSP70 selectively binds pathogenic BRCA1-BRCT variants and the magnitude of binding was correlated to loss of folding and function (35). So chaperone levels may be envisioned to reduce the penetrance of particular variants. Along this vein it may also be added that chemical chaperones have successfully been applied to re-establish the function of mutant proteins in monogenic diseases (35-38). In this scenario, FCCS may be helpful to monitor the effect of the chemical chaperone.
In order to generalize the results and examine other functional domains, we also examined the feasibility of employing the method for analysis of HNPCC and MEN1 variants in MSH2 and Menin, respectively. MSH2 is composed of five structural domains including a DNA mismatch binding domain, a connector and a lever domain with an incorporated Clamp domain and a C-terminal ATPase domain (39). The pathogenic P622L and C697F variants are located in the ATPase domain in two distinct beta strands. MSH2 and MSH6 exhibited a high cross-correlation (37%) and binding was clearly reduced by the two variants. Interestingly, we also observed that impaired heterodimerization leads to a significant increase in the diffusion constant of cytoplasmic MSH2, suggesting that other factors may be part of the complex even at this stage. Menin is a single domain scaffold protein regulating gene transcription and cell signalling (40). The protein has been described to attain the shape of a curved hand and JUND binds the central pocket spanning residues 27–47, which is relatively far from the mutated R237 and R242. The observed decrease in binding serves to underscore that the entire amino acid composition of this type of proteins has been fine-tuned during evolution and may support the concept of structural based analysis for classification of genetic variants.
FCS and FCCS are recognized as reproducible, fast, and cost-effective methods. They only require a confocal microscope and lab space for cell transfection. Vectors and cDNAs are moreover widely available and can be obtained within a few days. Including the time for transfection and recordings, the analyses can be completed in just two workdays, making FCCS particularly appealing for clinical settings. While we do not envision FCCS for systematic functional screens like those conducted for BRCA1 (4) or MSH2 (41) variants, it holds significant promise for the functional assessment of individual patient variants in a wide range of monogenic diseases. We tested four different protein interactions involving various structured domains and successfully identified pathogenic variants in all cases. Thus, FCCS could serve as a generic approach offering a valuable supplement to current diagnostic procedures in genetics and genomic medicine.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
MATERIAL AND METHODS
Cell lines
HeLa cells (ATCC® CCL-2) were cultured in Dulbecco’s Modified Eagle Medium (DMEM), high glucose, no glutamine, no phenol red (Thermo Fisher Scientific, Cat. No. 31053028), supplemented with 1X GlutaMAX (Thermo Fisher Scientific, Cat. No. 35050038), 1 mM sodium pyruvate (Thermo Fisher Scientific, Cat. No. 11360039), 10% FBS (Biowest, Cat. No. BWSTS1810) and 1% penicillin/streptomycin (Thermo Fisher Scientific, Cat. No. 15070063) and were grown in a humidified incubator at 37 C and 5% CO2.
Plasmids
The coding sequence of BRCA1 was PCR-amplified and cloned into pEGFP-C1 (Clontech) by restriction site digestion (SalI/SacII) and ligation. BARD1 and RBBP8 coding sequences were also PCR-amplified and cloned into pmCherry-C1 (Clontech). The sequences encoding for amino acids 1-109 and 1651-1864 of BRCA1 VCEP were separately cloned into pEGFP-C1 to generate vectors expressing the EGFP-tagged RING and BRCT trimmed domains, respectively. EGFP was PCR-amplified from pEGFP-C1 and cloned into pmCherry-C1 by restriction enzyme digestion in order to obtain the pmCherry-EGFP fusion protein. All inserts were confirmed by Sanger sequencing. Scale-up of plasmid DNA was performed with GeneJET Plasmid Midiprep Kit (Thermo Fisher Scientific, Cat. No.: K0481).
Cell seeding and plasmid transfection
For experiments in live cells, 200,000 HeLa cells were seeded in 4-well chambered glass-bottom coverslips (µ-Slide 4 Well Glass Bottom, Ibidi, Cat. No.: 80427). 4-5 hours after, cells were transfected using FuGene HD transfection reagent (Promega, Cat. No.: E2311). Briefly, 3 µl of FuGENE® HD Transfection reagent was added to 50 µl of Opti-MEMTM (Thermo Fisher Scientific, Cat. No.: 31985062). Next, 500 ng of plasmid DNA were added, and the mixture was incubated for 15 minutes before addition to the wells. To achieve similar but slightly higher mCherry-tagged protein expression levels, GFP:mCherry vectors were mixed at a 2:1 ratio. For experiments in lysates, 250,000 cells were seeded per well in 6-well plastic-bottom plates (Nunc Cell-Culture Treated Multidishes, Thermo Fisher Scientific, Cat. No.: 140675). 4-5 hours after, cells were transfected using FuGene HD transfection reagent. Briefly, 18 µl of FuGENE® HD Transfection reagent were added to 600 µl of Opti-MEMTM. Next, 1 μg of plasmid DNA was added and the mixture was incubated for 15 minutes before 100 µl were added to each well. To achieve similar protein expression levels, GFP:mCherry vectors were mixed either at a 1:2 ratio (RING domain:RBBP8), 1:4 ratio (BRCT domain:RBBP8), 1:1 ratio (MSH6:MSH2), 1:1 (Menin:JUND).
Confocal microscopy imaging
HeLa cells were seeded in 35 mm glass bottom dishes (No. 1.5 Coverslip 14mm G, uncoated, MatTek Corporation, Cat. No. P35G-1.5-14-C), transfected with GFP or/and mCherry vectors and subsequently imaged ∼24h after transfection. Confocal images were obtained using a Zeiss LSM780 confocal microscope with a Plan-Apochromat 63°ø/1.4 NA oil objective.
Fluorescence Correlation Spectroscopy (FCS)
FCS measurements were recorded the day after transfection, selecting cells where the expression level of GFP and mCherry-tagged proteins was low. FCS measurements were performed with a Zeiss LSM780 confocal microscope using a C-Apochromat 40×/1.2 W Corr M27 objective and Immersion oil Immersol W 2010 (Zeiss). GFP or mCherry measurements were performed with an Argon laser with a 488 nm excitation wavelength (0.1% laser power) and a DPSS laser with a 561 nm excitation wavelength (0.1% laser power), respectively. GFP fluorescence was captured with a detection window of 482–553 nm and mCherry fluorescence was captured with a detection window of 590–695 nm, as described (20). Before each measurement, average molecular count rate (kHz per molecule) was checked at different laser powers to ensure that fluorescence count signal was linear with laser power and not in saturation. Measurements were performed at randomly picked volumes in the cell nucleus for 1.5 minutes in 30 s intervals. The coverslip was taken from the incubator (37° C) and sealed with parafilm, before they were mounted on the microscope. The microscope measurements were taken at room temperature and coverslips were kept in the incubator until analyses. The experimental autocorrelation curves were obtained and analysed in ZEN 2011 software (Zeiss). The fits of the different models to the experimental data were also done in ZEN 2011, using their in-built models as described (20).
Fluorescence Cross-Correlation Spectroscopy (FCCS)
FCCS measurements in live cells were taken the day after transfection, selecting cells where the expression level of GFP and mCherry-tagged proteins was low. Recordings were performed in a Zeiss LSM780 confocal microscope using a C-Apochromat 40×/1.2 W Corr M27 objective and Immersion oil Immersol W 2010 (Zeiss). GFP or mCherry measurements were performed with an Argon laser with a 488 nm excitation wavelength (0.1% laser power) and a DPSS laser with a 561 nm excitation wavelength (0.1% laser power), respectively. GFP fluorescence was captured with a detection window of 482–553 nm and mCherry fluorescence was captured with a detection window of 590–695 nm. Before each measurement, average molecular count rate (kHz per molecule) was checked at different laser powers to ensure that fluorescence count signal was linear with laser power and not in saturation. Measurements were performed at randomly picked volumes in the cell nucleus for 1.5 minutes in 30 s intervals. The coverslip was taken from the incubator (37 ° C) and sealed with parafilm, and measurements of two wells (2/4) were taken 5–30 min later. Coverslip was placed back to the incubator for 30 min before taking the FCCS measurements of the remaining 2 wells. Experimental autocorrelation and cross-correlation curves were obtained and analysed in ZEN 2011 software (Zeiss). Cross-correlation and autocorrelation amplitude values, needed to calculate the cross-correlation/autocorrelation ratios, were extracted from the average of the amplitude values G(τ) from either curve in the area of maximum amplitude. Cross-correlation (CC)/autocorrelation (AC) ratio was calculated with the following formula: CC/AC ratio = [G(τ)CC – 1]/[G(τ)AC – 1].
FCCS on lysates was performed the day after transfection. Cells were lysed at room temperature in 100 µl of lysis buffer containing 20 mM Tris-HCl pH 7.5, 140 mM KCl, 1.5 mM MgCl2, 1mM DTT and 0.5% NP-40 supplemented with 1:300 mammalian protease inhibitor cocktail (Sigma). Cell lysates were briefly centrifuged at 800 x g for 1 min and supernatant was transferred to a 35 mm glass bottom dish (No. 1.5 Coverslip 14mm G, uncoated, MatTek Corporation, Cat. No. P35G-1.5-14-C) before being subjected to FCCS. All the settings described above for FCCS analyses in live cells were also used for FCCS in lysates, except for the laser power, which was optimized as follows. For GFP-tagged RING domain variants and mCherry-tagged BARD1, GFP or mCherry measurements were performed with an Argon laser with a 488 nm excitation wavelength (0.3% laser power) and a DPSS laser with a 561 nm excitation wavelength (0.5% laser power), respectively. For GFP-tagged BRCT domain variants and mCherry-tagged RBBP8, GFP or mCherry measurements were performed with an Argon laser with a 488 nm excitation wavelength (0.4% laser power) and a DPSS laser with a 561 nm excitation wavelength (0.4% laser power), respectively. The same laser settings were used for GFP-tagged MSH2 and mCherry-tagged MSH6 and GFP-tagged Menin and mCherry-tagged JUND.
ACKNOWLEDGEMENTS
The study was supported by the Danish National Research Fund for Infrastructure. Lelde Kalnina is thanked for help with the generation of the MSH2 variants.