INTRODUCTION
Aging leads to progressive dysregulation of immune function affecting nearly all components of the immune system (Montecino-Rodriguez et al., 2013; Nikolich-Zugich, 2018 ), in parallel with the loss of physiological fitness (Lo´ pez-Otı´n et al., 2013) and increased vulnerability to multiple diseases (Niccoli and Partridge, 2012). Understanding the impact of aging on the immune system requires an integrative approach due to the complexity of its cellular ecosystem as defined by specific body location, overall health status, and age itself. Indeed, a number of observations reported distinct age-specific alterations in discrete immune cell populations both in mice and humans (Alpert et al., 2019; Dulken et al., 2019; Goronzy and Weyand, 2017; Hashimoto et al., 2019; Kimmel et al., 2019). Furthermore, recent unbiased single-cell analyses revealed complex age-related changes in cell populations in mouse tissues (Almanzar et al., 2020; Angelidis et al., 2019; Dulken et al., 2019; Kimmel et al., 2019; Martinez-Jimenez et al., 2017; Schaum et al., 2020; Ximerakis et al., 2019). However, common and tissue-specific changes in the components of the aged immune system as well as their cross-species conservation still remain poorly understood.
To unravel the complex age-related changes in the immune system, we sought to investigate major cellular determinants of aging in mice and humans. To that end, we used a combination of single-cell RNA sequencing (scRNA-seq) and T and B cell receptor (TCR, BCR) sequencing of immune cells, followed by extensive validation using flow cytometry in independent cohorts supported by the focused application of functional and imaging approaches. We paralleled our findings in mice with a detailed analysis of human blood cell subpopulations using flow cytometry, mass cytometry (CyTOF), and simultaneous epitope and transcriptome measurements in single cells (CITE-seq). As a result, we comprehensively described the cellular landscape of immune aging and discovered a novel cellular hallmark of aging—expansion of a clonal granzyme K (GZMK)-expressing CD8+ T cell population that is conserved between the two species. Functional experiments strongly suggest that, in mice, a distinct subset of GZMK+ CD8+ T cells develops under the impact of an aged environment and can contribute to an inflammaging phenotype via increased secretion of GZMK. Finally, in order to make the data and analyses readily available to readers, we set up a dedicated resource that provides data in both raw and processed format for visual exploration of immune cell populations from young and old mice and humans (https:// artyomovlab.wustl.edu/immune-aging).
RESULTS
Shared and Organ-Specific Changes in the Aging Immune System
We analyzed the immune system in young (3–6 months) and aged (17–24 months) C57BL/6J mice. In agreement with published results (Mahmoudi et al., 2019), aged mice showed increased blood levels of multiple cytokines and chemokines (in-flammaging) (Figure S1A). To characterize cellular components of the immune system, we first obtained scRNA-, TCR-, and BCR-seq profiles for 34,771 CD45+ immune cells simultaneously sorted from four organs of young and aged mice: an immune organ (spleen), peripheral immune compartment (peritoneal cavity), respiratory organ (lungs), and metabolically active organ (liver) (Figure 1A). Clustering of the data revealed the major immune cell populations in the individual organs (Figures 1B–1D and S1B–S1F). Comparative analysis identified a number of changes in the proportions of cells and gene expression levels altered in aging (Figures 1B–1D and S1E; Table S1). The most striking observation was the accumulation of GZMK+ CD8+ T cells in all four profiled organs. Organ-specific changes included an age-associated decrease in natural killer (NK) cells and naive T (Tn) cells; alterations in B cells in the spleen and peritoneum; changes in macrophage subpopulations in peritoneum, the lungs, and the liver; and an increase in splenic regulatory T (Treg) cells (Figures 1B–1D).
Organ-Specific Changes in the Aging Innate Immune System
We performed flow cytometry analysis of an independent cohort of young and aged mice to validate age-associated changes in myeloid and innate lymphoid cells (Figures 2A and 2B). These alterations included a significant decrease in peritoneal macrophage populations (Figures 2C and S1G–S1I) and a decrease in proportions of lung alveolar macrophages (AMF) (both resting and Ki67+ proliferatively active cells) (Figures 2D and S1J-S1L).
This process was counteracted by an increase in markers of monocyte-derived origin (CCR2 and CX3RC1) in both lung interstitial macrophages (IMF) and hepatic macrophages in aged mice (Figures 2D–2F and S1M). Lung and hepatic neutrophils also increased in aged mice (Figures S1N and S1O). We also validated a profound decrease of plasmacytoid dendritic cells (pDC) in the liver and spleen, as well as a decrease of splenic conventional dendritic cells 1 (cDC1) and peritoneal cDC2 (Figures S2A– S2E). In agreement with the scRNA-seq data, proportions of NK cells were decreased in aged mice across multiple organs (Figures 2G and S2F–S2H). Moreover, proportions of hepatic innate lymphoid cells type 1 (ILC1) and lung innate lymphoid cells type 2 (ILC2) were also decreased in aged mice (Figures S2G–S2J).
Aging Results in Organ-Specific Changes in B Cells
Although aging did not alter proportions of major populations of B cells (Figures S2K–S2M), it increased splenic and peritoneal Zbtb32+ B cell subsets (Figures 2H–2J). Zbtb32+ B cell subsets were characterized by distinct transcriptional profiles: splenic Zbtb32+ B cells expressed Fcrl5 and were phenotypically similar to previously described age-associated B cells (ABCs) (Hao et al., 2011), while peritoneal Zbtb32+ B cells expressed Zcwpw1 and can be broadly identified as B-1 cells (Baumgarth, 2011) (Figures 2H–2J and S2M). Interestingly, Zbtb32+ B cells did not highly express markers of plasma cells (Xbp1 and Derl3) and, thus, were distinct from a recently reported B cell population accumulating in aged mice (Schaum et al., 2020).
In the context of the adaptive immune system, single-cell profiling is particularly powerful due to its ability to resolve the repertoire of BCR sequences. In young mice, only peritoneal Zbtb32+ B-1 cells demonstrated clonal features of the BCR repertoire (Table S2). B-1 cells are known to have a restricted BCR repertoire and their BCRs recognize self-antigens (Baumgarth, 2017). The characteristic BCR for peritoneal Zbtb32+ B cells (VH11-2 [HCDR3: CMRYSNYWYFDVW] together with VK14-126 [LCDR3: CLQHGESPYTF]) has been shown to recognize phosphatidylcholine (Singh et al., 2017), a cellular membrane component, suggesting that these cells may be self-reactive. Global BCR clonality, class switching, and BCR mutation frequencies showed only a modest increase in aged compared to young mice in all four organs (Figures S2N–S2R). Two B cell subsets that expanded with age, peritoneal B-1 cells (cluster 3) and splenic Zbtb32+ B cells (cluster 7), demonstrated an increase in clonality in aged mice (Figures 2K, S2M, and S2N; Table S2). Moreover, splenic Zbtb32+ B cells, but not Zbtb32+ B-1 cells from peritoneum, demonstrated high levels of BCR mutation rates (Figure S2R).
CD4+ T Cells in Aged Mice: Non-clonal Proinflammatory Remodeling
The total fraction of CD4+ T cells did not change but proportions of Tn cells decreased, whereas effector memory (Tem) and activated CD4+ T cells increased across all studied organs in aged mice (Figures 2L–2N and S3S–S3X). In agreement with the recently reported data (Elyahu et al., 2019), proportions of FOXP3+ Treg cells increased in the spleen of aged mice, however, were unaffected in other organs (Figure 2N). This was accompanied by upregulation of a specific gene signature including checkpoint inhibitor genes Pdcd1, Tigit, and Lag3 and the transcription factor Batf (Figures S3X–S3Y). We found that the agingspecific transcriptional signature of splenic Treg cells was enriched among the Batf-expressing tissue-specific precursor KLRG1+ Treg cells (Delacher et al., 2020) (Figures S3Q and S3R) suggesting that a Batf-dependent regulatory program might be responsible for aging of splenic Treg cells.
A universal age-specific feature of CD4+ T cell remodeling was an increase in their activated status and PD-1 expression observed in six different tissues (Figures 2N and S2V). In line with that, splenic CD4+ T cells from aged mice produced more inflammatory cytokines such as IFNg and IL-17A (Figure 2O). To evaluate if this phenotype was associated with a clonal response, we analyzed the paired TCRa/TCRb repertoire in CD4+ T cells. We found that CD4 TCR diversity was not altered in aged compared to young mice within the studied tissues (Figures 2P and S3A–S3C; Table S3), although that may be due to a relatively small number of analyzed TCRs. Moreover, we found little to no sharing of CD4 TCR clones between tissues in both young and aged mice (Figures 2P and 2Q). These results indicate that accumulation of proinflammatory CD4+ T cells in various organs in aging is not associated with a profound alteration in their TCR diversity.
GZMK+ CD8+ T Cells Bear Markers of Exhaustion and Expand with Age in Multiple Organs
We next focused on the most profound age-associated remodeling of the CD8+ T cell compartment. Proportions of CD8+T cells decreased in the spleen but increased in the liver and the brain and were unchanged in peritoneum, the lungs, and kidney (Figures S3H, S3P, and S3R). Yet, the changes within CD8+T cell subsets were very consistent across all examined organs in aging: the fraction of CD8+ Tn cells decreased, while the fraction of PD-1+ CD8+ T cells increased (Figures 3B, S3I–S3K, S3M, S3P, and S3R). However, in contrast to CD4+ T cells, expression of PD-1 on CD8+ T cells marked the emergence of a distinct cell subset that was virtually not present at young age (Figures 3A–3C). These age-associated (Taa) CD8+ cells were characterized by expression of Gzmk as one of the most specific markers. To confirm that the emergence of Gzmk+ CD8+ Taa cells was not specific to the local housing or profiling conditions, we reanalyzed two recently published scRNA-seq datasets that pro-filed total tissues of aging mice (Almanzar et al., 2020; Kimmel et al., 2019). In these datasets, Gzmk-expressing CD8+ T cells were increased in aged mice as early as 18 months in multiple organs (Figures S4A and S4B). Notably, Gzmk expression was not detectable in cells other than the subpopulation of CD8+ T cells (Figures 1B–1D, S4A, and S4B).
CD8+ Taa cells were distinct from conventional Tem cells and were characterized by co-expression of the transcription factor TOX and checkpoint molecules including PD-1 (Figures 3C and S3L–M, S3S, and S3T). scRNA-seq data demonstrate robust co-expression of Gzmk and Pdcd1 transcripts specific to Taa cells, confirming that PD-1 (and TOX) can be used to identify this cell population in healthy mice. Accordingly, flow cytometry analysis using TOX and PD-1 as markers of CD8+ Taa cells confirmed that proportions of these cells increased in aged mice across all studied organs alongside with infiltration of CD45+ immune cells (Figures 3D–3E and S3N–S3R). Because CD8+ Taa cells expressed markers associated with exhausted T (Tex) cells and TOX, a transcriptional regulator of exhaustion (Alfei et al., 2019; Khan et al., 2019) (Figures 3C, S3S, and S3T), we compared their transcriptional signatures with CD8+ Tex cells generated in chronic lymphocytic choriomeningitis virus (LCMV) infection in mice (Doering et al., 2012). CD8+ Taa cells matched the exhaustion signature acquired at day 30 (that is terminally exhausted), but not day 8, post LCMV infection (Chen et al., 2019) including increased expression levels of Gzmk, Tox, and Eomes but decreased Tbx21 and Tcf7 (Figures S4C–S4H). Next, we compared Taa cells with the various subsets of Tex cells (Beltra et al., 2020; Chen et al., 2019) and found that they are phenotypically similar to terminal Tex cells (Figure S4H). Importantly, transcriptional analysis demonstrated that CD8+ Taa cells were different from the recently described virtual memory CD8+ Tvm cells that expand in aged mice (Chiu et al., 2013; Quinn et al., 2018) (Figure S4I).
Age-Associated GZMK+ CD8+ T Cell Population Is Highly Clonal and Has Unique Inflammatory Potential
Next, we evaluated TCR repertoires among CD8+ T cell subsets. Contrary to the CD4+ T cell data, we observed an increased clonality among aged CD8+ T cells that was concentrated within the CD8+ Taa cell clusters but not within the Tem subsets (Figures 3F–3H and S3A–S3E). Remarkably, identical TCR clones were found simultaneously in different organs in aged, but not in young, mice (Figure 3G; Table S3). We further examined TCR repertoires of three different aged mice and found that age-associated clones were distinct for each mouse, involving different TCR clonotypes of unknown origin (Figures S3F and S3G; Tables S3 and S4). Together, these results show that CD8+ Taa cells undergo clonal expansion and accumulate common TCR clones in various organs in aged mice.
The presence of a clonally expanded cell population is typically associated with a history of T cell activation. Indeed, expression levels of IFNg and CCL5 were increased in CD8+ Taa cells similar to the CD8+ Tem cells (Figures 3I, S4J, and S4L). However, when total CD8+ T cells from an aged mouse were stimulated in vitro, they secreted significantly higher amounts of GZMK, while production of granzyme B (GZMB) remained the same (Figure S4K). This phenotype was due to CD8+ Taa cells that secreted signifi-cantly more GZMK and similar levels of CCL5 compared to CD8+ Tem cells from aged mice but, unlike CD8+ Tem cells, failed to efficiently secrete IFNg and GZMB upon TCR activation (Figures 3J and 3K). These data demonstrate that CD8+ Taa cells have a specific proinflammatory profile uniquely characterized by enhanced production of GZMK.
CD49d Is a Marker of CD8+ Taa Cells that Is Associated with Tissue Localization
We next predicted cell-cell ligand-receptor interactions (Efremova et al., 2020) between CD8+ Taa cells and other cell types using recently published scRNA-seq profiles of total tissues of the spleen, the lungs, and kidney in young and aged mice (Kimmel et al., 2019) (Figures 4A and 4B). To this end, we first re-clustered these data at finer resolution, which yielded a distinct Gzmk+ CD8+ T cell cluster in all three organs (see STAR Methods). The analysis revealed a number of organ-specific interaction pairs involving the Gzmk+ CD8+ T cell cluster; however, interactions through integrins highly expressed by these cells (e.g., a4b1, a4b7, etc.) were found in all tested organs (Figures 4B and S4M). In line with that, CD8+ Taa cells had increased levels of membrane protein expression of the a4 integrin subunit (CD49d) both in tissues and in blood (Figures 4C–4E). In circulation, CD49d expression was highly specific to the CD8+Taa cell population identifying constitutive expression of CD49d as one of the key markers of these cells.
Various immune and stromal cells were predicted to interact with a4b1 expressed by CD8+ Taa cells via fibronectin (FN1), receptor for urokinase plasminogen activator (PLAUR), junctional adhesion molecule B (JAM2), vascular cell adhesion molecule-1 (VCAM-1), and osteopontin (SPP1) (Figures 4A and 4B). For further analysis and validation, we focused on FN1 as a prototypical molecule that can bind to components of the extracellular matrix and CD49d expressed by immune cells. In various organs in aged mice, we confirmed that PD-1+ CD8+ T cells can be found in FN1-enriched areas where they physically interact with FN1 (Figures 4F–4G and S4N). These results suggest that the FN1-CD49d axis may control tissue homing and interactions of CD8+ Taa cells.
FN1 expression was detected in various clusters of macrophages (Figure 4A) and specifically in Ccr2+ and Cx3cr1+ subsets, which were increased in aged mice (Figures 1B–1D and S4O–S4Q). In fibroblasts, expression of Fn1 was accompanied by a high basal gene expression level of the senescence-associated secretory phenotype (SASP) components, e.g., Il6, Ccl2, and Cxcl1, in various organs (Figure 4H). Because fibroblasts appear to be poised to express SASP genes, we speculated that, upon homing to FN1 rich areas, CD8+ Taa cells might affect the SASP of fibroblasts via GZMK. GZMK is a member of the granzyme family with a distinct proteolytic activity, and it has been reported to induce proinflammatory phenotypes in stromal cells (Bovenschen et al., 2009; Joeckel et al., 2011; 2017). First, we found that, in mouse 3T3 fibroblasts, exogenously added GZMK dramatically boosted IFNg induced secretion of IL-6 and CCL5, factors of inflammation increased in aging (Faget et al., 2019) (Figure 4I). Next, we evaluated if GZMK by itself can enhance the SASP. To that end, we induced in vitro senescence and the SASP in 3T3 fibroblasts utilizing a doxorubicin model (Figures S4R–S4U). Intriguingly, GZMK significantly increased SASP components such as IL-6, CCL2, and CXCL1 in this model (Figure 4J). This indicates that even in the absence of classical inflammatory cytokines, secretion of GZMK has the potential to exacerbate the SASP and suggests that GZMK released from CD8+ Taa cells, alone or in combination with IFNg, may be an important regulator of inflammatory processes in aging (Figure 4K).
Development of CD8+ Taa Cells Is Driven by Extrinsic Components of an Aged Environment
Next, we investigated whether the accumulation of CD8+ Taa cells in aged mice depends on a cell-intrinsic mechanism or can be driven by an aged environment. To this end, we transferred congenically labeled splenic CD8+ T cells from young mice into young or aged mice (Figure 5A). In aged mice, 1 month post transfer was sufficient to convert circulating donor CD8+ T cells into the TOX+ PD-1+ phenotype to the same extent as host CD8+ T cells (Figures 5A–5C), increase expression of CD49d by these cells, and promote their accumulation into various organs (Figures 5D–5F, S5A, and S5B). Importantly, the CD8+ T cell transfer into a young host did not result in such phenotypes (Figures 5A–5F, S5A, and S5B). We reproduced these results using aged CD8+ T cell-depleted hosts (Figures S5C– S5F) and performed scRNA-seq analysis of donor and host splenic CD8+ T cells (Figure S5G). Importantly, the conversion of CD8+ T cells into the TOX+ PD-1+ phenotype induced a global transcriptional program very similar to native CD8+ Taa cells in these mice (Figure S5G).
Finally, to study if the CD8+ Taa cells depend on an aged environment for their survival, we transferred CD8+ T cells (including the TOX+ PD-1+ cell population) from aged to young mice (Figure 5G). Interestingly, CD8+ Taa cells maintained the TOX+ PD- 1+ phenotype and preserved elevated levels of CD49d in a young environment (Figures 5G–5K). Moreover, 1 month in the young host was sufficient for TOX+ PD-1+ CD8+ Taa cells from aged donors to infiltrate multiple organs (Figures S5H–S5J). Taken together, these results show that the development and expansion of CD8+ Taa cells depend on cell-extrinsic age-related factors and that differentiated CD8+ Taa cells have a stable phenotype that is linked to high capacity for homing into tissues.
scATAC-Seq Identifies EOMES as a Potential Regulator of GZMK+ CD8+ T Cells
To study underlying gene regulatory programs in aged immune cell populations, we performed scATAC-seq profiling of 17,081 CD45+ splenocytes from young and aged mice and identified 15 distinct clusters (Figures 6A and S5K–S5Q). scATAC-seq peak annotations demonstrated that these clusters reasonably matched transcriptional signatures of the main immune cell populations identified in scRNA-seq data (Figures 6A, 6B, and S5R). Interestingly, scATAC-seq clusters 9, 11, and 12 were profoundly expanded in aged compared to young mice (Figures 6A and 6B). In-depth analysis identified these clusters as three immune cell subsets that changed with age: Zbtb32+ ABCs (cluster 9), CD8+ Taa cells (cluster 11), Treg and activated CD4+ T cells (cluster 12) (Figures 6B and 6C). This included cluster-specific accessibility at the key cell-specific loci. For example, Zbtb32 and Tbx21 promoters were more accessible in ABCs (cluster 9) compared to other B cell clusters (Figure 6D). Similarly, CD8+ Taa cells (cluster 11) showed an increase in accessible chromatin levels in the Gzmk and Eomes promoters and Treg, and activated CD4+ T cells (cluster 12) showed an epigenetic accessibility in the promoter of highly expressed Ccr8 (Figure 6D). De novo motif discovery identified consensus transcription factor (TF) binding motifs specific for each cluster (Figures 5E and S5S). Specifically, CD8+ Taa cells demonstrated a strong enrichment with T-box motifs (Figure 6E). Among T-box TF genes, differential gene expression analysis showed increased Eomes but decreased Tbx21 expression levels (Figure S4D) and protein expression of EOMES was elevated whereas T-BET was decreased in CD8+ Taa cells compared to Tem cells (Figure 6F). These results suggest differential roles of EOMES and T-BET in the development of CD8+ Taa cells. Similarly, we followed up on the AP-1 motif enrichment (Figure 6E) emerging as an ageassociated feature of CD4+ T cells and confirmed elevated expression of BATF at the RNA and protein levels in age-associated Treg cells (Figures 5C and 5G).
To define the epigenetic relationships between CD8+ Taa cells and Tn, Tem, and Tex cells, we compared regions of highly accessible chromatin in these cell populations (Pauken et al., 2016). The epigenetic landscape of CD8+ Taa cells was overall similar to the one in LCMV-induced CD8+ Tex cells, matching multiple, but not all, peaks of active chromatin near genes that are characteristic for the CD8+ Tex cell epigenetic signature (Bengsch et al., 2018) (Figure S5T). These results highlight the similarity between CD8+ Taa and Tex cells showing both transcriptional and epigenetic parallelism between these cell populations, yet also suggest a distinct epigenetic regulation of CD8+ Taa cells in aged mice.
GZMK+ CD8+ T Cells Is a Hallmark of Inflammaging in Humans
Previous studies have shown that CD8+ T cells are profoundly affected by aging in humans (Goronzy and Weyand, 2017). Given the emergence of GZMK as a key marker of aging mouse CD8+ T cells, we first analyzed circulating populations of CD8+ T cells from non-obese healthy young and old men (Figure 7A; Table S5). In aged individuals, total CD8+ T cells within the PBMCs decreased, proportions of mucosal-associated invariant T cells (MAIT) relative to total CD8+ T cells did not change, yet signifi-cant remodeling was observed within the conventional CD8+T cells (Figures S6A–S6E). Strikingly, the fraction of GZMK+ cells within the conventional CD8+ T cells increased in old male donors, whereas aging had only a moderate effect on GZMB+ CD8+ T cells (Figures 7B, 7C, S6D, and S6E). A similar difference in proportions of GZMK+ CD8+ T cells was observed in young compared to old female donors (Figure S6F). The expression of GZMB versus GZMK was attributed to largely distinct subpopulations of CD8+ T cells (Figure 7D). Furthermore, GZMK+ , but not GZMB+ , CD8+ T cells positively correlated with plasma levels of IL-6, TNFa, and IL-8, key markers of inflammaging that were elevated in old individuals (Figures S6N and S6O and data not shown).
Detailed analysis of the conventional CD8+ T cells revealed a significant decrease in proportions of CD8+ Tn cells and an increase in Tcm and Tem cell populations but no significant changes in the Temra cell population in aged individuals (Figures 7E and 7F). GZMK expression was highest among the Tcm and Tem cells (Figure S6G). However, Tem cells also demonstrated profound levels of GZMB expression (Figure S6G). Given our observation that GZMK and GZMB mostly localize to different cells, this suggested that the Tem cell compartment might harbor distinct subpopulations.
Thus, to increase the cellular resolution, we analyzed PBMCs from this cohort using CyTOF (Figures S6H–S6J). An unbiased clustering resolved a number of subpopulations among CD8+ T cells, including a separate cluster that consisted of GZMK+ cells (Figures S6H–S6J). The classically defined CD8+ Tem cell population consisted of two major subpopulations: one expressing GZMK and the other expressing GZMB (Figures S6H–S6J). Consistent with the flow cytometry data, the relative abundance of cells from the GZMK+ Tem cluster significantly increased in old compared to young individuals (Figure S6K). GZMK+ Tem cells expressed CD27 and CD28 and were CD57– , while GZMB+ Tem cells were characterized by the lack of CD27 or CD28 expression and a CD57+ phenotype. CD8+ Temra cells consisted of the GZMB+ cells and were mostly CD57+ (Figures S6H–S6J).
Using flow cytometry, we validated that the CD57 and CD28 expression pattern defined two distinct subsets within CD8+Tem cells (Figure 7G). Furthermore, the expansion of the Tem cell population in old individuals was specifically due to an increase in its CD57– CD28+ subset (Figures 7G and 7H). Notably, CD57– CD28+ Tem cells expressed GZMK, while CD57+ Tem cells were a key GZMB-expressing subset (Figures 7I and 7J). Taken together, these data demonstrate an age-associated dichotomy: T cell subsets with the highest GZMK protein expression levels (CD57– CD28+ Tem and Tcm cells) expanded within the CD8+ T cells in old individuals, while two GZMB-expressing subpopulations (CD57+ Tem and Temra) did not appreciably alter with age in healthy individuals.
Transcriptional Landscapes of Human Clonal GZMK+ and GZMB+ CD8+ T Cells
To unbiasedly resolve cellular complexity associated with the accumulation of GZMK+ CD8+ T cells in the broader context of the aging immune system in humans, we performed scRNAseq and CITE-seq of PBMCs from young and old men (11 versus 10 individuals, 99,762 cells in total) and annotated 25 clusters corresponding to the main immune cell populations (Figures S7A and S7B). To integrate the expression of protein markers with transcriptional data at the single-cell level of resolution, we used CITE-seq with 39 antibody markers (Figure S7C). Proportions of classical monocytes were increased whereas clusters with gdT cells, MAIT cells, and conventional CD8+ T cells were decreased with age relative to total PBMCs (Figure S7D). Within the CD4+ T cell compartment, a decrease of the Tn and IFN-activated clusters was paralleled by an increase of Tem cells (Figure S7E). The composition of B cell clusters was not affected by aging with the exception of a decrease in a distinct cluster of ZBTB32+ B cells that has a transcriptional similarity with mouse Zbtb32+ ABCs (Figures S7F and S7G).
CD8+ T cells comprised clusters of MAIT cells, Tn, Tcm, and two distinct clusters of Tem cells (Figures 7K–7M, S7H, and S7I). CD8+ Tem subsets were characterized by mutually exclusive expression of Gzmk and Gzmb transcripts (Figures 7M and 7N) and, therefore, denoted as Tem-k and Tem-b. GZMKexpressing clusters of human CD8+ T cells have been described previously, e.g., in the context of scRNA-seq analyses of inflammatory diseases (Corridoni et al., 2020; Zhang et al., 2019). Guided by the results from the CyTOF and flow cytometry analyses, we leveraged CITE-seq data to establish that Tem-k subpopulation expresses a high level of CD45RO and a low level of CD57 (Figure 7L). This indicates that the Tem-k cluster corresponds largely to CD57– Tem cells. The Tem-b subpopulation was CD57+ and partly CD45RO+ , while another portion of this subpopulation expressed a high level of CD45RA (Figure 7L).
This indicates that the Tem-b cluster encompasses two subpopulations of GZMB-expressing cells: GZMB+ Tem cells and Temra cells. Indeed, a dedicated computational subclustering of the scRNA-seq data of non-naive CD8+ T cells revealed the presence of two distinct subpopulations within the Tem-b cluster that match Tem and Temra cells (Figure S7J).
In line with the flow cytometry data, these unbiasedly defined CD8+ Tcm and Tem-k clusters were increased in old compared to young individuals (Figure S7L). Pseudotime analysis of the single-cell transcriptional trajectories (Qiu et al., 2017) revealed a transition from Tcm to Tem-k to Tem-b clusters associated with an initial increase in GZMK expression level, followed by a shift toward GZMB expression, and accompanied by gradual changes in expression levels of genes that control T cell development and homeostasis (Figure S7K). These results suggest that the age-associated increase of Tem-k cells might be a reflection of the break down in the differentiation routine, albeit the possibility of an altered development of the terminal branch (Tem-b) cannot be excluded by pseudotime differentiation analysis alone.
Paired TCRa/TCRb repertoire analysis revealed that both CD8+ Tem-k and Tem-b clusters are highly clonal compared to Tn and Tcm clusters, whereas clusters of CD4+ T cells did not show evidence of clonal expansion in the same individuals (Figure S7M; Table S6). Furthermore, the expanded TCR repertoire in GZMK+ Tem-k cells was specific to this cluster, showing only a partial overlap (around 15% or less) with the GZMB+ cluster (Figure S7N). Comparison with public TCR sequences showed that cells from both Tem-k and Tem-b clusters expressed some TCRs that recognize epitopes from cytomegalovirus (CMV) and Epstein-Barr virus (EBV) (Table S6). However, CMV and EBV serology profiles were similar in young and old groups (Table S5), ruling out their roles in the expansion of clonal Tem cells in old individuals. Taken together, these results establish that clonal Tem-k cells are a phenotypically and transcriptionally distinct cell population that increases within the CD8+T cells in the blood with healthy aging.
scATAC-Seq of Human PBMCs Shows a Distinct Epigenetic Landscape of GZMK+ Tem Cells
Lastly, we investigated epigenetic determinants associated with human immune cells by performing scATAC-seq profiling of 45,674 PBMCs from six individuals (three young and three old men). Iterative peak calling and clustering identified 21 epigenetically distinct clusters within the PBMCs (Figure S7O). Among NK and T cell clusters, differential profiles of accessible chromatin revealed populations of CD8+ Tn cells (cluster 3), Tem-k cells (cluster 7), Tcm cells (cluster 9), Tem-b cells (cluster 5), and MAIT cells (cluster 8) (Figure S7O). This unbiased identification of the main CD8+ cell populations led us to employ de novo motif discovery tools in order to identify TF associated with their differentiation programs (Figure 7O). MAIT cells relied on the nuclear receptor (NR) RORgt motif, Tn cells were enriched for the TCF7 motif and replaced by a RUNX1 motif in Tcm cells, whereas two distinct sets of T-BOX and bZIP motifs were epigenetic signatures of Tem-k and Tem-b subpopulations (Figure 7L). The emergence of these motifs was concordant with the differential expression pattern for the corresponding transcription factors: RORC (which encodes RORgt), TCF7, TBX21 and EOMES (T-BOX), and BATF (bZIP) in equivalent scRNA-seq clusters of CD8+ T cells (Figures 7P and 7Q). These results show that GZMK+ CD8+ Tem-k and Tcm cells have distinct epigenetic landscapes and, similar to Taa cells in mice, may rely on a balance between T-BET- and EOMES-driven transcriptional regulation.
DISCUSSION
Here, we provide an extensively validated description of ageassociated changes in immune cell populations and integrate our key findings with published resources on mouse aging (Almanzar et al., 2020; Kimmel et al., 2019). Our data reveal profound common and organ-specific changes in the aged immune system and highlight the emergence of clonal GZMK+ CD8+ Taa cells as one of the most striking cellular hallmarks of immune aging. The potential impact of this cell population on aging physiology can be underscored by the fact that in mice of advanced age nearly half of all CD8+ T cells in circulation acquire the GZMK+ Taa phenotype.
We show that the development of CD8+ Taa cells depends on yet-to-be-identified signals in an old environment. Once differentiated into the Taa phenotype, these cells upregulate CD49d expression and acquire a capacity for efficient tissue homing. While CD8+ Taa cells share phenotypical and transcriptional features with Tex cells, they are ubiquitously distributed and show an increased capacity for homing in various organs and GZMK production. This suggests a potential contribution to systemic inflammation and perturbed immune responses in old individuals. GZMK, a proinflammatory member of the granzyme family (Joeckel et al., 2011, 2017), alone and in combination with IFNg, induces secretion of inflammatory cytokines by mouse fibroblasts. Thus, CD8+ Taa cells may amplify the SASP (Faget et al., 2019) by engaging both non-senescent and senescent stromal cells that contribute to systemic inflammation.
Several features of CD8+ Taa cells have been previously recognized in the context of immune aging: a distinct age-associated clonal expansion process characterized by CD49d upregulation (Clambey et al., 2008) and an increase in expression of inhibitory receptors linked to an impaired anti-viral response (Decman et al., 2012). Additionally, our results are in line with recent publications showing an overall increase of TCR clonality in aged mice (Almanzar et al., 2020; Kimmel et al., 2019) and indicate that a phenotypically distinct CD8+ Taa cell subset is the main driver of the restricted TCR repertoire. Intriguingly, the same TCR clones of CD8+ Taa cells are found across multiple organs, indicating that these cells circulate throughout the body and may impact physiology of the whole organism. Taken together, our results unbiasedly identify clonal GZMK+ CD8+Taa cells as a bona fide distinct subpopulation within healthy old animals and highlight their potential contribution to inflammaging through GZMK secretion.
scATAC-seq profiling indicates that a T-BOX-regulated program may support the transition of CD8+ T cells into the GZMK+ phenotype. Crosstalk between T-BET- and EOMESdependent regulation was previously shown in the development of CD8+ Tex cells (Chen et al., 2019), suggesting that a similar regulatory program might regulate ‘‘exhausted-like’’ CD8+ Taa cells. Moreover, an aged environment appears to initiate the development of CD8+ Taa cells by triggering a transcriptional program that includes high expression levels of EOMES. This indicates that age-associated features of the host organism, but not T cell-intrinsic signals, trigger the development of a stable CD8+ Taa cell population, and one can speculate that this could be attributed to non-conventional interactions between senescent cells and the adaptive immune system (for example, see Pereira et al., 2019). Moreover, aged T cells experience replicative stress and mitochondrial disfunctions (Goronzy and Weyand, 2017), which might be involved in the development of CD8+ Taa cells. Curiously, transfer of CD8+ Taa cells into a young host is not sufficient to reverse their phenotype or compromise their survival. On the contrary, transferred CD8+ Taa cells are capable of circulating and infiltrating multiple tissue niches in young mice. Further studies are needed to identify the signals of an aged environment that are critical to trigger the development of CD8+ Taa cells in mice and their relevance to human aging.
Finally, working with a cohort of healthy non-obese individuals, we show that a clonal GZMK+ CD8+ Tem-k cell subpopulation increases with age in humans. Similar to their mouse counterparts, these cells have a distinct epigenetic landscape and elevated EOMES expression levels indicating the involvement of T-BOX transcription factors in their development. Human CD8+ T cells, however, also include a distinct GZMB+subpopulation of Tem-b cells, which does not have a direct equivalent in mice. This divergence might be related to the fact that mice were aged in pathogen-free conditions, while humans have an experienced immune system affected by a multitude of pathogenic exposures. Indeed, it has been shown that CD57+Tem cells (which are GZMB+ ) are enriched in T cells reactive to CMV and EBV (Goronzy and Weyand, 2017; Wertheimer et al., 2014) and may sometimes accumulate with age due to the increased antigen exposure (Alpert et al., 2019). Together, our results identify GZMK+ CD8+ T cells as a hallmark of immune aging and highlight these cells, as well as GZMK itself, as potential targets to address age-associated dysfunctions of the immune system.
Limitations of the Study
One of the limitations of our study is the relatively low age of individuals in the old cohort; it is possible that more advanced age may result in a significant increase of both GZMK+ and GZMB+CD8+ T cells. Therefore, it is plausible that human aging is associated with two distinct processes: an accumulation of intrinsic age-related GZMK+ CD8+ T cell subpopulations as well as expansion of common pathogen-specific GZMB+ T cell subsets. In line with this, another limitation of this study is that age-associated alterations in the immune system are addressed in specific-pathogen-free mice. Despite that our results are consistent across several mouse colonies (assuming potential variations in their microbiota status), both commensal and pathogenic microbiota are a crucial component of natural aging process and might have an effect to immune system alteration and its dynamics during the lifespan. Lastly, the most rigorous definition of the GZMK+ Taa cell subpopulation in mice should be made using flow cytometry with corresponding monoclonal antibody, which currently is not available. However, orthogonal approaches demonstrate co-expression of GZMK, PD-1, and TOX in this cell population.
STAR+METHODS
Detailed methods are provided in the online version of this paper
and include the following:
● KEY RESOURCES TABLE
● RESOURCE AVAILABILITY
Lead Contact
Materials Availability
Data and Code Availability
● EXPERIMENTAL MODEL AND SUBJECT DETAILS
Mice
Human samples
Cell lines and primary cell culture
● METHOD DETAILS
Isolation of mouse immune cells
Transfer of mouse CD8+ T cells
In vitro activation of mouse CD8+ T cells
Treatment of 3T3 cells with granzyme K
Isolation of human PBMCs
Flow cytometry
CyTOF
Bioplex and ELISA
Histology and immunostaining
RNA isolation and bulk RNA sequencing
Bulk RNA-seq analysis
Bulk TCR sequencing and analysis
Single-cell RNA-seq
Single-cell RNA-seq analysis
Removal of ambient RNA contamination from scRNAseq data
Monocle2 pseudotime transcriptional trajectory analysis
Processing and analysis of public scRNA-seq datasets
Cell-cell interaction network analysis
Single-cell BCR repertoire analysis
Paired scTCRa/TCRb repertoire analysis
Single-cell ATAC-seq
Single-cell ATAC-seq analysis
Analysis of public ATAC-seq datasets
De novo TF motif analysis
● QUANTIFICATION AND STATISTICAL ANALYSIS
Statistical analyses of biological data
SUPPLEMENTAL INFORMATION
Supplemental Information can be found online at https://doi.org/10.1016/j. immuni.2020.11.005.
ACKNOWLEDGMENTS
The work was supported by grant from Aging Biology Foundation (USA) (M.N.A.) and NIH grants CA217208-01A1 and AG059244-01A1 (S.A.S.). We would like to thank Dr. Gwendalyn Randolph and Peter Wang for providing Cx3cr1-GFP reporter mice, Dr. Diane Bender for assistance with multiplex techniques, and Dorjan Brinja and Dr. Erica Lantelme for assistance with cell sorting.
AUTHOR CONTRIBUTIONS
D.A.M. and M.N.A designed the study, analyzed the data, and wrote the manuscript with input from the other authors. S. Brioschi, I.S., Z.Y., A.L., L.A., M.K., A.S., S. Burdess, and D.A.M. performed experiments with mice and PBMCs. L.A. directed human subject recruitment and sample collection. D.A.M., I.S., and M.K. performed in vitro experiments. L.A., A.S., and D.A.M. prepared samples for single-cell techniques. M.B. and D.A.M. performed confocal microscopy experiments. P.S.A., D.A.M and M.N.A. analyzed scRNA-seq and CITE-seq data. O.S., D.A.M., and M.N.A. analyzed scATAC-seq data. E.E. analyzed scTCR-seq, scBCR-seq, and CyTOF data. O.S. developed scATAC-seq analysis pipeline. K.Z. developed the single-cell browser. Important mice, reagents, and experimental techniques were provided by M.C., S.A.S., and S.G. O.S. and P.S.A. contributed equally to this work.
DECLARATION OF INTERESTS
The authors declare no competing interests.
Received: March 3, 2020
Revised: August 13, 2020
Accepted: November 10, 2020
Published: December 2, 2020
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be addressed to the Lead Contact, Maxim N. Artyomov (该Email地址已收到反垃圾邮件插件保护。要显示它您需要在浏览器中启用JavaScript。).
Materials Availability
Further information and material requests should be addressed to Maxim N. Artyomov (该Email地址已收到反垃圾邮件插件保护。要显示它您需要在浏览器中启用JavaScript。).
Data and Code Availability
Raw and processed mouse bulk RNA-seq, bulk TCR-seq, scRNA-seq, and scATAC-seq data are deposited to GEO repository (GSE155006) and https://www.synapse.org repository (syn22255433). Raw and processed human scRNA-seq and scATAC-seq data are deposed to https://www.synapse.org repository (syn22255433). The pipeline with code to process and visualize the scATAC-seq data is available at https://github.com/JetBrains-Research/sc-atacseq-explorer. Processed data and additional analysis files with differential expression data are available at the dedicated online portal at https://artyomovlab.wustl.edu/immune-aging.
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Mice
Wild-type (WT) C57BL/6J mice were purchased from the Jackson Laboratory and bred at Washington University in St. Louis or purchased from National Institute of Aging (USA). B6.SJL-Ptprca Pepcb /BoyJ (CD45.1) mice were purchased from the Jackson Laboratory. CX3CR1gfp/+ (CX3CR1-GFP) knock-in mice, in which the Cx3cr1 gene has been replaced by green fluorescent protein (GFP) gene, where from Washington University in St. Louis. All mice were bred and housed in specific-pathogen-free conditions and co-housed inside the same environment for at least 3 weeks before experiments. Animal protocols used in this study were approved by the Institutional Animal Care and Use Committee at Washington University in St. Louis. Mice used for scRNA-seq (3 young and 3 aged males per group) and scATAC-seq (3 young and 3 aged males per group) were euthanized at 3-4 or 17-18 months of age after 8 h fasting. Mice used for flow cytometry and multiplex cytokine assay validations were the same as used for scRNA-seq and scATAC-seq or from additional cohorts (3 to 8 males and females per group) 3-6 or 17-24 months of age. Investigators were not blinded to experimental groups; however, samples and data were processed in a high-throughput manner. No animals were excluded from analysis.
Male mice used to generate data presented in Figures 1A–1D, 2A, 2B, 2H, 2I, 2K–2N, 2P–2R, 3A–3I, 6A–6E, S1A–S1O, S2I, S2J, S2M–S2R, S2X, S2Y, S3A–S3G, S3O, S3P, S3S, S3T, S4D, S4M, S4O, and S5R–S5T were from Jackson Laboratory and Washington University in St. Louis.
Female mice used to generate data presented in Figures 2J, 2O, 3J, 3K, 4C–4G, 5A–5K, 6F, 6G, S2A–S2H, S2K, S2L, S2S–S2W, S3H–S3R, S4C, S4E, S4J–S4L, S4N, S4P, S4Q, and S5A–S5J were from National Institute of Aging (USA).
Female B6.SJL-Ptprca Pepcb /BoyJ (CD45.1) mice used to generate data presented in Figures 5A–5J were from Jackson Laboratory.
Human samples
All human studies were approved by the Washington University in St. Louis School of Medicine Institutional Review Board (IRB-201804084). Written informed consent was obtained from all participants. Healthy, Caucasian, non-obese (BMI under 30) males and females were enrolled in the study in 2018-2019. Young male donors (25-29 years, BMI 21.2-28.9 kg/m2 , n = 14) and old non-frail male donors (BMI 17.8-29.8 kg/m2 , 62-70 years, n = 14) were included. Young female donors (30-34 years, BMI 19.5- 25.1 kg/m2 , n = 3) and old non-frail female donors (68-80 years, BMI 18.5-27.4 kg/m2 , n = 7) were included. Using a screening questionnaire, subjects were asked about lifestyle and health issues. Subjects with any previous history of cancer, inflammatory conditions (rheumatoid arthritis, Crohn’s disease, colitis, dermatitis, fibromyalgia or lupus) or blood borne infections (HIV, hepatitis B, C) were excluded. Current smokers were also excluded. Blood ( 100 ml) was collected by venous puncture in the morning (7–10 a.m.) after an overnight fast. IgG serotype for CMV and EBV was determined in plasma from these donors using ELISA kits (Abcam, ab108639 and ab108730). One sample (A15) was excluded from all analyses as an outlier based on an extremely low number of cells and an isolated clustering in CyTOF analysis. One sample (D18) was excluded from all analyses as an outlier based on an isolated clustering in scRNA-seq analysis. Subjects who reported cold or flu symptoms in the prior month were excluded.
Cell lines and primary cell culture
Mouse splenic CD8+ T cells were isolated from using EasySep Mouse CD8+ T Cells Isolation Kit (Stemcell, cat# 19853). 2 3 105 cells were plated in 0.5 mL of RPMI + 10% FCS in the presence of 2 mg/mL of plate-immobilized anti-CD3 antibody (Invitrogen, clone 17A2) and 1 mg/mL of solubilized anti-CD28 antibody (Invitrogen, clone 37.51) and incubated for 48 h. Supernatant media were collected and used for ELISA.
Mouse 3T3 fibroblast cell line (ATCC CRL-1658) was grown in DMEM + 5% FCS until 75% confluence was established.
METHOD DETAILS
Isolation of mouse immune cells
Mice were anesthetized using isoflurane inhalation and blood was collected through retro-orbital sinus bleeding. Upon tissue harvest, mice were deeply anesthetized, and right ventricular perfusion was performed with 30 mL of ice-cold PBS. Peritoneal cavity was washed with 5 mL of PBS + 1% BSA and peritoneal fluid cells collected. The spleen was dissected and splenocytes isolated by using mechanical dissociation protocol. The lungs, the liver, kidney, and epididymal white adipose tissue were dissected, washed in PBS, minced to small pieces with scissors and digested during 30 min at 37 C in DMEM + 0.5% BSA + 0.1% collagenase D. Cell suspension was filtered through a 70 mm strainer and cell debris was removed by centrifugation at 400 g for 20 min on 30% Percoll gradient.
The brain and dura mater were harvested and mechanically dissociated with a dounce homogenizer. Cell suspension was filtered through a 70 mm strainer. For brain samples only, the myelin fraction was removed by centrifugation at 1000 g for 20 min on 30% Percoll gradient.
For scRNA-seq and scATAC-seq assays, the cell suspension was filtered through a 70 mm filter to remove debris and stained with TruStain FcX (anti-mouse CD16/CD32) antibody (Biolegend, clone 93), CD45-APC/Cy7 antibody (Biolegend, clone 30-F11) and Live/ Dead Fixable Aqua Dead Cell Stain (ThermoFisher, cat# L34957). Live CD45+ cells were sorted into cold PBS + 1% BSA using BD Aria II cell sorter, and pelleted at 400 g for 10 min at 4 C. The cell pellet was resuspended in PBS + 0.04% BSA at 1000 cells per mL. For flow cytometry, filtered cell suspension was stained with antibodies as described in the Flow cytometry part of the Methods.
Transfer of mouse CD8+ T cells
CD45.1+ or CD45.2+ CD8 T cells were isolated from the spleen of young (3 months) or aged (20-24 months) C57BL/6J female mice using EasySep Mouse CD8+ T Cell Isolation Kit and cell purity was analyzed by flow cytometry (> 90%). In some experiments, we depleted CD8+ T cells from aged mice using depleting anti-CD8a antibody (BioXcell, clone 2.43, cat# BE0061) injected IP 0.4 mg per mouse followed by injections with 0.25 mg per mouse of anti-CD8a antibody every 4 days for 3 weeks. 2 3 106 CD45.1+ CD8 T cells from young mice were i.v. injected into 3 months-old or 20-24 months-old CD45.2+ female mice. In another experiment, 2 3 106 CD45.2+ CD8 T cells from 20 months-old female mice were i.v. injected into 3 months-old CD45.1+ female mice. One month post cell transfer, CD45.1+ and CD45.2+ CD8+ T cells were analyzed by flow cytometry in the blood and various organs.
In vitro activation of mouse CD8+ T cells
CD44– PD-1– , CD44+ PD-1– , and PD-1+ CD8+ T cells were sorted from the spleen of young (3-5 months) or aged (18-24 months) C57BL/6J female mice using BD Aria II cell sorter. 0.5-2 3 105 of cells from each population were incubated for 48 h in RPMI + 10% FCS in the presence of plate-bound anti-CD3 antibody (2 mg/mL), soluble anti-CD28 antibody (1 mg/mL), and mouse recombinant IL-2 (50 ng/mL). Cell supernatants were collected, cells removed by centrifugation, and levels of cytokines and cytotoxic molecules determined using ELISA kits.
Treatment of 3T3 cells with granzyme K
To induce cellular senescence, 3T3 cells were grown at 70% confluence and treated with 0.1 mM doxorubicin for 24 h, media was refreshed, and cells incubated 24 h, then treated with 0.1 mM doxorubicin for another 24 h followed by incubation in fresh medium for 7 days. The cycling 3T3 cells were treated with 50 ng/mL mouse recombinant IFNg (PeproThech, cat# 315-05) and/or 100 ng/mL of mouse recombinant granzyme K (Mybiosource, cat# MBS2029386) in DMEM without FCS for 6 h. Senescent 3T3 were treated with 100 ng/mL of mouse recombinant granzyme K in DMEM with 5% FCS for 24 h. Supernatant media were collected and analyzed by ELISA.
Isolation of human PBMCs
Venous blood was collected in Sodium-Heparin vacutainers. Plasma and blood cells were separated using Histopaque-1077 according to the manufacturer’s protocol (Sigma). Briefly, whole blood was diluted 1:1 with sterile DPBS–2mM EDTA and 30 mL diluted blood was overlaid on to 10 mL of Histopaque-1077 and centrifuged at 500 g for 30 min at room temperature. Peripheral blood mononuclear cells (PBMCs) were isolated from the plasma-Histopaque interface and washed twice in DPBS–2mM EDTA. Cells were cryopreserved in CryoStor CS10 (Biolife Solutions Inc) and stored at –80 C. Prior scRNA-seq and flow cytometry assays, cryopreserved PBMCs were thawed by gradual addition of an equal volume of DPBS + 0.5% BSA at +37 C starting with 0.5 mL and, ones the volume was equal 20 ml, the samples were centrifuged at 500 g for 10 min at room temperature. Samples with cell viability was more than 85% were used for the analyses. Before scRNA-seq and CITE-seq, 5 3 105 PBMCs were incubated with a cocktail of TotalSeq oligo-conjugated antibodies (Biolegend, see Table S7) [0.25 mg of each antibody per sample in the presence of human TruStain FcX (Biolegend, cat# 422302) at 4℃ for 30 min] followed by 3X wash in DPBS + 0.5% BSA, which enables 10x Feature Barcoding and, thus, measurement of proteins at a single cell level and integrate these data into the single-cell RNA sequencing workflow.
Flow cytometry
0.5 – 2 3 106 cells were stained with Zombie UV Fixable Viability Kit (Biolegend, cat# 423107), mouse TruStain FcX (Biolegend, cat# 101319) or human TruStain FcX (Biolegend, cat# 422302), followed by incubation with antibodies during 30 min at 4 C. For cytokine staining, cells were pre-incubated at 37 C and 5% CO2 in RPMI + 10% FCS + 10 ng/mL PMA + 100 ng/mL ionomycin in the presence of GolgiPlug (BD) during 3 h. For intracellular staining, Fixation/Permeabilization Solution Kit (BD, cat# 554714) or True-Nuclear Transcription Factor Buffer Set (Biolegend, cat# 424401) were used. Information about antibodies used for flow cytometry is shown in Table S7. Flow cytometry analyses were performed on a BD X20 and a BD Symphony A3. Results were acquired with the FACSDiva V 7.0 software (BD) and analyzed using FlowJo V 10.6 software (Tree Star).
CyTOF
Some metal conjugated antibodies were available from Fluidigm. Antibodies that were not available from Fluidigm were conjugated by labeling purified antibodies (Table S7) with the Maxpar Antibody Labeling kit (Fluidigm) according to the manufacturer’s protocol. Cryopreserved PBMCs were thawed in PBS with 0.5% BSA and washed in CyFACS buffer (PBS, 0.1% BSA, 0.02% NaN3, 2 mM EDTA). Cell were incubated at room temperature with Fc receptor blocking solution (Biolegend Human Trustain FcX) for 10 min. The surface staining antibody cocktail was then added for 1 h on ice. After surface staining, the cells were washed in PBS twice and stained for viability with 2.5 mM cisplatin. The cells were then resuspended in CyFACS buffer and fixed and permeabilized with the eBioscience Foxp3/Transcription factor staining buffer set (Thermofisher, cat# 00-5523-00). The intracellular antibody cocktail was added to the cells and incubated for 1 h on ice. Cells were then washed in PBS and resuspended in PBS containing 2% freshly diluted paraformaldehyde (Electron Microscopy Sciences) until acquisition at CHiiPs Immunomonitoring Lab (IML), Washington University School of Medicine in St. Louis. Samples were analyzed on a CyTOF 1 mass cytometer. Data were exported and individual samples were manually gated using Cytobank (https://www.cytobank.org) to exclude normalization beads, cell debris, dead cells, and doublets. Next, live CD3+ CD8+ cells were gated and exported for further downstream analyses. Dimensionality reduction analysis was performed in Cytobank with the viSNE tool(Amir et al., 2013). A batch effect between samples stained during two different days was identified, so we applied batch correction using the 95 percentile method (Schuyler et al., 2019). Data normalization was performed using arcshin method with CATALYST package, clustering analysis was performed with Phenograph method using K = 110 from Rphenograph R package. T-SNE plots and cluster visualization was performed using the R package ggplot2.
Bioplex and ELISA
Human and mouse plasma or conditioned media collected from mouse CD8+ T cells or 3T3 mouse fibroblast cell line were diluted 2-10 times with PBS + 1% BSA and analyzed with mouse CCL5 (R&D, cat# DY478), mouse IL-6 (R&D, cat# DY406), mouse CCL2 (R&D, cat#DY479DuoSet), mouse CXCL1 R&D, cat#DY453), mouse Granzyme B (R&D, cat# DY1865), and mouse Granzyme K (Biomatik, cat# EKU04569) ELISA kits or V-PLEX Proinflammatory Panel 1 Human Kit (Meso Scale Discovery, cat# K15004D) and V-PLEX Mouse Cytokine 19-Plex Kit (Meso Scale Discovery, cat# K15255D) using Meso Scale Discovery Quickplex-120 (Meso Scale Discovery).
Histology and immunostaining
The spleen, the lungs, the liver and kidney were dissected from isoflurane anesthetized mice after cardiac perfusion with cold PBS. The tissues were fixed in 4% PFA buffered with PBS for 24 h at 4 C, transferred into 30% sucrose in PBS for 24 h at 4 C, embedded in optimal cutting temperature compound (OCT compound) (Tissue-Tek) and kept at –80 C. Cryosections of the tissues (30 mM) were blocked in PBS + 5% BSA, incubated with antibodies and DAPI in PBS + 1% BSA and mounted on microscopic slides in ProLong Diamond antifade mountant (Thermofisher). Microscopic images were acquired with Zeiss LSM 880 inverted confocal microscope and analyzed with ZEN 3.2 software (Zeiss).
RNA isolation and bulk RNA sequencing
mRNA was isolated directly from cell pellets with RNeasy Micro Kit (QIAGEN). cDNA was synthesized using custom oligo-dT primer with a barcode and adaptor-linker sequence (CCTACACGACGCTCTTCCGATCT-XXXXXXXX-T15). Samples were pooled based on Actb qPCR values, RNA-DNA hybrids degraded with acid-alkali treatment and a second sequencing linker (AGATCGGAAGAGCA CACGTCTG) was ligated with T4 ligase (NEB). After SPRI-beads clean-up (Agencourt AMPure XP, BeckmanCoulter), the mixture was PCR-enriched (12 cycles) and SPRI-purified yielding final strand-specific RNA-seq libraries. Libraries were sequenced at the Centre for Applied Genomics (SickKids, Toronto) using a HiSeq 2500 (Illumina) using 50 3 25 bp pair-end sequencing.
Bulk RNA-seq analysis
Fastq files were demultiplexed with fastq-multx tool. Fastq files for each sample were aligned to mm10 mouse genome (Release M8 Gencode; GRCm38.p4) using STAR. Each sample was evaluated according to a variety of both pre- and post-alignment quality control measures with Picard tools. Aligned reads were quantified using a quant3p script (https://github.com/ctlab/quant3p) to account for specifics of 30 sequencing: higher dependency on good 30 annotation and lower level of sequence specificity close to 30 end. Actual read counts were calculated by HTSeq with enriched genome annotation and alignment with fixed multimapper flags. The counts are quantile normalized and log transformed, converted to expression matrix and used for downstream analysis. Visualization of gene expression levels and gene set enrichment analysis were performed using Phantasus (https://artyomovlab.wustl.edu/ phantasus) (Zenkova et al., 2020).
Bulk TCR sequencing and analysis
Total RNA was isolated from the spleen, peritoneal fluid cells, the lungs and the liver from the same young (n = 3) and old (n = 3) mice that were used for scRNA-seq and TCR-seq using RNeasy Mini Kit (QIAGEN, cat# 74104). 190 ng of total RNA was used for TCR libraries generation using SMARTer Mouse TCR a/b Profiling Kit (Takara, cat#634404). Libraries were sequenced on a MiSeq v3 (600 cycles) flow cell, targeting 3000000 read pairs per library. Fastq files were aligned with MIXCR tool (Bolotin et al., 2017) using default parameters for RNA-seq data from mouse. TCRa and TCRb sequences obtained from each tissue were compared with paired TCRa/TCRb repertoire generated by scTCR-seq in CD4 and CD8 T cells from the same tissues. Based on TCRa and TCRb sequences uniquely overlapping between bulk TCR-seq and scTCR-seq, we mapped individual CD8 T cells with available paired TCRa/TCRb sequences from the 4 organs to individual mice to compare similarity of expanded CD8 TCR clonotypes between mice.
Single-cell RNA-seq
Isolated single cell suspensions were subjected to droplet-based massively parallel single-cell RNA sequencing using Chromium Single Cell 50 Reagent Kit as per manufacturer’s instructions (10x Genomics). Briefly, cell suspensions were loaded at 1,000 cells/uL with the aim to capture 10,000 cells per well. The 10x Chromium Controller generated GEM droplets, where each cell was labeled with a specific barcode, and each transcript labeled with a unique molecular identifier (UMI) during reverse transcription. The barcoded cDNA was isolated via a Dynabeads MyOne SILANE bead cleanup mixture. The cDNA was amplified by PCR and purified via SPRI bead cleanup.
For gene expression libraries, 50 ng of amplified cDNA was used for library preparation, consisting of fragmentation, end repair, Atailing, adaptor ligation and sample index PCR as per the manufacturer’s instructions. Libraries were sequenced on a NovaSeq S4 (300 cycles) flow cell, targeting 45,000 read pairs per cell.
For mouse and human TCR and BCR repertoire libraries, 2 uL of amplified cDNA underwent two rounds of Target Enrichment using nested primer pairs specific for mouse and human TCR and BCR constant regions. 50 ng of the target enrichment PCR product was used for library preparation, consisting of fragmentation, end repair, A-tailing, adaptor ligation and sample index PCR as per the manufacturer’s instructions. Enriched libraries were sequenced on a NovaSeq S4 (300 cycles) flow cell, targeting 5,000 read pairs per cell.
Single-cell RNA-seq analysis
Sample demultiplexing, barcode processing, and single-cell 50 counting was performed using the Cell Ranger Single-Cell Software Suite (10x Genomics). Cell Ranger cell count was used to align samples to the reference genome (mm10 for mouse genome and GRCh38 for human genome), quantify reads, and filter reads with a quality score below 30.
Processing data with Seurat Package
The Seurat R package was used for subsequent analysis (Butler et al., 2018). Cells with mitochondrial content greater than 5% were removed. Clusters of cells expressing gene markers of platelets (PF4 and PPBP) with median levels > 2 were removed from human scRNA-seq PBMC dataset. Cells identified as doublets or mutliplets based on gene expression signatures, when more than one cell population-specific marker gene was highly expressed in one cell, were filtered out. Filtered data were normalized using a scaling factor of 10,000, and nUMI was regressed with a negative binomial model. Final filtered dataset included: in mouse scRNA-seq 11838 cells from the spleen, 8218 cells from peritoneal fluid, 9233 cells from the lungs and 4609 cells from the liver; 99761 cells in human PBMC scRNA-seq.
Clustering
The highly variable genes were selected using the FindVariableFeatures function with mean greater than 0.0125 or less than 3 and dispersion greater than 0.5. These genes were used in performing the linear dimensionality reduction. Principal component analysis was performed using the top 3000 most variable genes prior to clustering and number of the first principal components (PCs) was selected based on the ElbowPlot as described below for different datasets. Clustering was performed using the FindClusters function which works on K-nearest neighbor (KNN) graph model with the granularity (resolution) ranging from 0.1-1.5 and selected resolutions for downstream clustering are described below. The datasets were projected as t-SNE or UMAP plots. Resulting unbiased clustering of the cells was not affected by batch effects derived from ambient RNA (Young and Behjati, 2020) or depth and quality of sequencing but rather reflected transcriptional variances between age and tissue localization of the immune populations in mice (Figures S1B and S1C). We further examined expression levels of characteristic marker genes to sub-divide the dataset to isolate immune cell clusters with similar ontogeny (B cells, myeloid cells, CD4 and CD8 T cells) for downstream analysis (Figure S1D and S15C).
Clustering of CD45+ cells from the 4 mouse organs was performed using the following parameters: 30 PCs and resolution 0.9 for the spleen, 20 PCs and resolution 0.5 for the peritoneal cells, 30 PCs and resolution 0.9 for the lungs, 30 PCs and resolution 0.4 for the liver.
Clustering of immune cell populations from the 4 organs was performed using the following parameters: 15 PCs and resolution 0.3 for B cells, 20 PCs and resolution 1.3 for CD4 T cells, 10 PCs and resolution 0.7 for CD8 T cells, 15 PCs and resolution 0.3 for macrophages and DCs.
Clustering of human PBMCs was performed using the following parameters: 40 PCs and resolution 0.3.
Clustering of human CD3, CD4 and CD8 T cells was performed using the following parameters: 30 PCs and resolution 0.3.
Clustering of human non-naive CD8 T cells was performed using the following parameters: 10 PCs and resolution 0.3.
Gene marker and differential expression analysis
To identify gene that expression levels were different between clusters (gene markers), we performed differential expression analysis between each cluster to all other clusters using Wilcoxon rank sum test from Seurat R package. The following parameters were used: minimal percentage of cells per cluster expressing the gene equals 10% and gene expression fold change threshold equals 10%.
Differential expression analysis of all expressed genes between clusters in young compared to old mice and men was performed using the MAST algorithm (Finak et al., 2015) of the Seurat R package. Log2 fold changes of average expression and the percentage of cells (pct) expressing the genes in each condition (ptc1 for young, ptc2 for old) were generated. The bulk fold changes were calculated by adding log2(pct1 + 0.005) / (pct2 + 0.005) to the log2(fold change) generated from the Seurat R package. The adjusted P values were calculated using Bonferroni correction.
Removal of ambient RNA contamination from scRNA-seq data
To remove of ambient RNA contamination from scRNA-seq data and determine a potential batch effect resulting from it, we used SoupX (Young and Behjati, 2020). The raw feature matrices and filtered feature matrices were used as an input for SoupX. The contamination rate was calculated using autoEstCont() and the counts were corrected by adjustCounts() command. The resultant matrix was used for downstream analysis.
Monocle2 pseudotime transcriptional trajectory analysis
Human PBMC scRNA-seq clusters annotated as CD8+ Tcm, Tem-k and Tem-b were subset into a small dataset using the subset function in Seurat (Butler et al., 2018). The RNA counts of all cells from the subset was used as an input for Monocle2 (Qiu et al., 2017) for further downstream analysis. Gene expression levels less than 0.1 and expressing in less than 10 cells were removed from the analysis. Remaining cells were clustered in an unsupervised manner and differential expression was performed using the function differentialGeneTest. Genes with q-value < 0.01 were selected for the analysis. The DDRTree method was used to find the cell transition. To find the genes driving the transition we performed a statistical test Beam was used.
Processing and analysis of public scRNA-seq datasets
Tabula Muris Senis dataset (GSE109774) containing scRNA-seq of multiple tissues from mice of different ages (Almanzar et al., 2020) was used to: (1) select clusters of immune cells that express high levels of Ptprc gene from all samples and (2) form these clusters of immune cells, select clusters with high expression levels of CD3E and CD3D genes (CD3 T cell clusters) using Seurat. These selected cells were re-clustered as described above using PCs 20 and resolution 0.3 as described above to generate a dataset with T cells from multiple organs from mice of different ages.
GSE130879 scRNA-seq dataset with t-SNE coordinates, cluster information and gene expression from mouse CD4 Treg cells was used as described (Delacher et al., 2020).
GSE132901 scRNA-seq dataset that contains cells from the spleen, the lungs and kidney from young and old mice (Kimmel et al., 2019) was downloaded from and subsequent analysis was performed using R package Seurat as described above. PCA was performed using the top 3000 most variable genes and t-SNE analysis was performed with the top 20 PCAs. Clustering was performed using a resolution of 0.3. Cell clusters were annotated based on their transcriptional signatures (for more details see https:// artyomovlab.wustl.edu/immune-aging/explore.html). To identify rare cell populations including GZMK-expressing CD8+ T cells, we subclustered cells based on the tissue for further analysis.
Kidney
PCA was performed using the top 3000 most variable genes and t-SNE analysis was performed with the top 20 PCAs. Clustering was performed using a resolution of 0.8. Clusters with high expression levels of Cd3d (T cells) were re-clustered separately (PCs 10, cluster resolution 0.3) and resulting clusters projected back to the original clustering.
Lung
PCA was performed using the top 3000 most variable genes and t-SNE analysis was performed with the top 20 PCAs. Clustering was performed using a resolution of 0.3. Clusters with high expression levels of Cd3d (T cells) were re-clustered separately (PCs 20, cluster resolution 0.3) and resulting clusters projected back to the original clustering.
Spleen
PCA was performed using the top 3000 most variable genes and t-SNE analysis was performed with the top 20 PCAs. Clustering was performed using a resolution of 0.8.
Clusters from the resulting datasets were annotated using analysis of expression patterns of genes that are specifically expressed in different immune and non-immune cell types in the spleen, the lungs and kidney.
These public datasets were used to analyze and validate accumulation of age-associated populations identified in our scRNA-seq data by analyzing the expression of characteristic genes in cell clusters in young and aged mice. GSE132901 scRNA-seq dataset was used to predict cell-cell ligand receptor interaction between different cell types (see below).
Cell-cell interaction network analysis
To predict cell-cell ligand-receptor interactions from scRNA-seq data in young and aged mice CellPhoneDB v.2.0 was used (Efremova et al., 2020). Dataset GSE132901 containing scRNA-seq of the spleen, the lungs and kidney from young and old mice (Kimmel et al., 2019) was re-annotated to identify a separate cluster of GZMK-expressing CD8 T cells and information about RNA counts and cluster attribution for all cells was used as an input for CellPhoneDB. Mouse genes that are part of ligand-receptor database were converted to human geneID using BioMart orthologs (Smedley et al., 2015). To identify the pairs of ligand-receptor interactions, we performed CellPhoneDB analysis using the default parameters. For more details see https://artyomovlab.wustl.edu/immune aging/explore.html.
Single-cell BCR repertoire analysis
Sample demultiplexing and barcode processing was performed using the Cell Ranger Single-Cell Software Suite (10x Genomics). Cell Ranger VDJ v3.0.2 was used to align reads to the reference genome (vdj_GRCm38_alts_ensembl for mouse and vdj_GRCh38_alts_ensembl for human) and assemble BCRs. For downstream analysis, only BCRs with 1 productive rearrangement for heavy chain and 1 productive rearrangement for light chain (either kappa or lambda) were used. Frequencies of clonotypes were counted based on number of cells that passed general expression QC and share clonotype nucleotide sequence.
To analyze BCR mutation score for B cells from different tissues and clusters, docker immcantation/suite:4.0.0 was used. 10x Cell Ranger output data were processed with changeo-10x script supplied in the Docker container. Clonally related IgH sequences were identified using DefineClones.py (Gupta et al., 2015) with a nearest neighbor distance threshold of 0.1, determined by visual inspection of the output of distToNearest R function (Shazam) (Gupta et al., 2015). CreateGermlines.py was used to infer germline sequences for each clonal family; observedMutations R function (Shazam) was used to calculate somatic hypermutation frequencies for each IgH contig. Finally, the number of IgH chains were determined per unique cell barcode and the subsequent metadata table was integrated with the scRNA-seq gene meta table for downstream analysis.
Paired scTCRa/TCRb repertoire analysis
Sample demultiplexing and barcode processing was performed using the Cell Ranger Single-Cell Software Suite (10x Genomics). Cell Ranger VDJ v3.0.2 was used to align reads to the reference genome (vdj_GRCm38_alts_ensembl for mouse and vdj_GRCh38_alts_ensembl for human) and assemble TCRs. For downstream analysis, only TCRs with 1 productive rearrangement for TCRa chain and 1 productive rearrangement for TCRb chain were selected. Frequencies of clonotypes were calculated based on number of cells that pass quality control as described above and share both TCRa and TCRb nucleotide sequences. Gini coefficients, characterizing diversity and clonality of the TCR repertoire, were calculated for mouse CD4 and CD8 T cells from each organ individually in young and aged mice. For human CD8 T cells, Gini coefficients were calculated for T cells in each cluster separately for individual subjects. Gini coefficient for TCR from T cells from different organs and clusters was computed with ‘‘gini’’ function from tcR R package.
To cross-reference the TCR data with public data, the last version of VDJdb with paired TCR only was downloaded (July 2020). For each TCR from our dataset, we calculated distance with each VDJdb TCR for TCRa and TCRb (adist R function). TCR pairs with edit distance of 2 and less for each chain were considered a ‘‘hit’’ if they belong to the same T cell type (CD4/CD8).
Single-cell ATAC-seq
0.6 – 1.5 3 106 sorted cells were pelleted at 300 g for 5 min at 4 C. Cell pellets were resuspended in 110 uL of Nuclei Lysis Buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 0.1% Nonidet P40 Substitute, 0.01% Digitonin, 1% BSA) for 3 min on ice, followed by addition of 1 mL Wash Buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 1% BSA). Nuclei were pelleted at 600 g for 8 min at 4 C, resuspended at a concentration of 3,000 nuclei per mL in Wash Buffer, and filtered through a 40 mm Flow-Mi filter (Bel-Art) to remove any nuclei clumps. Approximately 15,000 nuclei were carried forward into the ATAC-seq transposition reaction as per the manufacturer’s instructions (10x Genomics).
Transposed single nuclei suspensions were subjected to droplet-based massively parallel single-cell ATAC sequencing using Chromium Single Cell ATAC Reagent Kit as per manufacturer’s instructions (10x Genomics). Briefly, transposed nuclei were loaded with the aim to capture 10,000 nuclei per well. The 10x Chromium Controller generated GEM droplets, where the transposed DNA for each nucleus was labeled with a specific barcode via 12 cycles of PCR. The barcoded DNA was isolated via a Dynabeads MyOne SILANE bead cleanup mixture and further purified via SPRI bead cleanup. The resulting DNA was barcoded for sequencing through 11 cycles of sample index PCR. Libraries were sequenced on a NovaSeq S4 (200 cycles) flow cell, targeting 25,000 read pairs per nucleus.
Single-cell ATAC-seq analysis
scATAC-seq datasets were obtained by using 10X Genomics techniques and Cell Ranger ATAC v 1.1.0 was used to process the data. First, count procedure was applied to process individual samples. Next, data were aggregated with using aggregate Cell Ranger ATAC procedure with the library depth normalization parameters.
The following count and aggregate commands were used:
cellranger-atac count–id = < id > –fastqs = < folder > –sample < sample > –reference refdata-cellranger-atac-mm10-1.1.02
cellranger-atac aggr–id = < id > –csv merged.csv–normalize = depth–reference = refdata-cellranger-atac-mm10-1.1.02
For the in-depth analysis we developed a specialized scATAC-seq analysis pipeline. Overall, the pipeline consisted of the following steps: (1) Quality control and filtration; (2) Cell calling; (3) Normalization and feature selection; (4) Dimensionality reduction; (5) t-SNE and UMAP 2D projections; (6) Clustering; (7) Differential markers; (8) Gene annotation; (9) Clusters visualization for the JBR Genome Browser and the Single Cell Explorer. The source code of the pipeline including step-by-step instructions is available at https://github.com/JetBrains-Research/sc-atacseq-explorer. We provide reproducible Jupyter notebook with predefined environment.
Quality control and filtration
Cell Ranger ATAC output file fragments.tsv.gz contains high-quality fragments that passed all quality control and filtration procedures. This file was used as a main reference for processing of UMI reads. Fragments were filtered against stop-list regions (Kundaje Lab, http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/mm10.blacklist.bed.gz) to avoid possible PCR or alignment artifacts, and fragment overlapping with this list were excluded from downstream analysis. Raw read profiles were visualized by deeptools v 3.1.3. Both bulk reads and individual cell populations can be explored online in the JBR Genome Browser viewer (https://artyomovlab.wustl.edu/jbr/).
Cell calling
Cell calling was performed in a similar way as implemented in the Cell Ranger ATAC. The threshold of 200 fragments per UMI and less was used to exclude noise. The thresholds of 8000 and more was considered as ‘‘barcode multiplets’’ and also used to filter out UMI from downstream analysis. The procedure of removing the noise and multiplets significantly improved both analysis robustness and overall computational cost.
Although some scATAC-seq analysis approaches do not require predefined peaks regions (e.g., SnapATAC), a correct bulk signal peak calling is a crucial part of most of the existing methods. Cell Ranger processing includes bulk signal peak calling based on a proprietary algorithm without any additional parameters to tune for the actual dataset. We noticed that some meaningful biological peaks were missing in the resulting peaks by Cell Ranger. To overcome this limitation and improve the quality of data processing, we reprocessed the bulk signal peak calling with a novel algorithm SPAN v 0.12.0.5096 (Shchukina et al., 2020) before downstream analysis. SPAN was used in a semi-supervised mode with the following parameters:
–fdr 1e-10 gap = 2
Normalization and feature selection
A standard Cell Ranger ATAC normalization was applied to median count per UMI, followed by peak length normalization to make DNA accessible sites of different widths having the same impact on the analysis. Sites with an extremely high coverage (> 99%) were ignored as they likely represent sequencing errors or contain housekeeping genes. Filtering out sites with a low variation (< 1%) improved the robustness of the biological signal.
Dimensionality reduction
A standard Cell Ranger ATAC method was applied for dimensionality reduction: Inverse Document Frequency normalization, IRLBA Singular Vector Decomposition followed by L2 normalization. The only used hyperparameters were the size of low dimensional space (D). D = 15 was used as a default value (recommended by Cell Ranger ATAC).Normalization, feature selection and dimensionality reduction all together provide robust clustering and representation.
t-SNE and UMAP projections
t-SNE and UMAP methods were used for low dimensional data visualization in 2d.
Clustering
Hierarchical clustering was performed on L2 normalized dataset after singular vector decomposition. We used Ward’s algorithm with Euclidian distance. The numbers of clusters were chosen empirically to be sufficient for a good data separation and generation of number of distinct clusters feasible for a biological interpretation.
Iterative analysis. Human PBMC scATAC-seq dataset is an aggregated dataset of six different donors. To analyze it, we applied an iterative analysis, described briefly in this section. Human PBMC dataset is a diverse combination of different blood cell populations with their own epigenetic structure, cell type specific open chromatin sites, etc. A high complexity of cell population structure of PBMCs leads to highly different influence of open chromatin footprints (single cell ATAC-seq reads) for different populations, which makes it difficult to detect rare populations specific peaks during a bulk peak calling procedure. To overcome this limitation, we proposed an iterative analysis procedure, where we chose an interesting group of cells based on hierarchical clustering dendrogram - NK and T cells and performed similar fine-grained analysis on them. We used filtered fragments file as a reference input and performed a dedicated bulk peak calling applied to filtered reads only. This allowed to detect rare subpopulationspecific open chromatin peaks crucial for subsequent analysis. Clustering results from this iteration were applied to the global analysis, yielding precise clustering result for cell types of interest, while keeping general structure of the whole dataset. This procedure can be generalized to make it feasible to analyze rich libraries combined of different cell types for many donors and conditions.
Information about the clusters is available in a raw and processed formats for visual exploration of different immune cell populations at https://artyomovlab.wustl.edu/immune-aging.
Differential markers
A procedure from Cell Ranger ATAC to compute differential markers for each cluster was applied for each peak individually. The test of mean value in a particular cluster versus others was performed with Negative Binomial GLM. Multiple hypothesis testing was performed with Benjamini-Hochberg procedure. Individual markers for each cluster are available at http://artyomovlab.wustl.edu/ publications/supp_materials/4Oleg/2019_sc_ATAC_seq_DT1634_Denis/markers/. Each file marker_N.bed is a BED6 file that contains the information about chromosome, offsets, marker number, and –log10 of adjusted P value.
Gene annotation
Peaks were annotated with corresponding genes using bedtools command: bedtools closest -D b
This associated arbitrary genome location to the closest gene reporting distance to transcription start site (TSS) upstream or down stream of the peak with respect to DNA strand. Genes markup v M22 was downloaded from the Gencode portal (ftp://ftp.ebi.ac.uk/ pub/databases/gencode/Gencode_mouse/release_M22/gencode.vM22.annotation.gtf.gz). Using the information of annotated peaks and clusters, tables containing mean peak values per cluster was created.
Cluster visualization for the JBR Genome Browser and the Single Cell Explorer
Using the pipeline and deeptools, dedicated bigwig files were created. The resulting dataset was uploaded for visualization at the JBR Genome Browser and the Single Cell Explorer. The Single Cell Explorer requires the following files: data.h5 – information about mean UMI per peak per cluster in binary HDF5 format, dataset.json – dataset description and metainformation, exp_data.json – all UMI and peak identifiers with total fragments per UMI, markers.json – results of differential cluster analysis, and plot_data.json –t-SNE and UMAP coordinates for visualization.
Analysis of public ATAC-seq datasets
ATAC-set data from CD8+ Tn, Tm, Te and Tex mouse cells (GSE86797) (Pauken et al., 2016) were uploaded for visualization into the JBR genome browser alongside with scATAC-seq clusters containing mouse splenic CD8+ T cells (clusters 3, 6 and 11). To this end, raw reads in fastq format were downloaded from NCBI SRA Run Selector (https://www.ncbi.nlm.nih.gov/Traces/study/) and processed with general purpose ChIP-seq and ATAC-seq snakemake-based pipeline available at https://github.com/JetBrainsResearch/chipseq-smk-pipeline. Raw read files were aligned to mm10 genome with bowtie2, reads coverage was visualized with deeptools v 3.1.3. We obtained peaks with MACS2 version v2.2.71 and default ATAC-seq settings: -q 0.05 -f BAMPE–nomodel–nolambda -B–call-summits. Both bigwig files and peaks in narrowPeak format are available to explore at https://artyomovlab.wustl.edu/ immune-aging. Peaks with accessible chromatin that were describes as specifically associated with mouse CD8+ Tex cells (Bengsch et al., 2018) were visually examined in samples from both datasets.
De novo TF motif analysis
TF binding motifs in candidate regulatory regions in each scATAC-seq cluster were identified and analyzed using the HOMER v4.10 (Heinz et al., 2010). 500 the most significant markers in each cluster were selected for downstream HOMER analysis using the following parameters:
findMotifsGenome.pl markers_N_500.bed mm10 motifs_N -size 200 -mask
As a result, enriched motifs in target regions were identified and compared to the HOMER custom database of known motifs based on published ChIP-Seq datasets. For each scATAC-seq cluster, the top identified TF motifs and associated transcription factors were analyzed.
QUANTIFICATION AND STATISTICAL ANALYSIS
Statistical analyses of biological data
No statistical method was used to predetermine the sample size. For comparison of groups, non-paired or paired two-tailed t test or Mann-Whitney U test were used. Parametric Pearson R correlation coefficients were calculated. In case of multiple comparisons, P values were adjusted using the Benjamini-Hochberg or Bonferroni corrections. In some cases, P values were transformed into Q values were calculated using Storey FDR procedure. Values of p < 0.05 were considered significant. Statistical analyses were performed with Prism V 8.3.0 (GraphPad Software) or R statistical language V 3.6.1. p < 0.05 was considered to be statistically significant.
This article is excerpted from the Immunity 54, 99–115.e1–e12, January 12, 2021 e12 by Wound World.