Cancer Research • 2024-04-07

An RNA-based model for tertiary lymphoid structure (TLS) prediction and classification in pancreatic adenocarcinoma (PDAC)


Alexandra Livanova, Andrey Tyshevich, Andrey Kravets, Stanislav Kurpe, Nadezhda Lukashevich, Dmitry Ivchenkov, Daniil Dymov, Anna Belozerova, Kirill Kryukov, Aleksandr Sarachakov, Viktor Svekolkin, Vladimir Kushnarev
  1. BostonGene, Corp., Waltham, MA, US


Although TLS status possesses prognostic significance in PDAC and can potentially affect chemotherapy outcomes, there is currently a notable lack of RNA sequencing (RNA-seq) models that specialize in TLS identification and classification in PDAC. Here, we developed a model for predicting TLS status (high or low) based on RNA-seq data.

Design: Hematoxylin and eosin (H&E) whole slide images of PDAC samples from The Cancer Genome Atlas (TCGA, n = 118) and Clinical Proteomic Tumor Analysis Consortium (CPTAC, n = 129) were used to detect intratumoral and borderline TLSs followed by TLS density measurements (units/mm2) by an experienced pathologist. The samples were then stratified into TLS-high and TLS-low groups based on median density values. Next, we used deconvolution by Kassandra algorithm to identify cell subtypes enriched in each TLS group based on gene expression (RNA-seq) data. Calculation of ssGSEA scores for gene signatures corresponding to cell subtypes and TLS structures was performed, along with survival analysis. Differential expression analysis between TLS-high and TLS-low samples, followed by functional enrichment (|logFC| > 2; padj < 0.01), was conducted. The LightGBM gradient boosting classifier was then trained on ranked expression data with sequential feature selection to predict TLS-high and TLS-low groups. We trained the model with H&E staining annotations and ranked RNA expression data from TCGA or CPTAC samples (total n = 167). The remaining 80 were designated as hold-out samples. The weighted F1 score was computed as a performance metric.

Findings: Median density of detected TLSs in the TCGA and CPTAC samples was 0.012 units/mm2. Kassandra deconvolution revealed B-cell enrichment, but fibroblast and M2 macrophage depletion in the TLS-high group. Our calculated ssGSEA scores of previously described TLS gene signatures, along with those of different B-cell subtypes and follicular dendritic cells, showed significant association with the TLS-high group. Genes associated with B-cell proliferation, differentiation, and signaling (CD19, CD22, CD79A, CD79B, and CR2) were also upregulated in this group. Comparing the performance of our RNA-based model on the validation dataset with manual TLS classification by a pathologist, we obtained an F1 weighted score of 0.72 and ROC-AUC score of 0.77. Thus, the TLS predictions by our model concurred with the TLS classification based on H&E annotations and pathological evaluation. Moreover, patients in the predicted TLS-low group had worse overall survival (OS) compared to the TLS-high group (Log(HR) = 0.76; 95% CI [0.04; 1.48]; p<0.05).
We present an RNA-based model that stratifies PDAC samples as TLS-high or TLS-low, with predictions that conform to pathological findings. We also found TLS-low samples to associate with worse OS, thus offering an objective means to predict prognoses of PDAC patients based on TLS status.
Read full publication Download