Selecting appropriate cell lines to represent a specific disease is crucial for the success of a biological study, since the usage of less relevant cell lines could bring misleading results. However, no systematic guidance is available yet for the cell line selection in both biomedical research and drug screening paradigm. Here, we developed a clinical genomics-guided prioritizing system (cGPS) to classify and sort cell lines according to the similarity of gene expression profiles between cell line and tumor, which bridges the gap between tumors and cell lines, presenting a helpful guide to select the most suitable cell line models for cancer studies.
By integrating the tumor samples from TCGA and the cancer cell lines from CCLE, a panel of 720 cell lines and 7,308 tumor samples across 44 tumor subtypes were introduced into cGPS, which related with bladder cancer, breast cancer, bile duct cancer, colorectal cancer, esophagus cancer, glioma, head and neck squamous cell carcinoma, kidney cancer, liver cancer, lung cancer, mesothelioma, ovarian cancer, pancreatic cancer, prostatic cancer, melanoma, stomach cancer, thyroid cancer, endometrium cancer.