10X单细胞(10X空间转录组)之细胞通讯软件之间的分析比较

hello,大家好, 这一次给大家分享一下有关各个通讯软件之间的结果是否具有一致性,当然,做细胞通讯的软件非常多了,我也分享了很多,但是分享不是目的,用起来才是我们的终极目的,哪个软件该用,软件哪个好,优劣势都是什么,今天我们就来看一下。

Comparison of Resources and Methods to infer Cell-Cell Communication from Single-cell RNA Data

Abstract

1、做细胞通讯的软件很多,Each of them consists of a resource of intercellular interactions prior knowledge and a method to predict potential cell-cell communication events.(每个软件的配受体库和算法都不一样),Yet the impact of the choice of resource and method on the resulting predictions is largely unknown.

2、不同软件之间的分析比较,We found few unique interactions and a varying degree of overlap among the resources(配受体库的差异), and observed uneven coverage in terms of pathways and biological categories.

3、在用同一个数据进行测试的时候,We found major differences among the highest ranked intercellular interactions inferred by each method even when using the same resources.(方法之间的差异也很大)。

4、The varying predictions lead to fundamentally different biological interpretations, highlighting the need to benchmark resources and methods.(不同的软件分析出来的结果不一样,该用哪个???)

主要的结论

1、Different methods and resources provided notably different results(意料之中的事情,项目做的多了,早就发现了这个问题)。

2、The observed disagreement among the methods could have a considerable impact on the interpretation of results(结果不同,当然生物学解释就不同,用哪个呢??)。

Introduction

1、细胞通讯的意义,CCC commonly refers to interactions between secreted ligands and plasma membrane receptors(质膜受体 ). This picture can be broadened to include secreted enzymes, extracellular matrix proteins, transporters, and interactions that require the physical contact between cells, such as cell-cell adhesion proteins and gap junctions。CCC events are essential for homeostasis, development, and disease, and their estimation is becoming a routine approach in scRNA-seq data analysis(细胞通讯的研究确实非常重要)。

2.1、软件对于细胞通讯的预测,These CCC tools typically use gene expression information obtained by scRNA-Seq. In general, single cells are clustered by their gene expression profile and cell type identities are assigned to the clusters based on known gene markers.(首先对单细胞数据聚类和定义)。

2.2、CCC tools can predict intercellular crosstalk between any pair of clusters, one cluster being the source and the other the target of a CCC event.

3、每个软件都是一个配受体数据库,The information about which transmitter binds to which receiver is extracted from diverse sources of prior knowledge.(配受体库都是先验知识的积累)。

4、Roughly, CCC tools then estimate the likelihood of crosstalk based on the expression level of the transmitter and the receiver in the source and target clusters, respectively.(基本都是这么做的)。

5、每个软件有两个主要的组成部分,a resource of prior knowledge on CCC (interactions), and a method to estimate CCC from the known interactions and the dataset at hand

6、虽然每个软件的配受体和方法都不一样,但是原则上,any resource could be combined with any method.

7、软件之间的方法差异(6个软件),In turn, these different approaches result in diverse scoring systems that are difficult to compare and evaluate.(方法很多,选择哪一个??缺少一个好的标准)。

图片.png
关于Cellchat,大家可以参考文章10X单细胞(10X空间转录组)通讯分析之CellChat10X单细胞(10X空间转录组)通讯分析CellChat之多样本通讯差异分析,关于Squidpy,大家可以参加文章空间转录组细胞类型的距离分析之二---代码实现10X空间转录组通讯分析章节3、关于Connectome,大家可以参考文章10X单细胞之细胞通讯篇章-----Connectome,关于iTALK,大家可以参考文章细胞通讯-iTALK使用方法,关于NATMI,大家可以参考文章单细胞数据细胞通讯分析软件NATMI

8、软件之间配受体的不同,The available prior knowledge resources are typically distinct but often show partial overlap。Some of these resources also provide additional details for the interactions such as information about protein complexes、subcellular localisation、and classification into signalling pathways and categories。CCC resources are often manually curated and/or built from other resources, with varying proportions of expert curation and literature support,Some databases gather and harmonize the information contained in the individual resources(数据库的来源五花八门,数据库的影响也是今天研究的一个重点)。

图片.png

表注:We defined unique and shared interactions, receivers and transmitters between the CCC resources if they could be found in only one or at least two of the resources, respectively.

9.1、软件之间的比较测试,First, we explored the degree of overlap among resources and whether certain resources are biased toward specific biological terms, such as pathways and functional cancer states

9.2、we analysed how different combinations of resources and methods influence CCC inference, by decoupling the methods from their corresponding resources(数据库和软件的拆开组合,策略如下图)。

图片.png

我们来看看分析得到的结果,讲实话,我很惊讶,我知道每个软件分析结果不同,但是没有想到差异这么大。

结果1、Resource Uniqueness and Overlap(数据库比较,结论就是大家都不相同,之间的相似性差异也很大)。

首先是各个软件数据库的来源,Many of these resources share the same original data sources, including general biological databases such as KEGG, Reactome, and STRING,当然,还有很多其他的数据库。
图片.png
来看看配受体对的差异,As a consequence of their common origins, we noted limited uniqueness across the resources, with mean percentages of 4.6 unique receivers, 5.3 unique transmitters, and 16.8% unique interactions, for all resources(共有性非常低,各个软件都有其第一无二的配受体对,而且占比差异很大,如下图)。
图片.png
Despite the sparse uniqueness among the resources, the pairwise overlap between them varied,有的软件之间的相似性很高。
图片.png
图片.png

图片.png

关于Jaccard Index,大家可以参加百度百科Jaccard系数,简单来讲就是两个数据集的交集除以两个数据集的并集

图片.png
图片.png
图片.png

图注:Upset plots representing the shared Interactions, Receivers, and Transmitters between all resources (A-C) and all resources except OmniPath (D-F).

每个配受体数据库,contained on average more than 65% the interactions present in the other resources。

图片.png

图片.png

图注,A) Interactions B) Receivers and C) Transmitters present in each resource when taken from the rest of the resources. Note these plots are asymmetric and represent the % of interactions from the resources on the X axis found in each resource on the Y axis.

配受体库差异总结,In summary, our results indicate that many of the transmitters, receivers, and interactions are not unique to any single resource, due to their common origins. However, different resources include varying proportions of the collective CCC prior knowledge.(反正都有差异,只是比例大小不同)。

结果2、Resource Prior Knowledge Bias

首先来看Subcellular Localisation,On average 90% of transmitters and 79% of receivers were annotated as secreted and transmembrane proteins, respectively。(看来分泌型的配受体占主流)。further used the localisations of transmitters and receivers to categorize the interactions as secreted or direct-contact signaling.

图片.png

图片.png

图注,Numbers and Percentages of Subcellular locations annotations of Receivers (A-B) and Transmitters (C-D) for each CCC resource. S, P and T stand for Secreted, Peripheral plasma membrane(外周质膜), and Transmembrane plasma membrane proteins, respectively.

observed that all resources were predominantly (74% on average) composed of interactions associated with secreted signalling, while direct-contact signalling constituted a substantially smaller (16% on average) proportion of interactions(分泌型的信号占据主流)。

图片.png

图片.png

图注,Interactions categorized as neither secreted nor direct-contact were labeled as ‘Other’ and made up the remainder of the interactions

每个数据库分泌型和接触型的信号占比均不相同,CellChatDB showed an overrepresentation of interactions matched to the category Other

图片.png

配受体细胞定位的结论,Our results suggest that localisations of transmitters and receivers were largely uniformly distributed and that secreted signalling was predominant across all resources. Yet, differences were noted between the relative abundance of secreted and direct-contact signalling interactions.(分泌型和接触型的配受体,每个数据库的比例均不相同).

Functional Term Enrichment(配受体通路的不同),每个数据库覆盖的通路及数量都有差别。

图片.png

图片.png

interactions associated with innate immune pathways and T-cell receptor categories were under-represented in Guide to Pharmacology, Baccin2019, EMBRACE, Kirouac2010, ICELLNET, CellPhoneDB, and HMPR(免疫相关的通路差异比较大,有的数据库甚至没有,但同时也很注释的数据库有关)。

图片.png

图片.png

图片.png

图注,Number of matches to A) Interactions, B) Receivers and C) Transmitters, Enrichment Scores for their Receivers and Transmitters (D-E), and the Percentages of Interactions, Receivers and Transmitters (F-H) matched to the NetPath database per resource

These observations for the WNT pathway were further supported by the relative abundance of HGNC。(不同的数据库注释也带来了很大的差异)。

图片.png

图片.png

图注,Number of matches to A) Merged Sets of Receivers and Transmitters, B) Receivers and C) Transmitters, their corresponding Enrichment Scores (D-F), and Percentages (G-I) per resource matched to the HGNC database.

Functional cancer cell states from CancerSEA were also unevenly represented in sets of receivers and transmitters across the resources(差异太大了,大到没有思路了)。

图片.png
图片.png

图注,Number of matches to A) Merged Sets of Receivers and Transmitters, B)Receivers and C) Transmitters and their corresponding (D-F) Enrichment Scores, and Percentages (G-I) per resource matched to the CancerSEA database.

运用一个注释好的数据来判断软件结果的一致性,这里我们关注数据的 the interactions between tumour cells subclassified by their resemblance of CRC consensus molecular subtypes (CMS) and immune cells from tumour samples,reasoning that this subset of cell types represents a complex example where CCC events are known to have an important role.

第一个结果,Interaction overlap

We then used each method-resource combination to infer CCC interactions, assuming that different methods should generally agree on the most relevant CCC events for the same resource and expression data.(这个假设~~~~~~~😂)。To measure the agreement between method-resource combinations, we looked at the overlap between the 500 highest ranked interactions as predicted by each method。Whenever available, author recommendations were used to filter out the false-positive interactions.
结论1、Our analysis showed considerable differences in the interactions predicted by each of the methods regardless of the resource used(我们的分析表明,无论使用何种资源,每种方法预测的相互作用都有很大差异 ),as the mean Jaccard index per resource ranged from 0.01 to 0.06 (mean = 0.024) when using different methods(真够低的)。These large discrepancies in the results were further supported by the pairwise comparisons between methods using the same resource, with mean Jaccard indices ranging from 0.063 (CellChat-SingleCellSignalR) to 0.110 (Connectome-NATMI).(也很低)。The overlap among the top predicted interactions was slightly higher when using the same method but with different resources, as Jaccard indices ranged from 0.113 to 0.203 per method (mean = 0.167)(相同的方法,不同的数据库的分析结果一致性提高了一点,但是绝对值还是很低,我都怀疑之前的分析到底对不对了😄)。

图片.png

图注,Jaccard indices for the 500 highest ranked interactions obtained from each method-resource combination.

结论2、Consequently, the highest ranked interactions for each method-resource combination largely showed stronger clustering by method than resource(方法对结果的影响更大 suggesting that the overlap between these combinations occurs predominantly when using the same method regardless of the resource)。

图片.png

图注,Overlap in the 500 highest ranked CCC interactions between different combinations of methods and resources. Method-resource combinations were clustered according to binary ( Jaccard index) distances. SCA refers to the SingleCellSignalR method

关于配受体的复合物,This analysis showed that the proportion of complexes among the highest ranked hits was 2-23% for CellChat and 10-38% for Squidpy, largely reflecting the relative complex content in each resource.(差异也很大)。

结论2的总结1,Our results suggest that the overlap between methods when using the same resource was low

图片.png

图片.png

图片.png

图注,Upset plot showing the overlap between the 500 highest ranked interactions using the same method with all resources.

结论2的总结2,The overlap when using the same method with different resources, albeit higher than that between different methods, was also modest(相同的方法不同的数据库,差异也比较大)。Hence, our results indicate that both the method and the resource had a considerable impact on the predicted interactions.

图片.png
图片.png
图片.png
图片.png
图片.png
图片.png
图片.png
图片.png
图片.png

图注,Upset plot showing overlap of most relevant interactions for each method with the same resource

结论3,Next, we asked whether the discrepancies observed between the methods stem from the differences in the cell types inferred as most active in terms of CCC interactions。To this end, we used the 500 highest ranked interactions to examine the cell type activities, defined as the proportion of interactions per cell type, separately as a source and a target of CCC events。结论和上面的差不多,不同的方法影响很大,as each method largely clustered by itself, regardless of the resource used, including the reshuffled resource. 采用不同的方法,对结果的影响很大。As a consequence, the disagreement between the methods in which cell types are the most active is expected to have a major impact on the biological interpretation of CCC communication predictions.

图片.png

图片.png

图注,PCA of normalized average interaction rank frequencies per cell pair

目前推断细胞通讯的缺陷

1、CCC events are mainly predicted based on the average gene expression at the cluster or cell type/state level. Such an assumption inherently suggests that gene expression is informative of the activity of transmitters and receivers However, gene expression provided by scRNA-Seq is typically limited to protein coding genes and the cells within the dataset, and hence does not capture secreted signalling events driven by non-protein molecules or long-distance endocrine signalling events.(这个缺点光靠单细胞数据无法解决).

2、CCC inference from scRNA-Seq data assumes that the product of the gene expression of a transmitter and a receiver is a good proxy for their joint activity, and thus does not consider any of the processes preceding transmitter-receiver interactions, including protein translation and processing, secretion, and diffusion.(更无法解决)。

Conclusion

方法尚未完善,我们仍需努力

到底做细胞通讯用哪个方法好呢????不知道读者有什么想法

生活很好,有你更好~~~

©著作权归作者所有,转载或内容合作请联系作者
禁止转载,如需转载请通过简信或评论联系作者。
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 158,847评论 4 362
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,208评论 1 292
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 108,587评论 0 243
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 43,942评论 0 205
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,332评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,587评论 1 218
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,853评论 2 312
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,568评论 0 198
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,273评论 1 242
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,542评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,033评论 1 260
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,373评论 2 253
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,031评论 3 236
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,073评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,830评论 0 195
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,628评论 2 274
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,537评论 2 269

推荐阅读更多精彩内容