r语言go富集分气泡图 r语言go富集分析

R语言可视化之ggplot2——KEGG通路富集分析

之前分享了如何用ggplot2可视化GO分析的结果。既然做了GO,当然少不了KEGG了。

金水ssl适用于网站、小程序/APP、API接口等需要进行数据传输应用场景,ssl证书未来市场广阔!成为成都创新互联的ssl证书销售渠道,可以享受市场价格4-6折优惠!如果有意向欢迎电话联系或者加微信:18980820575(备注:SSL证书合作)期待与您的合作!

同样的,我们从 DAVID 获取KEGG pathway的结果。

对于KEGG,我比较喜欢做气泡图,这样用两种形式的图结合在一起,效果更丰富更好看一点。

【R语言】解决GO富集分析绘图,标签重叠问题

前面我给大家详细介绍过

☞GO简介及GO富集结果解读

☞四种GO富集柱形图、气泡图解读

☞GO富集分析四种风格展示结果—柱形图,气泡图

☞KEGG富集分析—柱形图,气泡图,通路图

☞ DAVID GO和KEGG富集分析及结果可视化

也用视频给大家介绍过

☞ GO和KEGG富集分析视频讲解

最近有粉丝反映说,利用clusterProfiler这个包绘制GO富集分析气泡图和柱形图的时候,发现GO条目的名字都重叠在一起了。

气泡图

柱形图

这个图别说美观了,简直不忍直视。经过我的认真研究,发现跟R版本有关。前面我给大家展示的基本都是R 3.6.3做出来的图。很多粉丝可能用的都是最新版本的R 4.1.2。

我们知道R的版本在不停的更新,相应的R包也在不停的更新。我把绘制气泡图和柱形图相关的函数拿出来认真的研究了一下,终于发现的症结所在。

dotplot这个函数,多了个 label_format 参数

我们来看看这个参数究竟是干什么用的,看看参数说明

label_format :

a numeric value sets wrap length, alternatively a custom function to format axis labels. by default wraps names longer that 30 characters

原来这个参数默认值是30,当标签的长度大于30个字符就会被折叠,用多行来展示。既然问题找到了,我们就来调节一下这个参数,把他设置成100,让我们的标签可以一行展示。

是不是还是原来的配方,还是熟悉的味道

同样的柱形图,我们也能让他恢复原来的容貌。

关于如何使用R做GO和KEGG富集分析,可参考下文

GO和KEGG富集分析视频讲解

R数据可视化7:气泡图 Bubble Plot

气泡图(Bubble Plot)就是由一个个像气泡元素组成的图,和普通的散点图不同,该图可以展示三维甚至四维信息,如下图: 点的位置即其横纵坐标分别代表了Weight和Height,气泡的大小代表了Age,颜色代表了不同个体。

再举几个例子:

上面用了不同形式展示了GO或其他富集的结果。上图和右下图中,我们用颜色代表GO的类别,用横纵坐标代表p-value和z-score,用大小代表富集的基因Count。左下图我们用颜色代表p-value,用大小代表GeneCount,横坐标代表GeneRatio,纵坐标代表具体的类别。

从上述例子中可以发现用气泡图我们能展示更多的数据信息。 随着多组学研究的涌现,我们急需在同一张图表理展现多维的数据,气泡图就是一个不错的选择。

1)需要什么格式的数据

根据最终想要在气泡图上展示数据的维度以确定数据的格式。

本次用一个来自于GOplo包的数据EC,该数据为RNA-seq的下游分析数据。

该数据标准化处理后进行统计分析以确定了差异表达基因。 使用DAVID功能注释工具对差异表达基因(调整后的p值0.05)进行基因注释富集分析。

由于本次将使用两个包一个是GOplot专门用于转录组数据的下游展示,还有一个是我们常用的画图包ggplot2, 需要注意的是用于ggplot2的作图数据还要基于circ略作修改,具体见下文。

2)如何作图

GOplot 包提供了直接做气泡图的方法:

略调整参数之后可以对图的布局、颜色等进行调整:

然后,我们来看一看用常见的包ggplot2应该如何做该图。

首先我们要对数据处理一下,剔除一些不必要的信息:

稍作改变,去除图例添加facet。

往期 R数据可视化 分享

R数据可视化6: 面积图 Area Chart

R数据可视化5: 热图 Heatmap

R数据可视化4: PCA和PCoA图

R数据可视化3: 直方/条形图

R数据可视化2: 箱形图 Boxplot

R数据可视化1: 火山图

ggplot2绘制Pathway富集分析气泡图

Term_Name GeneHitsInSelectedSet AllGenesInSelectedSet GeneHitsInBackground AllGenesInBackground p-value enrichFactor GeneListInSelectedSets Qvalue

00941 Flavonoid biosynthesis 14 492 41 3857 3.30E-04 2.676878842 "[FvH4_2g26480, FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_5g14010, FvH4_7g01160, FvH4_3g44420, FvH4_7g20870, FvH4_4g06180, FvH4_5g01170, FvH4_6g28410, FvH4_3g40570, FvH4_5g22390, FvH4_7g25890]" 0.04909626

00360 Phenylalanine metabolism 14 492 46 3857 0.001221701 2.38591375 "[FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_6g16060, FvH4_4g06180, FvH4_4g25490, FvH4_6g16460, FvH4_6g27650, FvH4_4g09340, FvH4_7g19130, FvH4_3g40570, FvH4_6g26610, FvH4_6g27940, FvH4_6g26600]" 0.091016736

00945 Stilbenoid, diarylheptanoid and gingerol biosynthesis 9 492 31 3857 0.012547314 2.275963808 "[FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_6g28410, FvH4_3g40570, FvH4_5g22390, FvH4_6g26800, FvH4_3g44420, FvH4_4g06180]" 0.467387431

00270 Cysteine and methionine metabolism 17 492 94 3857 0.083418875 1.417769417 "[FvH4_4g21340, FvH4_1g10540, FvH4_4g01140, FvH4_2g02530, FvH4_6g27650, FvH4_1g18690, FvH4_5g05120, FvH4_3g14020, FvH4_6g26610, FvH4_4g13980, FvH4_1g18490, FvH4_6g26600, FvH4_1g21920, FvH4_1g26460, FvH4_2g05040, FvH4_2g41260, FvH4_4g13280]" 0.654179598

04120 Ubiquitin mediated proteolysis 23 492 126 3857 0.04529262 1.431007227 "[FvH4_7g29370, FvH4_6g11010, FvH4_6g38720, FvH4_5g03910, FvH4_3g09200, FvH4_6g17370, FvH4_3g39370, FvH4_4g01260, FvH4_2g39250, FvH4_5g30320, FvH4_3g00910, FvH4_5g29350, FvH4_6g35920, FvH4_5g33030, FvH4_1g05910, FvH4_5g22570, FvH4_4g14790, FvH4_1g25030, FvH4_4g17530, FvH4_7g16630, FvH4_6g09540, FvH4_6g10930, FvH4_3g18500]" 0.674860033

00260 Glycine, serine and threonine metabolism 11 492 49 3857 0.0408107 1.759872242 "[FvH4_1g08890, FvH4_7g07540, FvH4_5g38450, FvH4_2g05310, FvH4_2g22570, FvH4_1g21920, FvH4_2g16830, FvH4_2g36660, FvH4_1g19090, FvH4_4g13290, FvH4_4g25490]" 0.675643816

00670 One carbon pool by folate 5 492 18 3857 0.069014744 2.177619693 "[FvH4_7g07540, FvH4_5g38450, FvH4_1g00040, FvH4_1g19090, FvH4_4g13290]" 0.685546458

03015 mRNA surveillance pathway 20 492 114 3857 0.082844862 1.375338753 "[FvH4_7g29390, FvH4_6g17300, FvH4_5g13570, FvH4_3g29340, FvH4_4g03530, FvH4_2g38640, FvH4_1g18700, FvH4_1g18000, FvH4_2g34040, FvH4_5g33710, FvH4_6g06810, FvH4_5g25490, FvH4_5g03260, FvH4_2g15670, FvH4_4g07000, FvH4_4g36800, FvH4_5g25550, FvH4_2g06580, FvH4_5g05510, FvH4_6g09230]" 0.685771358

00603 Glycosphingolipid biosynthesis - globo and isoglobo series 3 492 9 3857 0.096237762 2.613143631 "[FvH4_7g21240, FvH4_6g11740, FvH4_3g04760]" 0.71697133

00400 Phenylalanine, tyrosine and tryptophan biosynthesis 9 492 37 3857 0.038722924 1.906888596 "[FvH4_7g11530, FvH4_6g27650, FvH4_6g26610, FvH4_4g21980, FvH4_6g26600, FvH4_2g22570, FvH4_6g47770, FvH4_5g36810, FvH4_1g20450]" 0.721214462

00071 Fatty acid degradation 8 492 35 3857 0.068800169 1.791869919 "[FvH4_1g26810, FvH4_1g08890, FvH4_5g05130, FvH4_2g14760, FvH4_4g18500, FvH4_1g25230, FvH4_2g37760, FvH4_6g40560]" 0.732230372

04712 Circadian rhythm - plant 5 492 14 3857 0.024734738 2.799796748 "[FvH4_2g29440, FvH4_7g29370, FvH4_1g17250, FvH4_7g01160, FvH4_5g22570]" 0.737095202

03410 Base excision repair 8 492 34 3857 0.05939718 1.844571975 "[FvH4_4g29150, FvH4_4g36650, FvH4_2g21980, FvH4_6g11530, FvH4_2g39710, FvH4_4g35010, FvH4_2g40160, FvH4_4g35030]" 0.737514985

00130 Ubiquinone and other terpenoid-quinone biosynthesis 8 492 34 3857 0.05939718 1.844571975 "[FvH4_4g28800, FvH4_4g09340, FvH4_3g40570, FvH4_6g26610, FvH4_6g27940, FvH4_6g26600, FvH4_4g06180, FvH4_6g16460]" 0.737514985

00460 Cyanoamino acid metabolism 7 492 32 3857 0.103957465 1.714875508 "[FvH4_4g26180, FvH4_7g07540, FvH4_5g38450, FvH4_7g05220, FvH4_1g19090, FvH4_4g13290, FvH4_3g43510]" 0.737602967

00310 Lysine degradation 8 492 30 3857 0.030124137 2.090514905 "[FvH4_1g08890, FvH4_5g05130, FvH4_3g23070, FvH4_1g16260, FvH4_1g25230, FvH4_2g36660, FvH4_6g40560, FvH4_3g25420]" 0.748082742

00785 Lipoic acid metabolism 2 492 4 3857 0.081725815 3.919715447 "[FvH4_6g44960, FvH4_4g37350]" 0.761071655

00601 Glycosphingolipid biosynthesis - lacto and neolacto series 2 492 4 3857 0.081725815 3.919715447 "[FvH4_6g11740, FvH4_3g04760]" 0.761071655

00940 Phenylpropanoid biosynthesis 26 492 149 3857 0.056260767 1.367954384 "[FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_7g32980, FvH4_2g30540, FvH4_2g26620, FvH4_7g05220, FvH4_3g44420, FvH4_6g16060, FvH4_4g06180, FvH4_6g16460, FvH4_3g43510, FvH4_7g19130, FvH4_4g26180, FvH4_6g28410, FvH4_6g27940, FvH4_4g36130, FvH4_3g46010, FvH4_1g16790, FvH4_6g30610, FvH4_4g09340, FvH4_3g15230, FvH4_3g40570, FvH4_5g22390, FvH4_6g27610, FvH4_5g21320]" 0.762077663

00450 Selenocompound metabolism 4 492 15 3857 0.113762224 2.090514905 "[FvH4_2g38710, FvH4_7g04540, FvH4_6g24170, FvH4_2g41260]" 0.770480519

00563 Glycosylphosphatidylinositol(GPI)-anchor biosynthesis 3 492 13 3857 0.224775136 1.809099437 "[FvH4_5g04770, FvH4_2g15820, FvH4_1g19740]" 0.797416555

03008 Ribosome biogenesis in eukaryotes 6 492 33 3857 0.238324783 1.425351072 "[FvH4_1g27070, FvH4_1g17250, FvH4_1g16590, FvH4_2g38700, FvH4_3g27590, FvH4_1g22910]" 0.807054378

00860 Porphyrin and chlorophyll metabolism 8 492 47 3857 0.244407142 1.334371216 "[FvH4_3g20600, FvH4_5g33760, FvH4_7g25640, FvH4_2g27000, FvH4_3g20590, FvH4_1g04700, FvH4_2g23050, FvH4_4g37020]" 0.809259204

00053 Ascorbate and aldarate metabolism 8 492 47 3857 0.244407142 1.334371216 "[FvH4_1g08890, FvH4_5g05130, FvH4_3g33910, FvH4_6g20720, FvH4_7g08190, FvH4_7g13380, FvH4_1g25230, FvH4_5g20650]" 0.809259204

00944 Flavone and flavonol biosynthesis 2 492 5 3857 0.124932679 3.135772358 "[FvH4_6g17070, FvH4_5g14010]" 0.809346486

00040 Pentose and glucuronate interconversions 15 492 96 3857 0.236599083 1.224911077 "[FvH4_2g26010, FvH4_6g41430, FvH4_6g17310, FvH4_6g17430, FvH4_3g01680, FvH4_5g27090, FvH4_6g53340, FvH4_2g19540, FvH4_5g33570, FvH4_1g00260, FvH4_2g25970, FvH4_7g08190, FvH4_1g26360, FvH4_4g21500, FvH4_1g27720]" 0.819843336

03450 Non-homologous end-joining 2 492 7 3857 0.221439565 2.239837398 "[FvH4_4g35010, FvH4_4g35030]" 0.824862379

00942 Anthocyanin biosynthesis 2 492 7 3857 0.221439565 2.239837398 "[FvH4_3g19220, FvH4_7g33840]" 0.824862379

这是参考以下教程用自己的数据实现一遍,

R语言ggplot2画图系列——Pathway富集分析气泡图 - 生信技能树 - Powered by Discuz! ;action=printabletid=927

R语言ggplot2绘图教程——Pathway富集分析气泡图 - CSDN博客


网站标题:r语言go富集分气泡图 r语言go富集分析
URL链接:http://hbruida.cn/article/hhcshi.html