由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Biology版 - 才发现deseq除了2,好太多
相关主题
spearman correlation coefficient简单的方法做Volcano Plot?
Gene expression数据做GSEA的问题-------Question: How to statistically adjustify the mouse cohort sizes??--
Need help on one paper fromIEEE能用real time RT-PCR来比较同一细胞的不同gene的含量吗?
RNA seq 数据统计分析问题请教能用real time RT-PCR来比较同一细胞的不同gene的含量吗?
RNA-seq 表达量问题How to exactly define essential gene?
Can I put single and pair-ended RNAseq data together in DESeq analysishey here
请问R大牛或者懂RNA-SEQ分析的牛人,DESeq里面怎么对partially replicates的sample进行分析啊?谢!NIH gene array bank
truth about RNAseq vs Microarray请推荐一款text mining的工具
相关话题的讨论汇总
话题: deseq话题: genes话题: variance话题: expressed话题: filtering
进入Biology版参与讨论
1 (共1页)
x******m
发帖数: 736
1
速度快多了。
很多function打成包,run起来简单很多。不错。
z*********8
发帖数: 1203
2
用过了,感觉不错,就是不知道那个negative binomial model是个什么咚咚。反正我
也不发bioinfo方面的杂志,应该不需要知道的太多。嘿嘿
x******m
发帖数: 736
3
就是nb啊,不做bioinfo,知道他nb就行了。:)

【在 z*********8 的大作中提到】
: 用过了,感觉不错,就是不知道那个negative binomial model是个什么咚咚。反正我
: 也不发bioinfo方面的杂志,应该不需要知道的太多。嘿嘿

N******n
发帖数: 3003
4
是R package? 除了能找differential gene外,还要什么用处?
l**********1
发帖数: 5204
5
pls refer
>
One issue with RNA-seq data, however, is that the variance of this
probability among different individuals of a group is substantially higher
than the mean, with respect to many genes (Anders, Huber 2010). A Poisson
distribution assumes an equal mean and variance, and is therefore not a good
fit. This issue, known as “overdispersion,” has inspired statistical
software authors to adopt other models, particularly the negative binomial (
NB) distribution, which is characterized by an additional dispersion
parameter. Several popular differential expression packages, such as edgeR (
Robinson, McCarthy, Smyth 2010) and DESeq (Anders, Huber 2010) are based on
the NB distribution, but they differ extensively in how the dispersion
parameter is estimated, how normalization is performed, or how the
hypothesis test is carried out. For a nice tutorial on how DESeq works in
these respects, and its actual usage, see: cgrlucb.wikispaces.com/Spring+
2012+DESeq+Tutorial.
http://rnaseq.uoregon.edu/analysis.html
original hint was from former post here:
同主题阅读:简单介绍 Bioinformatics Tools for NGS 分析
[版面:生物学][首篇作者:jcp] , 2011年09月10日
http://www.mitbbs.com/article_t/Biology/31569333.html

【在 N******n 的大作中提到】
: 是R package? 除了能找differential gene外,还要什么用处?
l**********1
发帖数: 5204
6
so what? if your found can solve solid wet or hard bio can’t solve non-
linear trend between
count level and variance by Deseq or other softs ?
pls refer,
01-16-2013, 12:40 AM #1
JesperGrud
Junior Member
Location: Odense
Join Date: Aug 2012
Posts: 5
DESeq and independent filtering
Hi everyone
I know this topic has been up a few times, but yet there is a question. So
the basic idea about filtering is that it is done unsupervised to remove
genes that are too lowly expressed to become significant. This will in turn
reduce the number of tests made and therefore improve the multiple testing
correction of Benjamin-Hochberg. This is basically what Bourgon 2010 finds.
In the DESeq vignette, it is suggested to filter on the sum of reads for
each gene and remove the ones in the 40% bottom quantile. It is here the
question comes. Those 40% seem rather arbritary to me and must depend on the
data set.
So the question is if it statistically sound to just iterate through say 20-
60% cut-off and determine, which yields the best statistics and just use
that?
----
06-06-2013, 10:44 PM #70
sdriscoll
Senior Member
Location: La Jolla, CA, USA
Join Date: Sep 2009
Posts: 332
I do wonder why low-expressed genes tend to get the short end of the stick
in filtering. They aren't irrelevant...they are obviously expressed for some
reason unless we want to argue that we've got extra genes expressed doing
nothing at all. One explanation I tend to use is that when their count
values are low and additionally their coverage is very low across all
samples it's hard to say whether what we're seeing is noise or real evidence
that the gene is present but expressed very low. We can look at it as a
technical limitation of the sequencing run - we just didn't get enough reads
to test those genes.
Back to the filtering I do wonder one thing. Low count features tend to have
very high coefficients of variation but very low variance values. Highly
expressed genes tend to have very small coefficients of variation but very
high variance values. Is this maybe why Simon says they use the means
instead of the variances in their adaptation of what's outlined in the paper
? I fail to see how a linear cutoff could be applied when there's such a
clear non-linear trend between count level and variance.
http://seqanswers.com/forums/showthread.php?t=26560

【在 x******m 的大作中提到】
: 就是nb啊,不做bioinfo,知道他nb就行了。:)
1 (共1页)
进入Biology版参与讨论
相关主题
请推荐一款text mining的工具RNA-seq 表达量问题
GO analysisCan I put single and pair-ended RNAseq data together in DESeq analysis
问几个十分十分基础的生物问题, 请好心人解释下。。。包子答谢请问R大牛或者懂RNA-SEQ分析的牛人,DESeq里面怎么对partially replicates的sample进行分析啊?谢!
How to compare three gene lists, each with expression value or relationship?truth about RNAseq vs Microarray
spearman correlation coefficient简单的方法做Volcano Plot?
Gene expression数据做GSEA的问题-------Question: How to statistically adjustify the mouse cohort sizes??--
Need help on one paper fromIEEE能用real time RT-PCR来比较同一细胞的不同gene的含量吗?
RNA seq 数据统计分析问题请教能用real time RT-PCR来比较同一细胞的不同gene的含量吗?
相关话题的讨论汇总
话题: deseq话题: genes话题: variance话题: expressed话题: filtering