问个genomics和bioinformatics的问题 - Biology版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Biology版 - 问个genomics和bioinformatics的问题

相关主题
● 再问个chip-seq的问题	● 请教一个transcription factor的问题
● 请教：transcription factor binding site	● 如何寻找与Transcription start site相关的 specific histone modifications?
● 关于CRISPR activation	● 求教 invitro transcription
● transcription factor binding site missing	● Re: Yeast two hybrid system?
● transcription factor binding to one strand of DNA or both strand of DNA?	● how to know the potential transcriptional factor binding in the promotor of a certain gene
● 这个ENCODE其实就是UCSC那帮人在忽悠	● 如何比较不同物种的同一基因的promoter
● 大家对NGS的发展如何看？	● 怎样研究一个特定转录因子的功能？
● 蛋白interaction一定要证明是independent of DNA吗？	● 如何找transcriptional factor/repressor binding sites

相关话题的讨论汇总
话题: binding话题: seq话题: dna话题: protein

进入Biology版参与讨论

1

(共1页)

b******s 发帖数: 1089	1 我对genomics不是很懂，有些问题想请教一下。如果用microarray或RNA-seq做完transcriptional profile，是否可以通过分析有变化的gene的promoter elements得到high hits的candidates。然后通过这些可能的 element candidates做yeast-1-hybrid筛选，找到他们的binding transcription factors。这样在实际中是否不可行？是因为最后很难得到可靠的elements去做筛选吗？这些 elements不是针对一两个特定基因，而是针对有表型的处理组，通过分析基因表达的变化，找到跟这些表型变化相关的transcriptional regulators。我的模式材料是植物。谢谢！
K**4 发帖数: 1015	2 不可行！可以做，但是false positive 太大,没有实际意义了主要是现有mRNA 数据不完整，无法正确locate real transcription start site, 习惯上人们要提取TSS 前500-4000 bp for promoter analysis, 但是实际的很可能在 50-100kb以外。【在 b******s 的大作中提到】 : 我对genomics不是很懂，有些问题想请教一下。 : 如果用microarray或RNA-seq做完transcriptional profile，是否可以通过分析有变化 : 的gene的promoter elements得到high hits的candidates。然后通过这些可能的 : element candidates做yeast-1-hybrid筛选，找到他们的binding transcription : factors。 : 这样在实际中是否不可行？是因为最后很难得到可靠的elements去做筛选吗？这些 : elements不是针对一两个特定基因，而是针对有表型的处理组，通过分析基因表达的变 : 化，找到跟这些表型变化相关的transcriptional regulators。 : 我的模式材料是植物。谢谢！
x******m 发帖数: 736	3 你这样还不如做chip-seq
b******s 发帖数: 1089	4 chip-seq不是找下游targets吗？我现在想找上游的binding TFs 【在 x******m 的大作中提到】 : 你这样还不如做chip-seq
b******s 发帖数: 1089	5 这个是否可以找到一些有可能的elements，然后连上fluorescent proteins进一步in vivo分析，确定之后再做screen? 【在 K**4 的大作中提到】 : 不可行！ : 可以做，但是false positive 太大,没有实际意义了 : 主要是现有mRNA 数据不完整，无法正确locate real transcription start site, : 习惯上人们要提取TSS 前500-4000 bp for promoter analysis, 但是实际的很可能在 : 50-100kb以外。
l**********1 发帖数: 5204	6 if relative to stress response: you can try RAD-seq: Restriction site Associated DNA Sequencing http://www.molbio.uoregon.edu/facres/johnson.html HTTPS: //www.wiki.ed.ac.uk/display/RADSequencing/Home if relative to histone modification you can try BS-Seq: Bisulphite Sequencing http://seqanswers.com/wiki/BS-Seq original hint was from one Nature job posting: http://www.nature.com/naturejobs/science/jobs/344164-postdoctor >Postdoctoral Fellow in Evolutionary Bioinformatics : Vienna, Austria >A postdoctoral position in bioinformatics is immediately available in the research group of Ovidiu Paun at >the University of Vienna (see http://www.botanik.univie.ac.at/systematik/personnel/Paun.htm). below ignored > The candidate will play a lead role in analysing next generation sequencing data including RNA-seq, >smRNA-seq, BS-seq and RAD-seq. The fellow will be also involved in identifying outliers and performing > environmental correlations. >We are looking for a highly self-motivated and independent candidate, yet willing to work in a team->effort. The fellow should hold a relevant PhD degree in bioinformatics or related fields before starting this >position. Fluency in a major programming language such as perl or python and a strong publication >record are expected. The successful candidate should also be able to demonstrate experience with >computational analyses of high- throughput genomic data. >To be considered please send your application per email to ovidiu.paun’@‘ univie.ac.at including your CV, ........ >The latest preferred start >date is March 1st, 2014. or http://evol.mcmaster.ca/~brian/evoldir/PostDocs/Vienna.Evolutio http://evol.mcmaster.ca/cgi-bin/my_wrap/brian/evoldir/PostDocs/ 【在 b******s 的大作中提到】 : chip-seq不是找下游targets吗？ : 我现在想找上游的binding TFs
u*********1 发帖数: 2518	7 我们要区分清楚两个概念： a. RNA-seq发现一些基因的transcription level有变化 b. 通过yeast hybrid，chip-seq等等我们得到一些证据证明某个TF是bind到这个基因的某些element的我只想说，transcription level的调控是非常非常非常复杂的，远远不是promoter那么简单；你其实可以直接去看ENCODE project对基因noncoding区域的annotation，有很多TF是和intron结合的，同时被很多相隔很远的enhancer调控（chromatin structure比如looping）；所以transcription level有变化绝对不能就说是promoter 被调控导致表达量有变化当然我也不会完全信ENCODE的数据，1. 我相信还有很多其他的罕见的TF会binding到这个位点，但没有被数据库cover，2. 纵然有证据证明一个TF binding to loci，也不代表就一定有biological function，这个需要下游证明以上还是基于最简单的考虑，没有考虑tissue-specific，没考虑epigenetic，没考虑 splicing，。。。总之太多因素都可以导致transcription level变化总之我的意思是，a和b是没有绝对关系的，虽然貌似有联系【在 b******s 的大作中提到】 : 我对genomics不是很懂，有些问题想请教一下。 : 如果用microarray或RNA-seq做完transcriptional profile，是否可以通过分析有变化 : 的gene的promoter elements得到high hits的candidates。然后通过这些可能的 : element candidates做yeast-1-hybrid筛选，找到他们的binding transcription : factors。 : 这样在实际中是否不可行？是因为最后很难得到可靠的elements去做筛选吗？这些 : elements不是针对一两个特定基因，而是针对有表型的处理组，通过分析基因表达的变 : 化，找到跟这些表型变化相关的transcriptional regulators。 : 我的模式材料是植物。谢谢！
u*********1 发帖数: 2518	8 对你有兴趣的基因，你直接去UCSC上看就是了。现在基于chip-seq的数据很多了，你可以看到你有兴趣的（比如transcription有变化的）基因在全基因上被哪些TF binding ；。。。如果你发现一段序列被很多很多TF binding，加上又是conserved的，那么这段element就很大可能是functional的，然后拿到luciferase system来做【在 b******s 的大作中提到】 : 这个是否可以找到一些有可能的elements，然后连上fluorescent proteins进一步in : vivo分析，确定之后再做screen?
c***y 发帖数: 615	9 Are those transcription factor binding site prediction softwares making sense? I mean if the chip data are not available, what can we do about the regulatory elements on the basis of the sequences? binding 【在 u*********1 的大作中提到】 : 对你有兴趣的基因，你直接去UCSC上看就是了。现在基于chip-seq的数据很多了，你可 : 以看到你有兴趣的（比如transcription有变化的）基因在全基因上被哪些TF binding : ；。。。如果你发现一段序列被很多很多TF binding，加上又是conserved的，那么这 : 段element就很大可能是functional的，然后拿到luciferase system来做
u*********1 发帖数: 2518	10 makes NO sense at all in my perspective 你可以看到很多prediction的软件/网站；不同网站预测出来的结果完全不一样。 TF binding motif，我想都是非常variable的吧（http://en.wikipedia.org/wiki/Position-specific_scoring_matrix），当然我是外行，我想请教做TF binding的内行，到现在能准确identify比如MEF2A的binding site就一定是比如ATGGCC（我随便乱说的）？但根据俺的经验，纵然MEF2A是exclusively的bind到ATGGCC；也不是说每个ATGGCC都一定会被MEF2A target，一定还是要做实验的现在我比较相信的是：加入chip-seq的数据表明TF会bind在某个基因的某个loci，然后这个loci的某个SNP被软件预测可以改变binding motif；那么我相信这个SNP 会通过这个TF binding调控基因的表达【在 c***y 的大作中提到】 : Are those transcription factor binding site prediction softwares making : sense? I mean if the chip data are not available, what can we do about the : regulatory elements on the basis of the sequences? : : binding
b******s 发帖数: 1089	11 非常感谢。其实我想的是在genome-wide的层次上来做。通过分析有变化的 transcription的promoter elements找到一些candidates，然后连reporters做进一步筛选。确定的elements拿去做y1h筛选。所以即便会有很多false negative和false positive，只要能找到一些有变化的基因，及其他们的调控序列和regulator就很好了。另外我用的系统是植物，据说植物很多数据库很差。基本用不上。 promoter 【在 u*********1 的大作中提到】 : 我们要区分清楚两个概念： : a. RNA-seq发现一些基因的transcription level有变化 : b. 通过yeast hybrid，chip-seq等等我们得到一些证据证明某个TF是bind到这个基因 : 的某些element的 : 我只想说，transcription level的调控是非常非常非常复杂的，远远不是promoter那 : 么简单；你其实可以直接去看ENCODE project对基因noncoding区域的annotation，有 : 很多TF是和intron结合的，同时被很多相隔很远的enhancer调控（chromatin : structure比如looping）；所以transcription level有变化绝对不能就说是promoter : 被调控导致表达量有变化 : 当然我也不会完全信ENCODE的数据，1. 我相信还有很多其他的罕见的TF会binding到这
l**********1 发帖数: 5204	12 非植物或哺乳的数据库区别也植物的也一样在完善关键是楼主有无NGS and HMM （Hidden Markov Models 的背景或能找到那种背景的合作人 pls refer PMID 23435661 by Van der Does D et al., (2013). Salicylic acid suppresses jasmonic acid signaling downstream of SCFCOI1-JAZ by targeting GCC promoter motifs via transcription factor ORA59. Plant Cell. 25: 744-61. Abstract: ignored In silico promoter analysis of the SA/JA crosstalk transcriptome revealed that the 1-kb promoter regions of JA-responsive genes that are suppressed by SA are significantly enriched in the JA-responsive GCC-box motifs below ignored too http://www.ncbi.nlm.nih.gov/pubmed/23435661 full pdf link: HTTP double dot //www.plantcell.org/content/25/2/744.full.pdf or Wong KC et al., (2013). DNA motif elucidation using belief propagation. Nucleic Acids Res. 41: e153. http://www.ncbi.nlm.nih.gov/pubmed/23814189 Abstract Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ～10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. below ignored http://www.cs.toronto.edu/~wkc/kmerHMM/downloads.html or http://www.cs.utoronto.ca/~wkc/ http://www.utoronto.ca/zhanglab/people.html 了。【在 b******s 的大作中提到】 : 非常感谢。其实我想的是在genome-wide的层次上来做。通过分析有变化的 : transcription的promoter elements找到一些candidates，然后连reporters做进一步 : 筛选。确定的elements拿去做y1h筛选。所以即便会有很多false negative和false : positive，只要能找到一些有变化的基因，及其他们的调控序列和regulator就很好了。 : 另外我用的系统是植物，据说植物很多数据库很差。基本用不上。 : : promoter
l**********1 发帖数: 5204	13 pls refer fresh new both papers: a, by Morozov VY and Ioshikhes IP. (2013). Optimized Position Weight Matrices in Prediction of Novel Putative Binding Sites for Transcription Factors in the Drosophila melanogaster Genome. PLoS One. 8: e68712. Abstract Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. below ignored In the present study, we extended this technique originally tested on single examples of transcription factors (TFs) and showed its capability to optimize PWM performance to predict new binding sites in the fruit fly genome. We propose refined PWMs in mono- and dinucleotide versions similarly computed for a large variety of transcription factors of Drosophila melanogaster. Along with the addition of many auxiliary sites the optimization includes variation of the PWM motif length, the binding sites location on the promoters and the PWM score threshold. To assess the predictive performance of the refined PWMs we compared them to conventional TRANSFAC and JASPAR sources. below ignored http://www.ncbi.nlm.nih.gov/pubmed/23936309 or b, by Radivojac P et al., (2013). A large-scale evaluation of computational protein function prediction. Nat Methods. 10: 221-7. Abstract above ignored Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools. http://www.ncbi.nlm.nih.gov/pubmed/23353650 【在 c***y 的大作中提到】 : Are those transcription factor binding site prediction softwares making : sense? I mean if the chip data are not available, what can we do about the : regulatory elements on the basis of the sequences? : : binding
l**********1 发帖数: 5204	14 Pls refer one review By Madrigal P and Krajewski P. (2012). Current bioinformatic approaches to identify DNase I hypersensitive sites and genomic footprints from DNase-seq data. Front Genet. 3:230. cited from its pp2: >With sufficiently deepsequencing,the >so-called“digital genomic footprinting” >technique can reveal single protein- >binding events(Hesselberth et al.,2009). >Unlike ChIP-seq,which is specific for the >protein under study,footprints identify >narrow DNA regions that can be bound >by any factor(Hager,2009),showing sig- >nificant enrichment for known motifs >upstream of the transcription start sites >(TSSs). http://www.ncbi.nlm.nih.gov/pubmed/23118738 or Zhang W et al., (2012), Genome-wide identification of regulatory DNA elements and protein-binding footprints using signatures of open chromatin in Arabidopsis. Plant Cell. 24: 2719-31. http://www.ncbi.nlm.nih.gov/pubmed/22773751 【在 b******s 的大作中提到】 : chip-seq不是找下游targets吗？ : 我现在想找上游的binding TFs

1

(共1页)

进入Biology版参与讨论

相关主题
● 如何找transcriptional factor/repressor binding sites	● transcription factor binding to one strand of DNA or both strand of DNA?
● 请问有什么软件可以分析基因promoter上的potential transcription binding sites?	● 这个ENCODE其实就是UCSC那帮人在忽悠
● 关于chip assay的一个疑问	● 大家对NGS的发展如何看？
● 哪个网站分析潜在transcriptional factor binding区最好？	● 蛋白interaction一定要证明是independent of DNA吗？
● 再问个chip-seq的问题	● 请教一个transcription factor的问题
● 请教：transcription factor binding site	● 如何寻找与Transcription start site相关的 specific histone modifications?
● 关于CRISPR activation	● 求教 invitro transcription
● transcription factor binding site missing	● Re: Yeast two hybrid system?

相关话题的讨论汇总
话题: binding话题: seq话题: dna话题: protein

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)