RECENT YALE CEGS TOOLS AND DATASETS AVAILABLE ON THE WEB (17-Sep-08)
II ) Specific datasets
a) SVs
Sequence data were deposited in the small reads archive. Variants were deposited in the database of genome variants. Expression data was deposited into GEO. Accessions are listed at: http://sv.gersteinlab.org/index_files/data.html
Accession |
Data Type |
Sample(s) |
|
|
|
Array-CGH (microarray)* |
NA15510 vs. NA18505 |
|
SRA000197 |
PEM paired-ends (DNA) |
NA15510 |
SRA000198 |
PEM paired-ends (DNA) |
NA15510 |
SRA000199 |
PEM paired-ends (DNA) |
NA18505 |
SRA000200 |
PEM paired-ends (DNA) |
NA18505 |
SRA000201 |
PEM paired-ends (DNA) |
NA18505 |
SRA000202 |
PEM paired-ends (DNA) |
NA18505 |
SRA000203 |
PEM paired-ends (DNA) |
NA18505 |
SRA000204 |
Amplicon pool sequences (DNA) |
NA15510 |
SRA000205 |
Amplicon pool sequences (DNA) |
NA18505 |
SV data was loaded into our breakpoint database, BreakDB. BreakDB is located at http://sv.gersteinlab.org/breakdb. To access the Korbel et al. data, for example, either a text file containing all the SVs can be downloaded by selecting Korbel Release 1, or individual breakpoint events can be viewed within BreakDB. To view breakpoint events, first, select View by Source, then Korbel et al. (2007). Each breakpoint event is listed and contains information such as location, event type, flanking sequences and a suggested mechanism.
b) RNA Sequencing
Data were deposited in the NCBI GEO. Three types of files exist:
1. raw sequence reads/quality
2. processed data (exonic, junction and polyA reads)
3. a meta file describing the experiment
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11209
Ours was the first file of this type sent to GEO.
In addition the Science website contains a file with all new annotation/expression level info (Figure S4).
In addition our website has:
1. Gbrowse track for the new annotations.
2. Gbrowse track for novel transcribed regions.
3. table for ORFs have heterogenous polyA sites.
4. the list of introns confirmed by RNA-Seq
http://www.yale.edu/snyder/Naga2008sup.html
Software:
The core of the software is maintained at:
https://sourceforge.net/projects/nxgview/
This is a very visible host for all sorts of open-source software. There we have the version for our published yeast data.
IV) Informatics Tools and Websites:
RNA Seq is described above. Other technologies and websites are:
Jan O. Korbel, Alexander Eckehart Urban, Fabian Grubert, Jiang Du, Thomas E. Royce, Peter Starr, Guoneng Zhong, Beverly S. Emanuel, Sherman M. Weissman, M. Snyder & M. B. Gerstein (2007). Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome. PNAS (2007) 104: 10110-5.
[TOOL] WEBSITE: http://tiling.mbb.yale.edu/BreakPtr/index.html
TE Royce, JS Rozowsky, MB Gerstein (2007) Assessing the need for sequence-based normalization in tiling microarray experiments. Bioinformatics 23: 988-97.
[TOOL] WEBSITE: http://tiling.gersteinlab.org/sequence_effects
H Yu, K Nguyen, T Royce, J Qian, K Nelson, M Snyder, M Gerstein (2007). Positional artifacts in microarrays: experimental verification and construction of COP, an automated detection tool. Nucleic Acids Res 35: e8.
[TOOL] WEBSITE: http://bioinfo.mbb.yale.edu/ExpressYourself (COP submodule)
JE Karro, Y Yan, D Zheng, Z Zhang, N Carriero, P Cayting, P Harrrison, M Gerstein (2007). Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res 35: D55-60.
[TOOL] WEBSITE: http://pseudogene.org
KY Yip, P Patel, PM Kim, DM Engelman, D McDermott, M Gerstein (2008). An integrated system for studying residue coevolution in proteins. Bioinformatics 24: 290-2.
[TOOL] WEBSITE: http://coevolution.gersteinlab.org/coevolution
TE Royce, NJ Carriero, MB Gerstein (2007). An efficient pseudomedian filter for tiling microrrays. BMC Bioinformatics 8: 186.
[TOOL] WEBSITE: http://tiling.gersteinlab.org/pseudomedian
Modeling ChIP sequencing in silico with applications. ZD Zhang, J Rozowsky, M Snyder, J Chang, M Gerstein (2008) PLoS Comput Biol 4: e1000158.
[TOOL] WEBSITE: http://www.gersteinlab.org/proj/chip-seq-simu