Documentation
Citation
For all resources provided on the AlleleDB website, including the ASB, ASE, accessible SNVs, supplementary materials, personal genomes and scripts, please cite:
Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M. A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals (2016).
Nat Commun. 7:11101
For more details on the original AlleleSeq and vcf2diploid, please visit the Alleleseq website
here.
Querying the database
This database interfaces with the UCSC genome browser search engine; all coordinates are from human reference genome
HG19.
There are three basic query options that can be entered into the search box at the '
Query' page.
1) A region of the genome.
E.g. chr15:25247000-25365000. This will directly query the database with that region.
2) Name of a gene (HGNC symbol or otherwise).
E.g. EEF2. This will return UCSC genome annotations (with positional information) associated with that gene name. This will appear as a link to query the database for that region.
3) Keywords.
E.g. tetratricopeptide repeat.
Output of AlleleDB
Upon successful query of the database, two files are produced: "
out.bed" and "
view.txt".
Sample contents of "
out.bed" are shown below.
chr3 10192471 10192472 NA18486_ASE G/A 84 0 92 0 0 ASE
chr3 10192671 10192672 NA18486_ASE G/A 98 0 73 0 0 ASE
chr3 10191942 10191943 NA20505_ASE G/A 25 0 27 0 0 ASE
chr3 10192671 10192672 NA20505_ASE G/A 30 0 16 0 0 ASE
chr3 10192671 10192672 NA12878_ASB-Pol2 G/A 7 0 4 0 0 ASB-Pol2
chr3 10193682 10193683 NA12878_ASB-Pol2 T/G 0 0 8 6 0 ASB-Pol2
chr3 10184955 10184956 NA12878_ASB-SRF A/T 12 0 0 8 0 ASB-SRF
The columns are:
1. Chromosome of the SNV
2. Start position of the SNV (0-based)
3. End position of the SNV (1-based)
4. Identifier of the individual (as per the 1000 Genomes Project) with ASB/ASE annotation. For more information on the individuals, see
here.
5. Reference allele / Alternate allele
6. Read counts for Adenine
7. Read counts for Cytosine
8. Read counts counts for Guanine
9. Read counts for Thymine
10. Was this SNV detected as AS? (0:No | 1:Yes)
11. Additional annotation: ASB/ASE, followed by TF name if ASB.
Sample contents of "
view.txt" are below.
browser position chr3:10183319-10195354
track name="AS SNVs" description="AlleleDB Output" itemRgb="On"
chr3 10192471 10192472 NA18486_ASE 0 + 10192471 10192472 0,0,0
chr3 10192671 10192672 NA18486_ASE 0 + 10192671 10192672 0,0,0
chr3 10191942 10191943 NA20505_ASE 0 + 10191942 10191943 0,0,0
chr3 10192671 10192672 NA20505_ASE 0 + 10192671 10192672 0,0,0
chr3 10193682 10193683 NA12878_ASB-Pol2 0 + 10193682 10193683 0,0,0
The "view.txt" file is meant to be viewed in the UCSC genome browser as a track (the output page provides a link to do this). We provide this option to download the file so that one can save this as a custom track for later use.
On the track, ASB SNVs are colored
red and ASE SNVs are
black.
Downloading precompiled results
If one is interested in downloading more complete data with TF (for ASB) or gene (for ASE) and sample annotations, these are available on our '
Download' page.
Data sources
DNA-seq:
1000 Genomes Project (Abecasis G. et al.,
Nature 2012); PMID:
23128226
RNA-seq:
gEUVADIS (Lappalainen, T. et al.,
Nature, 2013); PMID:
24037378
ENCODE (ENCODE Project Consortium,
Nature, 2012); PMID:
22955616
Lalonde et al.,
Genome Res (2011); PMID:
21173033
Montgomery et al.,
Nature (2010); PMID:
20220756
Pickrell et al., Nature (2010); PMID:
20220758
Kilpinen et al.,
Science (2013); PMID:
24136355
Kasowski et al.,
Science (2013); PMID:
24136358
ChIP-seq:
ENCODE (ENCODE Project Consortium,
Nature, 2012)
McVicker et al.,
Science (2013); PMID:
24136359
Kilpinen et al.,
Science (2013); PMID:
24136355
Kasowski et al.,
Science (2013); PMID:
24136358
Scripts
The AlleleDB pipeline uses the following scripts from GitHub for filtering of ambiguous mapping bias reads and allele-specific SNV detection using beta-binomial calculations, in conjunction with the
AlleleSeq pipeline (v1.2a;
vcf2diploid tool v0.2.6 for personal genome construction):
alleleDB scripts v2.0
alleleDB scripts v1.0
Questions/Comments
Please contact J. Chen (jieming dot chen at yale dot edu) for questions, comments or feedback on AlleleDB.