Microsatellite markers – 1,486 members of 264 families. Marshfield Marker Set 16.
Ancestry Informative Marker Panel – (n=4,605)
- Description: Approximately 1,500 genome wide markers that are highly differentiated in frequency in Europeans compared to Africans. These markers provide less information than Affymetrix 6.0 GWAS genotypes, but are available in a larger sample, perhaps allowing, for example, imputation of rare variants of interest into related individuals.
- Methods: Genotyping methods and quality control are described in Nalls, et al.1
IBC Cardiovascular Candidate Gene Array2, 3 – (n=2,948)
- Genotyping was performed through NHLBI’s Candidate Gene Association Resource (CARe) consortium.
- Description: Gene-centric array interrogating ~55,000 SNPs selected to tag 2,100 CVD candidate genes chosen based on biologic function, involvement in CVD-related Mendelian syndromes, GWAS results, and other criteria.
- Methods: Design of the IBC Array is described in Keating et al2; and genotyping and quality control as well as organizational features of the CARe consortium are described in Musunuru, et al.3 A list of the genes and SNPs on the IBC Array is available on request.
Affymetrix 6.0 GWAS Genotyping – (n=3,029)
- Genotyping was performed through NHLBI’s Candidate Gene Association Resource (CARe) consortium.3
- Description: > 906,600 genome-wide tag SNPs and >946,000 probes for copy-number variation.
- Methods: Genotyping and quality control are described in Lettre, et al.4 Data have been imputed to the 1000 Genomes phase 1 v3 reference panel as described in Duan, et al.5
1000 Genome Phase 3 - (n=3029).
1000G Phase 3 Imputed Data: VCF files of dosage and likely genotypes for autosomal imputed SNPs from 1000 Genomes Project (1000G) Phase 3 version 5 reference panel. Imputation was completed using Minimac3 on the Michigan Imputation Server (PMID 27571263). The reference panel includes 5,008 haplotypes from 26 populations across the world (http://www.internationalgenome.org). Imputed SNPs were filtered for minor allele frequency ≥1%, call rate ≥ 90%, test of HWE p-value > 10-6, as well as exclusion of sites with invalid or mismatched alleles for the reference panel.
1) JHS Sample size (N) = 3,029 (includes 9 samples that are recommended to be excluded, based on quality control issues such as sex or pedigree mismatches)
2) Total SNPs imputed = 49,143,605 (not filtered for imputation quality or minor allele count)
These are the IDS that should be excluded:
•Sex mismatch: XT6HJD_XT6HJD, 572HY3_572HY3
•Duplicates (not twins) RP5EBV_RP5EBV, 9A42C8_9A42C8, 449BSP_449BSP
•Ambiguous sex check F value for X chromosome- 199657_199657
•Pedigree issues: F64JI8_F64JI8, YR136I_YR136I, P3AWF4_P3AWF4
Targeted exon sequencing in 256 candidate genes – (n=1,963)
- Sequencing was supported by the NHGRI sequencing centers in response to an application by Dr. Christine Seidman and others.
- Description: Candidate genes were nominated by project investigators based on evidence (from Mendelian families, GWAS, etc.) of involvement in LV remodeling, diabetes, dyslipidemia, dysrhythmia, or hypertension. A custom capture array targeting exons of 256 candidate genes was developed. DNA of 1,637 members of the Framingham Offspring Cohort and 1,963 members of the Jackson Heart Study was sequenced. A list of the targeted genes is available.
- Methods: Sequencing and quality control methods are described in Bick, et al.6
Exome sequencing – (n=3,374)
- Exome sequencing of JHS samples has been performed under four separate projects. The total of 3,374 unique samples includes some samples that were sequenced in more than one project:
- Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples (T2D-GENES; NIDDK): n=1,036.
- NHLBI’s Exome Sequencing Project (ESP): n=1,518.
- Minority Health Genomics and Translational Research Bio-repository Database (MH-GRID; NHLBI): n=312.
- Cohorts for Heart and Aging Research in Genomic Epidemiology Sequencing Project (CHARGE-S; performed through the Atherosclerosis Risk in Communities [ARIC] study [NHLBI] among participants included in both JHS and ARIC): n=522.
Methods: Library preparation, target capture, sequencing, variant calling and quality control have been performed at the Broad Institute, the University of Washington, and the Baylor College of Medicine (CHARGE-S) using methods similar to those described for the Exome Sequencing Project (Tennessen et al7). Sample shotgun libraries were captured for exome enrichment using one of three in-solution capture products: CCDS 2008 (~26 Mb), Roche/Nimblegen SeqCap EZ Human Exome Library v1.0 (~32 Mb; Roche Nimblegen EZ Cap v1), or EZ Cap v2 (~34 Mb), and sequencing was performed on Illumina GAIIx or HiSeq 2000 machines.
Joint calling: Sequence data from the four projects listed above were called jointly in the Kathiresan Laboratory at the Broad Institute. Sequence data of all participants were aligned to a human reference genome (hg19) using the Burrows–Wheeler Aligner algorithm. Aligned non-duplicate reads were locally realigned and base qualities were recalibrated using the Genome Analysis ToolKit software. Variants were jointly called using the Genome Analysis ToolKit software and filtered using the Variant Quality Score Recalibration, quality over depth metrics, and strand bias among other metrics.
Exome Chip – (n=2,790)
- Exome Chip genotyping was supported by R01HL107816 to S. Kathiresan.
- Description: The Exome Chip (Illumina Human Exome BeadChip v. 1.0) was developed through the Exome Sequencing Project as a cost-effective method to follow up on low-frequency and rare coding variants observed in the ESP and other exome sequencing studies. Content of the chip was derived from the exomes of 12,031 samples from an array of projects, largely involving participants of European ancestry but also including ~2,000 African Americans.
Selected variants included (n=243,094 designed successfully):
- nonsynonymous variants
- splice variants, and
- stop gain/loss variants
- variants were observed in at least two studies, except 8,242 variants seen only once and included for ethnic diversity.
Additional content included (numbers represent variants that designed successfully):
- 5,325 GWAS top SNPs reported by the time of design
- a grid of common variants (n=5,286)
- 4,651 random synonymous variants (including 870 genotyped on both strands)
- 3,241 ancestry informative markers for African ancestry
- 998 ancestry informative markers for Native American ancestry
- 2,459 HLA tags
- 846 ESP “requests”
- 259 fingerprint SNPs
- 270 Micro RNA Target Sites
- 246 mitochondrial SNPs
- 128 Y chromosome markers
- 181 Indels
- Methods: Genotyping, variant calling, and quality control were performed as described in Grove et al.8
Whole Genome Sequencing – (in progress; n=3,461 have passed sample QC)
- Description: Whole genome sequencing is being performed through NHLBI’s Trans-Omics for Precision Medicine (TOPMed) project (www.nhlbiwgs.org), through a contract with Macrogen, under the direction of the Nickerson Laboratory at University of Washington. Phases I and II of the TOPMed project include >60,000 samples from multiple cohorts, being sequenced at >30x depth of coverage at one of six sequencing centers (Baylor Human Genome Sequencing Center, Broad Institute, Illumina, Macrogen, New York Genome Center, Northwest Genomics Center). Joint calling of all samples will be performed by the TOPMed Informatics Resource Center at the University of Michigan, under the Direction of Dr. Gonçalo Abecasis.
- Status: Sequencing and joint variant calling for the first 20,000 TOPMed samples, including all JHS samples, are expected to be completed during the first quarter of 2016.
Select Genetic Variants Available at JHS for Analysis
Note: Select genetic variants have been genotyped directly on commercial genotyping arrays such as the Exome Chip (Illumina Human Exome BeadChip v. 1.0), IBC Cardiovascular Candidate Gene Array, or Affymetrix 6.0 GWAS array, or assessed by Exome Sequencing (see references on the website). These include: APOL1 G1 and G2 variants, Duffy null variants of the DARC gene, Hemoglobin C, PSCK9 loss of function variants, sickle hemoglobin (rs334) and a functional SCN5A missense variant. Alpha thalassemia-associated deletions have been assessed from whole genome sequence. These data are NOT distributed with the VC package but are available to investigators with approved JHS manuscript proposals through the Data Cordinating Center. For additional details go the link provided.
I) APOL1: data on the derived allele of coding SNP rs73885319 (p.S342G) defines, together with the derived allele of coding SNP rs60910145 (p.I384M), the APOL1 G1 alleles (Apolipoprotein L-1 (APOL1) gene. The derived allele of indel rs71785313 (p.NYK388K) defines the APOL1 G2 allele. Base pair positions are: chr22:36265860(+) and chr22:36265988(+) for G1 variants and chr2:36266000(+) for the G2 deletion, based on GRCh38.p7 assembly. JHS Sample size 3224.
II) DUFFY: data on rs2814778 SNP (i.e. upstream-variant-2KB, utr-variant-5-prime) in Atypical Chemokine Receptor 1 ( Duffy Antigen Receptor for Chemokines [DARC]; Duffy Blood Group antigen). Base pair position is chr1: 159204893(-) based on GRCh38.p7 assembly. JHS sample size: 3027
III) HbC: data on rs33930165 SNP (i.e. reference, missense) in Hemoglobin Subunit Beta (HBB) gene. Base pair position is ch11: 5227003(-) based GRCh38.p7 assembly. Gives rise to rare form of hemoglobin ‘Hb C’. JHS sample size: 3027
IV) PSCK9: data on rs28362286 SNP (i.e. nc-transcript-variant, reference, stop-gained) in Proprotein convertase subtilisin / kexin type 9 (PSCK9) gene. Base pair location chr1:55063542(+) based on GRCh38.p7 assembly. JHS sample size: 3027
V) Sickle cell trait: data on rs334 SNP (i.e. reference, missense) in Hemoglobin Subunit Beta (HBB) gene. Base pair position chr11:5227002(-) based GRCh38.p7 assembly. This dataset contains sickle cell trait/disease SNP for 3224 JHS participants.
VI) SCN5A: data on rs7626962 SNP (i.e. intron-variant, reference, missense) in Sodium Voltage-Gated Channel Alpha Subunit 5 (SCN5A) gene. Base pair location chr3: 38579416(+) based on GRCh38.p7 assembly. JHS sample size: 3027
VII) TTR: Transthyretin gene and associated genetic variant or SNP (rs76992529), a coding sequence and a missense variant located on chr18 base pair position 31,598,655(+) based on GRCh38.p7 assembly. Results from a G to A transition at the a CG dinucleotide codon of the 122 amino of a mature TTR protein. A total of 127 JHS samples carry the minor allele (A) of 3,447 individuals genotyped. Genotyping in JHS and quality control detailed in a paper by Grove et al. (2013)
1. Nalls MA, Wilson JG, Patterson NJ, Tandon A, Zmuda JM, Huntsman S, Garcia M, Hu D, Li R, Beamer BA, Patel KV, Akylbekova EL, Files JC, Hardy CL, Buxbaum SG, Taylor HA, Reich D, Harris TB, Ziv E. Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. Am J Hum Genet. 2008 Jan;82(1):81-7. Erratum in: Am J Hum Genet. 2008 Feb;82(2):532. PMC2253985.
2. Keating BJ, Tischfield S, Murray SS, Bhangale T, Price TS, Glessner JT, Galver L, Barrett JC, Grant SF, Farlow DN, Chandrupatla HR, Hansen M, Ajmal S, Papanicolaou GJ, Guo Y, Li M, Derohannessian S, de Bakker PI, Bailey SD, Montpetit A, Edmondson AC, Taylor K, Gai X, Wang SS, Fornage M, Shaikh T, Groop L, Boehnke M, Hall AS, Hattersley AT, Frackelton E, Patterson N, Chiang CW, Kim CE, Fabsitz RR, Ouwehand W, Price AL, Munroe P, Caulfield M, Drake T, Boerwinkle E, Reich D, Whitehead AS, Cappola TP, Samani NJ, Lusis AJ, Schadt E, Wilson JG, Koenig W, McCarthy MI, Kathiresan S, Gabriel SB, Hakonarson H, Anand SS, Reilly M, Engert JC, Nickerson DA, Rader DJ, Hirschhorn JN, Fitzgerald GA. Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies. PLoS ONE. 2008;3(10):e3583. Epub 2008 Oct 31. PMC2571995.
3. Musunuru K, Lettre G, Young T, Farlow DN, Pirruccello JP, Ejebe KG, Keating BJ, Yang Q, Chen MH, Lapchyk N, Crenshaw A, Ziaugra L, Rachupka A, Benjamin EJ, Cupples LA, Fornage M, Fox ER, Heckbert SR, Hirschhorn JN, Newton-Cheh C, Nizzari MM, Paltoo DN, Papanicolaou GJ, Patel SR, Psaty BM, Rader DJ, Redline S, Rich SS, Rotter JI, Taylor HA Jr, Tracy RP, Vasan RS, Wilson JG, Kathiresan S, Fabsitz RR, Boerwinkle E, Gabriel SB; NHLBI Candidate Gene Association Resource. Candidate gene association resource (CARe): design, methods, and proof of concept. Circ Cardiovasc Genet. 2010 Jun 1;3(3):267-75. Epub 2010 Apr 17. PMID: 20400780.
4. Lettre G, Palmer CD, Young T, Ejebe KG, Allayee H, Benjamin EJ, Bennett F, Bowden DW, Chakravarti A, Dreisbach A, Farlow DN, Folsom AR, Fornage M, Forrester T, Fox E, Haiman CA, Hartiala J, Harris TB, Hazen SL, Heckbert SR, Henderson BE, Hirschhorn JN, Keating BJ, Kritchevsky SB, Larkin E, Li M, Rudock ME, McKenzie CA, Meigs JB, Meng YA, Mosley TH, Newman AB, Newton-Cheh CH, Paltoo DN, Papanicolaou GJ, Patterson N, Post WS, Psaty BM, Qasim AN, Qu L, Rader DJ, Redline S, Reilly MP, Reiner AP, Rich SS, Rotter JI, Liu Y, Shrader P, Siscovick DS, Tang WH, Taylor HA, Tracy RP, Vasan RS, Waters KM, Wilks R, Wilson JG, Fabsitz RR, Gabriel SB, Kathiresan S, Boerwinkle E. Genome-Wide Association Study of Coronary Heart Disease and Its Risk Factors in 8,090 African Americans: The NHLBI CARe Project. PLoS Genet. 2011 Feb 10;7(2):e1001300. PMCID: PMC3037413.
5. Duan Q, Liu EY, Auer PL, Zhang G, Lange EM, Jun G, Bizon C, Jiao S, Buyske S, Franceschini N, Carlson CS, Hsu L, Reiner AP, Peters U, Haessler J, Curtis K, Wassel CL, Robinson JG, Martin LW, Haiman CA, Le Marchand L, Matise TC, Hindorff LA, Crawford DC, Assimes TL, Kang HM, Heiss G, Jackson RD, Kooperberg C, Wilson JG, Abecasis GR, North KE, Nickerson DA, Lange LA, Li Y. Imputation of Coding Variants in African Americans: Better Performance using Data from the Exome Sequencing Project. Bioinformatics. 2013 Aug 16. [Epub ahead of print] PMID: 23956302
6. Bick AG, Flannick J, Ito K, Cheng S, Vasan RS, Parfenov MG, Herman DS, Depalma SR, Gupta N, Gabriel SB, Funke BH, Rehm HL, Benjamin EJ, Aragam J, Taylor HA Jr, Fox ER, Newton-Cheh C, Kathiresan S, O'Donnell CJ, Wilson JG, Altshuler DM, Hirschhorn JN, Seidman JG, Seidman C. Burden of rare sarcomere gene variants in the Framingham and Jackson Heart Study cohorts. Am J Hum Genet. 2012 Sep 7; 91(3):513-9. PMID: 22958901. PMCID:PMC3511985.
7. Tennessen JA, Bigham AW, O'Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, Kang HM, Jordan D, Leal SM, Gabriel S, Rieder MJ, Abecasis G, Altshuler D, Nickerson DA, Boerwinkle E, Sunyaev S, Bustamante CD, Bamshad MJ, Akey JM; Broad GO; Seattle GO; NHLBI Exome Sequencing Project. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 2012 Jul 6;337(6090):64-9. doi: 10.1126/science.1219240. Epub 2012 May 17. PMID: 22604720
8. Grove ML, Yu B, Cochran BJ, Haritunians T, Bis JC, Taylor KD, Hansen M, Borecki IB, Cupples LA, Fornage M, Gudnason V, Harris TB, Kathiresan S, Kraaij R, Launer LJ, Levy D, Liu Y, Mosley T, Peloso GM, Psaty BM, Rich SS, Rivadeneira F, Siscovick DS, Smith AV, Uitterlinden A, van Duijn CM, Wilson JG, O'Donnell CJ, Rotter JI, Boerwinkle E. Best Practices and Joint Calling of the HumanExome BeadChip: The CHARGE Consortium. PLoS One. 2013 Jul 12;8(7):e68095. doi: 10.1371/journal.pone.0068095. Print 2013. PMID: 23874508. PMCID: PMC3709915