BOLT-LMM

Loh et al. |

BOLT-LMM

Omics

Run
About
API Example

Step 1: Upload your data

Upload .fam File

Drag your file(s) or upload

Your file can be in the following formats:fam, fam.gz

PLINK .fam file (note: file names ending in .gz are auto-[de]compressed). Sample information file accompanying a .bed binary genotype table.A text file with no header line, and one line per sample.

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload .bim File

Drag your file(s) or upload

Your file can be in the following formats:bim, bim.gz

PLINK .bim file. Extended variant information file accompanying a .bed binary genotype table.

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload .bed File

Drag your file(s) or upload

Your file can be in the following formats:bed, bed.gz

PLINK .bed file. Primary representation of genotype calls at biallelic variants. Must be accompanied by .bim and .fam files. Do not confuse this with the UCSC Genome Browser's BED format, which is totally different.

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload <phenoFile> File

Drag your file(s) or upload

Your file can be in the following formats:tab, tab.gz, txt, txt.gz

Phenotype file (header required; FID IID must be first two columns)

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload LDScores File

Drag your file(s) or upload

Your file can be in the following formats:tab.gz

A table of reference LD scores is needed to calibrate the BOLT-LMM statistic. Reference LD scores appropriate for analyses of European-ancestry samples are provided in the example file. For analyses of non-European data, we recommend computing LD scores using the LDSC software on an ancestry-matched subset of the 1000 Genomes samples

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload <remove> File(s) (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt

File(s) listing individuals to ignore (no header; FID IID must be first two columns). Data structure format should be same as .fam file.

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload <exclude> File(s) (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, txt.gz

File(s) listing SNPs to ignore (no header; SNP ID must be first column)

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload Covariate File (Optional)

Drag your file(s) or upload

Your file can be in the following formats:tab, tab.gz, txt, txt.gz

Covariate file (header required; FID IID must be first two columns)

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload <modelSnps> File (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt

File(s) listing SNPs to use in model (i.e.,GRM) (default: use all non-excluded SNPs) Note that even when a file of --modelSnps is specified, all SNPs in the genotype data are still tested for association; only the random effects in the mixed model are restricted to the --modelSnps. Also note that BOLT-LMM automatically performs leave-one-chromosome-out (LOCO) analysis, leaving out SNPs from the chromosome containing the SNP being tested in order to avoid proximal contamination.

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload Dosage File(s) (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, tab, gz

File(s) containing imputed SNP dosages to test for association. Example: rsID chr pos allele1 allele0 [dosage = E[#allele1]] x N

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload Dosage ID File (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, txt.gz

(This file should be provided when dosageFile is given.) File listing FIDs and IIDs of samples in dosageFile(s), one line per sample

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload IMPUTE2 File(s) (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, gz, tab

Please specify imputed SNPs as output by the IMPUTE2 The IMPUTE2 genotype file format is as follows: snpID rsID pos allele1 allele0 [p(11) p(10) p(00)] x N

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload IMPUTE2 List File (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, gz

(This file should be provided when IMPUTE2 file(s) is given.) List of [chr file] pairs containing IMPUTE2 SNP probabilities to test for association. Example: 17 EUR_subset.impute2.chr17second100 22 EUR_subset.impute2.chr22last100.gz

Don’t have a file?

Use our demo data to run

Use Demo Data

View example data

Upload IMPUTE2 ID File (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, gz

(This file should be provided when IMPUTE2 file(s) is given.). File listing FIDs and IIDs of samples in IMPUTE2 files, one line per sample. Example: 1 HG00096 2 HG00097 3 HG00099

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload 2-Dosage File(s) (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt, gz

You may also specify imputed SNPs as output by the Ri-copili pipeline and plink2 --dosage format=2. This file format consists of file pairs: (1)PLINK map files containing information about SNP locations; and (2) genotype probability files in the 2-dosage format. Example: 2-dosage format: SNP A1 A2 [FID IID] x N

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload 2-Dosage <.map> Format File(s) (Optional)

Drag your file(s) or upload

Your file can be in the following formats:map

(This file should be provided when 2-Dosage file(s) is given.). You may also specify imputed SNPs as output by the Ri-copili pipeline and plink2 --dosage format=2. This file format consists of file pairs: (1)PLINK map files containing information about SNP locations; and (2) genotype probability files in the 2-dosage format. Example: 2-dosage format: SNP A1 A2 [FID IID] x N .map file format: chr rsID allele1 allele0 [p(11) p(10)] x N

Don’t have a file?

Use our demo data to run

Use Demo Data

Upload 2-Dosage File List (Optional)

Drag your file(s) or upload

Your file can be in the following formats:txt

(This file should be provided when 2-Dosage file(s) is given.). List of [map dosage] file pairs with 2-dosage SNP probabilities (Ricopili/plink2 --dosage format=2) to test for association

Don’t have a file?

Use our demo data to run

Use Demo Data

View example data

Step 2: Set Parameters

Pheno Column

Covariate Column(s) (Optional)

Quantative Covariate Column(s) (Optional)

MAF (Minumum Allele Frequency) Threshold on IMPUTE2 Genotypes (Optional)

100

Number of Autosomes (Optional)

Min: 0

verboseStats Flag (Optional)

Geneticmapfile (Optional)

None

Step 3: Complete run profile

Job name - Optional

The BOLT-LMM (v2.4.1) algorithm employs a linear mixed model (LMM) to calculate statistical measures for examining the relationship between a phenotype (observable trait) and genotypes (genetic information). BOLT-LMM assumes a Bayesian mixture of normals before the random impact attributed to SNPs other than the one being tested by default. This model generalizes the traditional "infinitesimal" mixed model employed by prior mixed-model association approaches (e.g., EMMAX, FaST-LMM, GEMMA, GRAMMAR-Gamma, GCTA-LOCO), allowing for enhanced detection power while reducing false positives.

Example use case: GWAS (Genome-Wide Association Study)

Technology: Linear mixed model

Limitations:
- Currently, bgen format option is not available.
- BOLT-LMM is recommended for analyses of human genetic datasets with more than 5,000 samples.
- It is also noted that association test statistics obtained from BOLT-LMM are valid for quantitative traits as well as (reasonably) balanced case-control traits.
- The BOLT-LMM method, similar to other mixed-model approaches, can experience reduced effectiveness when applied to the analysis of large sets of case-control data in rare diseases, which may result in decreased statistical power.
- The research conducted does not aim to determine how much population structure or relatedness may affect the heritability parameter (h2g) estimated by BOLT-LMM, nor does it carry out or assess genetic prediction using external validation samples from a separate group.
- The performance of mixed-model techniques has not been examined in datasets where family structure plays a significant role.
- BOLT-LMM has only been evaluated on datasets consisting of human genetic data, which exhibit distinct genetic architectures and patterns of linkage disequilibrium compared to plant and animal data.

Metrics: Some of the metrics related to the study can be found in the article.

Citation:

Loh, PR., Tucker, G., Bulik-Sullivan, B. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 47, 284–290 (2015). https://doi.org/10.1038/ng.3190

Released:
Mar-24-2023

Previous Job Parameters

Your previous job parameters will show up here
so you can keep track of your jobs

Results

Parameters