Project Description

Computational biology tools from Microsoft Research's eScience group: PhyloD (Phylogeny-Based Association Analysis), Epitope Prediction, HLA Assignment, HLA Completion

The Tools

Different tools have difference licenses. Check each tool's license individually.

  • FaST-LMM
    • FaST-LMM (Factored Spectrally Transformed Linear Mixed Models) is a program for performing genome-wide association studies (GWAS) on large data sets. Versions of FaST-LMM run on either Windows or Linux systems and have been tested on data sets with over 120,000 individuals.
  • FaST-LMM-Set    -- COMING SOON --
    • FaST-LMM-Set extends the capabilities of FaST-LMM to handle associations between sets of variants and phenotype.
  • eLMM
    • eLMM (Eliminate Confouding in eQTL studies with Linear Mixed Models) is a program for performing eQTL analysis in the presence of two confounders: (1) population structure, (2) expression heterogeneity.
  • PhyloD
    • Pathogens live and reproduce inside the human host, whose immune system continually tries to rid the body of these pathogens. This leads to a tug-of-war between the pathogen and the human host, where the pathogen tries to adapt so as to “escape” the immune system, while the immune system learns to recognize and eliminate new foreign pathogens. A set of key players for the immune system are the HLA proteins, each of which can recognize specific short fragments of foreign (e.g. HIV) proteins, called epitopes, in infected cells and then alert the immune system to their presence. For rapidly evolving pathogens like HIV, a key defense mechanism is to evolve mutations that prevent the HLA proteins from recognizing the viral DNA. This evolution takes place anew in each patient, as each patient has a different set of HLA proteins that recognize different epitopes. PhyloD is a statistical tool that can identify HIV mutations that defeat the function of the HLA proteins in certain patients, thereby allowing the virus to escape elimination by the immune system. By applying this tool to large studies of infected patients, researchers are now able to start decoding the complex rules that govern the HIV mutations, in the hope of one day creating a vaccine to which the virus is unable to develop resistance.
  • Epitope Prediction
    • This tool computes the probability that a given kmer is a T-cell epitope restricted to a given HLA allele. The tool can scan for 8, 9, 10, and 11mer epitopes and over all common HLA alleles.
  • HLA Completion
    • HLA sequence typing sometimes yields uncertain results. For example, an allele may be identified as A6801/6802 or simply A02. This tool takes as input HLA typing data (loci A,B,C) and probabilistically resolves the typing ambiguities (i.e., probabilistically “completes” the data to 4-digit resolution).
  • HLA Assignment
    • One way to find epitopes is to do lab studies such as ELISPOT. One problem with this approach is that, if you see a reaction in a patient, you don’t know which of the patient’s HLA genes is responsible for the reaction. This tool takes lab data from a series of patients and determines (probabilistically) which HLA genes are responsible for the reaction.
  • Create Epitome
    • This tool takes, as input, a weighted list of amino acid sequences. It creates epitomes of all lengths.
  • False Discovery Rate
    • Estimate the false discovery rate for 2X2 contingency tables, based on Fisher's statistics.

Web Versions of the Tools

http://boson.research.microsoft.com/hla/
http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/

Pre-Compiled Programs for Windows

http://www.codeplex.com/MSCompBio/Release/ProjectReleases.aspx

Microsoft Research's eScience Research Group

General Practical Information




masthead.png

Last edited Jan 21 at 5:27 PM by CarlK, version 47