Released Tools Available to the Public




eleMAP is a dynamic, free tool that helps researchers harmonize their phenotype data and data dictionaries to existing metadata and terminology standards. eleMAP also houses a downloadable library of harmonized metadata from a varity of existing phenotype studies. LEARN MORE


113 Users representing 70 institutions


         Video Recording         Slide presentation





The Phenotype Knowledgebase website, PheKB, is a collaborative environment to building and validating electronic phenotype algorithms. The PheKB has both a public component that allows you to view existing phenotypes and implementations, and a collaborative component that allows you to submit phenotypes, provide detail about implementations of a phenotype or add to existing phenotypes. For the collaborative component, you will need to register with the site.


233 users representing 24 institutions; 2,936 unique visitors to the website since release in early 2013


Slide Presentation



Natural Language Processing Tools (NLP)

Network Tools

Genotyping Tools

  • PennCNV - Implements a hidden Markov model that integrates multiple sources of information to infer CNV calls for individual genotpes samples, widely-used: 577 citations to date.
  • NGS data Analysis Pipeline - Over 1,700 subjects whole exome sequencing with over 100 different rate medical dicsorders, resolved over 30 rare disorders; sequencing an average of 50 exomes per week at 70x coverage, with an average turn-around time of four to six weeks from sample-processing to variant identification.
  • Biofilter, Biobin - Provides methods for prioritizing and analyzing variants singly or in groups, over 173 downloads since June 2013.
  • PLATO, ATHENA - Provides platforms for QC and integrating multiple methods of analysis, over 166 downloads since June, 2013.
  • Synthesis-View, PheWAS-View and Phenogram provides visualization tools for genome and phenome-wide data, over 58 downloads since June 2013.  These tools are also offered via a web interface at with over 1840 unique visitors.

Phenotyping Tools

  • Phenome-Wide Association Studies (PheWAS) - PheWAS methods leverage EMR billing data (ICD9s) to derive case and controls populations. Using this data, a large number of disease phenotypes can be investigated simultaneously against a specified variant or variants. The software and code translation file can be downloaded here.

Consent Tools

CDS Tools

  • ANNOVAR - Annotates genetic variatns detected from genomes that can shortlist SNVs and insertions/deletions, examine and report functional consequence, infer cytogenetic bonds, find variants in conserved regions, and identify variants from the 1000 Genomes Project and dbSNP. Widely used: 394 citations to date.
  • MyResearch, integration between Registrar and MyChart - 502 patients registered as of end of August.

NLP Tools

cTAKES - Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing system for information extraction from electronic medical record clinical free-text. Development team located at Mayo Clinic & Boston Children's Hospital.  66 subscribers of user listserv, 75 subscribers to developers listserv, 207 unique hits to download cTAKES in last 30 days. Slide PresentationVideo Recording