Citation Policy PolySearch2 is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of the source material (PolySearch2) and the original publication (see below). We ask that users who download significant portions of the database cite the PolySearch2 paper in any resulting publications. Thank you very much.
Downloads for the Legacy PolySearch can be found here.
|Genes and Proteins||TSV|
|Adverse Health Effects||TSV|
|Wishart Lab's Chemical Ontology||TSV|
|Gene Ontology Terms||TSV|
|PolySearch2 Filter Words||JSON|
The evaluation page summarizes the performance evaluation and feature comparison of PolySearch 2.0 versus the original PolySearch. Evaluation #1-#4 are conducted using the legacy PolySearch evaluation datasets. Evaluation #1 assesses PolySearch2’s ability to identify disease-gene association. Evaluation #2 assesses PolySearch2’s ability to identify drug-gene/protein associations. Evaluation #3 assesses PolySearch2’s ability to identify protein-protein interactions. Evaluation #4 assesses PolySearch2’s metabolite-gene associations. Evaluation #5 assesses PolySearch2’s ability to identify drugs with significant adverse effects, or ‘dangerous drugs’. Evaluation #6 assesses PolySearch2’s ability to identify toxin-disease association. Evaluation #7 evaluates PolySearch2’s ability to identify toxin-adverse effect associations. Finally, Evaluation #8 evaluates PolySearch2's ability to find associated disease concepts when presented with biomedical question sentences.
All evaluation datasets are available below in zipped TSV format. The Complete version contains full PolySearch2 validation results including Z-scores, result assessments, and marked-up refernce and text-snippets. The Reduced version contains ground-truth only datasets including only associated entity pairs and plain-text reference snippets.
|Evaluation 1: Disease / Gene Associations||Complete||Reduced|
|Evaluation 2: Drug / Gene Associations||Complete||Reduced|
|Evaluation 3: Protein / Protein Interactions||Complete||Reduced|
|Evaluation 4: Metabolite / Enzyme Interactions||Complete||Reduced|
|Evaluation 5: Drug with Negative Health Effects||Complete||Reduced|
|Evaluation 6: Toxin with Negative Health Effects||Complete||Reduced|
|Evaluation 7: Toxin / Disease Associations||Complete||Reduced|
|Evaluation 8: BioASQ Question / Disease Associations||Complete||Reduced|
Liu Y., Liang Y., Wishart D.S. (2015) PolySearch 2.0: A significantly improved text-mining system for discovering associations between human diseases, genes, drugs, metabolites, toxins, and more. Nucleic Acids Res. 2015 Jul 1;43(Web Server Issue):W535-42.
Cheng D., Knox C., Young N., Stothard P., Damaraju S., Wishart D.S. (2008) PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites. Nucleic Acids Res. 2008 Jul 1;36(Web Server Issue):W399-405.
This project is supported by the Canadian Institutes of Health Research (award #111062), Alberta Innovates - Health Solutions, and by The Metabolomics Innovation Centre (TMIC), a nationally-funded research and core facility that supports a wide range of cutting-edge metabolomic studies. TMIC is funded by Genome Alberta, Genome British Columbia, and Genome Canada, a not-for-profit organization that is leading Canada's national genomics strategy with $900 million in funding from the federal government.