ID persistente
|
doi:10.21950/DYAZRE |
Fecha de publicación
|
2020-12-18 |
Título
| Reproducible experiments on the master thesis: An experimental survey of Named Entity Recognition methods in the biomedical domain |
Autor
| Hennig, SebastianUniversidad Nacional de Educación a Distancia (UNED)
Garcia-Serrano, Ana M.Universidad Nacional de Educación a Distancia (UNED)ORCIDhttps://orcid.org/0000-0003-0975-7205 |
Contacto
|
Utilice el botón de e-mail de arriba para contactar.
Hennig, Sebastian (Universidad Nacional de Educación a Distancia (UNED))
Garcia-Serrano, Ana (Universidad Nacional de Educación a Distancia (UNED)) |
Descripción
| Semantic Textual Similarity (also known as Semantic Short-text Similarity) is a research problem that aims to calculate the similarity among text units (phrases, sentences, paragraphs or texts) focusing on the semantic content. The importance of Semantic Similarity in Natural Language Processing has increased in the last years due to its relevance in many tasks and applications, such as Automatic Summarization, Machine Translation, Question Answering or Semantic Indexing. UB-NER is a self-contained Java software library for benchmarking state-of-the-art STS measures in the biomedical domain. It allows to define and execute a set of experiments combining different measures and preprocessing methods. This dataset contains the reproducibility framework and dependencies, whose aim is to allow the exact replication of unsupervised named entity recognition experiment in the biomedical domain as detailed in "ReproductionProtocol.pdf" file. (2020-10-20) |
Materia
| Ciencias de la información y computación |
Palabra clave
| Unsupervised Named Entity Recognition
NER biomedical domain
reproduction
UB-NER |
Notas
| ubner_public_experiments.gz (about 7GB) This file contains the docker container with MetaMap [2], MetaMap Lite [3] and cTAKES [4] installation aswell as the UMLS [5] 2020AA dictionaries. ReproductionProtocol.pdf (about 0.5MB) This file contains the detailed instructions to reproduce the experiments from [1]. |
Depositante
| Admin, Dataverse |
Fecha de depósito
| 2020-11-18 |
Software
| https://github.com/PSESEB/Reproduce-MetaMap-Experiments |
Dataset relacionado
| Sebastian Hennig and Ana Garcia-Serrano, "An experimental survey of Named Entity Recognition methods in the biomedical domain, 2020" |
Otras referencias
| [1] Sebastian Hennig "An experimental survey of Named Entity Recognition methods in the biomedical domain, 2020", OVGU Master Thesis (Supervisor: Ana Garcia-Serrano). [2] Aronson, A R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proceedings. AMIA Symposium (2001): 17-21.h [3] Dina Demner-Fushman, Willie J Rogers, Alan R Aronson, MetaMap Lite: an evaluation of a new Java implementation of MetaMap, Journal of the American Medical Informatics Association, Volume 24, Issue 4, July 2017, Pages 841-844, https://doi.org/10.1093/jamia/ocw177 [4] Savova, Guergana K et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association : JAMIA vol. 17,5 (2010): 507-13. doi:10.1136/jamia.2009.001560 [5] Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061. PMID: 14681409; PMCID: PMC308795. |