Instituto Nacional de Bioinformática

Biomedical Text mining Position available

Contract information: Several types of contracts are offered in our research group, including Post-doctoral, PhD or post-graduate position at the Structural Computational Biology Group of the Spanish National Cancer research Centre (CNIO). Salaries will depend on the type of position, expertise and academic formation. Working language is English. Lab URL:

General description:
The candidate will work in a multidisciplinary team involved in the development and application of biomedical text mining and natural language processing approaches. The overall aim of this work is to develop and apply text mining and natural language processing technologies to biomedical literature, covering aspects related to automatic text classification using machine learning methods, the detection of named entity mentions of biological interest from text and the extraction of relationships from the biomedical literature. A special focus will be given to certain topics such as cancer related literature.

-Applicants should have a solid formation in computational linguistics, Natural Language Processing, text mining or related areas.
-Ability to develop algorithms and software for natural language processing/text mining systems.
-Programming skills are required, in at least one of the following languages (Python, Perl, Java, C/C++, Ruby).
– Good English communication skills.
– Interest in the Biomedical field.
Expertise in the following points are a plus when applying for the position:
-Formation on topics related to statistics and machine learning.
-Development of online web applications.
-Previous experience with biomedical texts/domain.
-Ability to work in an interdisciplinary team.
-A publication record would be an advantage.
– Familiarity with NLP tasks such as named entity recognition, information extraction and information retrieval.

Background on our research group:
The Structural and Computational Biology Programme at CNIO, leaded by Dr. Alfonso Valencia integrates several research groups, including Computational Biology, Bioinformatics and text mining.
The computational facilities and infrastructure cover all the needs of the research in modern text mining and NLP, computational biology and provides excellent grounds for the  analysis of high-throughput genomic data. The research group contributed significantly to the biomedical text mining research over the past years, from initial work related to the analysis of protein families, microarray data and protein interactions to the development of online applications such as the iHOP server, PLAN2L or the BioCreative Metaserver. The group collaborates with experimental biomedicine labs to integrate NLP and text mining data with the results of bioinformatics data. It has been coorganizing community evaluation efforts in the BioNLP area, i.e. the BioCreative challenges.

Contact info:Requests for additional information or formal applications (including application letters, extensive CV) can be sent to Martin Krallinger (