2007年6月6日 星期三

[工作誌]BioCreative II: Gene Normalization Task

On 2007/5/15, I finished my duty in the military service and leaved from Army.

On 2007/5/21, I joined Professor Chun-Nan Hsu's research group at Institute of Information Science, Academia Sinica, Taiwan.

Currently, Professor Hsu assigned me a research topic about Gene Normalization.

This topic is a competition task of the BioCreative II. BioCreative II challege consists three tasks:
(1) Gene mention tagging (GM),
(2) Gene normalization (GN) and
(3) Extraction of protein-protein interaction from test (PPI).

Hsu's group proposed two methods in GM task and achieved the second and third highest scores.

Furthmore, they also submitted a method in GN task. However, this method only got rank 14 over 21 participanting groups. Therefore, Professor Hsu hope I can propose some exciting approaches and get high performance (high recall and precision) in GN task.

The goal of the task is required to return the EntrezGene ID corresponding to the human genes and direct gene products appearing in a given MEDLINE abstract.
Given a master list of human EntrezGene identifiers with some common gene and protein names (synonyms) for each identifier in the master list, system is required to return a list of EntrezGene inentifiers and corresponding test excerpts for each human gene or gene product mentioned in the abstract. Since the dictionary for gene synonyms has some noise, it is a difficult task to identify gene name with a correct EntrezGene ID.
