Keyword Extraction of Biomedical Literature Using Text Mining


  • Nur Aniq Syafiq Rodzuan
  • Shahreen Kasim
  • Muhammad Zaki Hassan
  • Mohanavali Sithambranathan
  • Mir Jamaluddin


Textual information gives us more clear information as it is presented using words and characters, which is easy for humans to understand. To extract this kind of information, text mining has come into the new sight of technology. Text mining is the process of extracting non-trivial patterns or knowledge from text documents or from textual databases. The purpose of this research paper is to perform and compare keyword extraction using statistical and linguistic extraction tools for 120 text documents related to hypertension and diabetes disease. In order to draw this comparison, RStudio and Fivefilters which is a statistical-based tool and TerMine and Flexiterm tool which is a linguistic-based tool have been used to demonstrate the process of extracting the specified keyword from the biomedical literature. Thus, classification evaluation using K-Nearest classifier is carried out in order to evaluate and compare the performance of the statistical and linguistic approach using the tools. Experimental results show the comparison and the difference between both tools in executing extraction keywords.