Our mission is to develop and implement Natural Language Processing (NLP) technologies to apply to the electronic medical record. These technologies include core NLP tasks such as relation extraction, coreference resolution, and parsing, and make use of statistical machine learning methods. In order to use many machine learning methods, manually labeled (annotated) domain- and task-specific data is required. To that end, we are heavily involved in many different clinical document annotation projects. Since manual annotation is a time-consuming, painstaking, expensive process, it is also our goal to develop and use algorithms that minimize the required amount of labeled data required while maximizing the use of existing labeled data.
Use cases for clinical NLP include automated phenotyping, cohort identification, and clinical question answering.
Our end-to-end system is called cTAKES (clinical Text And Knowledge Extraction System). See left panel under Software for more information about cTAKES.