Application and evaluation of automated methods to extract brain connectivity statements from free text

Filed under: Digital atlasing

Short Rating (10 votes):

1.2

Leon French (University of British Columbia), Suzanne Lane (University of British Columbia), Lydia Xu (University of British Columbia), Cathy Kwok (University of British Columbia), Celia Siu (University of British Columbia), YiQi Chen (University of British Columbia), Claudia Krebs (University of British Columbia), Paul Pavlidis (University of British Columbia)

Automated annotation of neuroanatomical connectivity statements from the neuroscience literature would enable accessible and large scale connectivity resources. Unfortunately, connectivity findings are not formally encoded, hindering aggregation, indexing, searching, and integration. Here we describe progress in developing an automated approach to extracting connectivity reports from free text, building on our previous work on extracting and normalizing brain region mentions. We manually annotated a set of 1,377 abstracts for connectivity relations to form a “gold standard” set. We evaluated a range of methods including simple algorithms (naïve co-occurrence) as well as sophisticated machine learning algorithms adapted from the protein interaction extraction domain that employ part-of-speech, dependency and syntax features. Co-occurrence based methods at various thresholds achieve higher recall and equal precision to techniques that employ complex features. A shallow linguistic kernel (SLK) method recalled 50% of the sentence level connectivity statements at 70% precision by employing a limited set of lexical features. We applied the SLK and co-occurrence approaches to 12,557 abstracts from the Journal of Comparative Neurology, resulting in 28,107 predicted connectivity relationships. We compared a normalized subset of 2,688 relationships to the Brain Architecture Management System (BAMS; an established database of rat tract tracing studies). The extracted connections were connected in BAMS at a rate of 63.5%, compared to 51.1% for co-occurring brain region pairs. Outside of the rat connections in BAMS, we estimated precision of 55.3% based on a manual evaluation of 2000 predicted connectivity statements (recall was not judged). We expanded our prediction set to an additional 5797 abstracts in other journals deemed to be connectivity related by the Mscanner method. By again employing BAMS for evaluation we found this new set of abstracts have similar levels of accuracy while extracting 1430 unique relationships that were not seen in the previous corpus. By aggregating these data into a connectivity matrix, we found that precision can be increased at the cost of recall by requiring predicted connections to occur more than once across the corpus. Further analyses of the predicted connectomes is under way

Preferred presentation format: Poster

Topic: Digital atlasing

Latest news for Neuroinformatics 2011 Twitter icon

Follow INCF on Twitter

Sections

Application and evaluation of automated methods to extract brain connectivity statements from free text