Yimin Ma


Coreference resolution in Information Extraction


 

Abstract:
 

Information Extraction (IE) is a process that takes free texts as input and creates a structured representation of predefined target information as output.  This data could be used directly for display to users, or stored in a database for later analysis.  As more and more information exists only in natural language form, the need for information extraction system grows dramatically. 

 

The coreference resolution is one of the key components in the IE system.  It merges partial data objects about the same entities, entity relationships, and evens described at different discourse position.  Most of the coreference resolution systems deal with resolution of anaphors with noun phrases or pronoun.  Traditional approaches to anaphora resolution are relying on a set of “anaphora resolution factor”, which could be gender or number constrains.

 

This presentation will first introduce the general architecture for information extraction and outline several of its real world applications. Special emphasis will be given to coreference resolution component within the IE system.  We will discuss the importance of the coreference resolution and analysis different approaches in the most well known works.   Finally, we will talk about the coreference resolution under E-mail context, and the algorithm used to solve this problem.