On Github kotfic / coref-presentation
Christopher Kotfila INF720 - Fall 2014 ckotfila@albany.edu
Understand how people spatially conceptualize vaguely defined regions like city neighborhoods and national regions
Conceptualization can be captured through understanding the linguistic features that contextualize mentions of these regions in textual documents.
All points are actually regions All regions are actually points
The East Valley's first Ronald McDonald House and the third serving the metro area opens for families whose children are undergoing often-critical medical care on Monday, Nov. 10, on the campus of Cardon Children's Medical Center, 2225 W. Southern Ave., Mesa.
Stanford NLP has found a bunch of different types,
for our purposes only interested in LOCATION
East Valley is a vague area (neighborhood)
But I have a significantly smaller granular region (the street)
Co-reference resolution is the process of classifying the two
To improve existing corpora of spatially annotated documents by identifying the granularity of locations and vague places that are already annotated.
NIST sponsored - MUC, ACE - Automatic Content Extraction Mani, I., Doran, C., Harris, D., Hitzeman, J., Quimby, R., Richer, J., … Clancy, S. (2010). SpatialML: annotation scheme, resources, and evaluation. Language Resources and Evaluation, 44(3), 263–280. 10.1007/s10579-010-9121-0
To explore the linguistic features and entities that spatially contextualize vague spaces and their spatial granularity
To assess the viability of spatially identifying vaguely defined places using automated natural language processing techniques.
Grothe, C., & Schaab, J. (2009). Automated Footprint Generation from Geotags with Kernel Density Estimation and Support Vector Machines. Spatial Cognition & Computation, 9(3), 195–211. 10.1080/13875860903118307 Montello, D. R., Goodchild, M. F., Gottsegen, J., & Fohl, P. (2003). Where’s Downtown?: Behavioral Methods for Determining Referents of Vague Spatial Queries. Spatial Cognition & Computation, 3(2-3), 185–204. 10.1080/13875868.2003.9683761
R1: To what degree can annotators agree on a granularity for a region/space?
R2: can annotators agree on whether or not a geospatial referent has a strict administrative boundary?
R3: What are the distributions of spatial granularities across documents and across corpora?
R4: To what degree is co-reference resolution across spatial granularities possible?
R5: What are the linguistic features that support co-reference resolution across granularities?
R6: To what degree can you classify the granularity of a location based solely on its textual context?
R7: Does co-reference resolution based automated region detection correspond with human judgment?
Validate based on human judgment (but probably small sample size) – specific, detailed validation
R8: Does co-reference resolution based automated region detection correspond with other automated region detection techniques?
Validate based on automated technique (but probably much larger sample size) – general, broad validation
Formalizing the relationship between points and regions at different granularities
Exploring linguistic features that connect mentions of different granularities
Generating descriptive understandings of regions and assessing their relationships to administrative boundaries
Improving current spatial textual corpora