|
|
|
|
|
|
ITRI seminars - Spring 2001
ITRI seminars generally take place 12 noon on Thursdays in room W107 on
the first floor of the Watts Building, University of Brighton
(Moulsecoomb site). Occasional deviations from this pattern are
indicated below.
Information on how to find W107 is available on our contact page.
|
25 Jan
abstract
| |
Duska Rosenberg
Royal Holloway, University of London Languages in Multimedia: Common-ground
framework for investigating the role of natural language interfaces in
computer-mediated communication
|
8 Feb
| |
Mark Steedman
University of Edinburgh Uses of Prosody in Spoken Language Processing
|
22 Feb
abstract
| |
Andrea Setzer
University of Sheffield Annotating Events and Temporal Information in Newswire Texts
|
15 Mar
abstract
| |
Fabio Ciravegna
Department of Computer Science, University of Sheffield User-driven Adaptive Information Extraction from Internet-related Text
|
| |
|
| |
Previous ITRI seminars
|
| | See also NLP seminars at COGS, University of Sussex
|
Abstracts
Duska Rosenberg
Languages in Multimedia: Common-ground
framework for investigating the role of natural language interfaces in
computer-mediated communication
My current research is a collaboration
with architects and urban planners who have developed models for the
design of physical workplace. We're now working on the design of
location-independent workplace where ICT plays a key role in
supporting mobile workers. The first issue we're addressing involves
the study of interaction in three types of space : "cloister" which
requires privacy and solitude, "club" where meetings among selected
members take place and "cafe" where everyone can join. My own
contribution involves the study of language use in different kinds of
space, and in particular, what informational resources are normally
available for people to establish the common ground in such spaces.
The theoretical framework I'm using is based on the common ground
developed by Kartunnen and Peters and also by Clark, but involves some
non-trivial extensions. I'm currently working with Peters and Ginzburg
on adapting situation semantics for the study of communication.
Andrea Setzer
Annotating Events and Temporal Information in Newswire Texts
If one is concerned with natural language processing applications such as
information extraction (IE), which typically involve extracting information
about temporally situated scenarios, the ability to accurately position key
events in time is of great importance. To date only minimal work has been done
in the IE community concerning the extraction of temporal information from text,
and the importance, together with the difficulty of the task, suggest that a
concerted effort be made to analyse how temporal information is actually
conveyed in real texts. To this end we have devised an annotation scheme for
annotating those features and relations in texts which enable us to determine
the relative order and, if possible, the absolute time, of the events reported
in them. Such a scheme could be used to construct an annotated corpus which
would yield the benefits normally associated with the construction of such
resources: a better understanding of the phenomena of concern, and a resource
for the training and evaluation of adaptive algorithms to automatically identify
features and relations of interest. We also describe a framework for evaluating
the annotation and compute precision and recall for different responses.
Fabio Ciravegna
User-driven Adaptive Information Extraction from Internet-related Text
In the last years, the increasing importance of the Internet has
stressed the central role of texts such as emails, Usenet posts and Web
pages. In this context, linguistically intensive approaches as used in
classical IE systems (e.g. [Hobbs97], [Humphreys98], [Grishman98],
[Ciravegna00]) are difficult or unnecessary. Information carried by
extralinguistic structures (e.g. HTML tags, document formatting, and
stereotypical language) is more relevant and easy to use than deep
linguistic knowledge. For this reason a new research stream on adaptive
IE has arisen at the convergence of NLP, Information Integration and
Machine Learning. The goal is to produce IE algorithms and systems
adaptable to new Internet-related applications/scenarios by using only
analyst's knowledge (i.e. knowledge on the domain/scenario itself)
[Kushmerick 1997], [Califf 1998], [Muslea 1998], [Freitag 1999],
[Soderland 1999], [Freitag 2000]. Such algorithms are easy to adapt to
new applications and very effective when applied on highly structured
HTML pages. Unfortunately they tend to be less effective on less
structured texts (e.g. free texts). In our opinion this is because most
successful algorithms make scarce (or no) use of NLP, tending to avoid
any generalization over the flat word sequence. When they are applied to
unstructured texts, data sparseness becomes a problem.
This paper presents LP-2, an adaptive IE algorithm designed in this
new stream of research that makes use of shallow NLP in order to
overcome data sparseness when confronted with NL texts, while keeping
effectiveness on highly structured texts.
LP-2 has a considerable success story. From a scientific point of
view, experiments report excellent results with respect to the current
state of the art on some publicly available corpora. From an application
point of view, a successful industrial IE tool has been based on it.
Real world applications have been developed and licenses have
been released to some commercial companies for building other applications.
In this talk I will first introduce the algorithm, discuss experimental
results and show how the algorithm compares successfully with the
current state of the art on semi-structured texts. The role and
importance of shallow NLP for overcoming data sparseness will also be
discussed. Then I will present my experience in designing and delivering
LearningPinocchio, an industrial system for adaptive IE based on
LP-2. Finally I will describe my research agenda for user-driven
adaptive IE for the next years.
Maintained by
Adam Kilgarriff
(Adam.Kilgarriff@itri.brighton.ac.uk).
Last updated Tuesday March 13 2001
©Information Technology Research Inst
itute