rs |
Description |
We are using semantic information to identify when sentences have common content when
there is little similarity in the vocabulary or structure. For example, a scientific
paper might use the formal name for a particular species (eg. Zootermopsis angusticollis)
while the same work presented to a general audience may use a common name (termite)
when the level of precision is less important. Documents may also be structured differently
depending on the intended audience, so that sentences which carry the same information
may share no common terms.
We are developing approximate, or“rough”, semantic representations which
can be matched with more lightweight algorithms than are required to recognise full
semantic equivalence. These allow us to recognise when different terms appear in similar
linguistic contexts, and so may have similar denotations. We are applying this work
to identify which parts of popular science articles discuss the source academic articles.
People
Alistair Willis
|