Universidad Complutense de Madrid
E-Prints Complutense

Thematic patterning in English and Spanish: contrastive annotation of a bilingual newspaper corpus for liguistic and computational applications
La estructuración temática en inglés y español: anotación contrastiva de un corpus bilingüe para aplicaciones lingüísticas y computacionales

Impacto

Downloads

Downloads per month over past year



Moratón Gutiérrez, Lara (2016) Thematic patterning in English and Spanish: contrastive annotation of a bilingual newspaper corpus for liguistic and computational applications. [Thesis]

[img]
Preview
PDF
10MB


Abstract

Thematization is recognized as a fundamental phenomenon in the construction of messages and texts by di erent linguistic schools. This location within a text privileges the elements that guide the reader in the orientation and interpretation of discourse at di erent levels. Thematizing a linguistic unit by locating it in the rst-initial position of a clause, paragraph, or text, confers upon it a special status: a signal of the organizational strategy which characterizes di erent text types playing a role as a variable in the distinction of registers, text types and genres. However, in spite of the importance of the study of thematization for message and textual structuring, to date there are no linguistic studies that have undertook the task of validating its aspects in a comparative manner, either for linguistic or computational purposes. This study, therefore, lls a research gap by implementing a methodology based on contrastive corpus annotation, which allows to empirically validate aspects of the phenomenon of Thematization in English and Spanish, it also seeks to develop a bilingual English-Spanish comparable corpus of newspaper texts automatically annotated with thematic features at clausal and discourse levels. The empirically validated categories (Thematic Field and its elements: Textual Theme, Interpersonal Theme, PreHead and Head) are used to annotate a larger corpus of three newspaper genres news reports, editorials and letters to the editor in terms of thematic choices. This characterization, reveals interesting results, such as the use of genre-speci c strategies in thematic position. In addition, the thesis investigates the possibility to automate the annotation of thematic features in the bilingual corpus through the development of a set of JAVA rules implemented in GATE. It also shows the e cacy of this method in comparison with the manual annotation results...


Item Type:Thesis
Additional Information:

Tesis inédita de la Universidad Complutense de Madrid, Facultad de Filología, Departamento de Filología Inglesa, leída el 04-12-2015

Directors:
DirectorsDirector email
Lavid López, Julia
Uncontrolled Keywords:Lingüística
Palabras clave (otros idiomas):Linguistics
Subjects:Humanities > Philology > Linguistics
ID Code:39741
Deposited On:25 Oct 2016 13:44
Last Modified:06 May 2019 10:29

Origin of downloads

Repository Staff Only: item control page