rss_2.0ICAME Journal FeedSciendo RSS Feed for ICAME Journal Journal Feed comparative corpus-based investigation of results sections of research articles in Applied Linguistics and Physics<abstract> <title style='display:none'>Abstract</title> <p>The present study sought to identify the generic structures of the results sections of scientific research articles (RAs) between Applied Linguistics and Physics. Following a manual search approach, a total of 200 RAs in the field of Applied Linguistics and Physics from different top prestigious journals randomly were singled out and analyzed. In addition to offering a tentative template for the rhetorical organizations of results sections, the findings revealed shared and non-shared rhetorical units as well as obligatory and optional steps in the results sections (RSs) of research articles between the disciplines. The findings also indicated that RA writers organize the contents of the RSs around certain rhetorical resources (i.e., M1, M2, M3, M4, and M5) to present key experimental and factual analytical results of their studies. The findings further suggested the existence of common core of rhetorical resources in writing RSs between the disciplines, albeit there are a set of certain steps playing an essential part in distinguishing textual features of each discipline as well as depicting how RSs of individual discipline are developed. The findings generated from the study can offer a number of important pedagogical implications for teaching EAP and ESP courses, especially for Applied Linguistics and Physics teachers and students.</p> </abstract>ARTICLEtrue Pérez-Paredes and Geraldine Mark (eds.). . Amsterdam/Philadelphia: John Benjamins Publishing Company, 2021. ix. 255 pp. ISBN: 978-9-02720989-4 (HB) differences between English clippings and their source words: A corpus-based study<abstract> <title style='display:none'>Abstract</title> <p>This paper uses corpus data and methods of distributional semantics in order to study English clippings such as dorm (&lt; <italic>dormitory</italic>), <italic>memo</italic> (&lt; <italic>memorandum</italic>), or quake (&lt; <italic>earthquake</italic>). We investigate whether systematic meaning differences between clippings and their source words can be detected. The analysis is based on a sample of 50 English clippings. Each of the clippings is represented by a concordance of 100 examples in context that were gathered from the Corpus of Contemporary American English. We compare clippings and their source words both at the aggregate level and in terms of comparisons between individual clippings and their source words. The data show that clippings tend to be used in contexts that represent involved text production, which aligns with the idea that clipped words signal familiarity with their referents. It is further observed that individual clippings and their source words partly diverge in their distributional profiles, reflecting both overlap and differences with regard to their meanings. We interpret these findings against the theoretical background of Construction Grammar and specifically the Principle of No Synonymy.</p> </abstract>ARTICLEtrue series as disseminators of emerging vocabulary: Non-codified expressions in the TV Corpus<abstract> <title style='display:none'>Abstract</title> <p>This study presents a method for identifying words that appear in corpus data earlier than their first date of attestation in dictionaries. We demonstrate the application of this method based on a large diachronic corpus, the TV Corpus, and the <italic>Oxford English Dictionary</italic> (OED). Combining automatic extraction of candidate terms from the TV Corpus with comprehensive manual analysis and verification, the method identifies 32 words that were used in TV series before their first attestation in the OED. We present a detailed discussion of these words, analysing their distribution across decades and genres of the TV Corpus, their origins, semantic domains and word-formation processes. We also present extracts with their first uses in the TV Corpus and analyse how the words were presented to the large and anonymous mass audience. Our study shows that the method we present is suitable for identifying early attestations of words in large corpora, even though in the case of the TV Corpus, a great deal of manual analysis and verification is needed. In addition, we argue that TV series and other types of fictional texts are an important resource for studying the coinage and spread of terms, due to their function and the fact that they address a mass audience.</p> </abstract>ARTICLEtrue to : The simplification of leavetaking formulae in 18th-century Scottish and Irish English letters<abstract> <title style='display:none'>Abstract</title> <p>The study in hand investigates the impact of social status on the use and change of pragmatic formulae in historical varieties of English. The study asks which leavetaking formulae are used between writers of equal social status in varieties of English in the later 18th century. Working on a corpus of letters compiled from two subsets of letters each from 18th-century Scottish and Irish English, the study illustrates pragmatic change on the basis of the investigation of leavetakings involving the <italic>servant</italic> formula. By doing so, the study also helps to widen the hitherto predominating narrow focus on mainly English English.</p> <p>The study shows that the use of formulae is situationally dependant. It suggests that pragmatic change takes place amongst writers of equal social status in the private domain, which then leads to the use of such formulae in the public domain and to the use between writers of different status groups.</p> </abstract>ARTICLEtrue Schützler and Julia Schlüter (eds.). . Cambridge: Cambridge University Press, 2022. 357 pp. ISBN 978-1-10849964-4 : A new tool for analysing recent changes in English legal discourse<abstract> <title style='display:none'>Abstract</title> <p>Legal discourse is widely assumed to be resistant to change, and indeed legislative documents are extremely conservative with fixed and formulaic structures. However, recent research has shown that changes can be observed in the lexico-grammatical features of some legal documents when examined diachronically, particularly since the emergence in the 1970s of the Plain Language Movement, which sought to draw attention to the unnecessary complexity of the official language, this including legal discourse. Despite the crucial changes in legal language in recent years, research in that direction is scarce to date, particularly in the British English variety, probably due, in part, to the shortage of specialised corpora that allow this kind of studies. In order to bridge this gap, we have embarked on the compilation of the <italic>Corpus of Contemporary English Legal Decisions, 1950–2021</italic> (CoCELD), a corpus of British judicial decisions produced between 1950 and 2021. In this paper we present the structure and characteristics of CoCELD, as well as the methodology used for its compilation. The new corpus, which was released in February 2022, contains sample texts of roughly 2,500 words for each year from 1950 to 2021, which adds up to more than 730,000 words. The corpus contains files in raw text and with POS-annotation, and is freely available for the research community under signed consent. With CoCELD we hope to contribute with a new, useful resource for linguists with an interest in legal language, from both a synchronic and a diachronic perspective.</p> </abstract>ARTICLEtrue a corpus of South Asian online Englishes: A report, some reflections and a pilot study<abstract> <title style='display:none'>Abstract</title> <p>In this research article we introduce the <bold>S</bold>outh <bold>A</bold>sian <bold>On</bold>line <bold>E</bold>nglishes (SAOnE) corpus representing four South Asian countries, i.e. Bangladesh, India, Pakistan, and Sri Lanka, and two native English-speaking countries, i.e. the UK and the USA. We have used semi-automatic and manual methods to collect data from three internet registers, i.e. newspaper comments, web forums and tweets, and a collection of internet sub-registers which we label as blogs and websites. Additionally, we have collected text messages using online freelance hiring platforms from each of the South Asian countries mentioned above. Each register category in the corpus consists of approximately 1 million words per register per country, except text messages, which contains around 500,000 words per country and only includes the four South Asian countries. We have verified the origin of website and blog links, authors of Twitter, and where possible of commenters and web forum users to make sure that only local content of each country is included. The corpus features some indigenous language content, which is tagged.</p> <p>In addition to the description of this dataset, we also present a pilot study analysing three discourse particles, namely <italic>na</italic>, <italic>neh</italic>, and <italic>yaar</italic>. The discourse particles <italic>na</italic> and <italic>yaar</italic> are native to Hindi/Urdu, while <italic>neh</italic> is based on a Sinhala negation marker. Our analysis indicates that <italic>na</italic> and <italic>neh</italic> have similarities in terms of their position in the clause/utterance. However, <italic>neh</italic> is confined to Sri Lanka while the Hindi/Urdu based discourse particles are also used in our Twitter data from Sri Lanka and Bangladesh. The use of these discourse particles in Bangladeshi tweets shows the influence of Indian culture through Bollywood celebrities. Of the Hindi/Urdu discourse particles <italic>yaar</italic> and <italic>na</italic>, <italic>yaar</italic> is preferred in Pakistan while <italic>na</italic> is preferred in India; additionally, <italic>yaar</italic> is used at the start of the clause more often in our Pakistani data. Lastly, we discuss the implications of the pilot study, the advantages of the type of data used for the pilot study, and future research directions.</p> </abstract>ARTICLEtrue McEnery and Vaclav Brezina. . Cambridge: Cambridge University Press, 2022. 313 pp. ISBN 978-1-1071-1062-5 and evaluation in contemporary American English: A corpus study based on pronominal and nominal expressions with male and female reference<abstract> <title style='display:none'>Abstract</title> <p>This study of contemporary American English examines how males and females are evaluated in terms of their personality, physical appearance, societal importance, etc. across various registers. In this study, <italic>evaluation</italic> is defined as an expression of a speaker or writer’s attitude toward, viewpoint on, or feelings about a male or female referent, which generally carries a positive or a negative meaning. The evaluative tokens analyzed in the study include noun phrases (e.g., <italic>a real jerk</italic>) and adjectival modification (e.g., <italic>congenial</italic>) co-occurring with gender-specific nominal expressions (e.g., <italic>boy</italic>, <italic>lady</italic>) or pronominal expressions (e.g., <italic>he</italic>, <italic>she</italic>). The findings imply a distinct gender patterning in the evaluation: whereas males are evaluated in terms of their skills, abilities, acuities and importance in society, females are typically assessed in terms of their looks and appearance. Males occupy considerably more evaluative space than females, particularly in the Newspaper register. The preponderance of the evaluation of males even in twenty-first-century American English is surprising, considering changes in gender role attitudes in U.S. society in recent decades.</p> </abstract>ARTICLEtrue science in urgent times: CoViD-19 and its impact on scientific writing<abstract> <title style='display:none'>Abstract</title> <p>The urgent need for new knowledge as a result of the CoViD-19 pandemic has led to a significant increase in the amount of scientific writing on the topic. Various analyses of this phenomenon from different approaches have appeared thus far (Horbach 2020; Torres-Salinas 2020). However, less attention has been paid to the impact of this situation on the language of these studies, looking into whether the continued emergency affects authors’ conscious or unconscious linguistic choices, and if so, how. This article compares texts on CoViD with texts written during the previous MERS emergency and its aftermath, trying to find if texts on CoViD present particular linguistic features reflective of this situation of urgency. Results suggest that texts on CoViD do indeed exhibit particular linguistic features, and that these point to a preference for conveying immediate knowledge and a departure from rhetorical practices common in scientific writing.</p> </abstract>ARTICLEtrue Bernaisch (ed.). . Cambridge: Cambridge University Press, 2021. xv, 235 pp. ISBN: 978-1-108-48254-7 Moessner. . Edinburgh: Edinburgh University Press. 2020. 272 pp. ISBN 978 1 4744 3799 8 Rautionaho, Arja Nurmi and Juhani Klemola (eds.). (Studies in Corpus Linguistics 96). Amsterdam and Philadelphia: John Benjamins Publishing Company, 2020. 305 pp. ISBN 9789027205438 (HB) bundles in maritime texts<abstract> <title style='display:none'>Abstract</title> <p>Lexical bundles are recurring frequent word combinations. Research has shown that lexical bundles vary in genre and register (Biber 2006; Biber, Conrad and Cortes 2004; Hyland 2008a, 2008b; Scott and Tribble 2006). However, the degree to which they vary by discipline remains inconclusive. The main aim of this paper is to establish whether lexical bundles are discipline specific, i.e., whether each discipline draws on a specialized lexical repertoire or whether there is a core vocabulary shared across various disciplines. For that purpose, maritime texts covering the subdomains marine engineering, navigation, maritime law and shipping have been collected so as to investigate the structure and function of lexical bundles and to find out how they shape meaning in specialized discourse. For the purposes of the study, a 7.4 M corpus consisting of two monolingual subcorpora and one bilingual subcorpus was compiled. This corpus can be used as a basis for further studies in the field. Furthermore, the paper discusses problems encountered while extracting N-grams from a corpus, as well as classification criteria for the identification of lexical bundles. The results show that lexical bundles identified in maritime texts are phrasal rather than clausal. The results also indicate that lexical bundles are discipline specific. Teaching these specialized features that shape discourse can improve students’ language production and should thus be the focus of instruction in ESP.</p> </abstract>ARTICLEtrue data for more researchers – using the audio features of BNCweb<abstract> <title style='display:none'>Abstract</title> <p>In spite of the wide agreement among linguists as to the significance of spoken language data, actual speech data have not formed the basis of empirical work on English as much as one would think. The present paper is intended to contribute to changing this situation, on a theoretical and on a practical level. On a theoretical level, we discuss different research traditions within (English) linguistics. Whereas speech data have become increasingly important in various linguistic disciplines, major corpora of English developed within the corpus-linguistic community, carefully sampled to be representative of language usage, are usually restricted to orthographic transcriptions of spoken language. As a result, phonological phenomena have remained conspicuously understudied within traditional corpus linguistics. At the same time, work with current speech corpora often requires a considerable level of specialist knowledge and tailor-made solutions. On a practical level, we present a new feature of BNCweb (Hoffmann et al. 2008), a user-friendly interface to the British National Corpus, which gives users access to audio and phonemic transcriptions of more than five million words of spontaneous speech. With the help of a pilot study on the variability of intrusive r we illustrate the scope of the new possibilities.</p> </abstract>ARTICLEtrue Götz and Joybrato Mukherjee (eds.). (Studies in Corpus Linguistics 92). Amsterdam/Philadelphia: John Benjamins. 2019. iv+267 pp. ISBN 978 90 272 0236 9. the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio<abstract> <title style='display:none'>Abstract</title> <p>This article explores challenges in the corpus linguistic analysis of Shakespeare’s language, and Early Modern English more generally, with particular focus on elaborating possible solutions and the benefits they bring. An account of work that took place within the <italic>Encyclopedia of Shakespeare’s Language</italic> Project (2016–2019) is given, which discusses the development of the project’s data resources, specifically, the <italic>Enhanced Shakespearean Corpus.</italic> Topics covered include the composition of the corpus and its subcomponents; the structure of the XML markup; the design of the extensive character metadata; and the word-level corpus annotation, including spelling regularisation, part-of-speech tagging, lemmatisation and semantic tagging. The challenges that arise from each of these undertakings are not exclusive to a corpus-based treatment of Shakespeare’s plays but it is in the context of Shakespeare’s language that they are so severe as to seem almost insurmountable. The solutions developed for the <italic>Enhanced Shakespearean Corpus</italic> – often combining automated manipulation with manual interventions, and always principled – offer a way through.</p> </abstract>ARTICLEtrue systems for corpus linguists Corpus-based classification and frequency distribution