ICAME Journal Feed

Review: Theresa Neumaier. . Cambridge: Cambridge University Press, 2023. xvi. 288 pp. ISBN 9781108936996

Tue, 28 May 2024 00:00:00 GMT

Review: Jesse Egbert, Douglas Biber and Bethany Gray. . Cambridge: Cambridge University Press, 2022. 300 pp. ISBN 978-1316605882

Tue, 28 May 2024 00:00:00 GMT

How real is the quantitative turn? Investigating statistics as the new normal in linguistics

Tue, 28 May 2024 00:00:00 GMT

Statistical approaches in linguistics seem to have gained in importance in recent times, especially in the field of Corpus Linguistics. In particular, the last ten years have seen an upsurge of linguists being dedicated to statistical methods and the improvement of statistical knowledge. This has repeatedly been described as ‘the quantitative turn’ in linguistics. In the present paper, we assess how real this quantitative turn actually is and whether statistics can be considered the ‘new normal’ in (corpus) linguistics. To this end, we have analyzed the contributions to six high-impact journals (Corpora, Corpus Linguistics and Linguistic Theory, ICAME Journal, English World-Wide, Journal of English Linguistics, and Language Variation and Change) for a period of eleven years (January 2011 until December 2021). Our results suggest that, indeed, statistical methods seem to be on the rise in linguistic studies. However, their frequency strongly varies between the journals, and, in general, we have identified some room for improvement in the use of advanced statistical methods, in particular the discussion of true prediction.

Review: Agnieszka Leńko-Szymańska and Sandra Götz (eds.). . Amsterdam: John Benjamins, 2022. vi. 327 pp. ISBN 9789027212580 (HB)

Tue, 28 May 2024 00:00:00 GMT

Against level-3-only analyses in corpus linguistics

Tue, 28 May 2024 00:00:00 GMT

In the last few decades, much work in corpus linguistics has attempted to discover, and then interpret, differences in the frequencies of use of linguistic elements (words, patterns, constructions, discourse features, etc.). It is probably fair to say that such studies were particularly frequent in (i) learner corpus research, (ii) corpus-based varieties research, and (iii) sociolinguistically motivated studies. For instance, many studies have discussed the differences in how often certain elements are used (i) in corpus data from native speakers vs. corpus data from learner from different L1 backgrounds, (ii) in corpora representing different inner- and outer-circle varieties, or (iii) by speakers in corpora representing people of different gender or sexual identities.

This paper will make the admittedly bold claim that any such study can in fact by definition unable to ‘prove’ what is often their main points, namely that the distributional differences found are in fact due to the one hypothesized explanatory variable(s) of L1, VARIETY, or, e.g., GENDER even when the distributional differences are significant and come with a decent effect size. To substantiate this claim, I will discuss some terminology from the family of methods known as multi-level modeling, namely the distinction between level-1, level-2, ... level-n variables and its relevance for many corpus studies. Second, I will then demonstrate how studies using only the above kinds of variables cannot distinguish the effect of their favored predictors from the effect of local/contextual level-1 variables. Third, in discussing this, I will exemplify how such effects need to be explored quantitatively instead.

Exploring variation in English as a lingua franca: Multivariate analysis of modal verbs of obligation and necessity in the VOICE corpus

Tue, 28 May 2024 00:00:00 GMT

The modal verbs of necessity and obligation, a testing ground of grammatical change, have been shown to exhibit change and variation in world Englishes. Previous studies have primarily concentrated on English as a native language (ENL) and English as a second language (ESL) varieties. The present study extends this line of research and explores variation in modal verbs of necessity and obligation in English use as a Lingua Franca (ELF). Descriptive statistics indicate that ELF resembles American English and also shares similarities with ESL varieties. In addition, ELF further exhibits divergence from both ENL and ESL varieties that arises in multilingual interactions. The multivariate analysis of this study employs mixed-effects logistic regression on the use of must and have to. Integrating social and linguistic factors, this analysis exploits metadata gathered from the VOICE corpus, which has thus far been underused. The results of the inferential statistics indicate that the same sociolinguistic factors that influence the variation in ENL and ESL varieties also shape ELF grammar. These findings not only bring ELF closer to other English varieties but also demonstrate the advantage of studying ELF from a variationist sociolinguistic perspective.

Mapping shared lexical bundles onto rhetorical moves in nursing research articles: A comparative study of paradigmatic variation

Tue, 28 May 2024 00:00:00 GMT

Previous studies have identified frequent lexical bundles associated with qualitative, quantitative, and mixed methods research paradigms. These paradigmatic investigations of lexical bundles conducted thus far seem to have two limitations. One is that they have primarily concentrated on distinctive lexical bundles, without much analysis of the shared bundles in qualitative, quantitative, and mixed methods research paradigms. Another shortcoming is that they tend not to explore in which contexts lexical bundles are likely to occur. These two problems deserve attention, as shared bundles are also frequently used to facilitate fluent linguistic production and analysing lexical bundles in their surrounding contexts can help reveal their specific textual meanings. To address these two limitations, this study seeks to link shared lexical bundles with rhetorical moves based on a corpus consisting of qualitative, quantitative, and mixed methods nursing research articles. The findings of this study show that in certain move-steps, shared lexical bundles have distinctive discourse functions in mixed methods research. Meanwhile, the findings also show that there are move-steps where shared lexical bundles have similar discourse functions in two or three research paradigms. Revealing shared lexical bundles’ discourse functions in specific contexts may enable learners to know where to use the bundles in a text.

Semantic prosody, semantic transfer and semantic change

Tue, 28 May 2024 00:00:00 GMT

This article investigates semantic prosody in a diachronic perspective. Although prosodies have been shown to change over time, there is no consensus regarding the source of such changes. The present study explores this further through a corpus study of the development of the lemmas fabric, fabricate and fabrication from the late 15th century to the late 20th century, drawing on material from Early English Books Online, the Corpus of Late Modern English Texts and the British National Corpus. The results of the study show that prosodic changes coincide with the emergence of new senses and indicate that these processes are related to and possibly caused by semantic transfer induced by persistent prosodies over time.

A comparative corpus-based investigation of results sections of research articles in Applied Linguistics and Physics

Sun, 30 Apr 2023 00:00:00 GMT

The present study sought to identify the generic structures of the results sections of scientific research articles (RAs) between Applied Linguistics and Physics. Following a manual search approach, a total of 200 RAs in the field of Applied Linguistics and Physics from different top prestigious journals randomly were singled out and analyzed. In addition to offering a tentative template for the rhetorical organizations of results sections, the findings revealed shared and non-shared rhetorical units as well as obligatory and optional steps in the results sections (RSs) of research articles between the disciplines. The findings also indicated that RA writers organize the contents of the RSs around certain rhetorical resources (i.e., M1, M2, M3, M4, and M5) to present key experimental and factual analytical results of their studies. The findings further suggested the existence of common core of rhetorical resources in writing RSs between the disciplines, albeit there are a set of certain steps playing an essential part in distinguishing textual features of each discipline as well as depicting how RSs of individual discipline are developed. The findings generated from the study can offer a number of important pedagogical implications for teaching EAP and ESP courses, especially for Applied Linguistics and Physics teachers and students.

Pascual Pérez-Paredes and Geraldine Mark (eds.). . Amsterdam/Philadelphia: John Benjamins Publishing Company, 2021. ix. 255 pp. ISBN: 978-9-02720989-4 (HB)

Mon, 01 May 2023 00:00:00 GMT

Meaning differences between English clippings and their source words: A corpus-based study

Mon, 01 May 2023 00:00:00 GMT

This paper uses corpus data and methods of distributional semantics in order to study English clippings such as dorm (< dormitory), memo (< memorandum), or quake (< earthquake). We investigate whether systematic meaning differences between clippings and their source words can be detected. The analysis is based on a sample of 50 English clippings. Each of the clippings is represented by a concordance of 100 examples in context that were gathered from the Corpus of Contemporary American English. We compare clippings and their source words both at the aggregate level and in terms of comparisons between individual clippings and their source words. The data show that clippings tend to be used in contexts that represent involved text production, which aligns with the idea that clipped words signal familiarity with their referents. It is further observed that individual clippings and their source words partly diverge in their distributional profiles, reflecting both overlap and differences with regard to their meanings. We interpret these findings against the theoretical background of Construction Grammar and specifically the Principle of No Synonymy.

TV series as disseminators of emerging vocabulary: Non-codified expressions in the TV Corpus

Mon, 01 May 2023 00:00:00 GMT

This study presents a method for identifying words that appear in corpus data earlier than their first date of attestation in dictionaries. We demonstrate the application of this method based on a large diachronic corpus, the TV Corpus, and the Oxford English Dictionary (OED). Combining automatic extraction of candidate terms from the TV Corpus with comprehensive manual analysis and verification, the method identifies 32 words that were used in TV series before their first attestation in the OED. We present a detailed discussion of these words, analysing their distribution across decades and genres of the TV Corpus, their origins, semantic domains and word-formation processes. We also present extracts with their first uses in the TV Corpus and analyse how the words were presented to the large and anonymous mass audience. Our study shows that the method we present is suitable for identifying early attestations of words in large corpora, even though in the case of the TV Corpus, a great deal of manual analysis and verification is needed. In addition, we argue that TV series and other types of fictional texts are an important resource for studying the coinage and spread of terms, due to their function and the fact that they address a mass audience.

From to : The simplification of leavetaking formulae in 18th-century Scottish and Irish English letters

Mon, 01 May 2023 00:00:00 GMT

The study in hand investigates the impact of social status on the use and change of pragmatic formulae in historical varieties of English. The study asks which leavetaking formulae are used between writers of equal social status in varieties of English in the later 18th century. Working on a corpus of letters compiled from two subsets of letters each from 18th-century Scottish and Irish English, the study illustrates pragmatic change on the basis of the investigation of leavetakings involving the servant formula. By doing so, the study also helps to widen the hitherto predominating narrow focus on mainly English English.

The study shows that the use of formulae is situationally dependant. It suggests that pragmatic change takes place amongst writers of equal social status in the private domain, which then leads to the use of such formulae in the public domain and to the use between writers of different status groups.

Ole Schützler and Julia Schlüter (eds.). . Cambridge: Cambridge University Press, 2022. 357 pp. ISBN 978-1-10849964-4

Mon, 01 May 2023 00:00:00 GMT

The : A new tool for analysing recent changes in English legal discourse

Mon, 01 May 2023 00:00:00 GMT

Legal discourse is widely assumed to be resistant to change, and indeed legislative documents are extremely conservative with fixed and formulaic structures. However, recent research has shown that changes can be observed in the lexico-grammatical features of some legal documents when examined diachronically, particularly since the emergence in the 1970s of the Plain Language Movement, which sought to draw attention to the unnecessary complexity of the official language, this including legal discourse. Despite the crucial changes in legal language in recent years, research in that direction is scarce to date, particularly in the British English variety, probably due, in part, to the shortage of specialised corpora that allow this kind of studies. In order to bridge this gap, we have embarked on the compilation of the Corpus of Contemporary English Legal Decisions, 1950–2021 (CoCELD), a corpus of British judicial decisions produced between 1950 and 2021. In this paper we present the structure and characteristics of CoCELD, as well as the methodology used for its compilation. The new corpus, which was released in February 2022, contains sample texts of roughly 2,500 words for each year from 1950 to 2021, which adds up to more than 730,000 words. The corpus contains files in raw text and with POS-annotation, and is freely available for the research community under signed consent. With CoCELD we hope to contribute with a new, useful resource for linguists with an interest in legal language, from both a synchronic and a diachronic perspective.

Compiling a corpus of South Asian online Englishes: A report, some reflections and a pilot study

Mon, 01 May 2023 00:00:00 GMT

In this research article we introduce the South Asian Online Englishes (SAOnE) corpus representing four South Asian countries, i.e. Bangladesh, India, Pakistan, and Sri Lanka, and two native English-speaking countries, i.e. the UK and the USA. We have used semi-automatic and manual methods to collect data from three internet registers, i.e. newspaper comments, web forums and tweets, and a collection of internet sub-registers which we label as blogs and websites. Additionally, we have collected text messages using online freelance hiring platforms from each of the South Asian countries mentioned above. Each register category in the corpus consists of approximately 1 million words per register per country, except text messages, which contains around 500,000 words per country and only includes the four South Asian countries. We have verified the origin of website and blog links, authors of Twitter, and where possible of commenters and web forum users to make sure that only local content of each country is included. The corpus features some indigenous language content, which is tagged.

In addition to the description of this dataset, we also present a pilot study analysing three discourse particles, namely na, neh, and yaar. The discourse particles na and yaar are native to Hindi/Urdu, while neh is based on a Sinhala negation marker. Our analysis indicates that na and neh have similarities in terms of their position in the clause/utterance. However, neh is confined to Sri Lanka while the Hindi/Urdu based discourse particles are also used in our Twitter data from Sri Lanka and Bangladesh. The use of these discourse particles in Bangladeshi tweets shows the influence of Indian culture through Bollywood celebrities. Of the Hindi/Urdu discourse particles yaar and na, yaar is preferred in Pakistan while na is preferred in India; additionally, yaar is used at the start of the clause more often in our Pakistani data. Lastly, we discuss the implications of the pilot study, the advantages of the type of data used for the pilot study, and future research directions.

Tony McEnery and Vaclav Brezina. . Cambridge: Cambridge University Press, 2022. 313 pp. ISBN 978-1-1071-1062-5

Mon, 01 May 2023 00:00:00 GMT

Gender and evaluation in contemporary American English: A corpus study based on pronominal and nominal expressions with male and female reference

Mon, 01 May 2023 00:00:00 GMT

This study of contemporary American English examines how males and females are evaluated in terms of their personality, physical appearance, societal importance, etc. across various registers. In this study, evaluation is defined as an expression of a speaker or writer’s attitude toward, viewpoint on, or feelings about a male or female referent, which generally carries a positive or a negative meaning. The evaluative tokens analyzed in the study include noun phrases (e.g., a real jerk) and adjectival modification (e.g., congenial) co-occurring with gender-specific nominal expressions (e.g., boy, lady) or pronominal expressions (e.g., he, she). The findings imply a distinct gender patterning in the evaluation: whereas males are evaluated in terms of their skills, abilities, acuities and importance in society, females are typically assessed in terms of their looks and appearance. Males occupy considerably more evaluative space than females, particularly in the Newspaper register. The preponderance of the evaluation of males even in twenty-first-century American English is surprising, considering changes in gender role attitudes in U.S. society in recent decades.

Writing science in urgent times: CoViD-19 and its impact on scientific writing

Fri, 26 Aug 2022 00:00:00 GMT

The urgent need for new knowledge as a result of the CoViD-19 pandemic has led to a significant increase in the amount of scientific writing on the topic. Various analyses of this phenomenon from different approaches have appeared thus far (Horbach 2020; Torres-Salinas 2020). However, less attention has been paid to the impact of this situation on the language of these studies, looking into whether the continued emergency affects authors’ conscious or unconscious linguistic choices, and if so, how. This article compares texts on CoViD with texts written during the previous MERS emergency and its aftermath, trying to find if texts on CoViD present particular linguistic features reflective of this situation of urgency. Results suggest that texts on CoViD do indeed exhibit particular linguistic features, and that these point to a preference for conveying immediate knowledge and a departure from rhetorical practices common in scientific writing.

Tobias Bernaisch (ed.). . Cambridge: Cambridge University Press, 2021. xv, 235 pp. ISBN: 978-1-108-48254-7

Fri, 26 Aug 2022 00:00:00 GMT