Research Article
A refined and concise model of indices for quantitatively measuring lexical richness of Chinese university students’ EFL writing
More Detail
1 College of International Studies, Southwest University, Chongqing, CHINA2 Faculty of Modern Languages and Communication, Universiti Putra Malaysia, Serdang, Selangor, MALAYSIA3 Independent Researcher, CHINA* Corresponding Author
Contemporary Educational Technology, 16(3), July 2024, ep513, https://doi.org/10.30935/cedtech/14707
Published Online: 11 June 2024, Published: 01 July 2024
OPEN ACCESS 834 Views 552 Downloads
ABSTRACT
In the existing literature, scholars have proposed various indices to measure the lexical richness (LR) of English as a foreign language (EFL) writing. However, there are currently issues of redundant indices and inconsistent usage. Attempting to address the research question of which indices are the most sensitive and effective ones to distinguish between different grade levels of Chinese university students’ EFL writing, this study aims to put forward a refined and concise model of indices that can truthfully reflect LR in EFL writing. A total of 180 compositions were selected from a Chinese EFL learner corpus: Spoken and written English corpus of Chinese learners. Scores of 28 LR indices of these compositions were computed using the software Lexical Complexity Analyzer, MATTR, and Coh-Metrix. One-way ANOVA or Welch’s ANOVA, depending on the variable’s homogeneity of variances, was conducted for each index. Two criteria were applied to determine which index of a measure should be included in the refined model: whether the difference of an index is significant among different grade levels and the effect size of ANOVA. Based on the quantitative results of ANOVAs and qualitative human judgment based on literature, six indices of the six LR measures were included in the refined model: lexical density, lexical sophistication-I, verb sophistication-II, number of different words-expected sequence 50, corrected TTR, and squared verb variation-I. This refined model addresses the issues of redundancy and inconsistency in previous studies, providing a more accurate and efficient tool for assessing LR in EFL writing.
CITATION (APA)
Yang, Y., & Zheng, Z. (2024). A refined and concise model of indices for quantitatively measuring lexical richness of Chinese university students’ EFL writing. Contemporary Educational Technology, 16(3), ep513. https://doi.org/10.30935/cedtech/14707
REFERENCES
- Ai, H., & Lu, X. (2010). A web-based system for automatic measurement of lexical complexity. In Proceedings of the 27th Annual Symposium of the Computer-Assisted Language Consortium.
- Bulté, B., & Housen, A. (2014). Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing, 26, 42-65. https://doi.org/10.1016/j.jslw.2014.09.005
- Chang, H., Huang, K., & Wu, C. (2006). Determination of sample size in using central limit theorem for Weibull distribution. International Journal of Information and Management Sciences, 17(3), 31-46.
- Chaudron, C., & Parker, K. (1990). Discourse markedness and structural markedness: The acquisition of English noun phrases. Studies in Second Language Acquisition, 12(1), 43-64. https://doi.org/10.1017/S0272263100008731
- Cheung, Y. L., & Jang, H. (2019). Effects of task structure on young learners’ writing quality. INTESOL Journal, 16(1), 52-78. https://doi.org/10.18060/23193
- Chung, E. S., & Ahn, S. (2019). Examining cloze tests as a measure of linguistic complexity in L2 writing. Language Research, 55(3), 627-649. https://doi.org/10.30961/lr.2019.55.3.627
- Cohen, J. (1969). Statistical power analysis for the behavioral sciences. Academic Press.
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Erlbaum.
- Covington, M. A., & McFall, J. D. (2010). Cutting the gordian knot: The moving-average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94-100. https://doi.org/10.1080/09296171003643098
- Crossley, S. A., & McNamara, D. S. (2017). Adaptive educational technologies for literacy instruction. Routledge. https://doi.org/10.4324/9781315647500
- Crossley, S. A., Kyle, K., & McNamara, D. S. (2016). The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion. Behavior Research Methods, 48, 1227-1237. https://doi.org/10.3758/s13428-015-0651-7
- Dewi, R. (2017). Lexical complexity in the introductions of undergraduate students’ research articles. Jurnal Pendidikan Bahasa dan Sastra Inggris [Journal of English Language and Literature Education], 6(2), 161-172. https://doi.org/10.26618/exposure.v6i2.1179
- Engber, C. A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of Second Language Writing, 4(2), 139-155. https://doi.org/10.1016/1060-3743(95)90004-7
- Fan, J., Yang, C., & Huang, Z. (2023). Lexical richness of Chinese college students’ spoken English. Journal of English Language Teaching and Applied Linguistics, 5(2), 1-14. https://doi.org/10.32996/jeltal.2023.5.2.1
- Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149-1160. https://doi.org/10.3758/BRM.41.4.1149
- Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175-191. https://doi.org/10.3758/BF03193146
- Geng, H., & Yang, Y. (2021, October 4-5). Lexical richness in English travel guidebooks by EEL and ENL writers [Paper presentation]. The 7th Malaysia International Conference on Foreign Languages 2021 (MICFL2021), Kuala Lumpur, Malaysia. http://micfl2021.upm.edu.my
- Gregori-Signes, C., & Clavel-Arroitia, B. (2015). Analyzing lexical density and lexical diversity in university students’ written discourse. Procedia-Social and Behavioral Sciences, 198, 546-556. https://doi.org/10.1016/j.sbspro.2015.07.477
- Halliday, M. A. K., & Matthiessen, C. M. I. M. (2004). An introduction to functional grammar. Edward Arnold.
- Harley, B., & King, M. L. (1989). Verb lexis in the written compositions of young L2 learners. Studies in Second Language Acquisition, 11(4), 415-439. https://doi.org/10.1017/S0272263100008421
- Hawkey, R., & Barker, F. (2004). Developing a common scale for the assessment of writing. Assessing Writing, 2(9), 122-159. https://doi.org/10.1016/j.asw.2004.06.001
- Heaps, H. S. (1978). Information retrieval, computational and theoretical aspects. Academic Press.
- Hess, C. W., Sefton, K. M., & Landry, R. G. (1986). Sample size and type-token ratios for oral language of preschool children. Journal of Speech, Language, and Hearing Research, 29(1), 129-134. https://doi.org/10.1044/jshr.2901.129
- Housen, A., & Kuiken, F. (2009). Complexity, accuracy, and fluency in second language acquisition. Applied Linguistics, 30(4), 461-473. https://doi.org/10.1093/applin/amp048
- Housen, A., Kuiken, F., & Vedder, I. (2012). Complexity, accuracy and fluency: Definitions, measurement and research. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA (Vol. 32, pp. 1-20). John Benjamins Publishing Company. https://doi.org/10.1075/lllt.32.01hou
- Huang, L., & Qian, X. (2003). An inquiry into Chinese learners’ knowledge of productive vocabulary: A quantitative study. Chinese Language Learning, 24(1), 56-61. https://doi.org/10.3969/j.issn.1003-7365.2003.01.010
- Hyltenstam, K. (1988). Lexical characteristics of near-native second-language learners of Swedish. Journal of Multilingual & Multicultural Development, 9(1-2), 67-84. https://doi.org/10.1080/01434632.1988.9994320
- Kojima, M., & Yamashita, J. (2014). Reliability of lexical richness measures based on word lists in short second language productions. System, 42, 23-33. https://doi.org/10.1016/j.system.2013.10.019
- Kovacevic, E. (2019). The relationship between lexical complexity measures and language learning beliefs. Jezikoslovlje [Linguistics], 20(3), 555-582. https://doi.org/10.29162/jez.2019.20
- Kwak, S. G., & Kim, J. H. (2017). Central limit theorem: The cornerstone of modern statistics. Korean Journal of Anesthesiology, 70(2), 144-156. https://doi.org/10.4097/kjae.2017.70.2.144
- Kyle, K., & Crossley, S. (2016). The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing, 34, 12-24. https://doi.org/10.1016/j.jslw.2016.10.003
- Laufer, B. (1991). The development of L2 lexis in the expression of the advanced learner. The Modern Language Journal, 75(4), 440-448. https://doi.org/10.2307/329493
- Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal, 25(2), 21-33. https://doi.org/10.1177/003368829402500202
- Laufer, B., & Nation, I. (1995). Lexical richness in L2 written production: Can it be measured. Applied Linguistics, 16(3), 307-322. https://doi.org/10.1093/applin/16.3.307
- Lei, S., & Yang, R. (2020). Lexical richness in research articles: Corpus-based comparative study among advanced Chinese learners of English, English native beginner students and experts. Journal of English for Academic Purposes, 47, 100894. https://doi.org/10.1016/j.jeap.2020.100894
- Li, X. (2021). A corpus based study on lexical richness of flipped classroom model in college English writing. Journal of Bengbu University, 10(6), 71-78. https://doi.org/10.3969/j.issn.2095-297X.2021.06.016
- Linnarud, M. (1986). Lexis in composition: A performance analysis of Swedish learners’ written English. CWK Gleerup.
- Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern Language Journal, 96(2), 190-208. https://doi.org/10.1111/j.1540-4781.2011.01232_1.x
- Malvern, D. D., & Richards, B. J. (1997). A new measure of lexical diversity. In A. Ryan, & A. Wray (Eds.), Evolving models of language (pp. 58-71). Multilingual Matters.
- Malvern, D., & Richards, B. (2013). Measures of lexical richness. In C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 3622-3627). John Wiley and Sons, Inc. https://doi.org/10.1002/9781405198431
- McCarthy, P. M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459-488. https://doi.org/10.1177/0265532207080767
- McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381-392. https://doi.org/10.3758/BRM.42.2.381
- McClure, E. (1991). A comparison of lexical strategies in L1 and L2 written English narratives. Pragmatics and Language Learning, 2, 141-154.
- McKee, G., Malvern, D., & Richards, B. (2000). Measuring vocabulary diversity using dedicated software. Literary and Linguistic Computing, 15(3), 323-338. https://doi.org/10.1093/llc/15.3.323
- McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. https://doi.org/10.1017/CBO9780511894664
- Meara, P. (2005). Lexical frequency profiles: A Monte Carlo analysis. Applied Linguistics, 26(1), 32-47. https://doi.org/10.1093/applin/amh037
- Michel, M. (2017). Complexity, accuracy and fluency in L2 production. In S. Loewen, & M. Sato (Eds.), The Routledge handbook of instructed second language acquisition (pp. 68). Taylor & Francis. https://doi.org/10.4324/9781315676968-4
- Moder, K. (2007). How to keep the type I error rate in ANOVA if variances are heteroscedastic. Austrian Journal of Statistics, 36(3), 179-188. https://doi.org/10.17713/ajs.v36i3.329
- Moder, K. (2010). Alternatives to F-test in one way ANOVA in case of heterogeneity of variances (a simulation study). Psychological Test and Assessment Modeling, 52(4), 343.
- Nasseri, M., & Thompson, P. (2021). Lexical density and diversity in dissertation abstracts: Revisiting English L1 vs. L2 text differences. Assessing Writing, 47, 100511. https://doi.org/10.1016/j.asw.2020.100511
- Pólya, G. (1920). Über den zentralen Grenzwertsatz der Wahrscheinlichkeitsrechnung und das Momentenproblem [About the central limit of the probability calculation and the moment problem]. Mathematische Zeitschrift [Mathematical Magazine], 8(3-4), 171-181. https://doi.org/10.1007/BF01206525
- Pyo, H. (2020). The effects of dictionary app use on college-level Korean EFL learners’ narrative and argumentative writing. Journal of Asia TEFL, 17(2), 580. https://doi.org/10.18823/asiatefl.2020.17.2.17.580
- Read, J. (2000). Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942
- Richards, B., & Malvern, D. (1997). Quantifying lexical diversity in the study of language development. In B. J. Richards (Ed.), Quantifying lexical diversity in the study of language development: New Bulmershe papers. University of Reading.
- Šišková, Z. (2012). Lexical richness in EFL students’ narratives. Language Studies Working Papers, 4, 26-36.
- Spring, R., & Johnson, M. (2022). The possibility of improving automated calculation of measures of lexical richness for EFL writing: A comparison of the LCA, NLTK and SpaCy tools. System, 106, 102770. https://doi.org/10.1016/j.system.2022.102770
- Ströbel, M., Kerz, E., & Wiechmann, D. (2020). The relationship between first and second language writing: Investigating the effects of first language complexity on second language complexity in advanced stages of learning. Language Learning, 70(3), 732-767. https://doi.org/10.1111/lang.12394
- Treffers-Daller, J., Parslow, P., & Williams, S. (2018). Back to basics: How measures of lexical diversity can help discriminate between CEFR levels. Applied Linguistics, 39(3), 302-327. https://doi.org/10.1093/applin/amw009
- Ure, J. (1971). Lexical density: A computational technique and some findings. In M. Coulthard (Ed.), Talking about text (pp. 27-48). English Language Research, University of Birmingham.
- Wan, L. (2010). An empirical investigation into lexical diversity of Chinese English majors’ TEM writings. Foreign Language World, 31(1), 40-46.
- Wang, L., & Jin, C. (2022). Effects of task complexity on linguistic complexity for sustainable EFL writing skills development. Sustainability, 14(8), 4791. https://doi.org/10.3390/su14084791
- Wang, Z. (2018). The analysis of lexical complexity of two college English textbooks. In Proceedings of the 3rd International Conference on Education and Management Science. https://doi.org/10.12783/dtssehs/icems2018/20109
- Wen, Q., Liang, M., & Yan, X. (2008). Spoken and written English corpus of Chinese learners (version 2.0). Foreign Language Teaching and Research Press.
- Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures of fluency, accuracy, & complexity. University of Hawaii Press.
- Xie, Y., & Shen, Y. (2015). A study of the relationships between lexical richness and writing quality: Taking the English majors at Guangxi University as an example. In Proceedings of the 2015 International Conference on Social Science, Education Management and Sports Education. https://doi.org/10.2991/ssemse-15.2015.419
- Yang, Y., Yap, N. T., & Mohamad Ali, A. (2023). Predicting EFL expository writing quality with measures of lexical richness. Assessing Writing, 57, 100762. https://doi.org/10.1016/j.asw.2023.100762
- Yang, Y., Zhang, F., & Zhang, S. (2022). Yīngyǔ xiězuò zhōng cíhuì fēngfù xìng cèliáng wéidù, fāngfǎ yǔ zhǐbiāo yánjiū zòngshù [An overview on dimensions, measures, and indices of lexical richness in English writing]. Wàiyǔ yǔ fānyì [Foreign Languages and Translation], 29(4), 80-85. https://doi.org/10.19502/j.cnki.2095-9648.2022.04.006
- Zhang, H., Chen, M., & Li, X. (2021). Developmental features of lexical richness in English writings by Chinese beginner learners. Frontiers in Psychology, 12, 665988. https://doi.org/10.3389/fpsyg.2021.665988
- Zhang, Y. (2021). A study on the lexical richness development in online writing for English learners. Journal of Gansu Normal Colleges, 26(3), 31-34. https://doi.org/10.3969/j.issn.1008-9020.2021.03.007