Research Article

A refined and concise model of indices for quantitatively measuring lexical richness of Chinese university students’ EFL writing

Yang Yang 1 2 * , Ze Zheng 3
More Detail
1 College of International Studies, Southwest University, Chongqing, CHINA2 Faculty of Modern Languages and Communication, Universiti Putra Malaysia, Serdang, Selangor, MALAYSIA3 Independent Researcher, CHINA* Corresponding Author
Contemporary Educational Technology, 16(3), July 2024, ep513, https://doi.org/10.30935/cedtech/14707
Published Online: 11 June 2024, Published: 01 July 2024
OPEN ACCESS   834 Views   552 Downloads
Download Full Text (PDF)

ABSTRACT

In the existing literature, scholars have proposed various indices to measure the lexical richness (LR) of English as a foreign language (EFL) writing. However, there are currently issues of redundant indices and inconsistent usage. Attempting to address the research question of which indices are the most sensitive and effective ones to distinguish between different grade levels of Chinese university students’ EFL writing, this study aims to put forward a refined and concise model of indices that can truthfully reflect LR in EFL writing. A total of 180 compositions were selected from a Chinese EFL learner corpus: Spoken and written English corpus of Chinese learners. Scores of 28 LR indices of these compositions were computed using the software Lexical Complexity Analyzer, MATTR, and Coh-Metrix. One-way ANOVA or Welch’s ANOVA, depending on the variable’s homogeneity of variances, was conducted for each index. Two criteria were applied to determine which index of a measure should be included in the refined model: whether the difference of an index is significant among different grade levels and the effect size of ANOVA. Based on the quantitative results of ANOVAs and qualitative human judgment based on literature, six indices of the six LR measures were included in the refined model: lexical density, lexical sophistication-I, verb sophistication-II, number of different words-expected sequence 50, corrected TTR, and squared verb variation-I. This refined model addresses the issues of redundancy and inconsistency in previous studies, providing a more accurate and efficient tool for assessing LR in EFL writing.

CITATION (APA)

Yang, Y., & Zheng, Z. (2024). A refined and concise model of indices for quantitatively measuring lexical richness of Chinese university students’ EFL writing. Contemporary Educational Technology, 16(3), ep513. https://doi.org/10.30935/cedtech/14707

REFERENCES

  1. Ai, H., & Lu, X. (2010). A web-based system for automatic measurement of lexical complexity. In Proceedings of the 27th Annual Symposium of the Computer-Assisted Language Consortium.
  2. Bulté, B., & Housen, A. (2014). Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing, 26, 42-65. https://doi.org/10.1016/j.jslw.2014.09.005
  3. Chang, H., Huang, K., & Wu, C. (2006). Determination of sample size in using central limit theorem for Weibull distribution. International Journal of Information and Management Sciences, 17(3), 31-46.
  4. Chaudron, C., & Parker, K. (1990). Discourse markedness and structural markedness: The acquisition of English noun phrases. Studies in Second Language Acquisition, 12(1), 43-64. https://doi.org/10.1017/S0272263100008731
  5. Cheung, Y. L., & Jang, H. (2019). Effects of task structure on young learners’ writing quality. INTESOL Journal, 16(1), 52-78. https://doi.org/10.18060/23193
  6. Chung, E. S., & Ahn, S. (2019). Examining cloze tests as a measure of linguistic complexity in L2 writing. Language Research, 55(3), 627-649. https://doi.org/10.30961/lr.2019.55.3.627
  7. Cohen, J. (1969). Statistical power analysis for the behavioral sciences. Academic Press.
  8. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Erlbaum.
  9. Covington, M. A., & McFall, J. D. (2010). Cutting the gordian knot: The moving-average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94-100. https://doi.org/10.1080/09296171003643098
  10. Crossley, S. A., & McNamara, D. S. (2017). Adaptive educational technologies for literacy instruction. Routledge. https://doi.org/10.4324/9781315647500
  11. Crossley, S. A., Kyle, K., & McNamara, D. S. (2016). The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion. Behavior Research Methods, 48, 1227-1237. https://doi.org/10.3758/s13428-015-0651-7
  12. Dewi, R. (2017). Lexical complexity in the introductions of undergraduate students’ research articles. Jurnal Pendidikan Bahasa dan Sastra Inggris [Journal of English Language and Literature Education], 6(2), 161-172. https://doi.org/10.26618/exposure.v6i2.1179
  13. Engber, C. A. (1995). The relationship of lexical proficiency to the quality of ESL compositions. Journal of Second Language Writing, 4(2), 139-155. https://doi.org/10.1016/1060-3743(95)90004-7
  14. Fan, J., Yang, C., & Huang, Z. (2023). Lexical richness of Chinese college students’ spoken English. Journal of English Language Teaching and Applied Linguistics, 5(2), 1-14. https://doi.org/10.32996/jeltal.2023.5.2.1
  15. Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149-1160. https://doi.org/10.3758/BRM.41.4.1149
  16. Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175-191. https://doi.org/10.3758/BF03193146
  17. Geng, H., & Yang, Y. (2021, October 4-5). Lexical richness in English travel guidebooks by EEL and ENL writers [Paper presentation]. The 7th Malaysia International Conference on Foreign Languages 2021 (MICFL2021), Kuala Lumpur, Malaysia. http://micfl2021.upm.edu.my
  18. Gregori-Signes, C., & Clavel-Arroitia, B. (2015). Analyzing lexical density and lexical diversity in university students’ written discourse. Procedia-Social and Behavioral Sciences, 198, 546-556. https://doi.org/10.1016/j.sbspro.2015.07.477
  19. Halliday, M. A. K., & Matthiessen, C. M. I. M. (2004). An introduction to functional grammar. Edward Arnold.
  20. Harley, B., & King, M. L. (1989). Verb lexis in the written compositions of young L2 learners. Studies in Second Language Acquisition, 11(4), 415-439. https://doi.org/10.1017/S0272263100008421
  21. Hawkey, R., & Barker, F. (2004). Developing a common scale for the assessment of writing. Assessing Writing, 2(9), 122-159. https://doi.org/10.1016/j.asw.2004.06.001
  22. Heaps, H. S. (1978). Information retrieval, computational and theoretical aspects. Academic Press.
  23. Hess, C. W., Sefton, K. M., & Landry, R. G. (1986). Sample size and type-token ratios for oral language of preschool children. Journal of Speech, Language, and Hearing Research, 29(1), 129-134. https://doi.org/10.1044/jshr.2901.129
  24. Housen, A., & Kuiken, F. (2009). Complexity, accuracy, and fluency in second language acquisition. Applied Linguistics, 30(4), 461-473. https://doi.org/10.1093/applin/amp048
  25. Housen, A., Kuiken, F., & Vedder, I. (2012). Complexity, accuracy and fluency: Definitions, measurement and research. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA (Vol. 32, pp. 1-20). John Benjamins Publishing Company. https://doi.org/10.1075/lllt.32.01hou
  26. Huang, L., & Qian, X. (2003). An inquiry into Chinese learners’ knowledge of productive vocabulary: A quantitative study. Chinese Language Learning, 24(1), 56-61. https://doi.org/10.3969/j.issn.1003-7365.2003.01.010
  27. Hyltenstam, K. (1988). Lexical characteristics of near-native second-language learners of Swedish. Journal of Multilingual & Multicultural Development, 9(1-2), 67-84. https://doi.org/10.1080/01434632.1988.9994320
  28. Kojima, M., & Yamashita, J. (2014). Reliability of lexical richness measures based on word lists in short second language productions. System, 42, 23-33. https://doi.org/10.1016/j.system.2013.10.019
  29. Kovacevic, E. (2019). The relationship between lexical complexity measures and language learning beliefs. Jezikoslovlje [Linguistics], 20(3), 555-582. https://doi.org/10.29162/jez.2019.20
  30. Kwak, S. G., & Kim, J. H. (2017). Central limit theorem: The cornerstone of modern statistics. Korean Journal of Anesthesiology, 70(2), 144-156. https://doi.org/10.4097/kjae.2017.70.2.144
  31. Kyle, K., & Crossley, S. (2016). The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing, 34, 12-24. https://doi.org/10.1016/j.jslw.2016.10.003
  32. Laufer, B. (1991). The development of L2 lexis in the expression of the advanced learner. The Modern Language Journal, 75(4), 440-448. https://doi.org/10.2307/329493
  33. Laufer, B. (1994). The lexical profile of second language writing: Does it change over time? RELC Journal, 25(2), 21-33. https://doi.org/10.1177/003368829402500202
  34. Laufer, B., & Nation, I. (1995). Lexical richness in L2 written production: Can it be measured. Applied Linguistics, 16(3), 307-322. https://doi.org/10.1093/applin/16.3.307
  35. Lei, S., & Yang, R. (2020). Lexical richness in research articles: Corpus-based comparative study among advanced Chinese learners of English, English native beginner students and experts. Journal of English for Academic Purposes, 47, 100894. https://doi.org/10.1016/j.jeap.2020.100894
  36. Li, X. (2021). A corpus based study on lexical richness of flipped classroom model in college English writing. Journal of Bengbu University, 10(6), 71-78. https://doi.org/10.3969/j.issn.2095-297X.2021.06.016
  37. Linnarud, M. (1986). Lexis in composition: A performance analysis of Swedish learners’ written English. CWK Gleerup.
  38. Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern Language Journal, 96(2), 190-208. https://doi.org/10.1111/j.1540-4781.2011.01232_1.x
  39. Malvern, D. D., & Richards, B. J. (1997). A new measure of lexical diversity. In A. Ryan, & A. Wray (Eds.), Evolving models of language (pp. 58-71). Multilingual Matters.
  40. Malvern, D., & Richards, B. (2013). Measures of lexical richness. In C. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 3622-3627). John Wiley and Sons, Inc. https://doi.org/10.1002/9781405198431
  41. McCarthy, P. M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language Testing, 24(4), 459-488. https://doi.org/10.1177/0265532207080767
  42. McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381-392. https://doi.org/10.3758/BRM.42.2.381
  43. McClure, E. (1991). A comparison of lexical strategies in L1 and L2 written English narratives. Pragmatics and Language Learning, 2, 141-154.
  44. McKee, G., Malvern, D., & Richards, B. (2000). Measuring vocabulary diversity using dedicated software. Literary and Linguistic Computing, 15(3), 323-338. https://doi.org/10.1093/llc/15.3.323
  45. McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press. https://doi.org/10.1017/CBO9780511894664
  46. Meara, P. (2005). Lexical frequency profiles: A Monte Carlo analysis. Applied Linguistics, 26(1), 32-47. https://doi.org/10.1093/applin/amh037
  47. Michel, M. (2017). Complexity, accuracy and fluency in L2 production. In S. Loewen, & M. Sato (Eds.), The Routledge handbook of instructed second language acquisition (pp. 68). Taylor & Francis. https://doi.org/10.4324/9781315676968-4
  48. Moder, K. (2007). How to keep the type I error rate in ANOVA if variances are heteroscedastic. Austrian Journal of Statistics, 36(3), 179-188. https://doi.org/10.17713/ajs.v36i3.329
  49. Moder, K. (2010). Alternatives to F-test in one way ANOVA in case of heterogeneity of variances (a simulation study). Psychological Test and Assessment Modeling, 52(4), 343.
  50. Nasseri, M., & Thompson, P. (2021). Lexical density and diversity in dissertation abstracts: Revisiting English L1 vs. L2 text differences. Assessing Writing, 47, 100511. https://doi.org/10.1016/j.asw.2020.100511
  51. Pólya, G. (1920). Über den zentralen Grenzwertsatz der Wahrscheinlichkeitsrechnung und das Momentenproblem [About the central limit of the probability calculation and the moment problem]. Mathematische Zeitschrift [Mathematical Magazine], 8(3-4), 171-181. https://doi.org/10.1007/BF01206525
  52. Pyo, H. (2020). The effects of dictionary app use on college-level Korean EFL learners’ narrative and argumentative writing. Journal of Asia TEFL, 17(2), 580. https://doi.org/10.18823/asiatefl.2020.17.2.17.580
  53. Read, J. (2000). Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942
  54. Richards, B., & Malvern, D. (1997). Quantifying lexical diversity in the study of language development. In B. J. Richards (Ed.), Quantifying lexical diversity in the study of language development: New Bulmershe papers. University of Reading.
  55. Šišková, Z. (2012). Lexical richness in EFL students’ narratives. Language Studies Working Papers, 4, 26-36.
  56. Spring, R., & Johnson, M. (2022). The possibility of improving automated calculation of measures of lexical richness for EFL writing: A comparison of the LCA, NLTK and SpaCy tools. System, 106, 102770. https://doi.org/10.1016/j.system.2022.102770
  57. Ströbel, M., Kerz, E., & Wiechmann, D. (2020). The relationship between first and second language writing: Investigating the effects of first language complexity on second language complexity in advanced stages of learning. Language Learning, 70(3), 732-767. https://doi.org/10.1111/lang.12394
  58. Treffers-Daller, J., Parslow, P., & Williams, S. (2018). Back to basics: How measures of lexical diversity can help discriminate between CEFR levels. Applied Linguistics, 39(3), 302-327. https://doi.org/10.1093/applin/amw009
  59. Ure, J. (1971). Lexical density: A computational technique and some findings. In M. Coulthard (Ed.), Talking about text (pp. 27-48). English Language Research, University of Birmingham.
  60. Wan, L. (2010). An empirical investigation into lexical diversity of Chinese English majors’ TEM writings. Foreign Language World, 31(1), 40-46.
  61. Wang, L., & Jin, C. (2022). Effects of task complexity on linguistic complexity for sustainable EFL writing skills development. Sustainability, 14(8), 4791. https://doi.org/10.3390/su14084791
  62. Wang, Z. (2018). The analysis of lexical complexity of two college English textbooks. In Proceedings of the 3rd International Conference on Education and Management Science. https://doi.org/10.12783/dtssehs/icems2018/20109
  63. Wen, Q., Liang, M., & Yan, X. (2008). Spoken and written English corpus of Chinese learners (version 2.0). Foreign Language Teaching and Research Press.
  64. Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures of fluency, accuracy, & complexity. University of Hawaii Press.
  65. Xie, Y., & Shen, Y. (2015). A study of the relationships between lexical richness and writing quality: Taking the English majors at Guangxi University as an example. In Proceedings of the 2015 International Conference on Social Science, Education Management and Sports Education. https://doi.org/10.2991/ssemse-15.2015.419
  66. Yang, Y., Yap, N. T., & Mohamad Ali, A. (2023). Predicting EFL expository writing quality with measures of lexical richness. Assessing Writing, 57, 100762. https://doi.org/10.1016/j.asw.2023.100762
  67. Yang, Y., Zhang, F., & Zhang, S. (2022). Yīngyǔ xiězuò zhōng cíhuì fēngfù xìng cèliáng wéidù, fāngfǎ yǔ zhǐbiāo yánjiū zòngshù [An overview on dimensions, measures, and indices of lexical richness in English writing]. Wàiyǔ yǔ fānyì [Foreign Languages and Translation], 29(4), 80-85. https://doi.org/10.19502/j.cnki.2095-9648.2022.04.006
  68. Zhang, H., Chen, M., & Li, X. (2021). Developmental features of lexical richness in English writings by Chinese beginner learners. Frontiers in Psychology, 12, 665988. https://doi.org/10.3389/fpsyg.2021.665988
  69. Zhang, Y. (2021). A study on the lexical richness development in online writing for English learners. Journal of Gansu Normal Colleges, 26(3), 31-34. https://doi.org/10.3969/j.issn.1008-9020.2021.03.007