5.2.1 The Corpus
The present study examines the schematic structure of the opening section(Introduction and Literature Review)of Applied Linguistics Master’s Thesis,and explores possible variations among three local communities,namely China,New Zealand and America.To fulfill the set aims,we built a corpus containing the opening section of 90 theses which have separate Introduction and Literature Review chapters.The Literature Review chapter goes under various names,such as Theoretical Background and Review of the Literature.Some theses have more than one chapter reviewing literature.Generally,the chapter(s)following the Introduction and before the Method,which have the function of giving an overview of the current body of knowledge on the topic,are regarded as the Literature Review chapter(s)in the present study.
As shown in Table 9,the corpus contains 674,351 words in total and is made up of three sub-corpora representing three local communities.Each sub-corpus contains 30 opening sections.The boxplots in Figure 6 illustrate the range of the length of the opening section in individual theses.New Zealand theses are longer on average than theses written by students in China and America.As shown in Figure 1,the opening section of New Zealand theses(Mdn=9,774)are also longer than the opening sections in theses written by students in China(Mdn=5,714)and America(Mdn=5,828).There is an outlier in the China sub-corpus which is much longer than others in the same sub-corpus(Zhou,2013).Apart from this outlier,the data in the three sub-corpora are roughly distributed normally.
Table 9 Corpus description


Figure 6 The boxplot of the length of opening sections in the three sub-corpora