5.2.3 Data Description and Statistical Tests

5.2.3 Data Description and Statistical Tests

The present study describes the use of moves with two measures:a)the number of theses containing each move;b)the length of each move in the three sub-corpora as well as in each thesis.Both measures are standardized as proportions for the purpose of comparison.The number of scripts containing moves/steps is a frequently used measure in previous genre analysis studies(e.g.Hu,2010;Martín &León Pérez,2014).The present study employs the length of moves as an additional measure for two reasons:First,the length of each move unit varies significantly.Since the number of occurrences cannot show the difference between the presence of a move in one sentence and the presence in one page,the measure cannot accurately reflect the differences.Second,given that a thesis is much longer than a research article,it is no surprise that the presence of the four moves and metatexts is relatively high in all sub-corpora and thus no obvious difference is observed in terms of the number of thesis containing each move(see Table 10).However,as shown in Table 11,there are clear differences in terms of the length of moves.

Apart from number of occurrences,some studies use frequency to describe the use of moves.However,as observed by many genre analysts,writers sometimes employ cyclic moves/steps in research articles(e.g.Bhatia,1993;Swales,1990).In my corpus,some thesis writers were observed to cycle moves/steps,while others do not.Since theses are much longer than research articles,there is a considerable difference between the length of a move/step unit as part of a cycle and the length of a move/step staying on the same topic until the function is completed or the content is described.The same move/step may be realised in three lines or three pages.Therefore,frequency may not be the best measure for genre analysis of a thesis.

To describe the length of moves in each sub-corpus,the present study reports the median and interquartile range of each group rather than means.Whereas the standardized length of Move 3,4 and metatexts of opening sections in the three sub-corpora are roughly normally distributed,there are some outliers in Move 1,2 and 5.Since outliers have a bigger influence on means and distort the true picture,median is a better measure in this case.Apart from median,the present study also reports the upper and lower quartile to give an indication of the variation of individual texts.

Multivariate Analysis of Variance(MANOVA)was conducted to assess the significance of the differences observed in the standardized length of moves across the three sub-corpora.As mentioned above,there are some outliers in Move 1,2 and 5.Since MANOVA is a test based on means,outliers would skew results.And therefore,a winsorizing technique,which is a process of making the value of outliers less extreme and has been used in Business(e.g.Watson,1990)and Medical Sciences research(e.g.May,Van Putten,Jenden,Yale,&Dixon,1981),was performed to reduce the influence from outliers and make the dataset meet the requirement of multivariate normality of MANOVA(Field,2009).