Procedure
The research is designed to reveal the relations between discoursal semantic categories and their utterance patterns from English authentic discourse.Central to this investigation is to explore the possibility of using identified words to predict the next word in a speech recognition task.In other words,if a speech recognition system has identified a word and therefore has obtained its semantic category,it is possible to predict its next word according to its discoursal pattern.In order to demonstrate the reliability of this approach on a large amount of data,three corpora,Bramshills,HCRC Map Task,and SPIDRE,are used.In practice,all utterances which contain a specific keyword/phrase(a semantic category)are extracted from these three corpora by a statistical program designed for such a purpose.We jointed all 779 conversation segments of the above three corpora together,and obtained approximately 23 million words and used them as the input data for the statistical analysis,we obtained a very large result from some semantic for this analysis.For example,the verb“have”appeared in 7050 utterances and have occupied about 0.64%of overall frequency in these 2.5 million words.The below are some of the 7050 examples:
· The best way to have a really good cup of tea is to...
· So we have a look at the photos now.
· et's have a look at all of them I think.
· Have you?
· ...and they in fact have gone to the edge of the...
· Ah market scene let's have a look for somebody moving.
· What else have we got?
· Yeah let's have a look at something else.
As can be seen from the above examples,authentic discourse contains many“ungrammatical”or broken sentences.In above utterances,only half of them can be treated as complete sentences and the other half are uncomplete sentences.These examples in fact pin point one of the most difficult task for speech recognition and understanding as many effort need to put in to reconstruct the utterance to suit the grammar of the recognition system.
A parser is used to automatic translated their syntactic status.Previous linguistic studies such as Levin's(1993)claimed that semantics of a verb and its syntactic behaviours are predictably related.And further confirmation made by Doug Jones's(1997)study to support such a relation.Recently,Rolland claimed that if words have the same dependent words in special relations,they are similar in contents.The question is,is it this a common phenomenon exiting in any natural authentic discourse?If so to what extend does this relation exist?However both Levin's and Jones'studies were limited to Levin's 1525 examples,and most of them are not authentic utterances.Through our own investigation,we believe that there is a such interface existing between semantic categories and discoursal utterance patterns,but with more complicate situation there are more alternative patterns and semantic categories than have been described by any other studies.Compare some examples from our result to the Levin's and Jones:
Verbs of Creation and Transformation(Performance Verbs):produce appears in the following syntactic patterns in Levin's and Jones'1524 examples:
1-[np,v,np,pp(for)]
1-[np,v,np]
1-[np,v]
1-[np,v,np,np]
1-[np,v,np,pp(to)]
0-[np,v]
Where in our corpora,five of their patterns are confirmed(without considering the specific features of authentic discourse):
1-[np,v,np pp(in)]
1-[np,v,np]
1-[dm,np,v,np]
1-[dm,np,v,np,np]
0-[dm,np,v]
but not the following patterns:
1-[dm,np,pp,v,np]
1-[dm,pp(to),v,np]
1-[dm,np,v,pp(on)]
As the above examples show that authentic discourse contains more patterns than theirs,especially it has a large number of so-called expletive phrases such as“it seems”;“because”;“I think”;and“uh”etc.we labelled this special class of word as DM(discourse markers)which often used to express speaker's evaluation of the previous utterances.
Where from the relation between noun and utterance patterns we not only confirm the convention view that most of adjective followed by nouns or noun phrases but also list the varieties of these modifiers and their categories.An example of the Noun of Putting,arrangement:
1-[np,v,np,np][1]
1-[dm,np,v,adj,np]
1-[utt,np,pp(of)]
1-[dm,np,pp(on)]
1-[dm,adj,np]
1-[dm,expl,np,np]
1-[dm,adj,np,adv]
1-[dm,np,dm,pp(in),np]
1-[dm,v,np,np,np]
Note:the bold np represent the position of the noun arrangement.