![]()
Amy Neale
One of the major products of my research is the resource that has come to be known as the Process Type Database (PTDB).
This body of data, which currently models almost 5,400 Process types (i.e. verb senses), not only provides the data for my research, but it is also a resource available for further development and for consultation by other grammarians and text analysts.
The PTDB is a living document, which can be altered as changes in usage occur, and to which new entries can be made.
A static description of TRANSITIVITY is only of use to a research project that aims to model a limited representation of language, and this is not the aim of the COMMUNAL project (the project to which my research results will contribute).
The database is intended to be a representative list of the most frequently used, and therefore most useful, Process types and their associated Participant Roles (PRs) and its basis is in text corpora.
The Process Type Database: What it covers
The Process Type Database: Guidelines For Use
The Process Type Database is in the form of an alphabetical list. The entry for each verb sense is on a separate row in the spreadsheet, and the degrees of analysis are provided in each column. Each column provides the following:
Column A:
The first column is an alphabetical list of verb forms. This ordering enables
ease of searching a particular verb sense.
Column B:
Column B provides figures of the occurrence of the verb form, as reported by
West. For a more detailed description of the use of West in creating this DataBase,
refer to Chapter 7. The 'occurrences of form' figures provided in this list
show how frequently the verb form occurred in a corpus of 5 million words.
Column C:
The information in this column is taken from two sources. The first is from
Francis et al (1996) Collins COBUILD Grammar Patterns 1: Verbs, and is
given as either 'C0', 'C1', 'C2', 'C3', 'C4' or 'C5', which represents the Band
distinctions that Francis et al recognised in their 'verb index' (see Section
7.3.1.3 of Chapter 7 for more detail), or as '-c' if the verb form does not
occur in Francis et al (1996). All of the C5 entries in this column also include
a figure, which indicates the total number of occurrences of that verb form
in the Bank of English .
The other source of information given in this column is Biber et al (1999), The Longman Grammar Of Spoken And Written English. This publication provides frequency information for 130 multi-word verb forms as they occur in a corpus of 1 million words, and all of this frequency information is given in Column C as 'Lo.n'.
Column D:
Column D provides the verb SENSE information in the form of a gloss of the meaning,
and an example of use (taken from Collins COBUILD English dictionary
- 2nd ed). This is the important information for determining how many
different senses of each verb form are recognised.
Column E:
The next column provides figures for the occurrences of the verb senses. These
figures are adapted from West (1953), as described in Chapter 7, and show how
many times the particular verb sense occurred in his 5 million word corpus.
Column F:
Column F provides the main analysis for each verb sense. This is the most useful
column for the research presented in this thesis, because it is possible to
search for Process types. By using the filter facility in Excel it is possible
to compile lists of all the verb senses in the PTDB that are a particular Process
type. Excel 'AutoFilter' has been set up in the Database. To use AutoFilter,
click on the AutoFilter arrows at the very top of the PTDB, on the right of
Column F's column label. A drop down list will then appear with every option
that occurs in that column. Column F starts as follows:
'all
top 10
custom
?
attributive, affected carrier
attributive, agent carrier
attributive, plus 3 p Ag
attributive, simple carrier
etc '
Clicking on an entry on this list will provide all, and only, the occurrences of that particular entry in the entire PTDB. Therefore, clicking 'two role plus affected' will provide a list of all the verb senses which formed the basis for the system network presented in Appendix B2 in the thesis: the system network for 'two role plus affected'.
Column G:
This column provides Levin's (1993) approximate analysis for all of the verb
senses in the PTDB that she too considers.
Column H:
Column H provides the Participant Role (PR) configuration for each verb sense.
This configuration can also be determined from the information in Column F:
Cardiff Grammar Feature. However, Column H provides additional information about
the likelihood of the covertness and overtness of the PRs, as described in Section
7.6 of Chapter 7.
Column I:
The final column, Column I, provides a place for any notes that needed to be
recorded in the compilation and analysis of the PTDB.
The current version of the PTDB can be downloaded here. It is my intention that this should be a living document, and so changes and additions will be made.