Abstract
A procedure that processes a corpus of text and produces numeric vectors containing information about its meanings for each word is presented. This procedure is applied to a large corpus of natural language text taken from Usenet, and the resulting vectors are examined to determine what information is contained within them. These vectors provide the coordinates in a high-dimensional space in which word relationships can be analyzed. Analyses of both vector similarity and multidimensional scaling demonstrate that there is significant semantic information carried in the vectors. A comparison of vector similarity with human reaction times in a single-word priming experiment is presented. These vectors provide the basis for a representational model of semantic memory, hyperspace analogue to language (HAL).
Article PDF
Similar content being viewed by others
References
Armstrong, S. (Ed.) (1994).Using large corpora. Cambridge, MA: MIT Press.
Burgess, C., &Cottrell, G. (1995). Using high-dimensional semantic spaces derived from large text corpora. InProceedings of the Seventeenth Annual Conference of the Cognitive Science Society (pp. 13–14). Hillsdale, NJ: Erlbaum.
Burgess, C., &Lund, K. (1994). Multiple constraints in syntactic ambiguity resolution: A connectionist account of psycholinguistic data. In A. Ram & K. Eiselt (Eds.),Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society (pp. 90–95). Hillsdale, NJ: Erlbaum.
Burgess, C., &Lund, K. (1995a).High-dimensional semantics from corpora and human syntactic processing constraints. Paper presented at the 8th Annual CUNY Sentence Processing Conference, Tucson, AZ.
Burgess, C., &Lund, K. (1995b, November).Hyperspace analogue to language (HAL): A general model of semantic representation. Paper presented at the annual meeting of the Psychonomic Society, Los Angeles.
Burgess, C., &Lund, K. (in press). Modeling cerebral asymmetries of semantic memory using high-dimensional semantic space. In M. Beeman & C. Chiarello (Eds.),Getting it right: The cognitive neuroscience of right hemisphere language comprehension. Hillsdale, NJ: Erlbaum.
Chiarello, C., Burgess, C., Richards, L., &Pollock, A. (1990). Semantic and associative priming in the cerebral hemispheres: Some words do, some words don’t … sometimes, some places.Brain & Language,38, 75–104.
Ervin-Tripp, S. M. (1970). Substitution, context, and association. In L. Postman & G. Keppel (Eds.),Norms of word association (pp. 383–467). New York: Academic Press.
Fischler, I. (1977). Semantic facilitation without association in a lexical decision task.Memory & Cognition,5, 335–339.
Landauer, T. K., &Dumais, S. (1994, November).Memory model reads encyclopedia, passes vocabulary test. Paper presented at the annual meeting of the Psychonomic Society, St. Louis.
Lund, K., &Burgess, C. (in press). A general model of semantic representation (abstract).Brain & Cognition.
Lund, K., Burgess, C., &Atchley, R. A. (1995). Semantic and associative priming in high-dimensional semantic space. InProceedings of the Seventeenth Annual Conference of the Cognitive Science Society (pp. 660–665). Hillsdale, NJ: Erlbaum.
McRae, K.,de Sa, V., &Seidenberg, M. S. (1993).The role of correlated properties in accessing conceptual memory. Unpublished manuscript.
Neely, J. H. (1977). Semantic priming and retrieval from lexical memory: Roles of inhibitionless spreading activation and limited-capacity attention.Journal of Experimental Psychology: General,106, 226–254.
Osgood, C. E., Suci, G. J., &Tannenbaum, P. H. (1957).The measurement of meaning. Urbana: University of Illinois Press.
Schütze, H. (1992).Dimensions of meaning. Unpublished manuscript.
Schvaneveldt, R. W. (1990).Pathfinder associative networks: Studies in knowledge organization. Norwood, NJ: Ablex.
Shepard, R. N. (1980). Multidimensional scaling, tree-fitting, and clustering.Science,210, 390–398.
Shepard, R. N., Romney, A. K., &Nerlove, S. B. (Eds.) (1972).Multidimension scaling: Theory and applications in the behavioral sciences. New York and London: Seminar Press.
Spence, D. P.&Owens, K. C. (1990). Lexical co-occurrence and association strength.Journal of Psycholinguistic Research,19, 317–330.
Zernik, U. (Ed.) (1991).Lexical acquisition: Exploiting on-line resources to build a lexicon. Hillsdale, NJ: Erlbaum.
Author information
Authors and Affiliations
Corresponding authors
Additional information
This research was supported by an NSF Presidential Faculty Fellow award (SBR-9453406) to C.B.
Rights and permissions
About this article
Cite this article
Lund, K., Burgess, C. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers 28, 203–208 (1996). https://doi.org/10.3758/BF03204766
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03204766