How AI Learned to Think in Vectors

12 min read

Okay, here is a draft blog post for your cognitive science audience, synthesizing the provided text into an accessible yet informative format. It incorporates key philosophical and cognitive science concepts, references, and explains the shift towards vector-based meaning theories.


The Geometry of Meaning: How High-Dimensional Vectors are Reshaping Our Understanding of Language

We humans navigate a world saturated with meaning. We intuitively grasp that "sofa" is closer in meaning to "chair" than to "star," and that "joyful" echoes "happy." But how do we represent this complex web of semantic relationships in our minds? For centuries, philosophers and linguists have debated the nature of meaning. Now, concepts from computer science and mathematics are offering a radically new perspective: What if meaning could be mapped geometrically, represented as points or vectors within a vast, high-dimensional space?

This isn't just science fiction; it's a concept revolutionizing fields from Artificial Intelligence to Cognitive Science. Let's explore this fascinating idea of "meaning as a vector," how it connects to older theories, and what it implies for our understanding of the human mind.

From Philosophy to Vectors: Shifting Perspectives on Meaning

Traditionally, understanding meaning involved grappling with concepts like sense and reference, famously distinguished by philosopher Gottlob Frege [13]. "Reference" is the actual thing or concept a word points to in the world (e.g., both "the morning star" and "the evening star" refer to the planet Venus). "Sense" is the way that object is presented or conceived (each phrase presents Venus differently) [13, 14]. Theories often focused on definitions, necessary and sufficient conditions, or how words relate to mental concepts.

The vector space approach emerges from a different starting point: the distributional hypothesis. Popularized in linguistics in the mid-20th century, its core idea is simple yet profound: "a word is known by the company it keeps" [1]. Words that frequently appear in similar linguistic contexts are likely to have similar meanings [1, 2]. "Doctor" and "nurse" often co-occur with "hospital," "patient," and "treatment," suggesting they inhabit a related semantic neighborhood, distinct from words like "car" or "planet" which keep different company.

This aligns somewhat with the philosophical notion of "meaning as use"—that a word's meaning is derived from its function and context within language, rather than some inherent essence [2].

Building the Semantic Space: How Vectors Capture Meaning

The distributional hypothesis is computationally realized through Vector Space Models (VSMs), also known as semantic embeddings [1, 2]. Imagine a space with potentially hundreds or thousands of dimensions. Each dimension might represent a specific contextual feature – often, another word [1, 2].

A word like "dog" is then represented as a vector – essentially a list of numbers or coordinates – in this high-dimensional space [1]. Each number indicates the strength of association between "dog" and a particular contextual dimension (e.g., how often does "dog" appear near "bark," "walk," "pet," vs. "meow" or "engine"?).

(See Table 1 in the source material for a simplified illustration of how vectors for "Dog," "Cat," and "Car" might differ across context dimensions like "Bark," "Meow," "Wheels," etc.)

Words with similar meanings, because they appear in similar contexts, will have similar vector coordinates and thus end up "closer" together in this semantic space [1, 2]. This geometric proximity becomes a quantifiable measure of semantic similarity, often calculated using metrics like cosine similarity [1].

This basic idea extends beyond single words. Sentence embeddings aim to capture the overall meaning of an entire sentence as a single vector, often by combining the vectors of its constituent words in sophisticated ways [2, 9].

Mapping Vectors onto Minds and Meaning

This geometric view offers powerful advantages:

  1. Quantifying Similarity: We can now numerically calculate how similar words or sentences are, enabling tasks impossible with purely qualitative theories [1]. This powers fascinating demonstrations, like showing the vector for "king" - "man" + "woman" is very close to the vector for "queen," capturing analogy [1].
  2. NLP Applications: These embeddings are the bedrock of modern Natural Language Processing (NLP), enabling better machine translation, information retrieval (search engines), sentiment analysis, question answering, and more [1, 2, 4, 9]. (See Table 2 in the source for specific examples).
  3. Cognitive Science Insights: Crucially for cognitive science, these models show promise in reflecting human cognition. The structure of these vector spaces often aligns with human judgments of similarity [1, 8]. Some researchers propose that our own conceptual knowledge might be organized geometrically in "conceptual spaces" [25, 34]. Excitingly, studies are finding correlations between the geometry of word embeddings and patterns of neural activity (fMRI/EEG), suggesting our brains might actually employ high-dimensional representations for meaning [5, 31, 33].

Philosophical Wrinkles and Cognitive Challenges

Despite its power, the vector space model isn't a perfect theory of meaning and faces significant hurdles:

  1. Sense and Polysemy: Can a single vector capture Frege's "sense," especially for words with multiple meanings (polysemy)? Early models struggled, averaging meanings together [2]. Newer contextualized embeddings (like those from BERT or GPT models) generate different vectors for a word depending on its sentence context, getting closer to capturing different senses, but the debate continues [1, 2].
  2. Reference and Grounding: How do these vectors, derived from text, connect to the real world (Frege's "reference")? This is the grounding problem [5]. Can a model that only sees text truly understand what a "chair" is, or what "sitting" feels like, without sensory experience? This lack of grounding in perception and action is a major critique, especially for concepts tied to embodiment [5, 21]. Research into multi-modal models (combining text with images or other data) attempts to address this [2].
  3. Compositionality: How do we combine word vectors to accurately represent complex sentence meanings? The meaning of "dog bites man" is different from "man bites dog," though the words are the same. Simple vector addition often isn't enough to capture syntactic structure and logical relations [2, 12]. This remains an active research area [15, 28].
  4. Philosophical Objections: Distributional semantics faces deeper challenges. The instability objection suggests that if meaning is purely relational, changing one word's meaning could destabilize the entire system [3, 29]. Furthermore, critics invoke Searle's Chinese Room argument: do these models understand meaning, or just expertly manipulate symbols based on statistical patterns [20]? Does correlation (distribution) equal causation (meaning) [20]?

Conclusion: A New Geometry for Thought?

Representing meaning as vectors in high-dimensional space is a paradigm shift. It provides a mathematically rigorous, computationally tractable way to model semantic relationships, bridging philosophy, computer science, and cognitive science. It allows us to quantify meaning, power sophisticated AI, and potentially offers a window into the geometric nature of conceptual representation in the human brain [5, 8, 25].

However, crucial challenges remain. Fully capturing the nuances of sense, grounding meaning in real-world experience, achieving robust compositionality, and answering fundamental philosophical questions about the nature of understanding are frontiers yet to be conquered [2, 5, 12, 20].

The journey to understand meaning is far from over, but the high-dimensional landscapes revealed by vector space models provide a compelling, powerful, and perhaps even neurologically plausible new map for exploring the territory of thought and language.


References

[1] Jurafsky, D. & Martin, J. H. Speech and Language Processing (3rd ed. draft). Chapter 6. Accessed April 2, 2025. https://web.stanford.edu/~jurafsky/slp3/6.pdf [2] Turney, P. D., & Pantel, P. (2010). From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37, 141-188. (Similar content also discussed in ResearchGate survey: https://www.researchgate.net/publication/264474244_Vector_Space_Models_of_Word_Meaning_and_Phrase_Meaning_A_Survey) [3] Rabern, B., & Ball, D. (2024). ChatGPT and The Philosophy of Linguistic Meaning. University of Reading. Accessed April 2, 2025. https://research.reading.ac.uk/digitalhumanities/chatgpt-and-the-philosophy-of-linguistic-meaning/ [4] Lin, D. (1998). Automatic Retrieval and Clustering of Similar Words. Proceedings of COLING-ACL. (Conceptually related to vector space models for lexical meaning, e.g., https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=e65afd4a071c6238e31ce0aebe1576cd47199384) [5] Grand, G., Blank, I. A., Pereira, F., & Fedorenko, E. (2023). Semantic projection recovers rich human knowledge of multiple object features from word embeddings. Nature Human Behaviour, 7(8), 1394-1408. https://pmc.ncbi.nlm.nih.gov/articles/PMC10349641/ [8] Turney, P. D., & Pantel, P. (2010). From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37, 141-188. https://www.microsoft.com/en-us/research/wp-content/uploads/2017/07/jair10.pdf [9] Su, J., et al. (2024). Simple Techniques for Enhancing Sentence Embeddings in Generative Language Models. arXiv preprint arXiv:2404.03921. https://arxiv.org/html/2404.03921v2 [12] Lake, B. M., & Murphy, G. L. (2024). Compositionality and Sentence Meaning: Comparing Semantic Parsing and Transformers on a Challenging Sentence Similarity Dataset. Computational Linguistics, 1-40. https://direct.mit.edu/coli/article/doi/10.1162/coli_a_00536/124463/Compositionality-and-Sentence-Meaning-Comparing [13] Frege, G. (1892). Über Sinn und Bedeutung (On Sense and Reference). Discussion referenced, e.g., https://philosophy.stackexchange.com/questions/121691/formal-definition-of-sense-in-regards-to-freges-sense-vs-reference-distinction [14] Textor, M. (2011). Sense, Reference, and Psychological Epiphenomena in Frege's Semantics (1892). In Frege on Sense and Reference. Cambridge University Press. https://www.cambridge.org/core/books/frege/sense-reference-and-psychological-epiphenomena-in-freges-semantics-1892/4B72C14A24E27DD59C1D36F4C1D268B8 [15] Coecke, B., Sadrzadeh, M., & Clark, S. (2010). Mathematical Foundations for a Compositional Distributional Model of Meaning. Linguistic Analysis, 36(1-4), 345-384. (Conceptual basis for compositionality challenges, e.g., https://www.youtube.com/watch?v=xCzmOyB5J1o) [20] Lenci, A. (2018). Distributional models of word meaning. Annual Review of Linguistics, 4, 151-171. (Related discussion: https://www.thecrowned.org/the-distributional-hypothesis-semantic-models-in-theory-and-practice) [21] Bergen, B. K. (2022). Distributional Semantics Still Can't Account for Affordances. Proceedings of the Annual Meeting of the Cognitive Science Society, 44. https://pages.ucsd.edu/~bkbergen/papers/cogsci_2022_nlm_affordances_final.pdf [25] Piantadosi, S. T. (2024). Why concepts are (probably) vectors. Trends in Cognitive Sciences. (Related: https://colala.berkeley.edu/papers/piantadosi2024why.pdf) [28] Rudolph, M., & Giesbrecht, E. (2016). Compositional matrix-space models of language: Definitions, properties, and learning methods. Natural Language Engineering, 22(4), 539-571. https://www.cambridge.org/core/journals/natural-language-engineering/article/compositional-matrixspace-models-of-language-definitions-properties-and-learning-methods/40972532DD4C2F61F209B45EBA743366 [29] Sahlgren, M. (2024). Distributional Semantics, Holism, and the Instability of Meaning. Minds and Machines. https://www.researchgate.net/publication/380729776_Distributional_Semantics_Holism_and_the_Instability_of_Meaning [31] Pereira, F., et al. (2018). Toward a Universal Decoder of Conceptual Representations from Human Brain Activity. Nature Communications, 9(1), 963. (Related concept: https://pmc.ncbi.nlm.nih.gov/articles/PMC8439276/) [33] Bao, R., et al. (2021). Semantics in High-Dimensional Space. Frontiers in Artificial Intelligence, 4, 698809. https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2021.698809/full [34] Gärdenfors, P. (2000). Conceptual Spaces: The Geometry of Thought. MIT Press. (Related: https://www.raubal.ethz.ch/Publications/RefConferences/Raubal_CognitiveModelingWithConceptualSpaces_08.pdf)

(Note: References are mapped to the concepts discussed, drawing from the provided list. Some source numbers might not appear directly if their content was synthesized or covered by another reference in the blog post flow.)