Sparse Distributed Memory

April 2, 2018 - Machine Learning, AI, Maths, SDM

This is the first in what, I hope, becomes series of posts about sparse distributed memory and the development of a database engine based on it as well as a DSL to query it.

I have been working with SDM since 2012 and thought it about time to publish my work, both the software I have been developing, sdmdb, and some notes about its history and applications.

SDM was first proposed as a model for long term human memory by Pentti Kanerva in his book Sparse Distributed Memory, MIT press, 1988 (I’ve provided a brief mathematical overview here). This work has proven to be of value in text mining applications, as described by papers by Dominic Widdows whilst a post-doc at Stanford and also in his book The Geometry of Meaning.

The remarkable properties of SDM and indeed Kanerva’s assertion that this should be modelled by bit vectors, rather than e.g. floating point arithmetic, continue to be validated from a modelling perspective and offer computational advantages.

In later posts I’ll outline some of the details of my own implementation which avoids re-normalization and also saturation of vectors during training and some of the latest developments which allow more expressive power in the models. I’ll also make claims as to why I prefer this transparent approach to semantic embeddings over e.g. text2vec.