Abstract
Probabilistic proposals of Language of Thoughts (LoTs) can explain learning across different domains as statistical inference over a compositionally structured hypothesis space. While frameworks may differ on how a LoT may be implemented computationally, they all share the property that they are built from a set of atomic symbols and rules by which these symbols can be combined. In this work we show how the set of productions of a LoT grammar can be effectively selected from a broad repertoire of possible productions by an inferential process starting from experimental data. We then test this method in the language of geometry, a specific LoT model (Amalric et al., 2017). Finally, despite the fact of the geometrical LoT not being a universal (i.e. Turing-complete) language, we show an empirical relation between a sequence’s probability and its complexity consistent with the theoretical relationship for universal languages described by the Coding Theorem.