Abstract
The earliest human graphic productions dating to the Lower and Middle Palaeolithic are associated with anatomically modern and archaic hominins. These productions, which consist of abstract patterns engraved on a variety of media, may have been used as symbols, and their emergence is thought to be associated with the evolution of the properties of the visual cortex. To test this hypothesis, we used functional magnetic resonance imaging to compare brain activations triggered by the perception of engraved patterns dating between 540,000 and 30,000 years before the present with those elicited by the perception of scenes, objects, symbol-like characters, and written words. The perception of the engravings bilaterally activated regions in the ventral route in a pattern similar to that produced by the perception of objects, suggesting that these graphic productions are processed as organized visual representations in the brain. Moreover, the perception of the engravings led to a leftward activation of the visual word form area. These results support the hypothesis that in contrast to random doodles, the earliest abstract graphic productions had a representational purpose for modern and archaic hominins.
From Palaeolithic rock paintings to contemporary art, abstract and figurative graphic productions have represented a major aspect of human cognitive activity. The ability to produce visual representations was assumed to have emerged at the beginning of the Upper Palaeolithic era approximately 42,000 years BP (Before Present) due to a sudden genetic mutation occurring only in anatomically modern human populations1. The subsequent discoveries of abstract paintings and engravings older than 42,000 years in Africa and Eurasia suggest that this ability emerged much earlier and was shared by different fossil species, including Homo erectus2, Neanderthals3,4 and Early African Homo sapiens. Analyses of several early abstract patterns have shown that these patterns were produced deliberately and had no apparent utilitarian function2,3,5,6, leading some authors7 to argue that these ancient abstract patterns represent the earliest instances of symbol use a thousand years prior to the earliest known 40- to 30,000-year-old examples of depictional representations8. The production of abstract patterns has been linked to the evolution of the primary visual areas and surrounding cortex9,10. Alternatively, the complexification of graphic productions from simple geometric to depictional representations may reflect the successive evolutionary stages of visual information processing through the primary visual cortex to the occipito-temporal cortex via the ventral route. The ventral route is involved in the extraction of visual information and allows the recognition of visual categories, such as shapes, scenes, and words (see11 and12 for a review). If the abstract patterns intentionally engraved by hominins were not random doodles and were perceived as structured forms with a potential meaning, their perception should engage the ventral route cortex, which extends anteriorly to the early visual areas in the occipito-temporal pathway. We test this hypothesis in the present study, which aims to characterize the cerebral regions involved in the perception of the earliest engraved abstract graphisms (EEAG). The blood oxygen level dependent (BOLD) signal was mapped in twenty-seven healthy volunteers using functional magnetic resonance imaging (fMRI), while the individuals were presented with tracings of EEAG dating between 540 ka and 30 ka (See Table S1). The perception of these patterns was compared to the perception of their scrambled version, in which the genuine organization of the abstract patterns was lost to assess whether the geometric organization of the EEAG (albeit some patterns were very simple) could be differentiated at the visual cortex level from patterns with no perceptual organization. Another aim of the study was to shed light on the nature of the EEAG. Therefore, we assessed whether the regions activated by the perception of EEAG were also involved in the processing of other visual categories visually presented to the same participants, including a scrambled version of each stimulus belonging to each categoriy (Figure 1).
The first category included pictures of nameable human-made artefacts. These stimuli conveyed semantic and lexical information but no symbolic content. The rationale for this choice is that proximity between the patterns of activation of the EEAG and objects could support the representational character of the EEAG, which could also indicate that the EEAG were processed as a whole rather than a collection of lines, fulfilling the “good shape” criteria of the gestalt theory of perception13,14. The second category was represented by outdoor scenes depicting largescale surroundings. These stimuli provided information regarding the outer world and conveyed semantic content without being perceived as a single item. In addition, two symbolic categories were included in the study. The first symbolic category included chains of characters from the linear B syllabic script, which is a writing system unknown to the participants, and these characters were chosen as stimuli perceptible as ‘potentially’ symbolic. A similar profile of activation between the EEAG and linear B script stimuli could suggest that the EEAG are processed as symbols or signs that result from a combination of structurally simple elements. The second symbolic category was represented by two-syllable words. This condition extended the requirements of processing the linear B alphabet because the perception of familiar individual overlearned signs (letters) and familiar combinations trigger access to the pronunciation and meaning of words. In addition, word reading is characterized by a leftward asymmetric activation of the ventral route that reflects symbol processing and lexical and semantic access15,16. Thus, this condition provides a pattern of leftward asymmetry that could be compared with the asymmetry elicited by the perception of EEAG. The regional BOLD values in response to the engravings minus the scramble contrast were extracted from each homotopic region (hROI) of the functional homotopic atlas AICHA17. The regions included in the analysis fulfilled the following two criteria: in at least one hemisphere, a significant positive value was observed in response to both the engravings minus scramble contrast (p < 0.05 False discovery rate, FDR corrected, one-sample two-sided t-test) and engravings minus fixation (see methods) contrast (p < 0.05 uncorrected, one- sample two-sided t-test). The fulfilment of these conditions ensured that regions with stronger deactivation under the reference condition (Scramble) during the EEAG comparison with the fixation were not selected.
Ten hROIs were significantly activated in either the right or left hemisphere in the engravings minus scramble contrast. All hROIs were located in the visual ventral pathway (Table 1, Table S2 and Figure 2). Thus, the perceptual processing of EEAG compared to their scrambled version did not involve primary visual areas but did involve visual regions beyond the primary visual and peristriate areas. Previous work has led authors to propose that the early visual cortex may play a role in the extraction of the geometric features composing EEAG9,10. However, the present work shows that the activity generated by the perception of scrambled and intact EEAG in the early visual cortex does not substantially differ. This similarity suggests that the evolution of the properties of the primary visual cortex was unlikely to play a central role in the emergence of EEAG or that the latter triggered the former.
Although EEAG are not recognized as existing entities, their perception elicited activation in anterior regions of the ventral pathway sensitive to distinct visual categories11,12,18,19. Since the EEAG and their scrambled version differed because the latter lacked geometric organization, the level of perceptual organization required for EEAG appears to be sufficiently high to recruit higher order visual areas. The involvement of the visual ventral pathway indicates that information processing requires more than the mere identification of visual primitives, such as the simple extraction of edges, oriented segments, or ends of lines.
Interestingly, in the left hemisphere, the 10 regions activated in the EEAG minus scramble contrast were also significantly activated by the perception of objects compared to their scrambled version (Table 1 and Figure 2). Repeated-measures ANOVA comparing EEAG and object perception conditions showed that there was no hROIs x condition interaction effect on the BOLD values (F(9,225) = 1.6, p = 0.11), reflecting the similarity of activation between these two conditions in the left hemisphere. In the right hemisphere, a condition by region interaction was present (F(9,225) = 2.4, p = 0.02) due to a significantly lower BOLD value in the right inferior temporal hROI (T3-4) under the object condition than under the EEAG condition. No other regions showed a significant BOLD difference between these two conditions in the right hemisphere. The activation profile in response to each visual category compared to its scrambled version among the 10 hROIs confirms that the EEAG and object conditions triggered similar activations along the ventral pathway (Figure 3). Notably, the activation profile under the EEAG perception condition in the 16 hROIs activated by object perception (listed in Table S3 as supplementary material) was also identical (see figure S1 in supplementary material), providing supplementary evidence of the brain functional overlap between object and EEAG perception.
In contrast, the comparison of the BOLD values observed under the EEAG condition with those in each of the other conditions revealed strongly significant hROIs x conditions interactions in all cases in both the left [EEAG, LinearB: F(9,225) = 3.6, p = 0.0003), EEAG, Words: F(9,225) = 23.6, p < 0.0001, EEAG, Scenes (F(9,225) = 13.9, p < 0.0001)] and the right [EEAG, LinearB (F(9,225) = 3.4, p = 0.0006, EEAG, Words F(9,225) = 18.0, p < 0.0001, EEAG, Scenes (F(9,225) = 11.4, p < 0.0001)] hemispheres. Thus, the activation profile under the linear B script, words and scenes conditions compared to their scrambled version differed from that under the EEAG condition, although some regions exhibited the same level of activation (Figure 3A). The plot of activations under the EEAG condition against the activations under each of the other conditions suggests that a positive linear relationship exists only between the objects and EEAG conditions (Figure 3B).
To verify that the relationship observed at the group level between the activation profiles elicited by the objects and EEAG was not determined by outliers, the correlation of the activation profiles in 10 regions in response to the EEAG and objects (compared to their scrambled version) was computed in each hemisphere separately for each subject. Then, the resulting Pearson’s correlation coefficients of each subject were Fisher z-transformed and analysed using a univariate two-sided t-test that was significant in both the left (t(25) = 7.6, p < 0.0001) and right (t(25) = 4.8, p < 0.0001) hemispheres. This finding confirms that both EEAG and object perception recruit the ventral pathway in a similar way in both hemispheres.
The proximity of the ventral visual cortex responses to these two visual categories was likely elicited by the global visual organization of EEAG, which triggered processes comparable to those required for object recognition. Among the regions activated by the perception of both EEAG and objects, O3-1 and O3-2 corresponded to the so-called LO functional region20. LO is defined as the brain area showing the greatest activation while viewing a known or novel object compared to its scrambled version20,21. The involvement of this region further supports the hypothesis that the processes involved in object recognition are involved in the perception of EEAG. This hypothesis is consistent with the fact that LO has been proven to be sensitive to the shape, but not the semantics, of objects22,23. Thus, although the similarity between the profiles of the objects and EEAG does not demonstrate that EEAG were previously used as symbols, modern brains perceive these EEAG as coherent visual entities to which symbolic meaning can be attached. Given that it has recently been argued that the anatomy of cortical areas belonging to the visual ventral cortex have been moderately impacted by the evolution of the brain from the earliest hominins to Homo sapiens24,25, it seems reasonable to speculate that the present results also apply to early modern humans and archaic hominins. Considering that EEAG were produced deliberately and had no apparent utilitarian function, our results are consistent with the hypothesis that these graphic productions had a representational purpose and were used and perceived as icons or symbols by modern and archaic hominins.
At the regional level, only one area, i.e., the occipito-temporal region of the left hemisphere or FUS4, was activated by both EEAG and tasks involving symbolic or iconic contents (objects, linear B, and words). This region corresponds to the so-called visual word form area (VWFA)26,15. This region was activated by the perception of both objects, i.e., the linear B script and words, which is consistent with the proposal that the role of the left FUS4 is not restricted to word recognition27–29. As proposed by Vogel, FUS4 processing extends to the processing of complex visual perception with a statistical regularity and a “groupable” characteristic (i.e., perceived as a whole)28. Crucially, our results suggest that these features are found not only in words, images of objects and strings of symbols but also in EEAG, supporting their potential representational value.
Remarkably, FUS4 was also the only region where the leftward asymmetry survived the FDR correction under the EEAG condition (mean = 0.12, t(25) = 3.17, p = 0.04 FDR corrected, one-sample two-sided t-test). As previously described16, word presentation causes the largest leftward asymmetry likely due to hemisphere high-order language areas top-down processing leading to right FUS4 deactivation (Table 1). A significant, but lower intensity, leftward asymmetry was also present during the perception of objects and the linear B script (Figure 4A, all p’s < 0.05 corrected, one-sample two-sided t-tests). The leftward asymmetries measured during the EEAG, linear B script and object presentations did not significantly differ (all p’s > 0.10, post hoc t-tests) but were significantly larger than those measured during the scene perception (all p’s < 0.01) and significantly smaller than those triggered in the words minus scrambled words contrast (all p’s < 0.0001). To ensure that this result was not determined by variation under the scrambled pictures conditions, we confirmed that the perception of the scrambled pictures of each category did not result in any significant asymmetry in this region (compared to the fixation dot, see methods, all p’s > 0.10 uncorrected, Figure 4B).
Importantly, the left hemisphere hosts language in most right-handed individuals. More generally, the left hemisphere is dedicated to cultural artifacts processing in most humans30. Under this framework, the leftward asymmetry observed in FUS4 during the perception of EEAG suggests that EEAG share some representational features with both objects and the linear B script that are not present in scenes, which did not show any asymmetry (p = 0.72, one-sample two-sided t-test). The trend of gradually decreasing leftward asymmetries in FUS4 from words, objects, linear B characters, EEAG to scenes may reflect the meaning conveyed by the different categories of stimuli; i.e., the more a stimulus is identified as conveying meaning, the more important the leftward asymmetry (Figure 4). This finding can be considered additional evidence supporting that EEAG are recognized as forms that could have been used by a human culture in the past to store and transmit coded information. Thus, the human brain perceives these graphic entities as having significant regularities to which semantic information can be connected. The present results support the as sumption that this type of production was not exclusively associated with anatomically modern humans.
Materials and Methods
Participants
The study was approved by the “Sud-ouest outremer III” local Ethics Committee (N° 2016/63). Twentysix healthy right-handed adults (mean age ± SD: 21.7 ± 2.5 years, 15 women) with no neurological history were included after providing written informed consent to participate in the study. The participants had a mean educational level of 15 years of schooling after first grade. One participant was excluded from the analysis due to technical issues in the fMRI images.
Data acquisition
Stimuli
Our experimental protocol included five categories of stimuli, and each category was compared to its scrambled version (Figure 1).
The stimuli of interest, labelled EEAG, consisted of 50 tracings of abstract patterns created with stone tools on a variety of media that were discovered at archaeological sites dating from the Lower, Middle and Early Upper Palaeolithic eras (see Table S1 in supplementary material). We used tracings in which the engraved lines were rendered in white on a grey background and the outline of the object was rendered in light grey. The thickness of the lines on the tracings reflects the width of the engraved lines on the original piece. These stimuli were 700 * 480 pixels in size.
The pictures of objects (256 * 256 pixels) were represented by/consisted of nameable human-made artefacts depicted in different shades of grey. The stimuli were used to determine whether the neural bases of the engravings’ perception can be distinguished according to their levels of perceptual organization and the meaning they convey. The lowest level of organization was represented by the scrambled pictures, which were designed to serve as a control for the processing of each condition. These stimuli were constructed by randomly scrambling the other pictures into 32 to 124 squares depending of the initial size of the original picture using MATLAB software (MathWorks, Inc., Sherborn, MA, USA). This treatment removes any perceptible organization from the images19. These stimuli were included as matched reference stimuli and used to subtract non-specific activations linked to the processing of any type of visual information (stimuli were equal in global luminance and size).
The corpus of linear B consisted of 50 images of six characters strings written in white on a grey background. These images were 653 * 114 pixels in size.
The word stimuli were bi-syllabic and composed of six letters in lowercase on a grey background. The word stimuli were nouns referring mostly to a concrete notion (80%). The pictures of words had a size of 256 * 64 pixels. The word frequency was 15.5 ± 3 (SD) according to the lexical data Lexicon 330. Low frequency words were chosen since these words activate ventral occipito-temporal regions more strongly than frequent words31.
The outdoor scenes were 256 * 256 pixels pictures of scenes, including the landscapes of mountains, beaches, or cities.
fMRI acquisition
The imaging was performed using a Siemens Prisma 3 Tesla MRI scanner. The structural images were acquired with a high-resolution 3D T1 weighted sequence (TR = 2000 ms, TE = 2.03 ms; flip angle = 8°; 192 slices; and 1 mm3 isotropic voxel size). In addition B0 maps were acquired to correct for geometric distortion (two acquisitions of 18s repeated four times). The functional images were acquired with a whole-brain T2*-weighted echo planar image acquisition (T2*-EPI Multiband x6, sequence parameters: TR = 850 ms; TE = 35 ms; flip angle = 56°; 66 axial slices; and 2.4 mm3 isotropic voxel size). The functional images were acquired in three sessions. The experiment presentation was programmed in E-prime software (Psychology Software Tools, Pittsburgh, PA, USA). The stimuli were displayed on a 27” screen. The participants viewed the stimuli through the rear of the magnet bore via a mirror mounted on the head coil.
Experimental protocol
The acquisition was organized in 3 sessions. The participants performed a 1-back task involving pictures of objects, scenes, and words and their scrambled versions in one session.
During the first run, the stimuli consisted of 234 distinct pictures presented in greyscale in a bitmap format belonging to one of the following three categories: scenes, objects, and words in their intact and scrambled versions. There were 39 different pictures per category. The block-designed paradigm consisted of 18 blocks lasting 12.75 s (6 blocks per category and their scramble version). Each of the 15 stimuli within a block was displayed for 300 ms, including two repetitions, and the participants were asked to detect the stimuli. A fixation point was displayed for 12.75 s every two blocks.
During the second and third runs, the subjects performed the same task, and the stimuli were the engravings, linear B strings and their scrambled version. Each run lasted for 4 min and 12 s with a random order of presentation and comprised 12 experimental blocks of 13.6 s interleaved with 7 fixation blocks of 12.75s. Each experimental block contained 11 stimuli and 2 repetitions with a 200 ms presentation time for each stimulus and an interstimulus interval of 1037 ms.
Image processing
Image analysis was performed using SPM12 software. The T1-weighted scans of the participants were normalized to a site-specific template (mean of 500 age-matched normal volunteers) matching the MNI space using the SPM12 “segment” procedure with the default parameters. To correct for subject motion during the fMRI runs, the 192 EPI-BOLD scans were realigned within each run using a rigid-body registration. Then, the EPI-BOLD scans were rigidly registered structurally to the T1-weighted scan. The combination of all registration matrices allowed for the warping of the EPI-BOLD functional scans to the standard space with a trilinear interpolation. Once in the standard space, a 6-mm FWHM Gaussian filter was applied.
Regional analysis
Within the selected hROIs, the activation profiles across the regions were calculated for each condition contrast (engravings minus scramble, objects minus scramble, scenes minus scramble, linear B minus scramble and words minus scramble) for each of the 26 participants.
The objects, linear B and scenes were compared to the engravings using repeated- measures 3-way ANOVA in each hemisphere (i.e., 2 (conditions) x 10 (hROIs)). In addition, Pearson’s correlation coefficients between the profiles of the engravings and objects conditions were computed for each subject and Fisher z-transformed. Then, the 26 Z-scores were analysed using a univariate two-sided t-test to determine whether the profile of objects was correlated to the profile of engravings. This analysis was conducted in each hemisphere separately.
A list of the regions activated by the objects, words, linear B, and scenes compared to their respective scrambled counterparts is presented in Table S3 to S6 as supplementary material.
Since a leftward asymmetry in the ventral route is typical of the processing of semantic and verbal material16,27, we calculated the left minus right BOLD values in the selected hROIs during the EEAG presentations to identify the hROIs with significant leftward asymmetries (one-sample t-test, FDR corrected). Then, we compared the identified hROI(s) asymmetries across all conditions with one-way ANOVA.
Author contributions
EM, NTM and FE conceived the study. EM, MS, GJ, and AM designed the study. EM, MS, and SC performed the experiment, EM, MS, SC, NTM, MJ, and BM carried out the analyses, EM, NTM and FE wrote the manuscript with contributions from AM, BM, and GJ.
Competing interest
The authors declare no competing interests.
Supplementary Materials
Acknowledgements
This research was supported by a grant from IdEx Bordeaux / CNRS (PEPS 2015). The authors thank GinesisLab (Programme Labcom 2016, ANR 16LCV2-0006-01) for their help in data management and data processing and Carole Peyrin for providing some of the stimuli used.