High-sensitivity detection of facial features on MRI brain scans with a convolutional network

Shashank Bansal; Avinash Kori; Wazeer Zulfikar; Joseph Wexler; Christopher J. Markiewicz; Franklin F. Feingold; Russell A. Poldrack; Oscar Esteban

doi:10.1101/2021.04.25.441373

Abstract

Platforms and institutions that support MRI data sharing need to ensure that identifiable facial features are not present in shared images. Currently, this assessment requires manual effect as no auto-mated tools exist that can efficiently and accurately detect if an image has been “defaced”. The scarcity of publicly available data with pre-served facial features, as well as the meager incentives to create such a cohort privately, have averted the development of face-detection models. Here, we introduce a framework to detect whether an input MRI brain scan has been defaced, with the ultimate goal of streamlining it within the submission protocols of MRI data archiving and sharing platforms. We present a binary (defaced/”nondefaced”) classifier based on a custom convolutional neural network architecture. We train the model on 980 de-faced MRI scans from 36 different studies that are publicly available at OpenNeuro.org. To overcome the unavailability of nondefaced examples, we augment the dataset by inpainting synthetic faces into each training image. We show the adequacy of such a data augmentation in a cross-validation evaluation. We demonstrate the performance estimated with cross-validation matches that of an evaluation on a held-out dataset (N =581) preserving real faces, and obtain accuracy/sensitivity/speci-ficity scores of 0.978/0.983/0.972, respectively. Data augmentations are key to boosting the performance of models bounded by limited sample sizes and insufficient diversity. Our model contributes towards developing classifiers with ∼100% sensitivity detecting faces, which is crucial to ensure that no identifiable data are inadvertently made public.

Competing Interest Statement

The authors have declared no competing interest.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.