Root system analysis is a complex task, often performed using fully automated image analysis pipelines. However, these pipelines are usually evaluated with a limited number of ground-truth root images, most likely of limited size and complexity. We have used a root model, ArchiSimple to create a large and diverse library of ground-truth root system images (10.000). This library was used to evaluate the accuracy and usefulness of several image descriptors classicaly used in root image analysis pipelines. Our analysis highlighted that the accuracy of the different metrics is strongly linked to the type of root system analyzed (e.g. dicot or monocot) as well as their size and complexity. Metrics that have been shown to be accurate for small dicot root systems might fail for large dicots root systems or small monocot root systems. Our study also demonstrated that the usefulness of the different metrics when trying to discriminate genotypes or experimental conditions may vary. Overall, our analysis is a call to caution when automatically analyzing root images. If a thorough calibration is not performed on the dataset of interest, unexpected errors might arise, especially for large and complex root images. To facilitate such calibration, both the image library and the different codes used in the study have been made available to the community.