%0 Journal Article %A Jaime Abraham Castro-Mondragon %A Sébastien Jaeger %A Denis Thieffry %A Morgane Thomas-Chollier %A Jacques van Helden %T RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections %D 2016 %R 10.1101/065565 %J bioRxiv %P 065565 %X Transcription Factor Binding Motifs (TFBMs) databases contain many similar motifs, from which non-redundant collections are derived by manual curation. However, the numbers of motifs and collections are exploding. Meta-databases merging these collections do not offer non-redundant versions, because automatically regrouping similar motifs into clusters cannot be easily achieved with availabel tools. Motif discovery from genome-scale data sets (e.g. ChIP-seq peaks) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant collections of motifs. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, facilitating and accelerating the analysis of motif collections. It can simultaneously cluster multiple collections from various sources. We demonstrate how matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools. It facilitates the comparison of ChIP-seq datasets, and highlights biologically relevant variations of similar motifs. By clustering 12 entire databases (>5000 motifs), we show that matrix-clustering correctly groups motifs belonging to the same TF families, and can drastically reduce motif redundancy. It is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. %U https://www.biorxiv.org/content/biorxiv/early/2016/07/27/065565.full.pdf