Abstract
We formulate a method termed Tree Branches Evaluated Statistically for Tightness (TBEST) for identifying significantly distinct tree branches in hierarchical clusters. For each branch of the tree a measure of tightness is defined as a rational function of heights, both of the branch and of its parent. A statistical procedure is then developed to determine the significance of the observed values of tightness. We test TBEST as a tool for tree-based data partitioning by applying it to four benchmark datasets, each from a different area of biology and each with a well-defined partition of the data into classes. In all cases TBEST performs on par with or better than the existing techniques. An eponymous R language implementation of the method is available from the Comprehensive R Archive Network (CRAN).