This study adapts Faith’s Phylogenetic Diversity metric from ecology to measure controlled vocabulary utilization in archival collections, addressing limitations of traditional diversity measures that ignore hierarchical term relationships. We introduce three diagnostic ratios—Coverage, Completeness, and Cataloging Intensity—and apply them to 878,046 photographs across 16 Dutch National Archives collections cataloged with the hierarchical GTAA vocabulary. The framework provides quantitative tools for assessing how vocabulary utilization patterns influence cultural heritage accessibility while highlighting the tension between cataloging intensity and comprehensive research utility. The findings suggest that the interaction between collection content characteristics and institutional cataloging practices creates different pathways for cultural heritage discovery, revealing substantial variation in both the scope of conceptual domains (coverage ratio) addressed and the thoroughness (completeness ratio) of description within those domains. This framework provides empirical benchmarks for evidence-based collection assessment and metadata evaluation.
