Embedded poetry is a defining feature of late imperial Chinese fiction, yet its narrative function remains contested. While some critics regard these poems as “parasitic”—reiterating surrounding prose with minimal contribution—others argue for their integral aesthetic and rhetorical roles. This study aims to explore if parasitic poems exist in late imperial Chinese fiction and how they can be systematically identified. We develop a computational framework to detect such poems across a corpus of Qing-dynasty novels, combining proxy-based measures (cosine similarity and mutual information) with prompt-based large language models (LLMs). Using a manually annotated dataset of 300 poem-context pairs, we evaluate each method’s alignment with human judgments. Our preliminary findings show that proxy models achieve higher accuracy but exhibit limited sensitivity to nonparasitic cases. A multilingual prompt-based approach yields a more balanced performance, suggesting LLMs can approximate literary interpretation when effectively prompted. Our work offers tools for analyzing Chinese poetry and demonstrates the potential of LLMs in modeling literary analysis.
