Anthology of Computers and the Humanities · Volume 3

Vision Language Models for Novel Art Therapy Evaluation in Schizophrenia

Ivan Nenchev1,2 ORCID , Karin Dannecker3 ORCID , Maren Rabe1 , Marie Jeschke4 and Christiane Montag1 ORCID

  • 1 Department of Psychiatry and Psychotherapy, Charité at St. Hedwig Hospital, Charité – Universitätsmedizin Berlin, Berlin, Germany
  • 2 Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
  • 3 Weißensee Kunsthochschule Berlin MA Art Therapy Programme, Berlin, Germany
  • 4 Galerie ART CRU Berlin, Berlin, Germany

Permanent Link: https://doi.org/10.63744/sttzqxdNWsq1

Published: 21 November 2025

Keywords: art therapy, schizophrenia, CLIP embeddings, vision language models, semantic analysis, longitudinal assessment

Abstract

Traditional methodologies for evaluating visual artistic output in art therapy remain rare and time-intensive, creating barriers to systematic assessment of therapeutic progress. This study presents the first application of multimodal dense embeddings for longitudinal evaluation of art therapy outcomes in individuals with schizophrenia. We analyzed 168 art therapy images produced by 14 participants with schizophrenia using CLIP (Contrastive Language-Image Pretraining) embeddings. CLIP embeddings successfully captured meaningful semantic patterns, with real images showing significantly greater semantic dispersion than spatially randomized controls. Longitudinal analysis revealed progressive semantic diversification over time, with significant increases in semantic distance between consecutive images (β = 0.284, p = 0.001) and cumulative semantic drift from first images (β = 0.336, p < 0.001). Individual differences analysis showed high variability in volume metrics spanning several orders of magnitude (M = 1.13 × 10¹¹, SD = 2.05 × 10¹¹), indicating highly individual semantic exploration patterns. Vision language models provide a novel and objective methodology for evaluating the progression of art therapy that reveals systematic patterns of semantic evolution during treatment. The progressive semantic diversification observed suggests that art therapy facilitates expanding creative expression and psychological exploration over time. The substantial individual differences in semantic exploration patterns indicate potential for personalized treatment approaches based on creative trajectory analysis. This methodology offers promising applications for systematic art therapy assessment, treatment monitoring, and personalized intervention strategies in clinical practice.