To examine agreement on Rorschach Comprehensive System (CS; Exner, 2004) interpretations, 55 patient protocols were interpreted by 3 to 8 clinicians across 4 data sets on a representative set of 29 characteristics. Substantial reliability was observed across data sets, although a problematic design produced lower results in one. Unexpectedly, a Q-sort task had slightly lower reliability than a simple rating task. As expected, scales that summarized judgments had higher agreement than judgments to individual interpretive statements, and some clinicians produced more generalizable inferences than others. Interpretations for all clinicians were more strongly associated with patients' psychometric true scores (aggregated judgment M range = .82 to .92) than with the judgments of other clinicians (range = .76 to .89). Compared to meta-analyses of interrater reliability in psychology and medicine, the findings indicate these clinicians could reliably interpret Rorschach CS data.