From Theory to Practice: Enhancing validity in interdisciplinary NLP research

In September 2025, Qixiang Fang gave a keynote speech on The 5th Workshop on Computational Linguistics for the Political and Social Sciences (CPSS) in Germany.

About the workshop

In his keynote, Fang emphasized the critical importance of validity—ensuring that measurements truly capture the concepts they claim to—in interdisciplinary NLP research. He compared how validity is defined across social sciences, educational testing, and physics, and highlighted frequent pitfalls when NLP studies neglect these traditions, such as in personality prediction and LLM benchmarking.

Fang demonstrated how under-specified benchmarks and selective human comparisons can lead to misleading conclusions and showcased item-response-theory methods for more rigorous evaluation of LLM capabilities. He further warned that high model accuracy does not guarantee valid downstream inferences when using LLM-generated annotations. He closed by urging the field to standardize language around validity, rethink cognitive assessments of NLP systems, and explicitly account for uncertainty in LLM-based measurements.

	Additional information
When	September 2025.
Where	The 5th Workshop on Computational Linguistics for the Political and Social Sciences (CPSS), Germany.
Registration	Registration is no longer possible.
Instructors	Qixiang Fang, Postdoctoral Researcher at Utrecht University.
Materials	Course materials, including slides and code, are open and can be accessed here.

From Theory to Practice: Enhancing validity in interdisciplinary NLP research

About the workshop

Additional information

When

Where

Registration

Instructors

Materials