调研世界2025,Issue(1) :85-96.DOI:10.13778/j.cnki.11-3705/c.2025.01.007

社会调查访员身份与问卷质量预测

Types of Interviewers and Data Quality Prediction in Social Surveys

马超 贾朋
调研世界2025,Issue(1) :85-96.DOI:10.13778/j.cnki.11-3705/c.2025.01.007

社会调查访员身份与问卷质量预测

Types of Interviewers and Data Quality Prediction in Social Surveys

马超 1贾朋1
扫码查看

作者信息

  • 1. 中国社会科学院人口与劳动经济研究所、中国社会科学院人力资源研究中心
  • 折叠

摘要

本文从访员身份视角探讨如何构建基于并行数据的质量控制体系和问卷质量预测模型.利用GPS、人脸识别等调查并行数据,本文首先从访问时长、访问负担和访员行为三个维度构建了CULS质量控制体系,并在多元回归框架下探讨了问卷质量的影响因素.问卷质量预测分析结果显示,基于质量控制指标和受访户特征构建的机器学习模型可以较准确地预测学生访员完访问卷质量,但对社区专职工作人员完成问卷质量预测一致率较低.本文进一步探讨了预测模型构建中初始训练集和更新训练集大小对预测一致率的影响,分析结果显示预测一致率随初始训练集和更新训练集样本量的增加而提高,更新训练集可以有效提高调查中期预测效果.本文结论对面访调查的访员遴选、质量控制指标构建、问卷质量预测和质量核查成本控制都具有一定借鉴意义.

Abstract

This paper discusses how to build a data quality control system based on paradata and data quality prediction model from the perspectives of types of interviewers in social surveys.Building on paradata like GPS and facial recognition,we firstly construct a data quality control system from 3 dimensions—interview length,response burden and interviewers'behavior,and explore the potential factors influencing data quality in a multivariate regression framework.Secondly,the data quality prediction analysis shows that a machine learning model based on data quality indicators and interviewees'characteristics performs well in predicting data quality collected by students but not by community workers.We further show that the sample size of initial training set and the proportion of sample updating training set in the prediction model affect the accuracy of prediction,in which a larger size of the initial training set and sample updating training set is associated with higher prediction consistency.Updating the initial training set improves the accuracy of prediction in the middle stage of surveys.This paper has some implications for interviewer selection,quality control indicators construction,data quality prediction and cost-benefit analysis of quality control in face-to-face surveys.

关键词

并行数据/质量控制/预测/访员身份

Key words

Paradata/Quality Control/Prediction/Types Interviewers

引用本文复制引用

出版年

2025
调研世界
中国统计学会

调研世界

CHSSCD
影响因子:0.695
ISSN:1004-7794
段落导航相关论文