不动产登记交易价格数据分类及其监管量化研究

Identification and processing of outliers for national real estate registration data based on statistical approaches

张俊逸 ¹贾文珏 ²孙中孝 ¹张倩¹

扫码查看

作者信息

1. 自然资源部城市国土资源监测与仿真重点实验室,广东深圳 518000;中国农业大学土地科学与技术学院,北京 100193
2. 自然资源部城市国土资源监测与仿真重点实验室,广东深圳 518000;自然资源部信息中心,北京 100812
折叠

摘要

针对不动产登记数据质量存在的问题,本文探讨了不动产登记数据库中的住宅交易数据的处理与分类方法.以S市住宅交易价格指标为例,形成了基于核密度估计等统计学方法为基础的数据清洗、处理、分类的技术方法.①在剔除极端值、重复值和特殊值后获得与地区实际最相近的主体数据,主体数据包括低概率数据、市场行为数据、非市场行为数据.②针对市场行为数据,提取并计算交易价格均值,与中介机构公开房价数据信息进行比对,大部分区域的数据差异在15％以下.定量分析证实不动产登记数据库中数据更加权威有效.本研究探索提出基于核密度估计与二阶差分相结合进行不动产登记数据处理分类的技术流程,可为分析挖掘全国不动产登记数据信息并进行量化监管提供方法基础.

Abstract

We aim to analyze the existing issues in the national real estate registration database,and then construct a method to improve the quality of big data.and assessed the effectiveness of this method.we employed statistical methods such as kernel density estimation and residential registration data of S city in the national real estate registration data to identify extreme values and duplicate values in residential prices and classify the cleaned data in S city.① We categorize the data according to the distribution condition of the data,firstly eliminating extreme values,then eliminating the duplicate values and special values in the valid data to obtain the subject data,where the subject data includes low probability data,market behavior data,and non-market behavior data.②For the market behavior data,the average value of transaction price is extracted and calculated,and compared with the information of public house price data of intermediary institutions,the difference of the data in most regions is less than 15％,the quantitative analysis confirms that the data in the real estate registration database is more authoritative and effective.This study built up a data quality improvement method based on kernel density estimation to identify extreme values and duplicate values in the real estate registration data.Our results verified that the method of data quality improvement is robust and effective.The improvement of registration data quality provided a methodological basis,which can provide more accurate data resources for the application of national real estate registration data.

关键词

数据质量提升/核密度估计/不动产登记/城市住宅价格

Key words

data quality improvement/kernel density estimation/registration of immovable property/urban residential housing price

引用本文复制引用

出版年

2024

测绘科学

中国测绘科学研究院

测绘科学

CSTPCDCSCD北大核心

影响因子：0.774

ISSN：1009-2307

段落导航