过刊检索
年份
《城市交通》杂志
2021年 第1期
基于手机信令数据的居住地人口分布辨识改进 方法
点击量:1037

文章编号: 1672-5328(2021)01-0095-07

黄伟1,孙世超2,孙娜1
(1. 北京清华同衡规划设计研究院有限公司,北京100085;2.大连海事大学,辽宁大连116026)

摘要: 针对传统的基于手机信令数据的居住地人口分布分析方法的不足,通过建立一种基于手机信 令数据和问卷调查数据的多源数据融合手段,利用有监督机器学习方法,实现对居住地人口分布现 状的分析。首先通过问卷调查数据获取志愿者的实际居住地位置及其所使用的通信运营商相关信 息,并进行样本筛选。其次,在通信运营商内部机房建立志愿者用户实际居住地位置与手机信令数 据位置信息之间的对应关系。最后,利用手机信令数据,通过获取志愿者在居住地网格位置的停留 特征以及非居住地网格位置的停留特征训练朴素贝叶斯分类器模型,继而完成机器学习方法的建立 并应用到其他手机用户实际居住地的识别。分析结果表明:基于有监督学习方法的人口分布辨识方 法较传统的阈值判断方法预测精度有明显提升。

关键词: 交通规划;人口分布;手机信令数据;有监督学习方法;朴素贝叶斯分类器

中图分类号: U491

文献标识码:A

Improved Method of Population Distribution Identification Based on Cellular Signaling Data

HuangWei1, Sun Shichao2, Sun Na1
(1.Beijing Tsinghua Tongheng Urban Planning & Design Institute, Beijing 100085, China; 2.Dalian Maritime University, Dalian Liaoning 116026, China)

Abstract: Aiming to overcome the shortcomings of traditional population distribution analysis methods based on cellular signaling data, a multi-dimensional data fusion method is put forward as a new kind of method, combining cellular signaling data with questionnaire data and supervised machine learning method, to analysis the current situation of the population distribution. Firstly, the actual residential location of volunteers and the information about the communication carrier can be obtained by the questionnaire survey. Meanwhile, sample screening is performed. Secondly, the relationship between the actual residence location of volunteers and the location information of the mobile phone signaling data is established within the internal equipment room of the communication carrier. Finally, using the cellular signaling data, the Naïve Bayesian Classifier model is trained by acquiring the residential features and non- residential features of the volunteer users within the grid location. Furthermore, this method will contribute to identify the actual residence of other mobile phone users. The results show that the population distribution identification method based on the supervised learning method has a significantly improved prediction accuracy compared with the traditional threshold judgment method.

Keywords: transportation planning; population distribution; cellular signaling data; supervised learning method; Naïve Bayesian Classifier