过刊检索
年份
《城市交通》杂志
2017年 第5期
基于CRISP-DM 的交通大数据分析方法及实践 ——以重庆市手机信令数据和RFID 数据为例
点击量:1459

文章编号: 1672-5328(2017)05-0042-109

周涛,赵必成,俞博
(重庆市交通规划研究院,重庆400020)

摘要: 随着交通大数据研究及应用日益广泛,其中存在的问题也越来越明显。很多分析结论存在概 念模糊、数据质量不确定、分析方法不清晰等问题,导致分析结果经不起推敲,也缺乏可比性。究 其主要原因是未能形成科学的大数据分析方法和统一的分析标准。提出基于CRISP-DM的交通大数 据分析方法,包括目标要求、数据理解、数据准备、数据建模、模型验证、工程化应用(部署)6 个 阶段。结合重庆市交通大数据平台建设实践,以手机信令数据和车辆RFID 数据为例,详细阐述数 据理解、数据建模和模型验证三个重要步骤的具体做法,探索如何实现交通大数据分析的标准化、 指标化和透明化。

关键词: 交通大数据;大数据分析方法;数据理解;数据建模;模型验证;重庆市

中图分类号: U491.1+2

文献标识码:A

Transportation Big Data Analysis Methodology Based on CRISP-DM: An Example of Cellular Signaling and RFID Data in Chongqing

Zhou Tao, Zhao Bicheng, Yu Bo
(Chongqing Transport Planning Institute, Chongqing 400020, China)

Abstract: As the transportation big data analysis becomes a popular research tool, the problems emerge in the data quality and ambiguous analysis method, which leads to unverifiable study conclusions and incomparable results. The lack of a scientifically mature data analysis method and a unified analysis evaluation standard are the problems. This paper proposes transportation big data analysis methodology based on CRISP-DM, which includes six steps: clarifying objectives and requirements, understanding nature of the data, data processing, developing models, model validation and application. Based on the practice of big data platform development in Chongqing, the paper elaborates the procedures of three important steps: data understanding, modeling and model validation using cellular signaling and vehicle RFID data. Based on the application experience, the paper explores how to achieve the standardization, indexation and transparency of transportation big data analysis.

Keywords: transportation big data; big data analysis methodology; data understanding; data modeling; model validation; Chongqing