Understanding the Usual Environment in Tourism:A Technical Definition Based on Big Data Space Marking
The term"tourism"refers to various forms of activities that take place in an unusual environment.This"unusual"environment needs to be defined in terms of its opposite,i.e.,the"usual"environment.However,the lack of a relatively homogeneous and unambiguous description of the usual environment,both academically and technically,has led to frequent and repeated mistakes in the implementation of tourism statistics in China.As a result,there has been a continual flood of negative public opinion and a variety of controversies relating to the concept of tourism.This study attempts to address the aforementioned issues through the following procedure.First,we present a review of the international practices,principles,and recommended expressions of the"usual environment"in a technical context.Second,we identify an individual's usual environment as an ensemble of two distinct types of usual spaces,namely the direct vicinity of a person's residential address and the region surrounding a person's place of employment or education.Geographically speaking,the usual environment is an irregular area made up of uneven circles without the limitation of administrative subdivisions.Third,based on the labeling of big data,we employ several spatial clustering algorithms to label the usual environment,and apply the method of inversion and expansion sampling for the monitoring of tourism flows.Finally,we present a preliminary determination of the feasible radius for the two types of usual spaces by comparing the operational parameters of different scanning radii in the density-based spatial clustering of applications with noise(DBSCAN)algorithm.The findings reveal that the scanning radius should be restricted to less than 1 km for the optimal DBSCAN clustering of two usual spaces,as this will minimize the positional noise interference that leads to a mean shift.Moreover,there is typically no more than a single usual residential space,and the number of usual locations relating to a person's place of employment or education is generally only one or two.Based on the attenuation of location points,the usual environment for a place of residence has a maximum radius of 40 km,whereas that for a place of employment or education has a maximum radius of 2 km~3 km.An inference about the statistical population is reached by expanding the sample space to include representative user travelling rates or arrival rates,rather than labeling with a full sample of location data.In addition,it can be assumed that users who are unable to identify their usual environments have the same travelling or arrival rates as those who are able to specify their usual locations,which is consistent with the assumption of homogeneity.The findings of this study serve as a reference for the standardized and consistent application of big data in tourism statistics,and reinforce the basis for big data-based research on tourism flows.Several significant policy and practical implications can be determined from these findings.