Analysis of the Characteristics and Influencing Factors of Maritime Administrative Penalties in China Based on Sparklyr Clustering
To analyze the categories and characteristics of maritime administrative penalty reasons in China,and to reveal the influencing factors of the frequency of different penalty cases,taking 414475 records of maritime administrative penalties as the research data,Sparklyr distributed clustering technology was used to effectively cluster the large-scale text of penalty reasons,and chi-square test was used to extract the semantic features of different penalty reasons.Subsequently,analysis of variance and LSD multiple comparisons are used to investigate the impact of area,season,category of reason,and their interactions on the fre-quency of penalty cases.It shows that Sparklyr distributed clustering can effectively cluster the reasons of large-scale maritime administrative penalties which are difficult to deal with in a standalone environment.The research results reveal the diversity of maritime illegal behaviors,and the frequency of penalty cases involving different reasons varies greatly,and area,season,the in-teraction between area and the category of cause are all significant influencing factors for this difference.Finally,some sugges-tions on differentiated maritime supervision are put forward.
maritime supervisionadministrative penaltydistributed text clusteringanalysis of variancechi-square test