Prediction and influencing factor analysis of harmful algal bloom species with random forest classification
Harmful algal blooms are frequently caused in the context of global change by phytoplank-ton,which lead to a series of problems with respect to fisheries,aquacultures,human health and social economy.By utilizing the random forest classification method of machine learning,in this study,we developed two models of shellfish toxic and fish kill basing on the common harmful algal bloom species around the waters of Zhangzi Island in the northern Yellow Sea.The response variable of shellfish toxic model was designated by the cell abundances of Alexandrium tamarense,Dinophysis spp.,Gonyaul-ax spp.,Prorocentrum spp.and Pseudo-nitzschia spp.whilst that of the fish kill model was setup by the Karenia mikimotoi,Noctiluca scintillans and Dictyocha fibula.The feature variables for the two models were transparency,temperature,salinity,pH value and dissolved oxygen.The classification performance showed that the accuracy of the shellfish toxic and fish kill models were 87.9%and 89.7%,respectively,while the precision all reached up to over 80%.The analysis of the feature im-portance indicated that temperature and dissolved oxygen were the key predictive variables for shellfish toxic model with MeanDecreaseGini being 15.4%and 14.3%,respectively,while pH value and salini-ty were the key variables for fish kill model with that values of 21.6%and 15.5%,respectively.Our findings could provide case study and basic information on discriminating key predictive variables for the harmful algal bloom species and establishing the early monitoring and warning system in the waters of fisheries,aquaculture and crucial habitats.