This article presents a Hadoop&Spark-based distributed storage and computing framework for saving and managing water conservancy geospatial big data efficiently. By reasonably using Spark's memory and cache mechanism, the distributed storage and computing method is designed and achieved. It calculates and analyzes the reservoir capacity by using DEM data of a hydroelectric sta-tion reservoir and the results show that the proposed distributed computing method can improve computing efficiency of large geospatial data effectively. Meanwhile, this distributed computing method exhibits good adaptability.
关键词
Hadoop/Spark/空间大数据/分布式计算/库容计算
Key words
Hadoop/Spark/geospatial big data/distributed computing/reservoir capacity calculation