Skip to content

Geospatial analysis using Hadoop Distributed File System(HDFS) and Apache Spark ▪ Performing geospatial analysis on large spatial data stored in HDFS using Apache Spark ▪ Retrieving geographical hotspots in a locality based on the data available in HDFS

Notifications You must be signed in to change notification settings

tusharpandit18/AllSpark-master

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AllSpark-master

Geospatial analysis using Hadoop Distributed File System(HDFS) and Apache Spark ▪ Performing geospatial analysis on large spatial data stored in HDFS using Apache Spark ▪ Retrieving geographical hotspots in a locality based on the data available in HDFS

Summary:

  • Performed geospatial database operations on large datasets stored in distributed systems using Hadoop, Apache Spark, Scala, GeoSpark library in Linux
  • Performed cluster analysis (efficiency, memory usage and CPU usage for each node) using Ganglia
  • Successfully implemented an algorithm for Spatial-Temporal hotspot analysis that included determining the top 50 hotspots for taxi pickups in New York city in January 2015 using Getis- Ord statistics

About

Geospatial analysis using Hadoop Distributed File System(HDFS) and Apache Spark ▪ Performing geospatial analysis on large spatial data stored in HDFS using Apache Spark ▪ Retrieving geographical hotspots in a locality based on the data available in HDFS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages