diff --git a/docs/pages/workflow.html b/docs/pages/workflow.html index 3900bcf..add4304 100644 --- a/docs/pages/workflow.html +++ b/docs/pages/workflow.html @@ -1 +1 @@ -WebGIS - Workflow

Workflow

the data we are using and how we process the analysis

Site Area

Step 0 - Data Pronuncement

Pronuncement of the data we used for the analysis: DUSAF, DTM, NVDI, Distance Characterization Data, and Landslides Inventory.
Click the toggle list to check the data details

DUSAF (Digital Urban Surface Analysis Factors)
NDVI (Normalized Difference Vegetation Index)
DTM (Digital Terrain Model)
NVDI

NVDI
Because we have missing NVDI data in some areas, we extracted this area during preprocessing and removed it when generating the training set later.

DUSAF

DUSAF

DTM

DTM

Distance Characterization Data
Roads Buffer

Roads Buffer

Rivers Buffer

Rivers Buffer

Faults Buffer

Faults Buffer

Landslides inventory

Landslides Inventory in the area of Group 7

Landslides Inventory
  • Data Preprocessing

Data preprocessing includes the reprojection, clipping, resampling, rasterization
During data preprocessing, we must ensure that the coordinate reference system, extent, and resolution of all layers are consistent with the DTM layer to obtain data consistency when generating the training set.

Coordinate Reference System (CRS)EPSG:32632 - WGS 84 / UTM zone 32N
Pixel Resolution5 Meters
ExtentVector Polygon of the area of Group 7

Step 1 - Data Processing

  • QGIS Analysis: Slope, Aspect, Plan Curvature and Profile Curvature

To define the no landslide zones,we need to extract 4 factors from the DTM, which are:slope angle,aspect,plane and profile curvature. To extract these factors,we use the Processing-SAGA-Slope, aspect, curvature provided in QGIS,and set the units as degrees.

Slope
Aspect
Plan and Profile Curvature
DTM

aspect

aspect

slope

slope

plan curvature

plan

profile curvature

profile
  • Define the No Landslide Zones (NLZ)

No landslide zone refers to the areas with low possibility of landslides. In our case study,we adopt the simplified no landslide zone definition derived from slope angle: areas with slopes below 20 degrees or above 70 degreesare generally less prone to landslides.

    To obtain the NLZ layer, we follow these three steps
  1. We use the raster calculator on the DTM layer to compute "slope@1" < 20 OR "slope@1" > 70 . In the resulting raster layer, No Landslide Zones correspond to pixels with a value of 1.
  2. Use the r.null to remove null value in the resulting raster layer,and use the Processing-GDAL-Raster analysis-Sieve tool to filter the raster,in order to remove the small patches in the resulting raster.We tried three different filter thresholds: 10, 30, 50 and 70. For our study area, a threshold of 30 works best.
  3. Vectorize the resulting raster to obtain the polygons of NLZ. To achieve this ,we use the Processing-GRASS-Raster-r.to.vect tool, set the raster values as categories.
Vectorized NLZ

Final Vectorized No Landslide Zones (NLZ)

threshold 10

Sieved with threshold 10
too much useless small scattered points

threshold 30

Sieved with threshold 30
removed most of excessively small scatter points

threshold 50

Sieved with threshold 50
relatively larger blocks have been retained

threshold 70

Sieved with threshold 70
filtered out excess blocks

  • Combine NLZ and LZ Dataset

  1. Different
    Since we adopt the simplified definition of non landslide zones, the non landslide zones may overlap with the landslide inventory polygons.We use the Processing-Vector Overlay-Difference tool to remove the overlapping part of two dataset.
  2. Define 'Hazard'
    To prepare the training/testing dataset ,we create a new field 'Hazard' in the attribute tables of both Landslide Inventory and NLZ,and assign 1 to the NLZ,2 to the Landslide Inventory.
  3. Union
    After this,perform union operation on the Landslide Inventory polygons.
  4. Manual Intervention
    For the purpose of ensuring the training and validation data are evenly distributed, we manually modified the vector layer produced after the union operation. For a detailed description of this issue, see the "Problem Encountered" section.


Hazard Value
Landslides Inventory1
No Landslide Zones2
NLZ Difference

NLZ Differenced with LZ inventory

Union Result

NLZ & LZ Union Result

manually cutting

Mannually Cut Excessively Large Pieces

  • Separation of Train_Test Dataset

    The procedure for creating the training data is as follows:
  1. For both landslide inventory and NLZ layers,define a training-testing ratio that will be used for the machine learning model.Here we tried both 80/20 and 70/30 dataset. Use Processing-Vector Selection-Random Selection,to randomly select polygons according to the given ration, and invert the selection to switch between training and testing polygons.
  2. Merge the processed Landslide Inventory layer with the one of NLZ.
  3. After determining how many points are needed for the training data,we need to select points within the polygons based on the training and validation set ratios while ensuring a 1:1 point ratio according to the 'Hazard' value.Thus,we use the Select Features by Value tool and select according to the 'Hazard' and 'Train_Test' field. After the selection,use Processing-Vector Creation-Random Points in layer bounds. Use the merged Landslide Inventory and NLZ layer and select selected feature only.
  4. Merge separately the training and testing layers into two point layers trainingPoints and testingPoints.
  • Selected dataset for final analysis:
    1000 points with 70/30 - Train/Test Ration
  • Hazard ValueNumbers of Points
    Train1350
    Train2350
    Test1150
    Test2150
    80/20 1000points dataset

  • To select the best-performing model for susceptibility mapping, we created two sets of data with training/testing ratios of 70/30 and 80/20, respectively. Also did some experiment with different total number of the dataset with 4000 points and 1000 points, respectively, resulting in a total of four sets of training data.
  • Step 2 - Susceptibility Mapping Process

    • Hazard classification using dzetsaka

      1. Training procedure
        Before training,we need to remove NULL value in all attributes of datasets by using the Field Calculator tool to update the value of each attribute column, replacing null values with 9999. We use the plugin dzetsaka for classifying hazard. For each group of training/testing data, we choose Random Forest as classifier,and generate a virtual raster layer containing all the raster layers for training.
      2. Probability Map
        To describe the probability of landslide occurrence and create a probability map, we need to convert classification confidence into class probability by using the raster calculator,the input is as follows:
        ("classification@1 "=2)*"confidence@1 "+ ("classification@1 "=1)*(100-"confidence@1 ")
        Thus,we obtain the derived probability map from the training result.
      LandslideSusceptibilityMap

      Landslide Susceptibility Map

      Step 3&4 - Exposure assessment

      • Population Analysis

      1. Data preprocessing
        For exposure assessment,we use the worldpop raster data, which is a spatial raster dataset with the estimated total number of people per grid-cell. We adopted the same processing workflow as in the data preprocessing stage. First reproject the population raster to WGS 84/ UTM Zone 32N,EPSG:32632 to maintain consistency of the CRS between datasets.Then clip the population raster layer using the mask vector of group 7,to obtain the population raster for exposure assessment.
      2. Ranking the Degree
        For ranking the degree of exposure,we need to reclassify the susceptibility raster map into 4 classes
        [0, 0.25)low
        [0.25, 0.5)moderate
        [0.5, 0.75)high
        [0.75, 1)very high
        We use the classify by table tool to perform classification. Then we resample the reclassified susceptibility raster map using extension and resolution of the clipped population raster dataset,which is around 81.67 m .
      3. population map

        Population Map

        LandslideSusceptibilityMap_reclassified

        Reclassified Landslide Susceptibility Map

        LandslideSusceptibilityMap_resampled

        Resampled into the same pixel resolution with Population

      4. Analysis with csv
        To compute the population counts in each susceptibility class, we use the tool Processing > Raster Analysis > Raster layer zonal statistics. Set the input layer to the clipped population raster dataset and the zones layer to the resampled susceptibility raster map. Finally, plot a pie chart showing the percentage of the population in each susceptibility class.
      • Alpine Pastures

      1. Considering the characteristic of fewer buildings in the study area and the potential hazards posed by landslides, we chose the category of Alpine Aastures for the exposure assessment.(Alpine Pasture data source:Geoportale della Lombardia).
      2. There are four alpine pastures in the study area: Alpe Meden, Alpe Rhon con Campondola e Campo, Alpe Piano-Ortiche con Aiada, and Alpe Piano dei Cavalli con Malgina e Combolo. We rasterized these areas using the DTM of the study area as the content. Then, using a method similar to processing population data, we performed calculations using the reclassified landslide susceptibility map. Finally, we obtained the exposure assessment for the alpine pasture characteristic.
      Alpine_pastures

      Alpine Pastures in our Area

      Step 5 - WebGIS

      • In the WebGIS section, we use OSM basemap and Stadia Maps layers as basemaps. We published the clipped original data, landslide susceptibility map, population map, and exposure assessment map on the Polimi Geoserver. The maps are displayed using WMS services, and a pop-up feature has been added for the currently top-displayed raster data. Clicking on a point will show the corresponding grayscale value。

      Get Cource Code?

      \ No newline at end of file +WebGIS - Workflow

      Workflow

      the data we are using and how we process the analysis

      Site Area

      Step 0 - Data Pronuncement

      Pronuncement of the data we used for the analysis: DUSAF, DTM, NVDI, Distance Characterization Data, and Landslides Inventory.
      Click the toggle list to check the data details

      DUSAF (Digital Urban Surface Analysis Factors)
      NDVI (Normalized Difference Vegetation Index)
      DTM (Digital Terrain Model)
      NVDI

      NVDI
      Because we have missing NVDI data in some areas, we extracted this area during preprocessing and removed it when generating the training set later.

      DUSAF

      DUSAF

      DTM

      DTM

      Distance Characterization Data
      Roads Buffer

      Roads Buffer

      Rivers Buffer

      Rivers Buffer

      Faults Buffer

      Faults Buffer

      Landslides inventory

      Landslides Inventory in the area of Group 7

      Landslides Inventory
      • Data Preprocessing

      Data preprocessing includes the reprojection, clipping, resampling, rasterization
      During data preprocessing, we must ensure that the coordinate reference system, extent, and resolution of all layers are consistent with the DTM layer to obtain data consistency when generating the training set.

      Coordinate Reference System (CRS)EPSG:32632 - WGS 84 / UTM zone 32N
      Pixel Resolution5 Meters
      ExtentVector Polygon of the area of Group 7

      Step 1 - Data Processing

      • QGIS Analysis: Slope, Aspect, Plan Curvature and Profile Curvature

      To define the no landslide zones,we need to extract 4 factors from the DTM, which are:slope angle,aspect,plane and profile curvature. To extract these factors,we use the Processing-SAGA-Slope, aspect, curvature provided in QGIS,and set the units as degrees.

      Slope
      Aspect
      Plan and Profile Curvature
      DTM

      aspect

      aspect

      slope

      slope

      plan curvature

      plan

      profile curvature

      profile
      • Define the No Landslide Zones (NLZ)

      No landslide zone refers to the areas with low possibility of landslides. In our case study,we adopt the simplified no landslide zone definition derived from slope angle: areas with slopes below 20 degrees or above 70 degreesare generally less prone to landslides.

        To obtain the NLZ layer, we follow these three steps
      1. We use the raster calculator on the DTM layer to compute "slope@1" < 20 OR "slope@1" > 70 . In the resulting raster layer, No Landslide Zones correspond to pixels with a value of 1.
      2. Use the r.null to remove null value in the resulting raster layer,and use the Processing-GDAL-Raster analysis-Sieve tool to filter the raster,in order to remove the small patches in the resulting raster.We tried three different filter thresholds: 10, 30, 50 and 70. For our study area, a threshold of 30 works best.
      3. Vectorize the resulting raster to obtain the polygons of NLZ. To achieve this ,we use the Processing-GRASS-Raster-r.to.vect tool, set the raster values as categories.
      Vectorized NLZ

      Final Vectorized No Landslide Zones (NLZ)

      threshold 10

      Sieved with threshold 10
      too much useless small scattered points

      threshold 30

      Sieved with threshold 30
      removed most of excessively small scatter points

      threshold 50

      Sieved with threshold 50
      relatively larger blocks have been retained

      threshold 70

      Sieved with threshold 70
      filtered out excess blocks

      • Combine NLZ and LZ Dataset

      1. Different
        Since we adopt the simplified definition of non landslide zones, the non landslide zones may overlap with the landslide inventory polygons.We use the Processing-Vector Overlay-Difference tool to remove the overlapping part of two dataset.
      2. Define 'Hazard'
        To prepare the training/testing dataset ,we create a new field 'Hazard' in the attribute tables of both Landslide Inventory and NLZ,and assign 1 to the NLZ,2 to the Landslide Inventory.
      3. Union
        After this,perform union operation on the Landslide Inventory polygons.
      4. Manual Intervention
        For the purpose of ensuring the training and validation data are evenly distributed, we manually modified the vector layer produced after the union operation. For a detailed description of this issue, see the "Problem Encountered" section.


      Hazard Value
      Landslides Inventory1
      No Landslide Zones2
      NLZ Difference

      NLZ Differenced with LZ inventory

      Union Result

      NLZ & LZ Union Result

      manually cutting

      Mannually Cut Excessively Large Pieces

      • Separation of Train_Test Dataset

        The procedure for creating the training data is as follows:
      1. For both landslide inventory and NLZ layers,define a training-testing ratio that will be used for the machine learning model.Here we tried both 80/20 and 70/30 dataset. Use Processing-Vector Selection-Random Selection,to randomly select polygons according to the given ration, and invert the selection to switch between training and testing polygons.
      2. Merge the processed Landslide Inventory layer with the one of NLZ.
      3. After determining how many points are needed for the training data,we need to select points within the polygons based on the training and validation set ratios while ensuring a 1:1 point ratio according to the 'Hazard' value.Thus,we use the Select Features by Value tool and select according to the 'Hazard' and 'Train_Test' field. After the selection,use Processing-Vector Creation-Random Points in layer bounds. Use the merged Landslide Inventory and NLZ layer and select selected feature only.
      4. Merge separately the training and testing layers into two point layers trainingPoints and testingPoints.
      5. Sampling final Train_test
        To obtain training values for our dataset,we need to sample the environmental factors using the training and testing point layers.
        We use the Point Sampling Tool plugin, selecting the points layers and all the environmental layers to perform sampling, and ensure the fields' name is consistent. After sampling, we obtain two layers:trainingPointsSampled and testingPointsSampled.
    • Selected dataset for final analysis:
      1000 points with 70/30 - Train/Test Ration
    • Hazard ValueNumbers of Points
      Train1350
      Train2350
      Test1150
      Test2150
      80/20 1000points dataset

    • To select the best-performing model for susceptibility mapping, we created two sets of data with training/testing ratios of 70/30 and 80/20, respectively. Also did some experiment with different total number of the dataset with 4000 points and 1000 points, respectively, resulting in a total of four sets of training data.
    • Step 2 - Susceptibility Mapping Process

      • Hazard classification using dzetsaka

        1. Training procedure
          Before training,we need to remove NULL value in all attributes of datasets by using the Field Calculator tool to update the value of each attribute column, replacing null values with 9999. We use the plugin dzetsaka for classifying hazard. For each group of training/testing data, we choose Random Forest as classifier,and generate a virtual raster layer containing all the raster layers for training.
        2. Probability Map
          To describe the probability of landslide occurrence and create a probability map, we need to convert classification confidence into class probability by using the raster calculator,the input is as follows:
          ("classification@1 "=2)*"confidence@1 "+ ("classification@1 "=1)*(100-"confidence@1 ")
          Thus,we obtain the derived probability map from the training result.
        LandslideSusceptibilityMap

        Landslide Susceptibility Map

        Step 3&4 - Exposure assessment

        • Population Analysis

        1. Data preprocessing
          For exposure assessment,we use the worldpop raster data, which is a spatial raster dataset with the estimated total number of people per grid-cell. We adopted the same processing workflow as in the data preprocessing stage. First reproject the population raster to WGS 84/ UTM Zone 32N,EPSG:32632 to maintain consistency of the CRS between datasets.Then clip the population raster layer using the mask vector of group 7,to obtain the population raster for exposure assessment.
        2. Ranking the Degree
          For ranking the degree of exposure,we need to reclassify the susceptibility raster map into 4 classes
          [0, 0.25)low
          [0.25, 0.5)moderate
          [0.5, 0.75)high
          [0.75, 1)very high
          We use the classify by table tool to perform classification. Then we resample the reclassified susceptibility raster map using extension and resolution of the clipped population raster dataset,which is around 81.67 m .
        3. population map

          Population Map

          LandslideSusceptibilityMap_reclassified

          Reclassified Landslide Susceptibility Map

          LandslideSusceptibilityMap_resampled

          Resampled into the same pixel resolution with Population

        4. Analysis with csv
          To compute the population counts in each susceptibility class, we use the tool Processing > Raster Analysis > Raster layer zonal statistics. Set the input layer to the clipped population raster dataset and the zones layer to the resampled susceptibility raster map. Finally, plot a pie chart showing the percentage of the population in each susceptibility class.
        • Alpine Pastures

        1. Considering the characteristic of fewer buildings in the study area and the potential hazards posed by landslides, we chose the category of Alpine Aastures for the exposure assessment.(Alpine Pasture data source:Geoportale della Lombardia).
        2. There are four alpine pastures in the study area: Alpe Meden, Alpe Rhon con Campondola e Campo, Alpe Piano-Ortiche con Aiada, and Alpe Piano dei Cavalli con Malgina e Combolo. We rasterized these areas using the DTM of the study area as the content. Then, using a method similar to processing population data, we performed calculations using the reclassified landslide susceptibility map. Finally, we obtained the exposure assessment for the alpine pasture characteristic.
        Alpine_pastures

        Alpine Pastures in our Area

        Step 5 - WebGIS

        • In the WebGIS section, we use OSM basemap and Stadia Maps layers as basemaps. We published the clipped original data, landslide susceptibility map, population map, and exposure assessment map on the Polimi Geoserver. The maps are displayed using WMS services, and a pop-up feature has been added for the currently top-displayed raster data. Clicking on a point will show the corresponding grayscale value。

        Get Cource Code?

        \ No newline at end of file diff --git a/pages/workflow.html b/pages/workflow.html index 611b346..3a7387f 100644 --- a/pages/workflow.html +++ b/pages/workflow.html @@ -463,6 +463,10 @@


      • After determining how many points are needed for the training data,we need to select points within the polygons based on the training and validation set ratios while ensuring a 1:1 point ratio according to the 'Hazard' value.Thus,we use the Select Features by Value tool and select according to the 'Hazard' and 'Train_Test' field. After the selection,use Processing-Vector Creation-Random Points in layer bounds. Use the merged Landslide Inventory and NLZ layer and select selected feature only.
      • Merge separately the training and testing layers into two point layers trainingPoints and testingPoints.
      • +
      • Sampling final Train_test
        To obtain training values for our dataset,we need to sample the environmental factors using the training and testing point layers.
        + We use the Point Sampling Tool plugin, selecting the points layers and all the environmental layers to perform sampling, and ensure the fields' name is consistent. + After sampling, we obtain two layers:trainingPointsSampled and testingPointsSampled. +
      • @@ -519,7 +523,7 @@


        - +