Merge pull request #3 from opendilab/dev

v0.1.0 update
opendilab · Apr 24, 2022 · 362c6c6 · 362c6c6
2 parents cf448be + 6706b8d
commit 362c6c6
Show file tree

Hide file tree

Showing 70 changed files with 38,372 additions and 14,590 deletions.
diff --git a/.github/workflows/unit_test.yml b/.github/workflows/unit_test.yml
@@ -23,8 +23,12 @@ jobs:
         run: |
           sudo add-apt-repository ppa:sumo/stable
           sudo apt-get update
-          sudo apt-get install sumo sumo-tools
+          sudo apt-get install sumo sumo-tools build-essential cmake
           export SUMO_HOME=/usr/share/sumo/tools
+          git clone https://github.com/cityflow-project/CityFlow.git
+          cd CityFlow
+          python -m pip install -e .
+          cd ..
           python -m pip install -e .
           python -m pip install -e ".[test]"
           ./modify_traci_connect_timeout.sh

diff --git a/.gitignore b/.gitignore
@@ -134,5 +134,8 @@ dmypy.json
 *_case_info.json
 *_flow*.sumocfg
 
+.vscode
 debug
-exp
+exp*
+replay*.json
+replay*.txt
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # DI-smartcross
 
-<img src="./docs/figs/di-smartcross_logo.png" width="200" alt="icon"/>
+<img src="./docs/figs/di-smartcross_banner.png" alt="icon"/>
 
 DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control.
 
@@ -10,7 +10,7 @@ DI-smartcross is application platform under [OpenDILab](http://opendilab.org/)
 
 **DI-smartcross** is an open-source traffic crossing signal control platform. DI-smartcross applies several Reinforcement Learning policies training & evaluation for traffic signal control system in provided road nets.
 
-DI-smartcross uses [**DI-engine**](https://github.com/opendilab/DI-engine), a Reinforcement Learning platform to build RL experiments. DI-smartcross uses [SUMO](https://www.eclipse.org/sumo/) (Simulation of Urban MObility) traffic simulator package to run signal control simulation.
+DI-smartcross uses [**DI-engine**](https://github.com/opendilab/DI-engine), a Reinforcement Learning platform to build RL experiments. DI-smartcross uses [SUMO](https://www.eclipse.org/sumo/) (Simulation of Urban MObility) and [CityFlow](https://cityflow-project.github.io) traffic simulator packages to run signal control simulation.
 
 DI-smartcross supports:
 
@@ -24,9 +24,10 @@ DI-smartcross supports:
 DI-smartcross supports SUMO version >= 1.6.0. You can refer to 
 [SUMO documentation](https://sumo.dlr.de/docs/Installing/index.html) or follow our installation guidance in 
 [documents](https://opendilab.github.io/DI-smartcross/installation.html).
+CityFlow can be installed and compiled from source code. You can clone their repo and run `pip install .`
 
 Then, DI-smartcross is able to be installed from source code.
-Simply run `pip install` in the root folder of this repository.
+Simply run `pip install .` in the root folder of this repository.
 This will automatically insall [DI-engine](https://github.com/opendilab/DI-engine) as well.
 
 ```bash
@@ -40,22 +41,37 @@ and Rainbow DQN RL methods with multi-discrete actions for each crossing, as wel
 in which each crossing is handled by a individual agent. A set of default DI-engine configs is provided for 
 each policy. You can check the document of DI-engine to get detail instructions of these configs.
 
+Here we show RL training sript for sumo envs, same with cityflow env.
+
 - train RL policies
 
-Example of running DQN in wj3 env with default config.
+Example of running DQN in sumo wj3 env with default config.
 
 ```bash
 sumo_train -e smartcross/envs/sumo_wj3_default_config.yaml -d entry/config/sumo_wj3_dqn_default_config.py
 ```
 
+Example of running PPO in cityflow grid env with default config.
+
+```bash
+cityflow_train -e ./smartcross/envs/cityflow_grid/cityflow_grid_config.json -d entry/cityflow_config/cityflow_grid_ppo_default_config.py 
+```
+
 - evaluate existing policies
 
 Example of running random policy in wj3 env.
 
+
 ```bash
 sumo_eval -p random -e smartcross/envs/sumo_wj3_default_config.yaml     
 ```
 
+Example of running fix policy in cityflow grid env.
+
+```bash
+cityflow_eval -e smartcross/envs/cityflow_grid/cityflow_auto_grid_config.json -d entry/cityflow_config/cityflow_eval_default_config.py -p fix
+```
+
 It is rerecommended to refer to [documation](https://opendilab.github.io/DI-smartcross/index.html)
 for detail information.
 
@@ -68,6 +84,7 @@ We appreciate all contributions to improve DI-smartcross, both algorithms and sy
 DI-smartcross released under the Apache 2.0 license.
 
 ## Citation
+
 ```latex
 @misc{smartcross,
     title={{DI-smartcross: OpenDILab} Decision Intelligence platform for Traffic Crossing Signal Control},
@@ -77,4 +94,3 @@ DI-smartcross released under the Apache 2.0 license.
     year={2021},
 }
 ```
-
diff --git a/docs/figs/di-smartcross_banner.png b/docs/figs/di-smartcross_banner.png
diff --git a/docs/source/envs/cf_grid_env.rst b/docs/source/envs/cf_grid_env.rst
@@ -0,0 +1,2 @@
+CityFlow Grid Env
+#####################
diff --git a/docs/source/envs/rl_arterial7_env.rst b/docs/source/envs/rl_arterial7_env.rst
@@ -0,0 +1,2 @@
+SUMO RL Arterial 7 Crossings Env
+#################################
diff --git a/docs/source/envs/wj3_env.rst b/docs/source/envs/wj3_env.rst
@@ -0,0 +1,3 @@
+SUMO Beijing Wangjing 3 Crossings Env
+######################################
+
diff --git a/docs/source/faq.rst b/docs/source/faq.rst
@@ -0,0 +1,15 @@
+FAQ
+##############
+
+.. toctree::
+    :maxdepth: 2
+
+
+Q1: SUMO environment always showing `Retrying in 1 seconds`
+------------------------------------------------------------------
+
+:A1:
+    SUMO environments and `traci` lib is slow to reset when running with large roadnets.
+    It only check the collection after reset for 1 sec. DI-smartcross provides an easy way
+    to change the retry timeout for `traci`. You can run `modify_traci_connect_timeout.sh`
+    file. It will automatically
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -1,4 +1,4 @@
-.. DI-drive documentation master file, created by
+.. DI-smartcross documentation master file, created by
    sphinx-quickstart on Mon Jan 25 13:49:15 2021.
    You can adapt this file completely to your liking, but it should at least
    contain the root `toctree` directive.
@@ -14,15 +14,14 @@ DI-smartcross Documentation
    installation
    quick_start
    rl_environments
+   faq
 
-
-.. figure:: ../figs/di-smartcross_logo.png
+.. figure:: ../figs/di-smartcross_banner.png
    :alt: DI-smartcross
-   :width: 500px
 
 Decision Intelligence Platform for Traffic Crossing Signal Control.
 
-Last updated on 
+Last updated on 2022.04.16
 
 -----
 
@@ -50,10 +49,13 @@ Content
 ==============
 
 `Installation <installation.html>`_
---------------------------------------
+------------------------------------------
+
+`Quick Start <quickstart.html>`_
+-------------------------------------
 
-`Quick Start <quickstart>`_
------------------------------
+`RL Environments <rl_environments.html>`_
+-------------------------------------------------
 
-`RL Environments <rl_environments>`_
-----------------------------------------
+`FAQ <faq.html>`_
+--------------------
diff --git a/docs/source/installation.rst b/docs/source/installation.rst
@@ -4,10 +4,17 @@ Installation
 .. toctree::
     :maxdepth: 2
 
+Here we provide easy installation for **DI-smartcross** and all simulators supported.
+
+.. note::
+
+    You can choose one of the simulators to run your experiments. Only chosen one needs to be installed.
+
+
 SUMO installation
 =====================
 
-**DI-smartcross** support SUMO version >= 1.6.0. Here we show two easy guides
+**DI-smartcross** supports SUMO version >= 1.6.0. Here we show two easy guides
 of SUMO installation on Linux.
 
 Install SUMO via apt-get or homebrew
@@ -92,6 +99,23 @@ If successful, the following message will be shown in the shell.
     License EPL-2.0: Eclipse Public License Version 2 <https://eclipse.org/legal/epl-v20.html>
     Use --help to get the list of options.
 
+
+CityFlow Installation
+==========================
+
+CityFlow simulator can be installed from source code via `CMake <https://cmake.org>`_.
+Please make sure it is correctly worked in your system.
+
+Simply download their source code and run ``pip install`` in the root folder to install CityFlow.
+
+.. code:: bash
+
+    git clone https://github.com/cityflow-project/CityFlow.git
+    cd CityFlow
+    pip install .
+
+You can check installation by running ``import cityflow`` in python.
+
 Install DI-smartcross
 ==========================
 

diff --git a/docs/source/quick_start.rst b/docs/source/quick_start.rst
@@ -15,11 +15,13 @@ to get detail instructions of these configs.
 train RL policies
 --------------------
 
+The type of policy can be automatically parsed from the config file.
+
 .. code::
 
     usage: sumo_train [-h] -d DING_CFG -e ENV_CFG [-s SEED] [--dynamic-flow]
-                  [-cn COLLECT_ENV_NUM] [-en EVALUATE_ENV_NUM]
-                  [--exp-name EXP_NAME]
+                      [-cn COLLECT_ENV_NUM] [-en EVALUATE_ENV_NUM]
+                      [--exp-name EXP_NAME]
 
     DI-smartcross training script
 
@@ -40,18 +42,25 @@ train RL policies
 
 Example of running DQN in wj3 env with default config.
 
+.. note:: 
+
+    Running with dynamic flow is only supported for arterial7 env currently.
+
 .. code:: bash
 
     sumo_train -e smartcross/envs/sumo_wj3_default_config.yaml -d entry/config/sumo_wj3_dqn_default_config.py
 
 evaluate existing policies
 --------------------------------
 
+We provide two eval policies: random and fixed-time. You can choose one to evaluate
+as comparison. It is suggested to use the `eval_default_config` for each env.
+
 .. code:: 
 
     usage: sumo_eval [-h] [-d DING_CFG] -e ENV_CFG [-s SEED]
-                 [-p {random,fix,dqn,rainbow,ppo}] [--dynamic-flow]
-                 [-n ENV_NUM] [--gui] [-c CKPT_PATH]
+                     [-p {random,fix,dqn,rainbow,ppo}] [--dynamic-flow]
+                     [-n ENV_NUM] [--gui] [-c CKPT_PATH]
 
     DI-smartcross testing script
 
@@ -76,4 +85,63 @@ Example of running random policy in wj3 env.
 
 .. code:: bash
 
-    sumo_eval -p random -e smartcross/envs/sumo_wj3_default_config.yaml
+    sumo_eval -p random -e smartcross/envs/sumo_wj3_default_config.yaml
+
+
+CityFlow Entries
+=================
+
+**DI-smartcross** provides a simple DQN and Off-policy PPO demo for CityFlow env. Each
+policy comes with a default **DI-engine** configs is provided for each policy. You can
+check the document of DI-engine to get detail instructions of these configs.
+
+train RL policies
+--------------------
+
+.. code::
+
+    usage: cityflow_train [-h] -d DING_CFG -e ENV_CFG [-s SEED]
+                          [-cn COLLECT_ENV_NUM] [-en EVALUATE_ENV_NUM]
+                          [--exp-name EXP_NAME]
+
+    DI-smartcross training script
+
+    optional arguments:
+    -h, --help            show this help message and exit
+    -d DING_CFG, --ding-cfg DING_CFG
+                            DI-engine configuration path
+    -e ENV_CFG, --env-cfg ENV_CFG
+                            cityflow json configuration path
+    -s SEED, --seed SEED  random seed
+    -cn COLLECT_ENV_NUM, --collect-env-num COLLECT_ENV_NUM
+                            collector env num for training
+    -en EVALUATE_ENV_NUM, --evaluate-env-num EVALUATE_ENV_NUM
+                            evaluator env num for training
+    --exp-name EXP_NAME   experiment name to save log and ckpt
+
+evaluate existing policies
+--------------------------------
+
+Note that CityFlow will run in fixed-time mode by default when not in rl mode.
+So the fix policy runs with an `auto_config.json`.
+
+.. code::
+
+    usage: cityflow_eval [-h] [-d DING_CFG] -e ENV_CFG [-s SEED]
+                         [-p {fix,dqn,ppo}] [-n ENV_NUM] [-c CKPT_PATH]
+
+    DI-smartcross training script
+
+    optional arguments:
+    -h, --help            show this help message and exit
+    -d DING_CFG, --ding-cfg DING_CFG
+                            DI-engine configuration path
+    -e ENV_CFG, --env-cfg ENV_CFG
+                            sumo environment configuration path
+    -s SEED, --seed SEED  random seed for sumo
+    -p {fix,dqn,ppo}, --policy-type {fix,dqn,ppo}
+                            RL policy type
+    -n ENV_NUM, --env-num ENV_NUM
+                            sumo env num for evaluation
+    -c CKPT_PATH, --ckpt-path CKPT_PATH
+                            model ckpt path
diff --git a/docs/source/rl_environments.rst b/docs/source/rl_environments.rst
@@ -69,3 +69,67 @@ Multi-agent
 It is only necessary to add ``multi_agent`` in **DI-engine** config file to convert common PPO into MAPPO,
 and change the ``use_centrolized_obs`` in environment config into ``True``. The policy and observations can
 be automatically changed to run individual agent for each cross.
+
+Roadnets
+-------------
+
+.. `Beijing Wangjing 3 Crossings <./envs/wj3_env.html>`_
+
+.. `RL Arterial 7 Crossings <./envs/rl_arterial7_env.html>`_
+
+.. toctree::
+    :maxdepth: 2
+
+    envs/wj3_env
+    envs/rl_arterial7_env
+
+
+CityFlow environments
+=============================
+
+configuration
+-----------------
+
+CityFlow simulator has its own config `json` file, with roadnet file, flow file and replay file defined in it.
+DI-smartcross adds some extra configs together with CityFlow's config file path in DI-engine's env config.
+
+.. code:: python
+
+    main_config = dict(
+        env=dict(
+            obs_type=['phase', 'lane_vehicle_num', 'lane_waiting_vehicle_num'],
+            max_episode_duration=1000,
+            green_duration=30,
+            yellow_duration=5,
+            red_duration=0,
+            ...
+        ),
+        ...
+    )
+
+Observation
+----------------
+
+We provide several types of observations of each cross.
+
+- phase: One-hot phase vector of current cross signal
+- lane_vehicle_num: vehicle nums of each incoming lane
+- lane_waiting_vehicle_num: waiting vehicle nums of each incoming lane
+
+Action
+-------------
+
+CityFlow environment supports changing cross signal to target phase. The action space is set to multi-discrete for each cross to reduce action num.
+
+Reward
+-------------
+
+CityFlow environment uses pressure of each cross as reward
+
+Roadnets
+-------------
+
+.. toctree::
+    :maxdepth: 2
+
+    envs/cf_grid_env
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		SUMO RL Arterial 7 Crossings Env
		#################################
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		SUMO Beijing Wangjing 3 Crossings Env
		######################################