Merge pull request #22 from ANRGUSC/develop

Updated Documentation of Jupiter + Fixed a Small Bug
ANRGUSC · Aug 15, 2018 · aece1a6 · aece1a6
2 parents c36e505 + aaf955b
commit aece1a6
Show file tree

Hide file tree

Showing 33 changed files with 236 additions and 51 deletions.
diff --git a/circe/monitor.py b/circe/monitor.py
@@ -26,8 +26,7 @@
 from socket import gethostbyname, gaierror, error
 import multiprocessing
 import time
-import urllib.request
-from urllib import parse
+import urllib
 import configparser
 
 def send_monitor_data(msg):
@@ -47,7 +46,7 @@ def send_monitor_data(msg):
         print("Sending message", msg)
         url = "http://" + home_node_host_port + "/recv_monitor_data"
         params = {'msg': msg, "work_node": taskname}
-        params = parse.urlencode(params)
+        params = urllib.parse.urlencode(params)
         req = urllib.request.Request(url='%s%s%s' % (url, '?', params))
         res = urllib.request.urlopen(req)
         res = res.read()
@@ -75,7 +74,7 @@ def send_runtime_profile(msg):
         print("Sending message", msg)
         url = "http://" + home_node_host_port + "/recv_runtime_profile"
         params = {'msg': msg, "work_node": taskname}
-        params = parse.urlencode(params)
+        params = urllib.parse.urlencode(params)
         req = urllib.request.Request(url='%s%s%s' % (url, '?', params))
         res = urllib.request.urlopen(req)
         res = res.read()

diff --git a/docs/deploy_sphinx b/docs/deploy_sphinx
@@ -21,7 +21,7 @@ rm -rf source/scripts
 sphinx-apidoc -o source/scripts ../scripts
 
 rm -rf source/task_mapper/heft
-sphinx-apidoc -o source/task_mapper/heft ../task_mapper/heft
+sphinx-apidoc -o source/task_mapper/heft/modified ../task_mapper/heft/modified
 
 rm -rf source/task_mapper/wave
 sphinx-apidoc -o source/task_mapper/wave/greedy_wave/home ../task_mapper/wave/greedy_wave/home

diff --git a/docs/source/Acirce.rst b/docs/source/Acirce.rst
@@ -12,6 +12,7 @@ Circe Reference
    circe/rt_profiler_update_mongo
    circe/runSQuery
    circe/scheduler
+   circe/evaluate
 
 
 

diff --git a/docs/source/Aprofilers.rst b/docs/source/Aprofilers.rst
@@ -13,6 +13,7 @@ Network Profiler
    profilers/network_resource_profiler/home/central_scheduler
    profilers/network_resource_profiler/home/generate_link_list
    profilers/network_resource_profiler/worker/automate_droplet
+   profilers/network_resource_profiler/worker/get_schedule
 
 
 

diff --git a/docs/source/Ascripts.rst b/docs/source/Ascripts.rst
@@ -63,14 +63,15 @@ Docker file preparation scripts
    circe/circe_docker_files_generator
    profilers/execution_profiler/exec_docker_files_generator
    profilers/network_resource_profiler/profiler_docker_files_generator
-   task_mapper/heft/heft_dockerfile_generator
+   task_mapper/heft/modified/heft_dockerfile_generator
 
 
 Other scripts
 -------------
 .. toctree::
    :maxdepth: 4
 
+   scripts/auto_redeploy
    scripts/static_assignment
    scripts/utilities
    scripts/keep_alive

diff --git a/docs/source/Ataskmapper.rst b/docs/source/Ataskmapper.rst
@@ -10,11 +10,11 @@ HEFT
 .. toctree::
    :maxdepth: 4
 
-   task_mapper/heft/create_input
-   task_mapper/heft/heft_dup
-   task_mapper/heft/master_heft
-   task_mapper/heft/read_input_heft
-   task_mapper/heft/write_input_heft
+   task_mapper/heft/modified/create_input
+   task_mapper/heft/modified/heft_dup
+   task_mapper/heft/modified/master_heft
+   task_mapper/heft/modified/read_input_heft
+   task_mapper/heft/modified/write_input_heft
 
 
 WAVE

diff --git a/docs/source/Japi.rst b/docs/source/Japi.rst
@@ -0,0 +1,90 @@
+Integration Interface
+=====================
+
+Jupiter by default use ``SCP`` as the file transfer method and ``DRUPE`` as the network monitoring tool. We have decoupled these modules in Jupiter so that you can create and use your own modules if you want. 
+
+.. figure::  images/api_general.png
+   :align:   center
+
+   General Integration Interface between Jupiter and the Network & Resource Monitor Tool
+
+.. figure::  images/api_example.png
+   :align:   center
+
+   Example of SCP and DRUPE using the Integration Interface
+
+The Jupiter integration interface use the following methods:
+
+* ``get_network_data()`` : retrieve network data from the Network & Resource Monitor Tool.
+* ``get_resource_data()`` : retrieve resource data from the Network & Resource Monitor Tool.
+* ``data_transfer(IP,user,pword,resource,destination)``: transfer file to the destination node with provided information (IP,username,password) and file paths (source, destination).
+
+
+.. warning:: The network data from the Network & Resource Monitor Tool working with Jupiter must be in the format of ``quadratic parameters`` which can specify the communication cost of the network links.
+
+.. warning:: The resource data from the Network & Resource Monitor Tool working with Jupiter must be the combination of ``CPU and memory``.
+
+
+Please follow the following guideline (with the examples from ``SCP`` and ``DRUPE``) to map your specific modules to the corresponding methods of the interface.
+
+.. figure::  images/API.png
+   :align:   center
+
+   The example with more detailed implementation.
+
+
+File Transfer method 
+--------------------
+
+Write the data transfer function. In ``SCP`` example, the function is ``data_transfer_scp``. Add the corresponding mapping part for data transfer function:
+
+.. code-block:: python
+    :linenos:
+    
+    def transfer_mapping_decorator(TRANSFER):
+	    def data_transfer_scp(IP,user,pword,source, destination):
+	        retry = 0
+	        while retry < num_retries:
+	            try:
+	                cmd = "sshpass -p %s scp -P %s -o StrictHostKeyChecking=no -r %s %s@%s:%s" % (pword, ssh_port, source, user, IP, destination)
+	                os.system(cmd)
+	                print('data transfer complete\n')
+	                break
+	            except:
+	                print('profiler_worker.txt: SSH Connection refused or File transfer failed, will retry in 2 seconds')
+	                time.sleep(2)
+	                retry += 1
+
+	    if TRANSFER==0:
+	        return data_transfer_scp
+	    return data_transfer_scp
+
+Network & Resource Monitor Tool
+-------------------------------
+
+Get resource data
+^^^^^^^^^^^^^^^^^
+
+Write the resource data crawling function. In ``DRUPE`` example, the function is ``get_resource_data_drupe``. Add the corresponding mapping part for resource data crawling function:
+
+.. code-block:: python
+    :linenos:
+
+    def get_resource_data_mapping(PROFILER=0):
+	    if PROFILER==0: 
+	        return profilers_mapping_decorator(get_resource_data_drupe)
+	    return profilers_mapping_decorator(get_resource_data_drupe)
+
+Get network data
+^^^^^^^^^^^^^^^^
+
+Write the network data crawling function. In ``DRUPE`` example, the function is ``get_network_data_drupe``. Add the corresponding mapping part for network data crawling function:
+
+.. code-block:: python
+    :linenos:
+
+    def get_network_data_mapping(PROFILER=0):
+	    if PROFILER==0: 
+	        return profilers_mapping_decorator(get_network_data_drupe)
+	    return profilers_mapping_decorator(get_network_data_drupe)
+	   
diff --git a/docs/source/Jdeploy.rst b/docs/source/Jdeploy.rst
@@ -128,7 +128,7 @@ Also change the following line to refer to your app:
 
 Version 2.0
 ^^^^^^^^^^^
-In version 2.0, to simplify the process we have provided with the following scripts:
+Starting from version 2.0, to simplify the process we have provided with the following scripts:
 
 .. code-block:: text
     :linenos:
@@ -141,9 +141,9 @@ In version 2.0, to simplify the process we have provided with the following scri
 These scripts will read the configuration information from ``jupiter_config.ini`` and ``jupiter_config.py`` to help generate corresponding Docker files for all the components. 
 
 Step 6 : Choose the task mapper
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-------------------------------
 
-You must choose the Task Mapper from ``config.ini``. Currently, there are 3 options from the scheduling algorithm list: centralized (HEFT), distributed(random WAVE, greedy WAVE).
+You must choose the Task Mapper from ``config.ini``. Currently, there are 4 options from the scheduling algorithm list: centralized (original HEFT, modified HEFT), distributed(random WAVE, greedy WAVE).
 
 .. code-block:: text
     :linenos:
@@ -156,8 +156,48 @@ You must choose the Task Mapper from ``config.ini``. Currently, there are 3 opti
         HEFT = 0
         WAVE_RANDOM = 1
         WAVE_GREEDY = 2
+        HEFT_MODIFIED = 3
+
+.. note:: When HEFT tries to optimize the Makespan by reducing communication overhead and putting many tasks on the same computing node, it ends up overloading them. While the Jupiter system can recover from failures, multiple failures of the overloaded computing nodes actually ends up adding more delay in the execution of the tasks as well as the communication between tasks due to temporary disruptions of the data flow. The modified HEFT is restricted to allocate no more than ``MAX_TASK_ALLOWED`` containers per computing node where the number ``MAX_TASK_ALLOWED`` is dependent upon the processing power of the node. You can find ``MAX_TASK_ALLOWED`` variable from ``heft_dup.py``. 
+
+Step 7 : Optional - Modify the File Transfer Method or Network & Resource Monitor Tool
+--------------------------------------------------------------------------------------
+
+Select File Transfer method 
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Jupiter by default use ``SCP`` as the file transfer method. If you want to use any other file transfer tool instead (like ``XCP``, etc...), you can perform the following 2 steps:
+
+Firstly, refer the :ref:`Integration Interface` and write your corresponding File Transfer module. 
+
+Secondly, update ``config.ini`` to make Jupiter use your corresponding File Transfer method. 
+
+.. code-block:: text
+    :linenos:
+
+    [CONFIG]
+    TRANSFER = 0
+
+    [TRANSFER_LIST]
+    SCP = 0
 
-Step 7 : Push the Dockers
+Select Network & Resource Monitor Tool 
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Jupiter by default use ``DRUPE`` as the Network & Resource Monitor Tool. If you want to use any other Network & Resource Monitor Tool, you can perform the following 2 steps:
+
+Firstly, refer the :ref:`Integration Interface` and write your corresponding Network & Resource Monitor module. 
+
+Secondly, update ``config.ini`` to make Jupiter use your corresponding Network & Resource Monitor module. 
+
+.. code-block:: text
+    :linenos:
+
+    [CONFIG]
+    PROFILER = 0
+
+    [PROFILERS_LIST]
+    DRUPE = 0
+
+Step 8 : Push the Dockers
 -------------------------
 
 Now, you need to build your Docker images. 
@@ -199,7 +239,7 @@ The same thing needs to be done for the profilers, the WAVE and HEFT files.
 
 .. warning:: However, before running any of these scripts you should update the ``jupiter_config`` file with your own docker names as well as dockerhub username. DO NOT run the script without crosschecking the config file.
 
-Step 8 : Setup the Proxy
+Step 9 : Setup the Proxy
 ------------------------
 
 Now, you have to create a kubernetes proxy. You can do that by running the follwing command on a terminal.
@@ -210,8 +250,8 @@ Now, you have to create a kubernetes proxy. You can do that by running the follw
     kubectl proxy -p 8080
 
 
-Step 9 : Create the Namespaces
-------------------------------
+Step 10 : Create the Namespaces
+-------------------------------
 
 You need to create difference namespaces in your Kubernetes cluster 
 that will be dedicated to the DRUPE, execution profiler, Task Mapper, and CIRCE deployments, respectively.
@@ -236,8 +276,8 @@ You can create these namespaces commands similar to the following:
     EXEC_NAMESPACE          = 'johndoe-exec'
 
 
-Step 10 : Run the Jupiter Orchestrator
--------------------------------------
+Step 11 : Run the Jupiter Orchestrator
+--------------------------------------
 
 
 Next, you can simply run:
@@ -249,18 +289,19 @@ Next, you can simply run:
     python3 k8s_jupiter_deploy.py
 
 
-Step 10 : Alternate
--------------------
+Step 12 : Optional - Alternate scheduler
+----------------------------------------
 
-If you do not want to use WAVE for the scheduler and design your own, you can do that by simply using the ``static_assignment.py``. You must do that by setting ``STATIC_MAPPING`` to ``1`` from ``jupiter_config.ini``. You have to pipe your scheduling output to the static_assignment.py while conforming to the sample dag and sample schedule structure. Then you can run:
+If you do not want to use our task mappers (``HEFT`` or ``WAVE``) for the scheduler and design your own, you can do that by simply using the ``static_assignment.py``. You must do that by setting ``STATIC_MAPPING`` to ``1`` from ``jupiter_config.ini``. You have to pipe your scheduling output to the ``static_assignment.py`` while conforming to the sample dag and sample schedule structure. Then you can run:
 
 .. code-block:: bash
     :linenos:
 
     cd scripts/
     python3 k8s_jupiter_deploy.py
 
-Step 11 : Interact With the DAG
+
+Step 13 : Interact With the DAG
 -------------------------------
 
 Now you can interact with the pos using the kubernetes dashboard. 

diff --git a/docs/source/Joverview.rst b/docs/source/Joverview.rst
@@ -73,6 +73,8 @@ This file includes all paths configuration for Jupiter system to start. The late
 
     STATIC_MAPPING          = int(config['CONFIG']['STATIC_MAPPING'])
     SCHEDULER               = int(config['CONFIG']['SCHEDULER'])
+    TRANSFER                = int(config['CONFIG']['TRANSFER'])
+    PROFILER                = int(config['CONFIG']['PROFILER'])
 
     USERNAME                = config['AUTH']['USERNAME']
     PASSWORD                = config['AUTH']['PASSWORD']
@@ -91,10 +93,12 @@ This file includes all paths configuration for Jupiter system to start. The late
     WAVE_PATH               = HERE + 'task_mapper/wave/random_wave/'
     SCRIPT_PATH             = HERE + 'scripts/'
 
-    if SCHEDULER == 1:
+    if SCHEDULER == config['SCHEDULER_LIST']['WAVE_RANDOM']:
         WAVE_PATH           = HERE + 'task_mapper/wave/random_wave/'
-    elif SCHEDULER == 2:
+    elif SCHEDULER == config['SCHEDULER_LIST']['WAVE_GREEDY']:
         WAVE_PATH           = HERE + 'task_mapper/wave/greedy_wave/'
+    elif SCHEDULER == config['SCHEDULER_LIST']['HEFT_MODIFIED']:
+        HEFT_PATH           = HERE + 'task_mapper/heft/modified/'
 
     KUBECONFIG_PATH         = os.environ['KUBECONFIG']
 
@@ -156,14 +160,16 @@ You also need to specify the corresponding information:
 File config.ini
 ---------------
 
-This file includes all configuration options for Jupiter system to start. The latest version of ``config.ini`` file includes types of mapping (static or dynamic), port information (SSH, Flask, Mongo), authorization (username and password), scheduling algorithm (Heft, random WAVE, greedy WAVE):
+This file includes all configuration options for Jupiter system to start. The latest version of ``config.ini`` file includes types of mapping (static or dynamic), port information (SSH, Flask, Mongo), authorization (username and password), scheduling algorithm (HEFT original, random WAVE, greedy WAVE, HEFT modified):
 
 .. code-block:: text
     :linenos:
 
     [CONFIG]
         STATIC_MAPPING = 0
         SCHEDULER = 2
+        TRANSFER = 0
+        PROFILER = 0
     [PORT]
         MONGO_SVC = 6200
         MONGO_DOCKER = 27017
@@ -182,8 +188,17 @@ This file includes all configuration options for Jupiter system to start. The la
         HEFT = 0
         WAVE_RANDOM = 1
         WAVE_GREEDY = 2
+        HEFT_MODIFIED = 3
+    [PROFILERS_LIST]
+        DRUPE = 0
+    [TRANSFER_LIST]
+        SCP = 0
 
-.. warning:: You should specify the information in ``CONFIG`` section to choose the specific scheduling algorithm from the ``SCHEDULER_LIST``. ``STATIC_MAPPING`` is only chosen on testing purpose. 
+.. warning:: You should specify ``SCHEDULER`` in ``CONFIG`` section to choose the specific scheduling algorithm from the ``SCHEDULER_LIST``. ``STATIC_MAPPING`` is only chosen on testing purpose. 
+
+.. warning:: You should specify ``TRANSFER`` in ``CONFIG`` section to choose the specific file transfer method from the ``TRANSFER_LIST``. The default file transfer method that we used is ``SCP``. If you want to use another file transfer method, please refer to the guideline how to use the interface. 
+
+.. warning:: You should specify ``PROFILER`` in ``CONFIG`` section to choose the specific network monitoring from the ``PROFILERS_LIST``. The default network monitoring tool that we used is ``DRUPE``. If you want to use another network monitoring tool, please refer to the guideline how to use the interface.
 
 File configuration.txt
 ----------------------
@@ -224,7 +239,7 @@ Inside the application folder, there should be a ``app_config.ini`` file having
 Output
 ======
 
-.. note:: Taking the node list from ``nodes.txt`` and DAG information from ``configuration.txt``, Jupiter will consider both updated network connectivity (from ``DRUPE-network profiler`` ) and computational capabilities (from ``DRUPE - resource profiler``) of all the nodes in the system, Jupiter use the chosen scheduling algorithm (``HEFT``, ``random WAVE`` or ``greedy WAVE``) to give the optimized mapping of tasks and nodes in the system. Next, ``CIRCE`` will handle deploying the optimized mapping in the **Kubernetes** system.
+.. note:: Taking the node list from ``nodes.txt`` and DAG information from ``configuration.txt``, Jupiter will consider both updated network connectivity (from ``DRUPE-network profiler`` or your chosen tool) and computational capabilities (from ``DRUPE - resource profiler`` or your chosen tool) of all the nodes in the system, Jupiter use the chosen scheduling algorithm (``HEFT original``, ``random WAVE``,``greedy WAVE`` or ``HEFT modified``) to give the optimized mapping of tasks and nodes in the system. Next, ``CIRCE`` will handle deploying the optimized mapping in the **Kubernetes** system.