From d30caf4dd7cb218a9924b0564ae055af5c39c186 Mon Sep 17 00:00:00 2001 From: GitHub Action Date: Mon, 18 May 2020 18:05:38 +0000 Subject: [PATCH] Update documentation --- README.md | 1 + docs/hive.md | 150 ++++++++++++++++++++++----------------------- docs/misc.md | 128 +++++++++++++++++++------------------- docs/prometheus.md | 25 ++++++++ 4 files changed, 165 insertions(+), 139 deletions(-) create mode 100644 docs/prometheus.md diff --git a/README.md b/README.md index 37e7e0a..60e9a06 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,7 @@ mount_nfs fs-7abd2444.efs.us-east-1.amazonaws.com:/ /mnt/efs ## Available functions The following set of functions are available at present: * [spark](docs/spark.md) +* [prometheus](docs/prometheus.md) * [misc](docs/misc.md) * [hive](docs/hive.md) * [hadoop](docs/hadoop.md) diff --git a/docs/hive.md b/docs/hive.md index 8bfb0d8..d103ae1 100644 --- a/docs/hive.md +++ b/docs/hive.md @@ -1,166 +1,166 @@ -# hive/hiveserver2.sh +# hive/thrift-metastore.sh -Provides functions to start/stop/restart HiveServer2 +Provides functions to start/stop/restart thrift metastore server -* [is_hs2_configured()](#ishs2configured) -* [stop_hs2()](#stophs2) -* [start_hs2()](#starths2) -* [restart_hs2()](#restarths2) +* [start_thrift_metastore()](#startthriftmetastore) +* [stop_thrift_metastore()](#stopthriftmetastore) +* [restart_thrift_metastore()](#restartthriftmetastore) -## is_hs2_configured() +## start_thrift_metastore() -Function to check if HiveServer2 is configured +Function to start thrift metastore server ### Example ```bash -if [[ is_hs2_configured ]]; then - # do something here -fi +start_thrift_metastore ``` _Function has no arguments._ -### Exit codes - -* **0**: If HiveServer2 is configured -* **1**: Otherwise - -## stop_hs2() - -Function to stop HiveServer2 JVM +## stop_thrift_metastore() -Works on both Hadoop2 and HiveServer2 clusters +Function to stop thrift metastore server ### Example ```bash -stop_hs2 +stop_thrift_metastore ``` _Function has no arguments._ -## start_hs2() - -Function to start HiveServer2 JVM +## restart_thrift_metastore() -Works on both Hadoop2 and HiveServer2 clusters +Function to restart thrift metastore server ### Example ```bash -start_hs2 +restart_thrift_metastore ``` _Function has no arguments._ -## restart_hs2() +# hive/ranger-client.sh -Function to restart HiveServer2 JVM +Provides function to install Apache Ranger client for Hive -Works on both Hadoop2 and HiveServer2 clusters +* [install_ranger()](#installranger) + + +## install_ranger() + +Install Apache Ranger client for Hive + +Currently supported only on AWS +Requires HiveServer2 ### Example ```bash -restart_hs2 +install_ranger -h example.host -p 6080 -r examplerepo ``` -_Function has no arguments._ +### Arguments -# hive/thrift-metastore.sh +* -h string Hostname of Ranger admin. Defaults to `localhost` +* -p int Port where Ranger admin is running. Defaults to `6080` +* -r string Name of Ranger repository. Defaults to `hivedev` -Provides functions to start/stop/restart thrift metastore server +# hive/glue-sync.sh -* [start_thrift_metastore()](#startthriftmetastore) -* [stop_thrift_metastore()](#stopthriftmetastore) -* [restart_thrift_metastore()](#restartthriftmetastore) +Provides function to install Hive Glue Catalog Sync Agent +* [install_glue_sync()](#installgluesync) -## start_thrift_metastore() -Function to start thrift metastore server +## install_glue_sync() + +Installs Hive Glue Catalog Sync Agent + +Requires Hive 2.x +Currently supported only on AWS ### Example ```bash -start_thrift_metastore +install_glue_sync us-east-1 ``` -_Function has no arguments._ +### Arguments -## stop_thrift_metastore() +* **$1** (string): Region for AWS Athena. Defaults to `us-east-1` -Function to stop thrift metastore server +# hive/hiveserver2.sh -### Example +Provides functions to start/stop/restart HiveServer2 -```bash -stop_thrift_metastore -``` +* [is_hs2_configured()](#ishs2configured) +* [stop_hs2()](#stophs2) +* [start_hs2()](#starths2) +* [restart_hs2()](#restarths2) -_Function has no arguments._ -## restart_thrift_metastore() +## is_hs2_configured() -Function to restart thrift metastore server +Function to check if HiveServer2 is configured ### Example ```bash -restart_thrift_metastore +if [[ is_hs2_configured ]]; then + # do something here +fi ``` _Function has no arguments._ -# hive/glue-sync.sh - -Provides function to install Hive Glue Catalog Sync Agent - -* [install_glue_sync()](#installgluesync) +### Exit codes +* **0**: If HiveServer2 is configured +* **1**: Otherwise -## install_glue_sync() +## stop_hs2() -Installs Hive Glue Catalog Sync Agent +Function to stop HiveServer2 JVM -Requires Hive 2.x -Currently supported only on AWS +Works on both Hadoop2 and HiveServer2 clusters ### Example ```bash -install_glue_sync us-east-1 +stop_hs2 ``` -### Arguments +_Function has no arguments._ -* **$1** (string): Region for AWS Athena. Defaults to `us-east-1` +## start_hs2() -# hive/ranger-client.sh +Function to start HiveServer2 JVM -Provides function to install Apache Ranger client for Hive +Works on both Hadoop2 and HiveServer2 clusters -* [install_ranger()](#installranger) +### Example +```bash +start_hs2 +``` -## install_ranger() +_Function has no arguments._ -Install Apache Ranger client for Hive +## restart_hs2() -Currently supported only on AWS -Requires HiveServer2 +Function to restart HiveServer2 JVM + +Works on both Hadoop2 and HiveServer2 clusters ### Example ```bash -install_ranger -h example.host -p 6080 -r examplerepo +restart_hs2 ``` -### Arguments - -* -h string Hostname of Ranger admin. Defaults to `localhost` -* -p int Port where Ranger admin is running. Defaults to `6080` -* -r string Name of Ranger repository. Defaults to `hivedev` +_Function has no arguments._ diff --git a/docs/misc.md b/docs/misc.md index fb28fbf..98c9710 100644 --- a/docs/misc.md +++ b/docs/misc.md @@ -1,3 +1,36 @@ +# misc/mount_nfs.sh + +Provides function to mount a NFS volume + +* [mount_nfs_volume()](#mountnfsvolume) + + +## mount_nfs_volume() + +Mounts an NFS volume on master and worker nodes + +Instructions for AWS EFS mount: +1. After creating the EFS file system, create a security group +2. Create an inbound traffic rule for this security group that allows traffic on +port 2049 (NFS) from this security group as described here: +https://docs.aws.amazon.com/efs/latest/ug/accessing-fs-create-security-groups.html +3. Add this security group as a persistent security group for the cluster from which +you want to mount the EFS store, as described here: +http://docs.qubole.com/en/latest/admin-guide/how-to-topics/persistent-security-group.html + +TODO: add instructions for Azure file share + +### Example + +```bash +mount_nfs_volume "example.nfs.share:/" /mnt/efs +``` + +### Arguments + +* **$1** (string): Path to NFS share +* **$2** (string): Mount point to use + # misc/python_venv.sh Provides function to install Python virtualenv @@ -29,110 +62,77 @@ install_python_env 3.6 /path/to/virtualenv/py36 * **$1** (float): Version of Python to use. Defaults to 3.6 * **$2** (string): Location to create virtualenv in. Defaults to /usr/lib/virtualenv/py36 -# misc/util.sh - -Provides miscellaneous utility functions +# misc/awscli.sh -* [set_timezone()](#settimezone) -* [add_to_authorized_keys()](#addtoauthorizedkeys) +Provides function to configure AWS CLI +* [configure_awscli()](#configureawscli) -## set_timezone() -Set the timezone +## configure_awscli() -This function sets the timezone on the cluster node. -The timezone to set is a mandatory parameter and must be present in /usr/share/zoneinfo -Eg: "US/Mountain", "America/Los_Angeles" etc. +Configure AWS CLI -After setting the timezone, it is advised to restart engine daemons on the master and worker nodes +A credentials file containing the AWS Access Key and the AWS Secret Key +separated by a space, comma, tab or newline must be provided ### Example ```bash -set_timezone "America/Los_Angeles" +configure_awscli -p exampleprofile -r us-east-1 -c /path/to/credentials/file ``` ### Arguments -* **$1** (string): Timezone to set - -## add_to_authorized_keys() - -Add a public key to authorized_keys - -### Example - -```bash -add_to_authorized_keys "ssh-rsa xyzxyzxyzxyz...xyzxyz user@example.com" ec2-user -``` +* -p string Name of the profile. Defaults to `default` +* -r string AWS region. Defaults to `us-east-1` +* -c string Path to credentials file -### Arguments +### Exit codes -* **$1** (string): Public key to add to authorized_keys file -* **$2** (string): User for which the public key is added. Defaults to `ec2-user` +* **0**: AWS CLI is configured +* **1**: AWS CLI or credentials file not found -# misc/mount_nfs.sh +# misc/util.sh -Provides function to mount a NFS volume +Provides miscellaneous utility functions -* [mount_nfs_volume()](#mountnfsvolume) +* [set_timezone()](#settimezone) +* [add_to_authorized_keys()](#addtoauthorizedkeys) -## mount_nfs_volume() +## set_timezone() -Mounts an NFS volume on master and worker nodes +Set the timezone -Instructions for AWS EFS mount: -1. After creating the EFS file system, create a security group -2. Create an inbound traffic rule for this security group that allows traffic on -port 2049 (NFS) from this security group as described here: -https://docs.aws.amazon.com/efs/latest/ug/accessing-fs-create-security-groups.html -3. Add this security group as a persistent security group for the cluster from which -you want to mount the EFS store, as described here: -http://docs.qubole.com/en/latest/admin-guide/how-to-topics/persistent-security-group.html +This function sets the timezone on the cluster node. +The timezone to set is a mandatory parameter and must be present in /usr/share/zoneinfo +Eg: "US/Mountain", "America/Los_Angeles" etc. -TODO: add instructions for Azure file share +After setting the timezone, it is advised to restart engine daemons on the master and worker nodes ### Example ```bash -mount_nfs_volume "example.nfs.share:/" /mnt/efs +set_timezone "America/Los_Angeles" ``` ### Arguments -* **$1** (string): Path to NFS share -* **$2** (string): Mount point to use - -# misc/awscli.sh - -Provides function to configure AWS CLI - -* [configure_awscli()](#configureawscli) - - -## configure_awscli() +* **$1** (string): Timezone to set -Configure AWS CLI +## add_to_authorized_keys() -A credentials file containing the AWS Access Key and the AWS Secret Key -separated by a space, comma, tab or newline must be provided +Add a public key to authorized_keys ### Example ```bash -configure_awscli -p exampleprofile -r us-east-1 -c /path/to/credentials/file +add_to_authorized_keys "ssh-rsa xyzxyzxyzxyz...xyzxyz user@example.com" ec2-user ``` ### Arguments -* -p string Name of the profile. Defaults to `default` -* -r string AWS region. Defaults to `us-east-1` -* -c string Path to credentials file - -### Exit codes - -* **0**: AWS CLI is configured -* **1**: AWS CLI or credentials file not found +* **$1** (string): Public key to add to authorized_keys file +* **$2** (string): User for which the public key is added. Defaults to `ec2-user` diff --git a/docs/prometheus.md b/docs/prometheus.md new file mode 100644 index 0000000..8320b7b --- /dev/null +++ b/docs/prometheus.md @@ -0,0 +1,25 @@ +# prometheus/configure-prometheus.sh + +Provides functions to configure Prometheus + +* [configure_prometheus_ram_on_master()](#configureprometheusramonmaster) + + +## configure_prometheus_ram_on_master() + +Ability to override the memory usage of prometheus daemon on master. Example : 500M + + function requires one argument to be passed. + Argument must specify the ram to be allocated to the prometheus service from master node ram. + Input should be an integer. All the values are assumed in MB. + +### Example + +```bash + configure_prometheus_ram_on_master 600 +``` + +### Arguments + +* **$1** (integer): Prometheus ram to be substituted in MB. +