-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changes to use hs2 interface and to run suite in a loop #5
base: master
Are you sure you want to change the base?
Conversation
|
||
echo "Completed Running PerfData Collection Scripts" | ||
|
||
zip -r $BENCH_HOME/$BENCHMARK/PerfData.zip $PERFDATA_OUTPUTDIR | ||
zip -r $BENCH_HOME/$BENCHMARK/PerfData_$RUN_ID.zip $PERFDATA_OUTPUTDIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently Zip full path in the zip (e.g. home/hdiuser/hive-testbench/PerfData_2/pat/tpch_query_2/.... ). Can we correct the zipping to not include the unnecessary /hdiuser/hive-testbench/ ?
|
||
LOG_DIR=$BENCH_HOME/$BENCHMARK/logs/ | ||
LOG_DIR=$BENCH_HOME/$BENCHMARK/logs_$RUN_ID/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we include everything about one run under a single dir?
BENCH_HOME=$( cd "$( dirname "${BASH_SOURCE[0]}" )/../../" && pwd ); | ||
echo "\$BENCH_HOME is set to $BENCH_HOME"; | ||
|
||
BENCHMARK=hive-testbench | ||
|
||
RESULT_DIR=$BENCH_HOME/$BENCHMARK/results/ | ||
RESULT_DIR=$BENCH_HOME/$BENCHMARK/results_$RUN_ID/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we include everything about one run under a single dir?
@@ -0,0 +1,22 @@ | |||
#!/bin/bash | |||
#usage: ./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong usage.
|
||
PLAN_DIR=$BENCH_HOME/$BENCHMARK/plans/ | ||
PLAN_DIR=$BENCH_HOME/$BENCHMARK/plans_$RUN_ID/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above. Under single dir?
|
||
timeout ${TIMEOUT} hive -i ${HIVE_SETTING} --database ${DATABASE} -d EXPLAIN="" -f ${QUERY_DIR}/tpch_query${2}.sql > ${RESULT_DIR}/${DATABASE}_query${j}.txt 2>&1 | ||
beeline -u ${CONNECTION_STRING} -i ${HIVE_SETTING} --hivevar EXPLAIN="" -f ${QUERY_DIR}/tpch_query${2}.sql > ${RESULT_DIR}/${DATABASE}_query${j}.txt 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra space at the start.
@@ -13,6 +13,8 @@ fi | |||
|
|||
>${STATS_DIR}/tableinfo_${DATABASE}.txt; | |||
|
|||
hive -d DB=${DATABASE} -f gettpchtablecounts.sql > ${STATS_DIR}/tablecounts_${DATABASE}.txt ; | |||
hive -d DB=${DATABASE} -f gettpchtableinfo.sql >> ${STATS_DIR}/tableinfo_${DATABASE}.txt ; | |||
CONNECTION_STRING="jdbc:hive2://localhost:10001/${DATABASE};transportMode=http" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work in case of failover?
@@ -53,7 +53,7 @@ hdfs dfs -mkdir -p ${DIR} | |||
hdfs dfs -ls ${DIR}/${SCALE}/lineitem > /dev/null | |||
if [ $? -ne 0 ]; then | |||
echo "Generating data at scale factor $SCALE." | |||
(cd tpch-gen; hadoop jar target/*.jar -d ${DIR}/${SCALE}/ -s ${SCALE}) | |||
(cd tpch-gen; hadoop jar target/*.jar -D mapreduce.map.memory.mb=8192 -d ${DIR}/${SCALE}/ -s ${SCALE}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not hard code settings here. May be have a global variable or something if you really want.
runcommand "hive -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_${SCALE} -d LOCATION=${DIR}/${SCALE}" | ||
|
||
DATABASE=tpch_text_${SCALE} | ||
CONNECTION_STRING="jdbc:hive2://localhost:10001/$DATABASE;transportMode=http" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
Also, may be we should have all of these settings in a config file rather than repeating it everytime. This is prone to error.
No description provided.