Add methods to log the results of the api runner to a server #177

rishsriv · 2024-06-22T18:52:15Z

The main idea behind this is to help us use the eval visualizer with just a run name, instead of having the JSON files locally. One more PR in the sql-eval repo coming up!

This will justwork™ with all the run_checkpoints*.sh scripts without any code changes. All you have to do is to add the environment variable SQL_EVAL_UPLOAD_URL that corresponds to our record-eval function (not including full URL here since this is a public repo). And you'll be good to go!

I have already added this as an env variable on the a10, gpu-inference, and h100 production instance.

…nked one

rishsriv · 2024-06-22T19:15:11Z

main.py

@@ -33,6 +33,7 @@
    parser.add_argument("-v", "--verbose", action="store_true")
    parser.add_argument("-l", "--logprobs", action="store_true")
    parser.add_argument("--upload_url", type=str)
+    parser.add_argument("--run_name", type=str, required=False)


added a param to optionally add a run name when we are running an eval

this run name is then stored as a static JSON object, and can be used by the eval visualizer

Thanks! nit: shall we update the README with the new option as well? 😄

Done, thank you for pointing this out!

wongjingping

Thanks for cleaning up and reviving the logic to log to a gcs bucket! 1 small suggestion for making the various option paths more robust 👌🏼

Just checking, where do we read the SQL_EVAL_UPLOAD_URL environment variable (which you mentioned in the description)? Or are we just using the args.upload_url now?

Separately, no action needed but just thought it would also be nice to have the option to do blob.upload_from_string(json.dumps(results)) within the sql-eval process / local environment directly since it's also more secure to just authenticate locally vs having an public http endpoint that can write to one's gcs bucket.

wongjingping · 2024-06-24T01:50:49Z

main.py

@@ -33,6 +33,7 @@
    parser.add_argument("-v", "--verbose", action="store_true")
    parser.add_argument("-l", "--logprobs", action="store_true")
    parser.add_argument("--upload_url", type=str)
+    parser.add_argument("--run_name", type=str, required=False)


Thanks! nit: shall we update the README with the new option as well? 😄

wongjingping · 2024-06-24T01:55:03Z

utils/upload_report_gcloud.py

@@ -0,0 +1,19 @@
+# this is a Google cloud function for receiving the data from the web app and storing it in the database


nit: shall we add the command for launching this cloud function (in case we need to modify it and relaunch in the future)?

wongjingping · 2024-06-24T02:03:16Z

eval/api_runner.py

+
+        if args.upload_url is not None:
+            upload_results(
+                results=results,


it seems like results is only created in L289 when logprobs is true:

results = output_df.to_dict("records")

And I think this might fail if logprobs is True and upload_url is provided (eg if the user is not exporting logprobs, say when using TGI or some other inference API), since results hasn't yet been defined in that code path.

Shall we update the code before the check in L285 to output results regardless of whether logprobs is provided?

Great point, done!

…bled

wongjingping

Thanks for the updates!

rishsriv · 2024-06-24T02:56:45Z

Thanks for the details comments!

Separately, no action needed but just thought it would also be nice to have the option to do blob.upload_from_string(json.dumps(results)) within the sql-eval process / local environment directly since it's also more secure to just authenticate locally vs having an public http endpoint that can write to one's gcs bucket.

Good point. This is a bit of a tradeoff. I wouldn't feel comfortable uploading secure credentials to some GPU providers (like, say, community instances). In the future, we could optionally add some kind of token authentication to the public logging URL if needed! That should solve for both security and the ability to not upload credentials!

rishsriv added 5 commits June 23, 2024 01:57

run sqlite tests on a relative folder path, instead of a workspace li…

5ff0832

…nked one

add a run name parameter

f1d2bc6

removed prompt from reporting method

4877c5f

added a google cloud function for uploading results to cloud storage

da54167

addeded method for run name

5c93586

rishsriv commented Jun 22, 2024

View reviewed changes

use output filename for run name if explicit run name is not provided

2f74233

wongjingping approved these changes Jun 24, 2024

View reviewed changes

rishsriv added 4 commits June 24, 2024 10:48

update README

f4a7484

added env.template for SQL_EVAL_UPLOAD_URL

879ded4

added comment to launch cloud function

566081b

assign to a results variable regardless of whether logprobs are ena…

ef3ab5a

…bled

wongjingping approved these changes Jun 24, 2024

View reviewed changes

rishsriv merged commit 49249fd into main Jun 24, 2024
2 checks passed

rishsriv deleted the rishabh/log-to-server branch June 24, 2024 02:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add methods to log the results of the api runner to a server #177

Add methods to log the results of the api runner to a server #177

rishsriv commented Jun 22, 2024 •

edited

Loading

rishsriv Jun 22, 2024

wongjingping Jun 24, 2024

rishsriv Jun 24, 2024

wongjingping left a comment •

edited

Loading

wongjingping Jun 24, 2024

wongjingping Jun 24, 2024

wongjingping Jun 24, 2024 •

edited

Loading

rishsriv Jun 24, 2024

wongjingping left a comment

rishsriv commented Jun 24, 2024

		@@ -0,0 +1,19 @@
		# this is a Google cloud function for receiving the data from the web app and storing it in the database

Add methods to log the results of the api runner to a server #177

Add methods to log the results of the api runner to a server #177

Conversation

rishsriv commented Jun 22, 2024 • edited Loading

rishsriv Jun 22, 2024

Choose a reason for hiding this comment

wongjingping Jun 24, 2024

Choose a reason for hiding this comment

rishsriv Jun 24, 2024

Choose a reason for hiding this comment

wongjingping left a comment • edited Loading

Choose a reason for hiding this comment

wongjingping Jun 24, 2024

Choose a reason for hiding this comment

wongjingping Jun 24, 2024

Choose a reason for hiding this comment

wongjingping Jun 24, 2024 • edited Loading

Choose a reason for hiding this comment

rishsriv Jun 24, 2024

Choose a reason for hiding this comment

wongjingping left a comment

Choose a reason for hiding this comment

rishsriv commented Jun 24, 2024

rishsriv commented Jun 22, 2024 •

edited

Loading

wongjingping left a comment •

edited

Loading

wongjingping Jun 24, 2024 •

edited

Loading