-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated questions and golden queries to prevent accept multiple correct answers and reduce ambiguity #37
Updated questions and golden queries to prevent accept multiple correct answers and reduce ambiguity #37
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,8 +60,8 @@ What are the names of all the courses offered by the department of Computer Scie | |
"What are the easiness scores for courses in the ""Computer Science"" department?","SELECT {course.name, course.course_id, course.number}, course.easiness_score FROM course WHERE course.department ilike '%Computer Science%';",advising,where | ||
How many students have taken a course in-person or online?,SELECT count(DISTINCT sr.student_id) AS num_students FROM student_record sr JOIN student s ON sr.student_id = s.student_id WHERE sr.how ilike '%in-person%' OR sr.how ilike '%online%';,advising,where | ||
Which flight has the shortest duration between departure and arrival times? Convert to minutes.,"SELECT {flight.flight_number, flight.flight_id}, (arrival_time - departure_time) / 60 AS duration_minutes FROM flight ORDER BY duration_minutes LIMIT 1;",atis,date_functions | ||
What's the average duration between departure and arrival times minus 34 minutes? Convert from UNIX to regular datetime.,SELECT avg(to_timestamp(arrival_time) - to_timestamp(departure_time) - interval '34 minutes') AS average_duration FROM flight;,atis,date_functions | ||
Count the number of flight departures for each month?,"SELECT month.month_name, count(*) AS departure_count FROM flight JOIN MONTH ON extract(MONTH FROM to_timestamp(flight.departure_time)) = month.month_number GROUP BY month.month_name, month.month_number ORDER BY month.month_number;",atis,date_functions | ||
What's the average duration between departure and arrival times minus 34 minutes? Convert from UNIX to regular datetime.,"SELECT avg(to_timestamp(arrival_time) - to_timestamp(departure_time) - interval '34 minutes') AS average_duration FROM flight;SELECT AVG(arrival_time - departure_time)/60 - 34 AS average_duration FROM flight;",atis,date_functions | ||
Count the number of flight departures for each month?,"SELECT month.month_name, count(*) AS departure_count FROM flight JOIN MONTH ON extract(MONTH FROM to_timestamp(flight.departure_time)) = month.month_number GROUP BY month.month_name, month.month_number ORDER BY month.month_number;SELECT date_trunc('month', to_timestamp(flight.departure_time)) AS MONTH, COUNT(*) AS num_departures FROM flight GROUP BY MONTH ORDER BY MONTH;",atis,date_functions | ||
What's the earliest flight departure time in the day in HH:MM?,"SELECT to_char(to_timestamp(departure_time)::TIME, 'HH24:MI') AS earliest_departure_time FROM flight ORDER BY earliest_departure_time LIMIT 1",atis,date_functions | ||
What's the difference in time in days between today and the earliest flight departure?,"SELECT date_part('day', CURRENT_DATE - to_timestamp(departure_time)) AS difference_in_days FROM flight ORDER BY departure_time LIMIT 1;",atis,date_functions | ||
What is the total cost of round-trip fares for each airline code?,"SELECT fare.fare_airline, SUM(fare.round_trip_cost) AS total_round_trip_cost FROM fare GROUP BY fare.fare_airline ORDER BY total_round_trip_cost DESC;",atis,group_by | ||
|
@@ -70,7 +70,7 @@ What is the total cost of round-trip fares for each airline code?,"SELECT fare.f | |
"How many meals are served in each compartment, sorted by the number of meals in descending order?","SELECT food_service.compartment, COUNT(food_service.meal_number) AS number_of_meals FROM food_service GROUP BY food_service.compartment ORDER BY number_of_meals DESC NULLS LAST;",atis,group_by | ||
"How many flights depart from each airport code, excluding stopovers?","SELECT airport.airport_code, COUNT(flight.from_airport) AS num_departures FROM airport LEFT JOIN flight ON airport.airport_code = flight.from_airport GROUP BY airport.airport_code;SELECT airport.airport_code, COUNT(flight.from_airport) AS num_departures FROM airport JOIN flight ON airport.airport_code = flight.from_airport GROUP BY airport.airport_code;",atis,group_by | ||
"Which flight ids to Chicago (ORD) have the longest duration from departure to arrival, sorted in ascending order?","SELECT flight.flight_id, (flight.arrival_time - flight.departure_time) AS duration FROM flight WHERE to_airport = 'ORD' ORDER BY duration ASC NULLS LAST;",atis,order_by | ||
"Which airport(s) have the shortest minimum connect time, sorted in ascending order? Show the minimum connect time.","SELECT {airport.airport_name, airport.airport_code}, airport.minimum_connect_time FROM airport ORDER BY airport.minimum_connect_time ASC NULLS LAST LIMIT 1;",atis,order_by | ||
"Which airports have the shortest minimum connect time, sorted in ascending order? Show the minimum connect time.","SELECT {airport.airport_name, airport.airport_code}, airport.minimum_connect_time FROM airport ORDER BY airport.minimum_connect_time ASC NULLS LAST;",atis,order_by | ||
Which aircraft code can carry the highest weight of cargo that any aircraft can carry?,SELECT aircraft.aircraft_code FROM aircraft ORDER BY pay_load DESC NULLS LAST LIMIT 1;,atis,order_by | ||
What are the top 2 airlines with the most flights?,"SELECT {airline.airline_name, airline.airline_code}, COUNT(flight.flight_id) AS number_of_flights FROM flight JOIN airline ON flight.airline_code = airline.airline_code GROUP BY {} ORDER BY number_of_flights DESC NULLS LAST LIMIT 2;",atis,order_by | ||
What are the aircraft codes for all aircraft with a cruising speed of over 200 mph? sort the aircraft codes in ascending order.,SELECT aircraft.aircraft_code FROM aircraft WHERE aircraft.cruising_speed > 200 ORDER BY aircraft.aircraft_code ASC NULLS LAST;,atis,order_by | ||
|
@@ -82,7 +82,7 @@ How does the average ratio of the cruising speed to the payload of an aircraft v | |
Which flights serve meals in first class? Give me the flight id and meal description.,"SELECT flight.flight_id, food_service.meal_description FROM flight JOIN food_service ON flight.meal_code = food_service.meal_code WHERE LOWER(food_service.compartment) LIKE '%first class%';",atis,table_join | ||
Which airlines offer flights with a stopover in Dallas?,"SELECT DISTINCT {airline.airline_name, airline.airline_code} FROM flight_stop JOIN airport ON flight_stop.stop_airport = airport.airport_code JOIN flight ON flight_stop.flight_id = flight.flight_id JOIN airline ON flight.airline_code = airline.airline_code WHERE airport.airport_location ILIKE '%Dallas%';",atis,table_join | ||
Which airlines offer flights from LAX to ORD?,"SELECT DISTINCT {airline.airline_name, airline.airline_code} FROM flight JOIN airline ON flight.airline_code = airline.airline_code WHERE flight.from_airport = 'LAX' AND flight.to_airport = 'ORD';",atis,table_join | ||
"Which airlines offer flights from Chicago (ORD) to New York (JFK), and how many stops do they have, sorted by number of stops?","SELECT {airline.airline_name, airline.airline_code}, flight.stops FROM flight JOIN airline ON flight.airline_code = airline.airline_code WHERE flight.from_airport = 'ORD' AND flight.to_airport = 'JFK' GROUP BY {}, flight.stops ORDER BY flight.stops NULLS LAST;",atis,table_join | ||
"Which airlines offer flights from Chicago (ORD) to New York (JFK), and how many stops do they have, sorted by number of stops in ascending order?","SELECT {airline.airline_name, airline.airline_code}, flight.stops FROM flight JOIN airline ON flight.airline_code = airline.airline_code WHERE flight.from_airport = 'ORD' AND flight.to_airport = 'JFK' GROUP BY {}, flight.stops ORDER BY flight.stops NULLS LAST;",atis,table_join | ||
"Which airlines do not have flights that depart or arrive at JFK, excluding stopovers?","SELECT DISTINCT {a.airline_name, a.airline_code} FROM public.airline a LEFT JOIN public.flight f ON a.airline_code = f.airline_code AND (f.to_airport = 'JFK' OR f.from_airport = 'JFK') GROUP BY {} HAVING COUNT(f.flight_id) = 0;",atis,table_join | ||
Which state code is Orlando International Airport in?,SELECT state_code FROM airport WHERE airport_name ILIKE '%Orlando International Airport%';,atis,where | ||
Which flights operate on Mondays and Wednesdays? Give me the relevant flight numbers,"SELECT {flight.flight_number, flight.flight_id} FROM flight WHERE LOWER(flight.flight_days) LIKE '%mon%' AND LOWER(flight.flight_days) LIKE '%wed%';",atis,where | ||
|
@@ -146,7 +146,7 @@ Give me the total number of papers published in the first 12 months of 2019.,SEL | |
"On average, how many papers per month were published in the whole of 2020?",SELECT cast(count(*) AS float)/ 12 AS average_papers_per_month FROM paper WHERE YEAR = 2020;,scholar,date_functions | ||
What is the total number of papers published per year?,"SELECT paper.year, COUNT(paper.paperid) AS total_papers FROM paper GROUP BY paper.year ORDER BY paper.year NULLS LAST;",scholar,group_by | ||
What is the total number of papers published in each year?,"SELECT paper.year, COUNT(paper.paperid) AS total_papers FROM paper GROUP BY paper.year ORDER BY paper.year;",scholar,group_by | ||
What is the total number of papers associated with each dataset?,"SELECT paperdataset.datasetid, COUNT(DISTINCT paperdataset.paperid) AS total_papers FROM paperdataset GROUP BY paperdataset.datasetid;",scholar,group_by | ||
What is the total number of papers associated with each dataset?,"SELECT paperdataset.datasetid, COUNT(DISTINCT paperdataset.paperid) AS total_papers FROM paperdataset GROUP BY paperdataset.datasetid;SELECT dataset.datasetname, COUNT(paperdataset.paperid) AS total_papers FROM paperdataset JOIN dataset ON paperdataset.datasetid = dataset.datasetid GROUP BY dataset.datasetname ORDER BY total_papers DESC NULLS LAST;SELECT p.title, COUNT(DISTINCT a.authorid) AS num_authors FROM paper p JOIN writes w ON p.paperid = w.paperid JOIN author a ON w.authorid = a.authorid GROUP BY p.title ORDER BY num_authors DESC;",scholar,group_by | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The 2nd query looks good to me but I think the 3rd one might not be related? It seems to be returning the paper title instead of the dataset name/id:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah my bad – pushing a fix! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed! |
||
How many keyphrases are associated with each paper?,"SELECT paperkeyphrase.paperid, COUNT(paperkeyphrase.keyphraseid) AS keyphrase_count FROM paperkeyphrase GROUP BY paperkeyphrase.paperid ORDER BY keyphrase_count DESC NULLS LAST;",scholar,group_by | ||
How many authors have published more than 2 papers?,SELECT COUNT(*) AS number_of_authors FROM (SELECT writes.authorid FROM writes GROUP BY writes.authorid HAVING COUNT(writes.paperid) > 2) AS subquery;,scholar,group_by | ||
"Which papers have the highest number of authors, ordered by the number of authors in descending order?","SELECT writes.paperid, COUNT(writes.authorid) AS num_authors FROM writes GROUP BY writes.paperid ORDER BY num_authors DESC NULLS LAST;",scholar,order_by | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
### Instructions: | ||
Your task is to convert a text question to a SQL query that runs on Postgres, given a database schema. | ||
|
||
### Input: | ||
Generate a SQL query that answers the question `{user_question}`. | ||
|
||
This query will run on a database whose schema is represented in this string: | ||
{table_metadata_string} | ||
|
||
### Response: | ||
Given the database schema, here is the SQL query that answers `{user_question}`: | ||
```sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want the airports with the shortest minimum_connect_time, then we would need a subquery that gets the shortest minimum_connect_time, and then finds the airports which have that particular value right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I read it as give me the connect time in ascending order. I think the current one is fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just texted a couple of folks to get more opinions and it seems like
give me the connect time in ascending order
is the consensus understanding of this. Sticking with this for now, but we can change later :)This was a very instructive lessons in phrasing, though. Context understanding is so non-trivial, and is one of the things our upcoming instruction-fine-tuned models will eventually get better at
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining - agreed about the ascending order. I was focusing more on the 'shortest' bit, and it seemed to imply that we only want the one with the smallest value. Do you think it'd be ok if we remove the word 'shortest' here?