Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: have the ability to return URLs not in input data within a certain region #20

Open
evansiroky opened this issue Mar 31, 2022 · 2 comments
Assignees
Labels
product: transit data quality Items that are a part of the Transit Data Quality Product of which @evansiroky is the product owner status: on hold

Comments

@evansiroky
Copy link
Member

evansiroky commented Mar 31, 2022

User Story (Cal-ITP)

As a research data analyst,
I want to know if there are more up-to-date GTFS URLs found on feed aggregator websites than the GTFS URLs that Cal-ITP has
so that I can maintain a database of the GTFS URLs of the CA transit agencies
and so that I can have additional sources of information indicating which GTFS URLs transit agencies have

User Story (Community User)

As a transit application developer,
I want to get a list of all GTFS URLs on all feed aggregator websites for a particular region
so that I can have a complete list of all GTFS URLs to download data from to power my transit application

Acceptance Criteria

Given

  1. The input GTFS URLs given to any of the command-line input options of this program
  2. The input aggregator regions to check in

For transitland, it seems like the agencies can be queried to determine where they operate and compared with the feeds found based off of the input URLs. The command line arguments could look something like this:

--transit-land-adm1_iso=US-CA

For transitfeeds, the hardcoded location could be made configurable via a command line argument:

--transit-feeds-location=67-california-usa

  1. The GTFS URLs found on the aggregator websites for their respective regions

Then The URLs found on the aggregator websites that weren't within the input list URLs should be outputted in a separate section of the output.

Example:

When searching for all transitfeeds URLs in Saskatchewan, Canada, but also checking against a single input URL, the CLI input and result could be as follow:

CLI Input

python -m gtfs_aggregator_checker --url https://opengis.regina.ca/reginagtfs/google_transit.zip --output results.json --transit-feeds-location=196-saskatchewan-canada

JSON Output

{
  "input_url_results": {
    "https://opengis.regina.ca/reginagtfs/google_transit.zip": {
      "transitfeeds": {
        "public_web_url": "https://transitfeeds.com/p/the-city-of-regina/830",
        "status": "present"
      },
      "transitland": {
        "public_web_url": "https://www.transit.land/feeds/f-c8vx-thecityofregina",
        "status": "present"
      }
    }
  },
  "additional_aggregator_urls_in_region_not_in_input_list": [
    {
      "transitfeeds_metadata": {
        "name": "Saskatoon Transit GTFS",
        "public_web_url": "https://transitfeeds.com/p/city-of-saskatoon/264",
        "type": "GTFS Schedule"
      },
      "url": "http://apps2.saskatoon.ca/app/data/google_transit.zip"
    },
    {
      "transitfeeds_metadata": {
        "name": "Saskatoon Transit Service Alerts",
        "public_web_url": "https://transitfeeds.com/p/city-of-saskatoon/842",
        "type": "GTFS Realtime Service Alerts"
      },
      "url": "http://apps2.saskatoon.ca/app/data/Alert/Alerts.pb"
    },
    {
      "transitfeeds_metadata": {
        "name": "Saskatoon Transit Trip Updates",
        "public_web_url": "https://transitfeeds.com/p/city-of-saskatoon/841",
        "type": "GTFS Realtime Trip Updates"
      },
      "url": "http://apps2.saskatoon.ca/app/data/TripUpdate/TripUpdates.pb"
    },
    {
      "transitfeeds_metadata": {
        "name": "Saskatoon Transit Vehicle Positions",
        "public_web_url": "https://transitfeeds.com/p/city-of-saskatoon/840",
        "type": "GTFS Realtime Vehicle Positions"
      },
      "url": "http://apps2.saskatoon.ca/app/data/Vehicle/VehiclePositions.pb"
    }
  ]
}
@holly-g
Copy link

holly-g commented Apr 18, 2022

Based on Evan's input, this isn't a pressing priority and we can defer work on this. Removing it from our current sprint and setting it to Sprint: 5/16 - 5/27 for tracking purposes.

@evansiroky
Copy link
Member Author

Please icebox this indefinitely.

@evansiroky evansiroky added the product: transit data quality Items that are a part of the Transit Data Quality Product of which @evansiroky is the product owner label May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product: transit data quality Items that are a part of the Transit Data Quality Product of which @evansiroky is the product owner status: on hold
Projects
None yet
Development

No branches or pull requests

3 participants