Skip to content

Commit

Permalink
feat: Passthrough of ch. 1.
Browse files Browse the repository at this point in the history
  • Loading branch information
muziejus committed Dec 12, 2024
1 parent 9de86d8 commit ab0f5f7
Show file tree
Hide file tree
Showing 14 changed files with 42 additions and 49 deletions.
87 changes: 40 additions & 47 deletions docs/index.html

Large diffs are not rendered by default.

Binary file removed docs/index_files/figure-html/unnamed-chunk-1-1.png
Binary file not shown.
Binary file modified docs/index_files/figure-html/unnamed-chunk-2-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified docs/results_files/figure-html/unnamed-chunk-22-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/results_files/figure-html/unnamed-chunk-23-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file modified docs/results_files/figure-html/unnamed-chunk-7-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/results_files/figure-html/unnamed-chunk-8-1.png
Binary file not shown.
4 changes: 2 additions & 2 deletions docs/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"href": "index.html",
"title": "“It’s Too Nice Out to Take a Cab”: Weather and Taxi Usage in New York, 2019–2024",
"section": "",
"text": "1 Introduction\nAt first, Moacir was interested in seeing if there is a relationship between “unseasonably” warm weather and New York and drought like conditions, but Sophie suggested crossing in a dataset from a different domain and seeing what kinds of results could emerge. Instead of just looking at the weather, perhaps we can draw a relationship between human behavioral response to the weather and taxi usage. What might this look like? After a bit of discussion, we had a preliminary idea of testing the hypothesis that people use cabs less often when it is “nice” out in Manhattan. That is, they are more inclined to walk to their destination than hail an expensive cab.\nQuickly it was clear, however, that proving this hypothesis would require coming up with a definition of “nice,” so we flipped the project: we’re assuming as true that people are more inclined to walk when the weather is nice, so we are using the taxi data to see if we can define what “nice” weather is. Does it just mean sunny skies, or does it have a relationship to a temperature threshold? How might relative temperature come into play, such as an unusually warm day after a cold spell, impacting people’s inclination to walk? And does the effect wear off if there are multiple nice days in a row, as the novelty of walking gives way to taking cabs again? These questions struck us as more amusing and speculative, so we decided to pursue them, instead.\nOverall, our project explores how weather influences the small, everyday decisions which collectively shape urban life. The unique spatial and temporal granularity of taxi data allows us to capture patterns of human mobility with precision. By doing so, we may observe behavioral shifts in response to weather changes in real time. Such a study not only provides a unique lens into how people adapt their transportation preferences due to the weather, but also serves as a microcosm for understanding human responses to environmental factors. Such insights are particularly relevant in a large, dynamic city like New York.",
"text": "1 Introduction\nAt first, Moacir was interested in seeing if there is a relationship between “unseasonably” warm weather and New York and drought-like conditions, but Sophie suggested crossing in a dataset from a different domain and seeing what kinds of results could emerge. Instead of just looking at the weather, perhaps we can draw a relationship between human behavioral response to the weather and taxi usage. What might this look like? After a bit of discussion, we had a preliminary idea of testing the hypothesis that people use cabs less often when it is “nice” out in Manhattan. That is, they are more inclined to walk to their destination than hail an expensive cab.\nQuickly it was clear, however, that proving this hypothesis would require coming up with a definition of “nice,” so we flipped the project: we’re assuming as true that people are more inclined to walk when the weather is nice, so we are using the taxi data to see if we can define what “nice” weather is. Does it just mean sunny skies, or does it have a relationship to a temperature threshold? How might relative temperature come into play, such as an unusually warm day after a cold spell, impacting people’s inclination to walk? And does the effect wear off if there are multiple nice days in a row, as the novelty of walking gives way to taking cabs again? These questions struck us as more amusing and speculative, so we decided to pursue them, instead.\nOverall, our project explores how weather influences the small, everyday decisions which collectively shape urban life. The unique spatial and temporal granularity of taxi data allows us to capture patterns of human mobility with precision. By doing so, we may observe behavioral shifts in response to weather changes in real time. Such a study not only provides a unique lens into how people adapt their transportation preferences due to the weather, but also serves as a microcosm for understanding human responses to environmental factors. Such insights are particularly relevant in a large, dynamic city like New York.",
"crumbs": [
"<span class='chapter-number'>1</span>  <span class='chapter-title'>Introduction</span>"
]
Expand All @@ -14,7 +14,7 @@
"href": "index.html#a-glance-at-the-data",
"title": "“It’s Too Nice Out to Take a Cab”: Weather and Taxi Usage in New York, 2019–2024",
"section": "1.1 A glance at the data",
"text": "1.1 A glance at the data\nLet’s take a quick peek at the data. Below we have weekly averages for daily temperature and taxi rides from January 2019 to June 2024.\n\n\nCode\nlibrary(ggplot2)\nlibrary(arrow)\nlibrary(dplyr)\nlibrary(lubridate)\nlibrary(scales)\nlibrary(tidyr)\ndf &lt;- read_parquet(\"data/complete_weather_and_taxi_data.parquet\")\ndf_day &lt;- df |&gt; \n group_by(date) |&gt; \n summarize(total_trips_day = sum(trip_count)) |&gt; \n select(date, total_trips_day)\ndf_day |&gt; \n mutate(week_start=lubridate::floor_date(date, unit=\"week\")) |&gt;\n group_by(week_start) |&gt;\n summarize(avg_trips_day = mean(total_trips_day))|&gt;\n ggplot(aes(x=week_start, y=avg_trips_day)) +\n geom_point(color=\"cornflowerblue\") +\n geom_line(color=\"cornflowerblue\") + \n scale_x_date(date_labels = \"%b %Y\", date_breaks = \"1 year\") +\n scale_y_continuous(labels = comma) +\n labs(title=\"Average daily taxi trips per week, January 2019 – June 2024\",\n x = \"Date\", \n y = \"Average number of trips in a day\") +\n theme_classic()\n\n\n\n\n\n\n\n\n\nConsidering the taxi data, there are many narratives that can be told. The most notable observation on the chart is the dramatic decline in ridership in March 2020, coinciding with the beginning of the COVID-19 pandemic. While ridership has increased since, it has not nearly returned to pre-pandemic levels. This trend is likely influenced by the shift towards a more work-from-home friendly economic environment, along with other behavioral changes.\nWe can also observe seasonal fluctuations; for example, it appears that there are dips and peaks around January of each year. These could be attributed to behavioral changes around the holidays, including increased travel around the holidays, staying in on the holidays themselves, or different travel patterns due to the weather. We will have to look at this with a lot more granularity in order to parse out further trends in the data.\n\n\nCode\ndf_day &lt;- df |&gt; \n filter(!is.na(temperature)) |&gt;\n group_by(date) |&gt; \n summarize(daily_temp = mean(temperature)) |&gt; \n select(date, daily_temp)\ndf_day |&gt; \n mutate(week_start=lubridate::floor_date(date, unit=\"week\")) |&gt;\n group_by(week_start) |&gt;\n summarize(avg_temp_day = mean(daily_temp)) |&gt;\n ggplot(aes(x=week_start, y=avg_temp_day)) +\n geom_point(color=\"cornflowerblue\") +\n geom_line(color=\"cornflowerblue\") + \n geom_hline(yintercept=0)+\n scale_x_date(date_labels = \"%b %Y\", date_breaks = \"1 year\") +\n labs(title=\"Average weekly temp, January 2019 – June 2024\",\n x = \"Date\", \n y = \"Tempurature (Celsius)\") +\n theme_classic()\n\n\n\n\n\n\n\n\n\nAverage weekly temperature looks fairly consistent over time, with expected seasonal peaks and valleys across the year. There may be a subtle trend of slightly higher average temperatures in more recent years, but nothing too definitive.\nAs just one data point, temperature provides a limited slice into what may distinguish a “nice day.” In the next chapter, we will see that it will be necessary to calculate additional numeric and categorical weather measurements to help establish this definition. Ideas include change in temperature and a simple categorical variable for cloud cover derived from the multiple columns currently devoted to cloud cover.",
"text": "1.1 A glance at the data\nLet’s take a quick peek at the data. Below we have weekly averages for daily taxi rides and temperature from January 2019 to June 2024.\n\n\nCode\ndf &lt;- read_parquet(\"data/complete_weather_and_taxi_data.parquet\")\ndf |&gt; \n group_by(date) |&gt; \n summarize(total_trips_day = sum(trip_count)) |&gt; \n select(date, total_trips_day) |&gt;\n mutate(week_start=lubridate::floor_date(date, unit=\"week\")) |&gt;\n group_by(week_start) |&gt;\n summarize(avg_trips_day = mean(total_trips_day))|&gt;\n ggplot(aes(x=week_start, y=avg_trips_day)) +\n geom_point(color=base_color, size=0.5) +\n geom_line(color=secondary_color) + \n scale_x_date(date_labels = \"%b %Y\", date_breaks = \"1 year\") +\n scale_y_continuous(labels = thousands) +\n labs(\n title=\"Average daily taxi trips per week, January 2019 – June 2024\",\n x = \"Date\", \n y = \"Average number of trips in a day\"\n ) \n\n\n\n\n\n\n\n\n\nConsidering the taxi data, there are many narratives that can be told. The most notable observation on the chart is the dramatic decline in ridership in March 2020, coinciding with the emergence of the full impact of the COVID-19 pandemic. While ridership has increased since, it has not nearly returned to pre-pandemic levels. This trend is likely influenced by the shift towards a more work-from-home friendly economic environment, along with other behavioral changes.\nWe can also observe seasonal fluctuations; for example, it appears that there are dips and peaks around January of each year. These could be attributed to behavioral changes around the holidays, including increased travel around the holidays, staying in on the holidays themselves, or different travel patterns due to the weather. We will have to look at this with a lot more granularity in order to parse out further trends in the data.\n\n\nCode\ndf |&gt; \n filter(!is.na(temperature)) |&gt;\n group_by(date) |&gt; \n summarize(daily_temp = mean(temperature)) |&gt; \n select(date, daily_temp) |&gt;\n mutate(week_start=lubridate::floor_date(date, unit=\"week\")) |&gt;\n group_by(week_start) |&gt;\n summarize(avg_temp_day = mean(daily_temp)) |&gt;\n ggplot(aes(x=week_start, y=avg_temp_day)) +\n geom_point(color=base_color, size=0.5) +\n geom_line(color=secondary_color) +\n geom_hline(yintercept=0)+\n scale_x_date(date_labels = \"%b %Y\", date_breaks = \"1 year\") +\n labs(title=\"Average Weekly Temperature, January 2019 – June 2024\",\n x = \"Date\", \n y = temperature_label\n ) \n\n\n\n\n\n\n\n\n\nAverage weekly temperature looks fairly consistent over time, with expected seasonal peaks and valleys across the year. There may be a subtle trend of slightly higher average temperatures in more recent years, but nothing too definitive.\nAs just one data point, temperature provides a limited slice into what may distinguish a “nice day.” In the next chapter, we will see that it will be necessary to calculate additional numeric and categorical weather measurements to help establish this definition. Ideas include change in temperature and a simple categorical variable for cloud cover derived from the multiple columns currently devoted to cloud cover.",
"crumbs": [
"<span class='chapter-number'>1</span>  <span class='chapter-title'>Introduction</span>"
]
Expand Down

0 comments on commit ab0f5f7

Please sign in to comment.