Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new TidierPlots get started #134

Merged
merged 3 commits into from
Apr 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,12 @@ DocMeta.setdocmeta!(Tidier,

pgs = [
"Home" => "index.md",
"Get Started" => ["Installation" => "installation.md", "A Simple Data Analysis" => "simple-analysis.md"],
"API Reference" => "reference.md",
"Get Started" => [
"Installation" => "installation.md",
"A Simple Data Analysis" => "simple-analysis.md",
"From Data to Plots" => "simple-plotting.md"
],
# "API Reference" => "reference.md",
"Changelog" => "news.md",
"FAQ" => "faq.md",
# "Contributing" => "contributing.md",
Expand Down
Binary file added docs/src/figs/customized-scatter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/figs/scatter.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@ features:
details: "TidierPlots.jl is a 100% Julia implementation of the R package ggplot in Julia. Powered by Makie.jl, and Julia’s extensive meta-programming capabilities, TidierPlots.jl is an R user’s love letter to data visualization in Julia."
link: https://tidierorg.github.io/TidierPlots.jl/latest/

- icon: <img width="200" height="200" src="https://github.com/TidierOrg/TidierDB.jl/raw/main/assets/logo.png" alt="tidierdb"/>
title: TidierDB.jl
details: "TidierDB.jl is a 100% Julia implementation of the R package dbplyr in Julia and similar to Python's ibis package. Its main goal is to bring the syntax of Tidier.jl to multiple SQL backends, making it possible to analyze data directly on databases without needing to copy the entire database into memory."
link: https://tidierorg.github.io/TidierDB.jl/latest/

- icon: <img width="200" height="200" src="https://github.com/TidierOrg/TidierFiles.jl/raw/main/assets/logo.png" alt="tidierfiles"/>
title: TidierFiles.jl
details: "TidierFiles.jl leverages the CSV.jl, XLSX.jl, and ReadStatTables.jl packages to reimplement the R haven and readr packages."
Expand Down
94 changes: 94 additions & 0 deletions docs/src/simple-plotting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# From data to plots

## Exploring the penguins data

A very well known dataset in the R community is the `palmerpenguins` dataset. It contains data about penguins, including their species and some ecological measurements. Let's load the data and take a look at it.

```julia
using Tidier #exports TidierPlots.jl and others
using DataFrames
using PalmerPenguins

penguins = dropmissing(DataFrame(PalmerPenguins.load()));
```

The `penguins` DataFrame contains the following columns (from `TiderData.jl` let us take a glimpse):

```julia
@glimpse penguins
```

```
Rows: 333
Columns: 7
.species InlineStrings.String15Adelie, Adelie, Adelie, Adelie, Adelie, Ade
.island InlineStrings.String15Torgersen, Torgersen, Torgersen, Torgersen,
.bill_length_mm Float64 39.1, 39.5, 40.3, 36.7, 39.3, 38.9, 39.2, 41.1, 38
.bill_depth_mm Float64 18.7, 17.4, 18.0, 19.3, 20.6, 17.8, 19.6, 17.6, 21
.flipper_length _mmInt64 181, 186, 195, 193, 190, 181, 195, 182, 191, 19
.body_mass_g Int64 3750, 3800, 3250, 3450, 3650, 3625, 4675, 3200, 38
.sex InlineStrings.String7male, female, female, female, male, female,
```

## A simple `TiderPlots.jl` scatterplot

Now the experience to plot using `TidierPlots.jl` will be as seamless as in R. Let's start by plotting the `bill_length_mm` and `bill_depth_mm` columns.

```julia
ggplot(penguins, @aes(x=bill_length_mm, y=bill_depth_mm, color = species))+
geom_point()
```

![A simple scatter plot](figs/scatter.png)

This is *not* R code, its pure Julia. And if you are familiar with R, you will find it very similar. The `ggplot` function creates a plot object, and the `geom_point` function adds a scatter layer on top of it. The `@aes` macro is used to map the variables of the `penguins` DataFrame to the aesthetics of the plot. In this case, we are mapping the `bill_length_mm` column to the x-axis, the `bill_depth_mm` column to the y-axis, and the `species` column to the color of the points. The output is a scatter plot of the `bill_length_mm` and `bill_depth_mm` columns, colored by the `species` column.

Now, `@aes()` is used to map variables in your data to visual properties (aesthetics) of the plot. These aesthetics can include things like position (x and y coordinates), color, shape, size, etc. Each aesthetic is a way of visualizing a variable or a statistical transformation of a variable.

Aesthetics are specified in the form aes(aesthetic = variable), where aesthetic is the name of the aesthetic, and variable is the column name in your data that you want to map to the aesthetic. The variable names do not need to be preceded by a colon. This is the first difference you might encounter when using `TidierPlots.jl`, and the best part is that it also accepts multiple forms for `aes` specification, none of which is exactly the same as ggplot2.

Option 1: `@aes` macro, aes as in ggplot2:

```julia
@aes(x = x, y = y)
```

Option 2: `@es`:

```julia
@es(x = x, y = y)
```

Option 3: `aes` function, julia-style columns:

```julia
aes(x = :x, y = :y)
```

Option 4: `aes` function, strings for columns:

```julia
aes(x = "x", y = "y")
```

## Customizing the plot

Moving from general rules, to specific plots, let us first explore `geom_point()`

`geom_point()` is used to create a scatter plot. It is typically used with aesthetics mapping variables to x and y positions, and optionally to other aesthetics like color, shape, and size. `geom_point()` can be used to visualize the relationship between two continuous variables, or a continuous and a discrete variable. The following visuals features can be changed within geom_point(), shape, size, stroke, strokecolour, and alpha.

```julia
ggplot(penguins, @aes(x = bill_length_mm, y = bill_depth_mm, color = species)) +
geom_point(
size = 20,
stroke = 1,
strokecolor = "black",
alpha = 0.2) +
labs(x = "Bill Length (mm)", y = "Bill Width (mm)") +
lims(x = c(40, 60), y = c(15, 20)) +
theme_minimal()
```

![Customized scatter plot](figs/customized-scatter.png)

To see more about the `TidierPlots.jl` package, you can visit the [documentation](https://tidierorg.github.io/TidierPlots.jl/latest/).
Loading