This repository has been archived by the owner on Jan 11, 2024. It is now read-only.
generated from pjournal/gh-pages-quarto-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9970216
commit ba4e694
Showing
23 changed files
with
674 additions
and
42 deletions.
There are no files selected for viewing
File renamed without changes.
File renamed without changes
Large diffs are not rendered by default.
Oops, something went wrong.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
[ | ||
{ | ||
"objectID": "index.html", | ||
"href": "index.html", | ||
"title": "[STUDENT] Progress Journal", | ||
"section": "", | ||
"text": "Introduction\n\nThis progress journal covers [STUDENT NAME SURNAME / PROJECT GROUP NAME]’s work during their term at BDA 503 Fall 2022.\nEach section is an assignment or an individual work." | ||
}, | ||
{ | ||
"objectID": "assignment1.html#about-me", | ||
"href": "assignment1.html#about-me", | ||
"title": "1 Assignment 1", | ||
"section": "1.1 About Me", | ||
"text": "1.1 About Me\nI’m Gözde Uğur. I’ve been working as Senior Data Analyst at Trendyol for almost 2 years. Since graduating from university, I have worked in various data analyst roles where I mainly used SQL. I want to make sophisticated analyzes and visualizations of data with what I learned in this program. You can check My Linkedin Page for further info." | ||
}, | ||
{ | ||
"objectID": "assignment1.html#bridging-the-gap-between-sql-and-r", | ||
"href": "assignment1.html#bridging-the-gap-between-sql-and-r", | ||
"title": "1 Assignment 1", | ||
"section": "1.2 Bridging the Gap between SQL and R", | ||
"text": "1.2 Bridging the Gap between SQL and R\n\n\n\n\nThe reason I chose this video called Bridging the Gap between SQL and R was that I was curious about ways to use SQL, which I use frequently in daily life, in R. In this video, I learned an R package, tidyquery, with which I can run SQL Queries directly." | ||
}, | ||
{ | ||
"objectID": "assignment1.html#dataset", | ||
"href": "assignment1.html#dataset", | ||
"title": "1 Assignment 1", | ||
"section": "1.3 Dataset", | ||
"text": "1.3 Dataset\n\n1.3.1 Amazon Products Dataset 2023\n\nlibrary(dplyr)\n\n\nAttaching package: 'dplyr'\n\n\nThe following objects are masked from 'package:stats':\n\n filter, lag\n\n\nThe following objects are masked from 'package:base':\n\n intersect, setdiff, setequal, union\n\n# Using read.csv()\nmyData = read.csv(\"/Users/gozde.ugur/Downloads/archive (3)/amazon_products.csv\") \nen_cok_satanlar <- myData %>% \n arrange(desc(boughtInLastMonth)) %>% # satisadedi sütununa göre azalan sırada sırala\n head(5) # İlk 5 kaydı getirilk_5_kayit <- head(myData, 5)\n# Show only selected columns\nsecilen_sutunlar <- en_cok_satanlar[, c(\"title\",\"stars\", \"price\" ,\"boughtInLastMonth\")]\nprint(secilen_sutunlar)\n\n title\n1 Bounty Quick Size Paper Towels, White, 8 Family Rolls = 20 Regular Rolls\n2 Amazon Brand - Presto! Flex-a-Size Paper Towels, 158 Sheet Huge Roll, 12 Rolls (2 Packs of 6), Equivalent to 38 Regular Rolls, White\n3 Stardrops - The Pink Stuff - The Miracle All Purpose Cleaning Paste\n4 Amazon Basics 2-Ply Paper Towels, Flex-Sheets, 150 Sheets per Roll, 12 Rolls (2 Packs of 6), White\n5 Hismile v34 Colour Corrector, Tooth Stain Removal, Teeth Whitening Booster, Purple Toothpaste, Colour Correcting, Hismile V34, Hismile Colour Corrector, Tooth Colour Corrector\n stars price boughtInLastMonth\n1 4.8 24.42 100000\n2 4.7 28.28 100000\n3 4.4 4.99 100000\n4 4.2 22.86 100000\n5 3.4 20.69 100000\n\n\nSince I work in the e-commerce industry, Amazon Products Dataset 2023 attracted my attention. Reasons why I find this dataset useful for out course:\n1. We may have the opportunity to get more insight about the trends and behaviors in the e-commerce industry, where we are usually on the customer side.\n2. This dataset allows us to work with different data types such as boolean, integer, float and character.\n3. This dataset contains 1.4M records. Although we work with much larger datasets in real business life, working with this dataset can also be a good opportunity to learn and overcome the difficulties of large datasets." | ||
}, | ||
{ | ||
"objectID": "assignment1.html#r-posts-relevant-to-my-interests", | ||
"href": "assignment1.html#r-posts-relevant-to-my-interests", | ||
"title": "1 Assignment 1", | ||
"section": "1.4 R posts relevant to my interests", | ||
"text": "1.4 R posts relevant to my interests" | ||
}, | ||
{ | ||
"objectID": "assignment1.html#r-histogram", | ||
"href": "assignment1.html#r-histogram", | ||
"title": "1 Assignment 1", | ||
"section": "1.5 R Histogram", | ||
"text": "1.5 R Histogram\nA histogram is like a bar chart as it uses bars of varying height to represent data distribution. However, in a histogram values are grouped into consecutive intervals called bins. In a Histogram, continuous values are grouped and displayed in these bins whose size can be varied.\nExample: \n\n# Histogram for Maximum Daily Temperature \ndata(airquality) \n\nhist(airquality$Temp, main =\"La Guardia Airport's\\ \nMaximum Temperature(Daily)\", \n xlab =\"Temperature(Fahrenheit)\", \n xlim = c(50, 125), col =\"yellow\", \n freq = TRUE)" | ||
}, | ||
{ | ||
"objectID": "assignment1.html#hypothesis-testing-in-r-programming", | ||
"href": "assignment1.html#hypothesis-testing-in-r-programming", | ||
"title": "1 Assignment 1", | ||
"section": "1.6 Hypothesis Testing in R Programming", | ||
"text": "1.6 Hypothesis Testing in R Programming\nhypothesis is made by the researchers about the data collected for any experiment or data set. A hypothesis is an assumption made by the researchers that are not mandatory true. In simple words, a hypothesis is a decision taken by the researchers based on the data of the population collected. Hypothesis Testing in R Programming is a process of testing the hypothesis made by the researcher or to validate the hypothesis. To perform hypothesis testing, a random sample of data from the population is taken and testing is performed. Based on the results of the testing, the hypothesis is either selected or rejected. This concept is known as Statistical Inference. In this article, we’ll discuss the four-step process of hypothesis testing, One sample T-Testing, Two-sample T-Testing, Directional Hypothesis, one sample -test, two samples -test and correlation test in R programming.\n\n1.6.0.1 One Sample T-Testing\nOne sample T-Testing approach collects a huge amount of data and tests it on random samples. To perform T-Test in R, normally distributed data is required. This test is used to test the mean of the sample with the population. For example, the height of persons living in an area is different or identical to other persons living in other areas.\n\nSyntax: t.test(x, mu) Parameters: x: represents numeric vector of data mu: represents true value of the mean\n\nTo know about more optional parameters of t.test(), try the below command:\nhelp(\"t.test\")\nExample: \n\n# Defining sample vector\nx <- rnorm(100)\n\n# One Sample T-Test\nt.test(x, mu = 5)\n\n\n One Sample t-test\n\ndata: x\nt = -52.617, df = 99, p-value < 2.2e-16\nalternative hypothesis: true mean is not equal to 5\n95 percent confidence interval:\n 0.04392102 0.40412977\nsample estimates:\nmean of x \n0.2240254" | ||
}, | ||
{ | ||
"objectID": "assignment1.html#reading-a-.csv-file-in-r", | ||
"href": "assignment1.html#reading-a-.csv-file-in-r", | ||
"title": "1 Assignment 1", | ||
"section": "1.7 Reading a .csv File in R", | ||
"text": "1.7 Reading a .csv File in R\nread.csv(): read.csv() is used for reading “comma separated value” files (“.csv”). In this also the data will be imported as a data frame.\n\nSyntax: read.csv(file, header = TRUE, sep = “,”, dec = “.”, …)\nParameters:\n\nfile: the path to the file containing the data to be imported into R.\nheader: logical value. If TRUE, read.csv() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.\nsep: the field separator character\ndec: the character used in the file for decimal points.\n\nlibrary(dplyr)\n# Using read.csv()\nmyData = read.csv(\"/Users/gozde.ugur/Downloads/movies.csv\") \n# Show only first 5 record\ncomedy_filmler <- myData %>% filter(Genre == \"Comedy\")\n\nilk_5_kayit <- head(comedy_filmler, 5)\n# Show only selected columns\nsecilen_sutunlar <- ilk_5_kayit[, c(\"Film\", \"Genre\", \"Year\")]\nprint(secilen_sutunlar)\n\n Film Genre Year\n1 Youth in Revolt Comedy 2010\n2 You Will Meet a Tall Dark Stranger Comedy 2010\n3 When in Rome Comedy 2010\n4 What Happens in Vegas Comedy 2008\n5 Valentine's Day Comedy 2010" | ||
}, | ||
{ | ||
"objectID": "inclass1.html", | ||
"href": "inclass1.html", | ||
"title": "2 inclass1", | ||
"section": "", | ||
"text": "2.0.1 Amazon Products Dataset 2023\n\nlibrary(dplyr)\n\n\nAttaching package: 'dplyr'\n\n\nThe following objects are masked from 'package:stats':\n\n filter, lag\n\n\nThe following objects are masked from 'package:base':\n\n intersect, setdiff, setequal, union\n\n# Using read.csv()\nmyData = read.csv(\"/Users/gozde.ugur/Downloads/archive (3)/amazon_products.csv\") \nen_cok_satanlar <- myData %>% \n arrange(desc(boughtInLastMonth)) %>% # satisadedi sütununa göre azalan sırada sırala\n head(5) # İlk 5 kaydı getirilk_5_kayit \n# Show only selected columns\ntop_5 <- en_cok_satanlar[, c(\"title\",\"stars\", \"price\" ,\"boughtInLastMonth\")]\nprint(top_5)\n\n title\n1 Bounty Quick Size Paper Towels, White, 8 Family Rolls = 20 Regular Rolls\n2 Amazon Brand - Presto! Flex-a-Size Paper Towels, 158 Sheet Huge Roll, 12 Rolls (2 Packs of 6), Equivalent to 38 Regular Rolls, White\n3 Stardrops - The Pink Stuff - The Miracle All Purpose Cleaning Paste\n4 Amazon Basics 2-Ply Paper Towels, Flex-Sheets, 150 Sheets per Roll, 12 Rolls (2 Packs of 6), White\n5 Hismile v34 Colour Corrector, Tooth Stain Removal, Teeth Whitening Booster, Purple Toothpaste, Colour Correcting, Hismile V34, Hismile Colour Corrector, Tooth Colour Corrector\n stars price boughtInLastMonth\n1 4.8 24.42 100000\n2 4.7 28.28 100000\n3 4.4 4.99 100000\n4 4.2 22.86 100000\n5 3.4 20.69 100000\n\n\n\n#kategori bazında grupla, satışları topla\ncategory_sales <- myData %>% \n group_by(category_id) %>%\n summarise(category_sale=sum(boughtInLastMonth)) %>%\n ungroup()\n\n\nprint(category_sales)\n\n# A tibble: 248 × 2\n category_id category_sale\n <int> <int>\n 1 1 1099750\n 2 2 67400\n 3 3 262100\n 4 4 145050\n 5 5 500500\n 6 6 1509950\n 7 7 46650\n 8 8 128550\n 9 9 197550\n10 10 2092300\n# ℹ 238 more rows\n\n\n\n#kategori bazında ortalama fiyat\ncategory_mean_prices <- myData %>% \n group_by(category_id) %>%\n summarise(mean_price=mean(price),median_price=median(price) ) %>%\n ungroup()\n\n\nprint(category_mean_prices)\n\n# A tibble: 248 × 3\n category_id mean_price median_price\n <int> <dbl> <dbl>\n 1 1 15.6 9.99\n 2 2 28.4 11.0 \n 3 3 16.8 12.8 \n 4 4 78.2 22.0 \n 5 5 15.4 9.99\n 6 6 15.0 9.99\n 7 7 19.2 14.3 \n 8 8 18.7 14.9 \n 9 9 20.6 15.0 \n10 10 17.5 13.0 \n# ℹ 238 more rows\n\n\n\nglimpse(myData)\n\nRows: 1,426,337\nColumns: 11\n$ asin <chr> \"B014TMV5YE\", \"B07GDLCQXV\", \"B07XSCCZYG\", \"B08MVFKGJ…\n$ title <chr> \"Sion Softside Expandable Roller Luggage, Black, Che…\n$ imgUrl <chr> \"https://m.media-amazon.com/images/I/815dLQKYIYL._AC…\n$ productURL <chr> \"https://www.amazon.com/dp/B014TMV5YE\", \"https://www…\n$ stars <dbl> 4.5, 4.5, 4.6, 4.6, 4.5, 4.5, 4.5, 4.5, 4.5, 4.4, 4.…\n$ reviews <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…\n$ price <dbl> 139.99, 169.99, 365.49, 291.59, 174.99, 144.49, 169.…\n$ listPrice <dbl> 0.00, 209.99, 429.99, 354.37, 309.99, 0.00, 0.00, 0.…\n$ category_id <int> 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 10…\n$ isBestSeller <chr> \"False\", \"False\", \"False\", \"False\", \"False\", \"False\"…\n$ boughtInLastMonth <int> 2000, 1000, 300, 400, 400, 500, 400, 100, 500, 200, …" | ||
} | ||
] |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.