upd

pjournal · Oct 26, 2023 · ba4e694 · ba4e694
1 parent 9970216
commit ba4e694
Show file tree

Hide file tree

Showing 23 changed files with 674 additions and 42 deletions.
diff --git a/assignment1.html → docs/assignment1.html b/assignment1.html → docs/assignment1.html
diff --git a/...1_files/figure-html/unnamed-chunk-2-1.png → ...1_files/figure-html/unnamed-chunk-2-1.png b/...1_files/figure-html/unnamed-chunk-2-1.png → ...1_files/figure-html/unnamed-chunk-2-1.png
diff --git a/docs/inclass1.html b/docs/inclass1.html
diff --git a/index.html → docs/index.html b/index.html → docs/index.html
diff --git a/docs/search.json b/docs/search.json
@@ -0,0 +1,65 @@
+[
+  {
+    "objectID": "index.html",
+    "href": "index.html",
+    "title": "[STUDENT] Progress Journal",
+    "section": "",
+    "text": "Introduction\n\nThis progress journal covers [STUDENT NAME SURNAME / PROJECT GROUP NAME]’s work during their term at BDA 503 Fall 2022.\nEach section is an assignment or an individual work."
+  },
+  {
+    "objectID": "assignment1.html#about-me",
+    "href": "assignment1.html#about-me",
+    "title": "1  Assignment 1",
+    "section": "1.1 About Me",
+    "text": "1.1 About Me\nI’m Gözde Uğur. I’ve been working as Senior Data Analyst at Trendyol for almost 2 years. Since graduating from university, I have worked in various data analyst roles where I mainly used SQL. I want to make sophisticated analyzes and visualizations of data with what I learned in this program. You can check My Linkedin Page for further info."
+  },
+  {
+    "objectID": "assignment1.html#bridging-the-gap-between-sql-and-r",
+    "href": "assignment1.html#bridging-the-gap-between-sql-and-r",
+    "title": "1  Assignment 1",
+    "section": "1.2 Bridging the Gap between SQL and R",
+    "text": "1.2 Bridging the Gap between SQL and R\n\n\n\n\nThe reason I chose this video called Bridging the Gap between SQL and R was that I was curious about ways to use SQL, which I use frequently in daily life, in R. In this video, I learned an R package, tidyquery, with which I can run SQL Queries directly."
+  },
+  {
+    "objectID": "assignment1.html#dataset",
+    "href": "assignment1.html#dataset",
+    "title": "1  Assignment 1",
+    "section": "1.3 Dataset",
+    "text": "1.3 Dataset\n\n1.3.1 Amazon Products Dataset 2023\n\nlibrary(dplyr)\n\n\nAttaching package: 'dplyr'\n\n\nThe following objects are masked from 'package:stats':\n\n    filter, lag\n\n\nThe following objects are masked from 'package:base':\n\n    intersect, setdiff, setequal, union\n\n# Using read.csv()\nmyData = read.csv(\"/Users/gozde.ugur/Downloads/archive (3)/amazon_products.csv\") \nen_cok_satanlar &lt;- myData %&gt;% \n  arrange(desc(boughtInLastMonth)) %&gt;%  # satisadedi sütununa göre azalan sırada sırala\n  head(5)  # İlk 5 kaydı getirilk_5_kayit &lt;- head(myData, 5)\n# Show only selected columns\nsecilen_sutunlar &lt;- en_cok_satanlar[, c(\"title\",\"stars\", \"price\" ,\"boughtInLastMonth\")]\nprint(secilen_sutunlar)\n\n                                                                                                                                                                            title\n1                                                                                                        Bounty Quick Size Paper Towels, White, 8 Family Rolls = 20 Regular Rolls\n2                                            Amazon Brand - Presto! Flex-a-Size Paper Towels, 158 Sheet Huge Roll, 12 Rolls (2 Packs of 6), Equivalent to 38 Regular Rolls, White\n3                                                                                                             Stardrops - The Pink Stuff - The Miracle All Purpose Cleaning Paste\n4                                                                              Amazon Basics 2-Ply Paper Towels, Flex-Sheets, 150 Sheets per Roll, 12 Rolls (2 Packs of 6), White\n5 Hismile v34 Colour Corrector, Tooth Stain Removal, Teeth Whitening Booster, Purple Toothpaste, Colour Correcting, Hismile V34, Hismile Colour Corrector, Tooth Colour Corrector\n  stars price boughtInLastMonth\n1   4.8 24.42            100000\n2   4.7 28.28            100000\n3   4.4  4.99            100000\n4   4.2 22.86            100000\n5   3.4 20.69            100000\n\n\nSince I work in the e-commerce industry, Amazon Products Dataset 2023 attracted my attention. Reasons why I find this dataset useful for out course:\n1. We may have the opportunity to get more insight about the trends and behaviors in the e-commerce industry, where we are usually on the customer side.\n2. This dataset allows us to work with different data types such as boolean, integer, float and character.\n3. This dataset contains 1.4M records. Although we work with much larger datasets in real business life, working with this dataset can also be a good opportunity to learn and overcome the difficulties of large datasets."
+  },
+  {
+    "objectID": "assignment1.html#r-posts-relevant-to-my-interests",
+    "href": "assignment1.html#r-posts-relevant-to-my-interests",
+    "title": "1  Assignment 1",
+    "section": "1.4 R posts relevant to my interests",
+    "text": "1.4 R posts relevant to my interests"
+  },
+  {
+    "objectID": "assignment1.html#r-histogram",
+    "href": "assignment1.html#r-histogram",
+    "title": "1  Assignment 1",
+    "section": "1.5 R Histogram",
+    "text": "1.5 R Histogram\nA histogram is like a bar chart as it uses bars of varying height to represent data distribution. However, in a histogram values are grouped into consecutive intervals called bins. In a Histogram, continuous values are grouped and displayed in these bins whose size can be varied.\nExample: \n\n# Histogram for Maximum Daily Temperature \ndata(airquality) \n\nhist(airquality$Temp, main =\"La Guardia Airport's\\ \nMaximum Temperature(Daily)\", \n    xlab =\"Temperature(Fahrenheit)\", \n    xlim = c(50, 125), col =\"yellow\", \n    freq = TRUE)"
+  },
+  {
+    "objectID": "assignment1.html#hypothesis-testing-in-r-programming",
+    "href": "assignment1.html#hypothesis-testing-in-r-programming",
+    "title": "1  Assignment 1",
+    "section": "1.6 Hypothesis Testing in R Programming",
+    "text": "1.6 Hypothesis Testing in R Programming\nhypothesis is made by the researchers about the data collected for any experiment or data set. A hypothesis is an assumption made by the researchers that are not mandatory true. In simple words, a hypothesis is a decision taken by the researchers based on the data of the population collected. Hypothesis Testing in R Programming is a process of testing the hypothesis made by the researcher or to validate the hypothesis. To perform hypothesis testing, a random sample of data from the population is taken and testing is performed. Based on the results of the testing, the hypothesis is either selected or rejected. This concept is known as Statistical Inference. In this article, we’ll discuss the four-step process of hypothesis testing, One sample T-Testing, Two-sample T-Testing, Directional Hypothesis, one sample -test, two samples -test and correlation test in R programming.\n\n1.6.0.1 One Sample T-Testing\nOne sample T-Testing approach collects a huge amount of data and tests it on random samples. To perform T-Test in R, normally distributed data is required. This test is used to test the mean of the sample with the population. For example, the height of persons living in an area is different or identical to other persons living in other areas.\n\nSyntax: t.test(x, mu) Parameters: x: represents numeric vector of data mu: represents true value of the mean\n\nTo know about more optional parameters of t.test(), try the below command:\nhelp(\"t.test\")\nExample: \n\n# Defining sample vector\nx &lt;- rnorm(100)\n\n# One Sample T-Test\nt.test(x, mu = 5)\n\n\n    One Sample t-test\n\ndata:  x\nt = -52.617, df = 99, p-value &lt; 2.2e-16\nalternative hypothesis: true mean is not equal to 5\n95 percent confidence interval:\n 0.04392102 0.40412977\nsample estimates:\nmean of x \n0.2240254"
+  },
+  {
+    "objectID": "assignment1.html#reading-a-.csv-file-in-r",
+    "href": "assignment1.html#reading-a-.csv-file-in-r",
+    "title": "1  Assignment 1",
+    "section": "1.7 Reading a .csv File in R",
+    "text": "1.7 Reading a .csv File in R\nread.csv(): read.csv() is used for reading “comma separated value” files (“.csv”). In this also the data will be imported as a data frame.\n\nSyntax: read.csv(file, header = TRUE, sep = “,”, dec = “.”, …)\nParameters:\n\nfile: the path to the file containing the data to be imported into R.\nheader: logical value. If TRUE, read.csv() assumes that your file has a header row, so row 1 is the name of each column. If that’s not the case, you can add the argument header = FALSE.\nsep: the field separator character\ndec: the character used in the file for decimal points.\n\nlibrary(dplyr)\n# Using read.csv()\nmyData = read.csv(\"/Users/gozde.ugur/Downloads/movies.csv\") \n# Show only first 5 record\ncomedy_filmler &lt;- myData %&gt;% filter(Genre == \"Comedy\")\n\nilk_5_kayit &lt;- head(comedy_filmler, 5)\n# Show only selected columns\nsecilen_sutunlar &lt;- ilk_5_kayit[, c(\"Film\", \"Genre\", \"Year\")]\nprint(secilen_sutunlar)\n\n                                Film  Genre Year\n1                    Youth in Revolt Comedy 2010\n2 You Will Meet a Tall Dark Stranger Comedy 2010\n3                       When in Rome Comedy 2010\n4              What Happens in Vegas Comedy 2008\n5                    Valentine's Day Comedy 2010"
+  },
+  {
+    "objectID": "inclass1.html",
+    "href": "inclass1.html",
+    "title": "2  inclass1",
+    "section": "",
+    "text": "2.0.1 Amazon Products Dataset 2023\n\nlibrary(dplyr)\n\n\nAttaching package: 'dplyr'\n\n\nThe following objects are masked from 'package:stats':\n\n    filter, lag\n\n\nThe following objects are masked from 'package:base':\n\n    intersect, setdiff, setequal, union\n\n# Using read.csv()\nmyData = read.csv(\"/Users/gozde.ugur/Downloads/archive (3)/amazon_products.csv\") \nen_cok_satanlar &lt;- myData %&gt;% \n  arrange(desc(boughtInLastMonth)) %&gt;%  # satisadedi sütununa göre azalan sırada sırala\n  head(5)  # İlk 5 kaydı getirilk_5_kayit \n# Show only selected columns\ntop_5 &lt;- en_cok_satanlar[, c(\"title\",\"stars\", \"price\" ,\"boughtInLastMonth\")]\nprint(top_5)\n\n                                                                                                                                                                            title\n1                                                                                                        Bounty Quick Size Paper Towels, White, 8 Family Rolls = 20 Regular Rolls\n2                                            Amazon Brand - Presto! Flex-a-Size Paper Towels, 158 Sheet Huge Roll, 12 Rolls (2 Packs of 6), Equivalent to 38 Regular Rolls, White\n3                                                                                                             Stardrops - The Pink Stuff - The Miracle All Purpose Cleaning Paste\n4                                                                              Amazon Basics 2-Ply Paper Towels, Flex-Sheets, 150 Sheets per Roll, 12 Rolls (2 Packs of 6), White\n5 Hismile v34 Colour Corrector, Tooth Stain Removal, Teeth Whitening Booster, Purple Toothpaste, Colour Correcting, Hismile V34, Hismile Colour Corrector, Tooth Colour Corrector\n  stars price boughtInLastMonth\n1   4.8 24.42            100000\n2   4.7 28.28            100000\n3   4.4  4.99            100000\n4   4.2 22.86            100000\n5   3.4 20.69            100000\n\n\n\n#kategori bazında grupla, satışları topla\ncategory_sales &lt;- myData %&gt;% \n                  group_by(category_id) %&gt;%\n                  summarise(category_sale=sum(boughtInLastMonth)) %&gt;%\n                  ungroup()\n\n\nprint(category_sales)\n\n# A tibble: 248 × 2\n   category_id category_sale\n         &lt;int&gt;         &lt;int&gt;\n 1           1       1099750\n 2           2         67400\n 3           3        262100\n 4           4        145050\n 5           5        500500\n 6           6       1509950\n 7           7         46650\n 8           8        128550\n 9           9        197550\n10          10       2092300\n# ℹ 238 more rows\n\n\n\n#kategori bazında ortalama fiyat\ncategory_mean_prices &lt;- myData %&gt;% \n                  group_by(category_id) %&gt;%\n                  summarise(mean_price=mean(price),median_price=median(price) ) %&gt;%\n                  ungroup()\n\n\nprint(category_mean_prices)\n\n# A tibble: 248 × 3\n   category_id mean_price median_price\n         &lt;int&gt;      &lt;dbl&gt;        &lt;dbl&gt;\n 1           1       15.6         9.99\n 2           2       28.4        11.0 \n 3           3       16.8        12.8 \n 4           4       78.2        22.0 \n 5           5       15.4         9.99\n 6           6       15.0         9.99\n 7           7       19.2        14.3 \n 8           8       18.7        14.9 \n 9           9       20.6        15.0 \n10          10       17.5        13.0 \n# ℹ 238 more rows\n\n\n\nglimpse(myData)\n\nRows: 1,426,337\nColumns: 11\n$ asin              &lt;chr&gt; \"B014TMV5YE\", \"B07GDLCQXV\", \"B07XSCCZYG\", \"B08MVFKGJ…\n$ title             &lt;chr&gt; \"Sion Softside Expandable Roller Luggage, Black, Che…\n$ imgUrl            &lt;chr&gt; \"https://m.media-amazon.com/images/I/815dLQKYIYL._AC…\n$ productURL        &lt;chr&gt; \"https://www.amazon.com/dp/B014TMV5YE\", \"https://www…\n$ stars             &lt;dbl&gt; 4.5, 4.5, 4.6, 4.6, 4.5, 4.5, 4.5, 4.5, 4.5, 4.4, 4.…\n$ reviews           &lt;int&gt; 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…\n$ price             &lt;dbl&gt; 139.99, 169.99, 365.49, 291.59, 174.99, 144.49, 169.…\n$ listPrice         &lt;dbl&gt; 0.00, 209.99, 429.99, 354.37, 309.99, 0.00, 0.00, 0.…\n$ category_id       &lt;int&gt; 104, 104, 104, 104, 104, 104, 104, 104, 104, 104, 10…\n$ isBestSeller      &lt;chr&gt; \"False\", \"False\", \"False\", \"False\", \"False\", \"False\"…\n$ boughtInLastMonth &lt;int&gt; 2000, 1000, 300, 400, 400, 500, 400, 100, 500, 200, …"
+  }
+]
diff --git a/site_libs/bootstrap/bootstrap-icons.css → docs/site_libs/bootstrap/bootstrap-icons.css b/site_libs/bootstrap/bootstrap-icons.css → docs/site_libs/bootstrap/bootstrap-icons.css
diff --git a/site_libs/bootstrap/bootstrap-icons.woff → .../site_libs/bootstrap/bootstrap-icons.woff b/site_libs/bootstrap/bootstrap-icons.woff → .../site_libs/bootstrap/bootstrap-icons.woff
diff --git a/site_libs/bootstrap/bootstrap.min.css → docs/site_libs/bootstrap/bootstrap.min.css b/site_libs/bootstrap/bootstrap.min.css → docs/site_libs/bootstrap/bootstrap.min.css
diff --git a/site_libs/bootstrap/bootstrap.min.js → docs/site_libs/bootstrap/bootstrap.min.js b/site_libs/bootstrap/bootstrap.min.js → docs/site_libs/bootstrap/bootstrap.min.js
diff --git a/site_libs/clipboard/clipboard.min.js → docs/site_libs/clipboard/clipboard.min.js b/site_libs/clipboard/clipboard.min.js → docs/site_libs/clipboard/clipboard.min.js
diff --git a/site_libs/quarto-html/anchor.min.js → docs/site_libs/quarto-html/anchor.min.js b/site_libs/quarto-html/anchor.min.js → docs/site_libs/quarto-html/anchor.min.js
diff --git a/site_libs/quarto-html/popper.min.js → docs/site_libs/quarto-html/popper.min.js b/site_libs/quarto-html/popper.min.js → docs/site_libs/quarto-html/popper.min.js
diff --git a/...uarto-html/quarto-syntax-highlighting.css → ...uarto-html/quarto-syntax-highlighting.css b/...uarto-html/quarto-syntax-highlighting.css → ...uarto-html/quarto-syntax-highlighting.css
diff --git a/site_libs/quarto-html/quarto.js → docs/site_libs/quarto-html/quarto.js b/site_libs/quarto-html/quarto.js → docs/site_libs/quarto-html/quarto.js
diff --git a/site_libs/quarto-html/tippy.css → docs/site_libs/quarto-html/tippy.css b/site_libs/quarto-html/tippy.css → docs/site_libs/quarto-html/tippy.css
diff --git a/site_libs/quarto-html/tippy.umd.min.js → docs/site_libs/quarto-html/tippy.umd.min.js b/site_libs/quarto-html/tippy.umd.min.js → docs/site_libs/quarto-html/tippy.umd.min.js
diff --git a/site_libs/quarto-nav/headroom.min.js → docs/site_libs/quarto-nav/headroom.min.js b/site_libs/quarto-nav/headroom.min.js → docs/site_libs/quarto-nav/headroom.min.js
diff --git a/site_libs/quarto-nav/quarto-nav.js → docs/site_libs/quarto-nav/quarto-nav.js b/site_libs/quarto-nav/quarto-nav.js → docs/site_libs/quarto-nav/quarto-nav.js
diff --git a/site_libs/quarto-search/autocomplete.umd.js → ...te_libs/quarto-search/autocomplete.umd.js b/site_libs/quarto-search/autocomplete.umd.js → ...te_libs/quarto-search/autocomplete.umd.js
diff --git a/site_libs/quarto-search/fuse.min.js → docs/site_libs/quarto-search/fuse.min.js b/site_libs/quarto-search/fuse.min.js → docs/site_libs/quarto-search/fuse.min.js
diff --git a/site_libs/quarto-search/quarto-search.js → .../site_libs/quarto-search/quarto-search.js b/site_libs/quarto-search/quarto-search.js → .../site_libs/quarto-search/quarto-search.js
diff --git a/inclass1.qmd b/inclass1.qmd
@@ -37,3 +37,18 @@ category_sales <- myData %>%
 
 print(category_sales)
 ```
+
+```{r}
+#kategori bazında ortalama fiyat
+category_mean_prices <- myData %>% 
+                  group_by(category_id) %>%
+                  summarise(mean_price=mean(price),median_price=median(price) ) %>%
+                  ungroup()
+
+
+print(category_mean_prices)
+```
+
+```{r}
+glimpse(myData)
+```
diff --git a/inclass1.rmarkdown b/inclass1.rmarkdown