From 940965450b36b0aa8424cc09ea8366746123a2b6 Mon Sep 17 00:00:00 2001 From: CelinaWalkowicz <92951131+CelinaWalkowicz@users.noreply.github.com> Date: Mon, 25 Oct 2021 08:54:21 -0700 Subject: [PATCH] Created using Colaboratory --- Unit 1 Sprint 1 - Study Guide.ipynb | 623 +++++++++++++++++++++++++++- 1 file changed, 622 insertions(+), 1 deletion(-) diff --git a/Unit 1 Sprint 1 - Study Guide.ipynb b/Unit 1 Sprint 1 - Study Guide.ipynb index 5ec5cb4d..dbb00e2a 100644 --- a/Unit 1 Sprint 1 - Study Guide.ipynb +++ b/Unit 1 Sprint 1 - Study Guide.ipynb @@ -1 +1,622 @@ -{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Unit 1 Sprint 1 - Study Guide.ipynb","provenance":[{"file_id":"1lAICG6khpXBJmRXvaMlctWrxm8QhxI5R","timestamp":1601310442388},{"file_id":"10T4m64dmfLsGA91j2v5c5yG7rUzBqB2Q","timestamp":1578879651220},{"file_id":"1SGnO8ZjDtlDUKnPMbzT9QHCO4Fne-fyB","timestamp":1573058832416}],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"Nd2OOOVXxXS1"},"source":["This study guide should reinforce and provide practice for all of the concepts you have seen in Unit 1 Sprint 1. There are a mix of written questions and coding exercises, both are equally important to prepare you for the sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job.\n","\n","If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, then research a solution with google and stackoverflow. Only once you have exausted these methods should you turn to your track team and mentor - they won't be there on your SC or during an interview. That being said, don't hesitate to ask for help if you truly are stuck.\n","\n","Have fun studying!"]},{"cell_type":"markdown","metadata":{"id":"fpvInKdXekFi"},"source":["## Questions"]},{"cell_type":"code","metadata":{"id":"Q8aB5qieZG-k"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Q6bS8AhBZ86H"},"source":["When completing this section, try to limit your answers to 2-3 sentences max and use plain english as much as possible. It's very easy to hide incomplete knowledge and undertanding behind fancy or technical words, so imagine you are explaining these things to a non-technical interviewer.\n","\n","1. What is a Data Frame?\n","```\n","your answer\n","```\n","\n","2. What is Pandas?\n","```\n","your answer\n","```\n","\n","3. How do you check for missing values?\n","```\n","your answer\n","```\n","\n","4. What is numpy?\n","```\n","your answer\n","```\n","\n","5. Explain the difference between tidy and wide (summary) data.\n","```\n","your answer\n","```\n","\n","6. Explain the difference between categorical and quantitative data.\n","```\n","your answer\n","```\n","\n","7. For categorical variables, explain the difference between an ordinal, nominal or identifier variable.\n","```\n","your answer\n","```\n","\n","8. For quantitative variables, explain the difference between a discrete and a continuous variable.\n","```\n","your answer\n","```\n","\n","9. Explain the differnece between an inner, outer, left and right merge.\n","```\n","your answer\n","```\n","\n","10. Explain the differnece between merging and concatenating data.\n","```\n","your answer\n","```\n","\n","11. Explain the purpose of a function.\n","```\n","your answer\n","```\n","\n","12. Explain what .apply() does.\n","```\n","your answer\n","```\n","\n","13. Explain what .strip() does.\n","```\n","your answer\n","```\n","\n","14. Explain what .strip('%') does.\n","```\n","your answer\n","```\n","\n","15. Explain what .split('-') does.\n","```\n","your answer\n","```\n","\n","16. Give an example of a misleading figure and how you would fix it.\n","```\n","your answer\n","```\n","\n","17. Describe the important fetures of the distribution of a quantitative variable.\n","```\n","your answer\n","```"]},{"cell_type":"markdown","metadata":{"id":"dUQaIwbceohq"},"source":["## Coding problems"]},{"cell_type":"markdown","metadata":{"id":"4jnYgnFjP6eE"},"source":["Import pandas, numpy, matplotlib, etc.\n"]},{"cell_type":"code","metadata":{"id":"S9hFYrmqQlLA"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"bCbET5ioQlmQ"},"source":["Import a dataset from a link\n"]},{"cell_type":"code","metadata":{"id":"lwNwPn5nQowi"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"2hq-PhcTQph7"},"source":["Import a dataset from a .csv file saved on your personal computer."]},{"cell_type":"code","metadata":{"id":"ZsWsuYwXRRP3"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jVqQYLgxld7M"},"source":["Import matplotlib"]},{"cell_type":"code","metadata":{"id":"ScovMuwvRdtq"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uD5cTw9plh9c"},"source":["Loading and viewing a Dataframe"]},{"cell_type":"code","metadata":{"id":"XlazD59ClhXi"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"JyThYBHGlm60"},"source":["Using the loaded DataFrame to create and display a plot or graph."]},{"cell_type":"code","metadata":{"id":"pr_vT8VSmK6J"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jsVSa5EXnS_o"},"source":["Print the first five rows of a dataset\n"]},{"cell_type":"code","metadata":{"id":"mmdHcvXznVec"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"gmYQEUVtnVrS"},"source":["Print the last five rows of a dataset"]},{"cell_type":"code","metadata":{"id":"bs4t6foXnXyC"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Lk74FAh9nYHL"},"source":["Print a single variable in a dataset"]},{"cell_type":"code","metadata":{"id":"abVBumEJnbTj"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"yxvIfMg2nblq"},"source":["Drop rows from a dataset"]},{"cell_type":"code","metadata":{"id":"lxpnqi1PnuS4"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"3lMapML2nueU"},"source":["Find the dimensions of a dataframe"]},{"cell_type":"code","metadata":{"id":"MO8vcOK3oAI3"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"-TMBHELcoAUL"},"source":["Identify the data types for each column in a dataframe"]},{"cell_type":"code","metadata":{"id":"VdHEuBNQoEdL"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"c0LRaORtoPJj"},"source":["Display summary statstics for a dataset."]},{"cell_type":"code","metadata":{"id":"PXP_Ir9noTY6"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"DqzkGVSaoT0b"},"source":["Create a new variable that is a linear combination of other variables."]},{"cell_type":"code","metadata":{"id":"wK8t9QrNosLv"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"9wUphwc2ostv"},"source":["Create a new variable using the .apply() function."]},{"cell_type":"code","metadata":{"id":"HbtajpAQpSsB"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uLQiD_pXpS4Q"},"source":["Create a new variable using if-then statments with .loc"]},{"cell_type":"code","metadata":{"id":"gzaD7rkjpb-5"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jiI41_j_pcNT"},"source":["Add and and or statements to your if-then statement with .loc"]},{"cell_type":"code","metadata":{"id":"jZUAwgcXpiIa"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"TSzazmW1piUj"},"source":["Convert a date to a datetime format."]},{"cell_type":"code","metadata":{"id":"bZu1fB6Qraqr"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"wzarAGNkrGyh"},"source":["Make a histogram"]},{"cell_type":"code","metadata":{"id":"fIPPUOP0rG97"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"_q0dSVW3Nsms"},"source":["Make a box plot"]},{"cell_type":"code","metadata":{"id":"f7QCHb6xNv4v"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"5l623djktvYg"},"source":["Make a bar plot"]},{"cell_type":"code","metadata":{"id":"bcdFHrFOtyzE"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"ZunmkAzDty-Q"},"source":["Make a line plot"]},{"cell_type":"code","metadata":{"id":"iUUR_m3LuHL2"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"yrpxkNbRuHXd"},"source":["Print axis and figure legends"]},{"cell_type":"code","metadata":{"id":"Aj5E84MAuR3Z"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"e2zgIRceuSDg"},"source":["Identify missing data in a dataframe."]}]} \ No newline at end of file +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Unit 1 Sprint 1 - Study Guide.ipynb", + "provenance": [], + "collapsed_sections": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Nd2OOOVXxXS1" + }, + "source": [ + "This study guide should reinforce and provide practice for all of the concepts you have seen in Unit 1 Sprint 1. There are a mix of written questions and coding exercises, both are equally important to prepare you for the sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job.\n", + "\n", + "If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, then research a solution with google and stackoverflow. Only once you have exausted these methods should you turn to your track team and mentor - they won't be there on your SC or during an interview. That being said, don't hesitate to ask for help if you truly are stuck.\n", + "\n", + "Have fun studying!" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fpvInKdXekFi" + }, + "source": [ + "## Questions" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Q8aB5qieZG-k" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q6bS8AhBZ86H" + }, + "source": [ + "When completing this section, try to limit your answers to 2-3 sentences max and use plain english as much as possible. It's very easy to hide incomplete knowledge and undertanding behind fancy or technical words, so imagine you are explaining these things to a non-technical interviewer.\n", + "\n", + "1. What is a Data Frame?\n", + "```\n", + "A Data Frame is a structed data set with rows and columns, much like a spread sheet.\n", + "```\n", + "\n", + "2. What is Pandas?\n", + "```\n", + "Pandas is python library built on top of numpy that allows for easier access, manipulation, and analyisis of a dataframe, and there data there within. \n", + "```\n", + "\n", + "3. How do you check for missing values?\n", + "```\n", + "after importing pandas, and reading in my dataframe: you would enter into a cell 'df.isnull().sum()\n", + "```\n", + "\n", + "4. What is numpy?\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "5. Explain the difference between tidy and wide (summary) data.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "6. Explain the difference between categorical and quantitative data.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "7. For categorical variables, explain the difference between an ordinal, nominal or identifier variable.\n", + "```\n", + "A\n", + "```\n", + "\n", + "8. For quantitative variables, explain the difference between a discrete and a continuous variable.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "9. Explain the differnece between an inner, outer, left and right merge.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "10. Explain the differnece between merging and concatenating data.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "11. Explain the purpose of a function.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "12. Explain what .apply() does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "13. Explain what .strip() does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "14. Explain what .strip('%') does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "15. Explain what .split('-') does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "16. Give an example of a misleading figure and how you would fix it.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "17. Describe the important fetures of the distribution of a quantitative variable.\n", + "```\n", + "your answer\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dUQaIwbceohq" + }, + "source": [ + "## Coding problems" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4jnYgnFjP6eE" + }, + "source": [ + "Import pandas, numpy, matplotlib, etc.\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "S9hFYrmqQlLA" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bCbET5ioQlmQ" + }, + "source": [ + "Import a dataset from a link\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lwNwPn5nQowi" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2hq-PhcTQph7" + }, + "source": [ + "Import a dataset from a .csv file saved on your personal computer." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZsWsuYwXRRP3" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jVqQYLgxld7M" + }, + "source": [ + "Import matplotlib" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ScovMuwvRdtq" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "uD5cTw9plh9c" + }, + "source": [ + "Loading and viewing a Dataframe" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "XlazD59ClhXi" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JyThYBHGlm60" + }, + "source": [ + "Using the loaded DataFrame to create and display a plot or graph." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "pr_vT8VSmK6J" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jsVSa5EXnS_o" + }, + "source": [ + "Print the first five rows of a dataset\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "mmdHcvXznVec" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gmYQEUVtnVrS" + }, + "source": [ + "Print the last five rows of a dataset" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bs4t6foXnXyC" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Lk74FAh9nYHL" + }, + "source": [ + "Print a single variable in a dataset" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "abVBumEJnbTj" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yxvIfMg2nblq" + }, + "source": [ + "Drop rows from a dataset" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lxpnqi1PnuS4" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3lMapML2nueU" + }, + "source": [ + "Find the dimensions of a dataframe" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MO8vcOK3oAI3" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-TMBHELcoAUL" + }, + "source": [ + "Identify the data types for each column in a dataframe" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VdHEuBNQoEdL" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "c0LRaORtoPJj" + }, + "source": [ + "Display summary statstics for a dataset." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "PXP_Ir9noTY6" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DqzkGVSaoT0b" + }, + "source": [ + "Create a new variable that is a linear combination of other variables." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "wK8t9QrNosLv" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9wUphwc2ostv" + }, + "source": [ + "Create a new variable using the .apply() function." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "HbtajpAQpSsB" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "uLQiD_pXpS4Q" + }, + "source": [ + "Create a new variable using if-then statments with .loc" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "gzaD7rkjpb-5" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jiI41_j_pcNT" + }, + "source": [ + "Add and and or statements to your if-then statement with .loc" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "jZUAwgcXpiIa" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TSzazmW1piUj" + }, + "source": [ + "Convert a date to a datetime format." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bZu1fB6Qraqr" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wzarAGNkrGyh" + }, + "source": [ + "Make a histogram" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "fIPPUOP0rG97" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_q0dSVW3Nsms" + }, + "source": [ + "Make a box plot" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "f7QCHb6xNv4v" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5l623djktvYg" + }, + "source": [ + "Make a bar plot" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bcdFHrFOtyzE" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZunmkAzDty-Q" + }, + "source": [ + "Make a line plot" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "iUUR_m3LuHL2" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yrpxkNbRuHXd" + }, + "source": [ + "Print axis and figure legends" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Aj5E84MAuR3Z" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e2zgIRceuSDg" + }, + "source": [ + "Identify missing data in a dataframe." + ] + } + ] +} \ No newline at end of file