diff --git a/Unit 1 Sprint 1 - Study Guide.ipynb b/Unit 1 Sprint 1 - Study Guide.ipynb index 5ec5cb4d..5d74215b 100644 --- a/Unit 1 Sprint 1 - Study Guide.ipynb +++ b/Unit 1 Sprint 1 - Study Guide.ipynb @@ -1 +1,622 @@ -{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Unit 1 Sprint 1 - Study Guide.ipynb","provenance":[{"file_id":"1lAICG6khpXBJmRXvaMlctWrxm8QhxI5R","timestamp":1601310442388},{"file_id":"10T4m64dmfLsGA91j2v5c5yG7rUzBqB2Q","timestamp":1578879651220},{"file_id":"1SGnO8ZjDtlDUKnPMbzT9QHCO4Fne-fyB","timestamp":1573058832416}],"collapsed_sections":[]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"Nd2OOOVXxXS1"},"source":["This study guide should reinforce and provide practice for all of the concepts you have seen in Unit 1 Sprint 1. There are a mix of written questions and coding exercises, both are equally important to prepare you for the sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job.\n","\n","If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, then research a solution with google and stackoverflow. Only once you have exausted these methods should you turn to your track team and mentor - they won't be there on your SC or during an interview. That being said, don't hesitate to ask for help if you truly are stuck.\n","\n","Have fun studying!"]},{"cell_type":"markdown","metadata":{"id":"fpvInKdXekFi"},"source":["## Questions"]},{"cell_type":"code","metadata":{"id":"Q8aB5qieZG-k"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Q6bS8AhBZ86H"},"source":["When completing this section, try to limit your answers to 2-3 sentences max and use plain english as much as possible. It's very easy to hide incomplete knowledge and undertanding behind fancy or technical words, so imagine you are explaining these things to a non-technical interviewer.\n","\n","1. What is a Data Frame?\n","```\n","your answer\n","```\n","\n","2. What is Pandas?\n","```\n","your answer\n","```\n","\n","3. How do you check for missing values?\n","```\n","your answer\n","```\n","\n","4. What is numpy?\n","```\n","your answer\n","```\n","\n","5. Explain the difference between tidy and wide (summary) data.\n","```\n","your answer\n","```\n","\n","6. Explain the difference between categorical and quantitative data.\n","```\n","your answer\n","```\n","\n","7. For categorical variables, explain the difference between an ordinal, nominal or identifier variable.\n","```\n","your answer\n","```\n","\n","8. For quantitative variables, explain the difference between a discrete and a continuous variable.\n","```\n","your answer\n","```\n","\n","9. Explain the differnece between an inner, outer, left and right merge.\n","```\n","your answer\n","```\n","\n","10. Explain the differnece between merging and concatenating data.\n","```\n","your answer\n","```\n","\n","11. Explain the purpose of a function.\n","```\n","your answer\n","```\n","\n","12. Explain what .apply() does.\n","```\n","your answer\n","```\n","\n","13. Explain what .strip() does.\n","```\n","your answer\n","```\n","\n","14. Explain what .strip('%') does.\n","```\n","your answer\n","```\n","\n","15. Explain what .split('-') does.\n","```\n","your answer\n","```\n","\n","16. Give an example of a misleading figure and how you would fix it.\n","```\n","your answer\n","```\n","\n","17. Describe the important fetures of the distribution of a quantitative variable.\n","```\n","your answer\n","```"]},{"cell_type":"markdown","metadata":{"id":"dUQaIwbceohq"},"source":["## Coding problems"]},{"cell_type":"markdown","metadata":{"id":"4jnYgnFjP6eE"},"source":["Import pandas, numpy, matplotlib, etc.\n"]},{"cell_type":"code","metadata":{"id":"S9hFYrmqQlLA"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"bCbET5ioQlmQ"},"source":["Import a dataset from a link\n"]},{"cell_type":"code","metadata":{"id":"lwNwPn5nQowi"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"2hq-PhcTQph7"},"source":["Import a dataset from a .csv file saved on your personal computer."]},{"cell_type":"code","metadata":{"id":"ZsWsuYwXRRP3"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jVqQYLgxld7M"},"source":["Import matplotlib"]},{"cell_type":"code","metadata":{"id":"ScovMuwvRdtq"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uD5cTw9plh9c"},"source":["Loading and viewing a Dataframe"]},{"cell_type":"code","metadata":{"id":"XlazD59ClhXi"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"JyThYBHGlm60"},"source":["Using the loaded DataFrame to create and display a plot or graph."]},{"cell_type":"code","metadata":{"id":"pr_vT8VSmK6J"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jsVSa5EXnS_o"},"source":["Print the first five rows of a dataset\n"]},{"cell_type":"code","metadata":{"id":"mmdHcvXznVec"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"gmYQEUVtnVrS"},"source":["Print the last five rows of a dataset"]},{"cell_type":"code","metadata":{"id":"bs4t6foXnXyC"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Lk74FAh9nYHL"},"source":["Print a single variable in a dataset"]},{"cell_type":"code","metadata":{"id":"abVBumEJnbTj"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"yxvIfMg2nblq"},"source":["Drop rows from a dataset"]},{"cell_type":"code","metadata":{"id":"lxpnqi1PnuS4"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"3lMapML2nueU"},"source":["Find the dimensions of a dataframe"]},{"cell_type":"code","metadata":{"id":"MO8vcOK3oAI3"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"-TMBHELcoAUL"},"source":["Identify the data types for each column in a dataframe"]},{"cell_type":"code","metadata":{"id":"VdHEuBNQoEdL"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"c0LRaORtoPJj"},"source":["Display summary statstics for a dataset."]},{"cell_type":"code","metadata":{"id":"PXP_Ir9noTY6"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"DqzkGVSaoT0b"},"source":["Create a new variable that is a linear combination of other variables."]},{"cell_type":"code","metadata":{"id":"wK8t9QrNosLv"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"9wUphwc2ostv"},"source":["Create a new variable using the .apply() function."]},{"cell_type":"code","metadata":{"id":"HbtajpAQpSsB"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uLQiD_pXpS4Q"},"source":["Create a new variable using if-then statments with .loc"]},{"cell_type":"code","metadata":{"id":"gzaD7rkjpb-5"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"jiI41_j_pcNT"},"source":["Add and and or statements to your if-then statement with .loc"]},{"cell_type":"code","metadata":{"id":"jZUAwgcXpiIa"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"TSzazmW1piUj"},"source":["Convert a date to a datetime format."]},{"cell_type":"code","metadata":{"id":"bZu1fB6Qraqr"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"wzarAGNkrGyh"},"source":["Make a histogram"]},{"cell_type":"code","metadata":{"id":"fIPPUOP0rG97"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"_q0dSVW3Nsms"},"source":["Make a box plot"]},{"cell_type":"code","metadata":{"id":"f7QCHb6xNv4v"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"5l623djktvYg"},"source":["Make a bar plot"]},{"cell_type":"code","metadata":{"id":"bcdFHrFOtyzE"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"ZunmkAzDty-Q"},"source":["Make a line plot"]},{"cell_type":"code","metadata":{"id":"iUUR_m3LuHL2"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"yrpxkNbRuHXd"},"source":["Print axis and figure legends"]},{"cell_type":"code","metadata":{"id":"Aj5E84MAuR3Z"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"e2zgIRceuSDg"},"source":["Identify missing data in a dataframe."]}]} \ No newline at end of file +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Unit 1 Sprint 1 - Study Guide.ipynb", + "provenance": [], + "collapsed_sections": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Nd2OOOVXxXS1" + }, + "source": [ + "This study guide should reinforce and provide practice for all of the concepts you have seen in Unit 1 Sprint 1. There are a mix of written questions and coding exercises, both are equally important to prepare you for the sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job.\n", + "\n", + "If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, then research a solution with google and stackoverflow. Only once you have exausted these methods should you turn to your track team and mentor - they won't be there on your SC or during an interview. That being said, don't hesitate to ask for help if you truly are stuck.\n", + "\n", + "Have fun studying!" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fpvInKdXekFi" + }, + "source": [ + "## Questions" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Q8aB5qieZG-k" + }, + "source": [ + "#This is a test edit" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q6bS8AhBZ86H" + }, + "source": [ + "When completing this section, try to limit your answers to 2-3 sentences max and use plain english as much as possible. It's very easy to hide incomplete knowledge and undertanding behind fancy or technical words, so imagine you are explaining these things to a non-technical interviewer.\n", + "\n", + "1. What is a Data Frame?\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "2. What is Pandas?\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "3. How do you check for missing values?\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "4. What is numpy?\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "5. Explain the difference between tidy and wide (summary) data.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "6. Explain the difference between categorical and quantitative data.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "7. For categorical variables, explain the difference between an ordinal, nominal or identifier variable.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "8. For quantitative variables, explain the difference between a discrete and a continuous variable.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "9. Explain the differnece between an inner, outer, left and right merge.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "10. Explain the differnece between merging and concatenating data.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "11. Explain the purpose of a function.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "12. Explain what .apply() does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "13. Explain what .strip() does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "14. Explain what .strip('%') does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "15. Explain what .split('-') does.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "16. Give an example of a misleading figure and how you would fix it.\n", + "```\n", + "your answer\n", + "```\n", + "\n", + "17. Describe the important fetures of the distribution of a quantitative variable.\n", + "```\n", + "your answer\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dUQaIwbceohq" + }, + "source": [ + "## Coding problems" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4jnYgnFjP6eE" + }, + "source": [ + "Import pandas, numpy, matplotlib, etc.\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "S9hFYrmqQlLA" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bCbET5ioQlmQ" + }, + "source": [ + "Import a dataset from a link\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lwNwPn5nQowi" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2hq-PhcTQph7" + }, + "source": [ + "Import a dataset from a .csv file saved on your personal computer." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ZsWsuYwXRRP3" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jVqQYLgxld7M" + }, + "source": [ + "Import matplotlib" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "ScovMuwvRdtq" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "uD5cTw9plh9c" + }, + "source": [ + "Loading and viewing a Dataframe" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "XlazD59ClhXi" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JyThYBHGlm60" + }, + "source": [ + "Using the loaded DataFrame to create and display a plot or graph." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "pr_vT8VSmK6J" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jsVSa5EXnS_o" + }, + "source": [ + "Print the first five rows of a dataset\n" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "mmdHcvXznVec" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gmYQEUVtnVrS" + }, + "source": [ + "Print the last five rows of a dataset" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bs4t6foXnXyC" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Lk74FAh9nYHL" + }, + "source": [ + "Print a single variable in a dataset" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "abVBumEJnbTj" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yxvIfMg2nblq" + }, + "source": [ + "Drop rows from a dataset" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "lxpnqi1PnuS4" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3lMapML2nueU" + }, + "source": [ + "Find the dimensions of a dataframe" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "MO8vcOK3oAI3" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-TMBHELcoAUL" + }, + "source": [ + "Identify the data types for each column in a dataframe" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "VdHEuBNQoEdL" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "c0LRaORtoPJj" + }, + "source": [ + "Display summary statstics for a dataset." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "PXP_Ir9noTY6" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DqzkGVSaoT0b" + }, + "source": [ + "Create a new variable that is a linear combination of other variables." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "wK8t9QrNosLv" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9wUphwc2ostv" + }, + "source": [ + "Create a new variable using the .apply() function." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "HbtajpAQpSsB" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "uLQiD_pXpS4Q" + }, + "source": [ + "Create a new variable using if-then statments with .loc" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "gzaD7rkjpb-5" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jiI41_j_pcNT" + }, + "source": [ + "Add and and or statements to your if-then statement with .loc" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "jZUAwgcXpiIa" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TSzazmW1piUj" + }, + "source": [ + "Convert a date to a datetime format." + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bZu1fB6Qraqr" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wzarAGNkrGyh" + }, + "source": [ + "Make a histogram" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "fIPPUOP0rG97" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_q0dSVW3Nsms" + }, + "source": [ + "Make a box plot" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "f7QCHb6xNv4v" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5l623djktvYg" + }, + "source": [ + "Make a bar plot" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "bcdFHrFOtyzE" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZunmkAzDty-Q" + }, + "source": [ + "Make a line plot" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "iUUR_m3LuHL2" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yrpxkNbRuHXd" + }, + "source": [ + "Print axis and figure legends" + ] + }, + { + "cell_type": "code", + "metadata": { + "id": "Aj5E84MAuR3Z" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e2zgIRceuSDg" + }, + "source": [ + "Identify missing data in a dataframe." + ] + } + ] +} \ No newline at end of file