This is the code repository for Machine Learning with R, Fourth Edition, published by Packt. Learn techniques for building and improving machine learning models, from data preparation to model tuning, evaluation, and working with big data
With the expert help of Brett Lantz, you’ll learn how to uncover key insights and make new predictions using this hands-on, practical guide to machine learning with R. This 10th Anniversary Edition features an overview of R and plenty of new use cases for advanced users. The book is fully updated to R 4.0.0, with newer and better examples and the most up-to-date R libraries, advice on ethical and bias issues, and new chapters that dive deeper into advanced modeling techniques and methods for using big data in R.
- Learn the end-to-end process of machine learning from raw data to implementation
- Classify important outcomes using nearest neighbor and Bayesian methods
- Predict future events using decision trees, rules, and support vector machines
- Forecast numeric data and estimate financial values using regression methods
- Model complex processes with artificial neural networks
- Prepare, transform, and clean data using the tidyverse
- Evaluate your models and improve their performance
- Connect R to SQL databases and emerging big data technologies such as Spark, Hadoop, H2O, and TensorFlow
- Introducing Machine Learning
- Managing and Understanding Data
- Lazy Learning – Classification Using Nearest Neighbors
- Probabilistic Learning – Classification Using Naive Bayes
- Divide and Conquer – Classification Using Decision Trees and Rules
- Forecasting Numeric Data – Regression Methods
- Black Box Methods – Neural Networks and Support Vector Machines
- Finding Patterns – Market Basket Analysis Using Association Rules
- Finding Groups of Data – Clustering with k-means
- Evaluating Model Performance
- Being Successful with Machine Learning
- Advanced Data Preparation
- Challenging Data – Too Much, Too Little, Too Complex
- Building Better Learners
- Making Use of Big Data
If you feel this book is for you, get your copy today!
With the following software and hardware list you can run all code files present in the book (Chapter number mention here).
Chapter | Software required | Link to the software | Hardware specifications | OS required |
---|---|---|---|---|
All chapters | R version 4.2 or higher | https://cran.r-project.org/ | Should work on virtually any recent computer (with the exception of Chapter 15, which may require more resources) | Windows, MacOS, Linux |
All chapters | RStudio | https://posit.co/products/open-source/rstudio/ | Should work on virtually any recent computer | Windows, MacOS, Linux |
Chapter 5, 15 | Java SE (any recent version) | https://www.java.com/en/ | Should work on virtually any recent computer | Windows, MacOS, Linux |
You can get more engaged on the discord server for more latest updates and discussions in the community at Discord
If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost. Simply click on the link to claim your free PDF. Free-Ebook
We also provide a PDF file that has color images of the screenshots/diagrams used in this book at GraphicBundle
Brett Lantz (DataSpelunking) has spent more than 10 years using innovative data methods to understand human behavior. A sociologist by training, Brett was first captivated by machine learning during research on a large database of teenagers' social network profiles. Brett is a DataCamp instructor and a frequent speaker at machine learning conferences and workshops around the world. He is known to geek out about data science applications for sports, autonomous vehicles, foreign language learning, and fashion, among many other subjects, and hopes to one day blog about these subjects at Data Spelunking, a website dedicated to sharing knowledge about the search for insight in data.