An open source Data Science repository to learn and apply towards solving real world problems.
This is a shortcut path to start studying Data Science. Just follow the steps to answer the questions, "What is Data Science and what should I study to learn Data Science?"
Everything that you will need on Data Science
Data Science is one of the hottest topics on the Computer and Internet farmland nowadays. People have gathered data from applications and systems until today and now is the time to analyze them. The next steps are producing suggestions from the data and creating predictions about the future. Here you can find the biggest question for Data Science and hundreds of answers from experts.
Link | Preview |
---|---|
What is Data Science @ O'reilly | Data scientists combine entrepreneurship with patience, the willingness to build data products incrementally, the ability to explore, and the ability to iterate over a solution. They are inherently interdisciplinary. They can tackle all aspects of a problem, from initial data collection and data conditioning to drawing conclusions. They can think outside the box to come up with new ways to view the problem, or to work with very broadly defined problems: “here’s a lot of data, what can you make from it?” |
What is Data Science @ Quora | Data Science is a combination of a number of aspects of Data such as Technology, Algorithm development, and data interference to study the data, analyse it, and find innovative solutions to difficult problems. Basically Data Science is all about Analysing data and driving for business growth by finding creative ways. |
The sexiest job of 21st century | Data scientists today are akin to Wall Street “quants” of the 1980s and 1990s. In those days people with backgrounds in physics and math streamed to investment banks and hedge funds, where they could devise entirely new algorithms and data strategies. Then a variety of universities developed master’s programs in financial engineering, which churned out a second generation of talent that was more accessible to mainstream firms. The pattern was repeated later in the 1990s with search engineers, whose rarefied skills soon came to be taught in computer science programs. |
Wikipedia | Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining, machine learning and big data. |
How to Become a Data Scientist | Data scientists are big data wranglers, gathering and analyzing large sets of structured and unstructured data. A data scientist’s role combines computer science, statistics, and mathematics. They analyze, process, and model data then interpret the results to create actionable plans for companies and other organizations. |
a very short history of #datascience | The story of how data scientists became sexy is mostly the story of the coupling of the mature discipline of statistics with a very young one--computer science. The term “Data Science” has emerged only recently to specifically designate a new profession that is expected to make sense of the vast stores of big data. But making sense of data has a long history and has been discussed by scientists, statisticians, librarians, computer scientists and others for years. The following timeline traces the evolution of the term “Data Science” and its use, attempts to define it, and related terms. |
Our favorite programming language is Python nowadays for #DataScience. Python's - Pandas library has full functionalities for collecting and analyzing data. We use Anaconda to play with data and to create applications.
- Algorithms
- Colleges
- MOOC's
- Podcasts
- Books
- YouTube Videos & Channels
- Toolboxes - Environment
- Journals, Publications and Magazines
- Presentations
- Tutorials
These are some Machine Learning and Data Mining algorithms and models help you to understand your data and derive meaning from it.
- Regression
- Linear Regression
- Ordinary Least Squares
- Logistic Regression
- Stepwise Regression
- Multivariate Adaptive Regression Splines
- Locally Estimated Scatterplot Smoothing
- Classification
- k-nearest neighbor
- Support Vector Machines
- Decision Trees
- ID3 algorithm
- C4.5 algorithm
- Ensemble Learning
- Boosting
- Bagging
- Random Forest
- AdaBoost
- Clustering
- Hierchical clustering
- k-means
- Fuzzy clustering
- Mixture models
- Dimension Reduction
- Principal Component Analysis (PCA)
- t-SNE
- Neural Networks
- Self-organizing map
- Adaptive resonance theory
- Hidden Markov Models (HMM)
- S3VM
- Clustering
- Generative models
- Low-density separation
- Laplacian regularization
- Heuristic approaches
- Q Learning
- SARSA (State-Action-Reward-State-Action) algorithm
- Temporal difference learning
- C4.5
- k-Means
- SVM
- Apriori
- EM
- PageRank
- AdaBoost
- kNN
- Naive Bayes
- CART
- Multilayer Perceptron
- Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN)
- Boltzmann Machines
- Autoencoder
- Generative Adversarial Network (GAN)
- Self-Organized Maps
- Data Scientist with Python
- Data Analyst with Python
- Data Analyst with SQL Server
- Data Science for Everyone
- Machine Learning Scientist with Python
A trove of carefully curated resources and links (on the topics of software, platforms, language, techniques, etc.) related to data science, all in one place.
Please feel free to connect with me here on LinkedIn if you are interested in data science and would like to connect
MONTRÉAL.AI ACADEMY: ARTIFICIAL INTELLIGENCE 101 FIRST WORLD-CLASS OVERVIEW OF AI FOR ALL
AI thinks like a corporation—and that’s worrying - Open Voices
Does the Brain Store Information in Discrete or Analog Form?
Explainable Artificial Intelligence (Part 1) — The Importance of Human Interpretable Machine…
Is The Singularity Coming? – Arc Digital
Michael I. Jordan NYSE Machine Learning Presentation
Some scientists fear superintelligent machines could pose a threat to humanity | The Washington Post
The Four Waves of A.I. | LinkedIn
When algorithms go wrong we need power to fight back, say researchers - The Verge
Amazon CloudWatch - Application and Infrastructure Monitoring
Amazon Elastic Block Store (EBS) - Amazon Web Services
Amazon Elastic File System (EFS) | Cloud File Storage
AWS Concepts: Understanding AWS - YouTube
AWS Concepts: Understanding the Course Material & Features - YouTube
AWS re:Invent 2017: Building production apps easily with Amazon Lightsail (CMP212) - YouTube
Classless Inter-Domain Routing - Wikipedia
Cloud Compute Products – Amazon Web Services (AWS)
Cloud Object Storage | Store & Retrieve Data Anywhere | Amazon Simple Storage Service
Elastic Load Balancing - Amazon Web Services
Getting Spark, Python, and Jupyter Notebook running on Amazon EC2
Use PuTTY to access EC2 Linux Instances via SSH from Windows
What is Cloud Computing? - Amazon Web Services
7-Step Guide to Become a Machine Learning Engineer in 2021
Reducing the Need for Labeled Data in Generative Adversarial Networks
10 Free Must-Read Books for Machine Learning and Data Science
Advice to aspiring data scientists: start a blog – Variance Explained
Chris Albon - Data Science, Machine Learning, and Artificial Intelligence
explained.ai - Deep explanations of machine learning and related topics
Here Are (Approximately) 3000 Free Data Sources You Can Use Right Now
If you want to learn Data Science, take a few of these statistics classes
Learn Data Science - Infographic (article) - DataCamp
LIGO Gravity Wave GW150914_tutorial
O.R. & Analytics Success Stories - INFORMS
Paul Ford: What Is Code? | Bloomberg
Science Isn’t Broken | FiveThirtyEight
Top 28 Cheat Sheets for Machine Learning, Data Science, Probability, SQL & Big Data
GitHub Python Data Science Spotlight: AutoML, NLP, Visualization, ML Workflows
Solved end-to-end Data Science projects
Dive into Deep Learning (An interactive deep learning book with code, math, and discussions)
60+ Free Books on Big Data, Data Science, Data Mining, Machine Learning, Python, R, and more
Feature Engineering and Selection: A Practical Approach for Predictive Models
Nerual Networks and Deep Learning - an online book
Adding an existing project to GitHub using the command line - User Documentation
An Intro to Git and GitHub for Beginners (Tutorial)
Follow these simple rules and you’ll become a Git and GitHub master
git - the simple guide - no deep shit!
How not to be afraid of GIT anymore – freeCodeCamp.org
joshnh/Git-Commands: A list of commonly used Git commands
The beginner’s guide to contributing to a GitHub project – Rob Allen's DevNotes
Understanding the GitHub Flow · GitHub Guides
Towards an anti-fascist AI (from opendemocracy.net)
Becoming a Level 3.0 Data Scientist
The Third-wave of Data Scientist
46 Most Intellectually Stimulating Sites That Will Spark Your Inner Genius in 10 Minutes a Day
Artificial Intelligence Learns to Learn Entirely on Its Own | Quanta Magazine
Edward Witten Ponders the Nature of Reality | Quanta Magazine
Foundations Built for a General Theory of Neural Networks - Quanta Magazine
General Thinking Tools: 9 Mental Models to Solve Difficult Problems
How Social Media Endangers Knowledge | WIRED
In These Small Cities, AI Advances Could Be Costly - MIT Technology Review
Machine Learning’s ‘Amazing’ Ability to Predict Chaos | Quanta Magazine
New Brain Maps With Unmatched Detail May Change Neuroscience | WIRED
Pedro Domingos on the Arms Race in Artificial Intelligence - SPIEGEL ONLINE
Quantum Leaps in Quantum Computing? - Scientific American
The Fragile State of the Midwest’s Public Universities - The Atlantic
The Future of Human Work Is Imagination, Creativity, and Strategy
The Quantum Thermodynamics Revolution | Quanta Magazine
What Is Code? | Paul Ford| Bloomberg
The Economics Of Artificial Intelligence - How Cheaper Predictions Will Change The World
OpenAI’s Dota 2 defeat is still a win for artificial intelligence - The Verge
Machine Learning Confronts the Elephant in the Room | Quanta Magazine
Complete lecture notes of the Stanford/Coursera Machine Learning class by Andrew Ng
200 universities just launched 560 free online courses. Here’s the full list.
Artificial Intelligence | MIT OpenCourseWare
Dashboard | MIT Professional Education Digital Programs
Data Science A-Z™: Real-Life Data Science Exercises Included | Udemy
How to choose effective MOOCs for machine learning and data science?
I uncovered 1,150+ Coursera courses that are still completely free
Information and Entropy | MIT OpenCourseWare
Introduction to Algorithms | MIT OpenCourseWare
Introduction to Data Analysis using Excel | edX
Introduction to Python for Data Science | edX
Introduction to R for Data Science | edX
Mathematics for Computer Science | MIT OpenCourseWare
Programming with Python for Data Science!
Statistical Thinking for Data Science course
Top Data Science Online Courses in 2017 – LearnDataSci
U. Wash ML course Jupyter Home
A Visual Explanation of SQL Joins
PostgreSQL: Mathematical Functions and Operators
PostgreSQL: String Functions and Operators
Psycopg2 Tutorial - PostgreSQL with Python
The SQL Tutorial for Data Analysis | SQL Tutorial - Mode Analytics
SQL vs NoSQL or MySQL vs MongoDB - YouTube
Thinking in SQL vs Thinking in Python
Kaggle SQL course (including BigQuery topics)
Common statistical tests are linear models (or: how to teach stats)
Introductory statistics - OpenText Library
Common statistical tests are linear models (or: how to teach stats)
Regression Analysis Tutorial and Examples | Minitab
The 10 Statistical Techniques Data Scientists Need to Master
The Ultimate Guide to 12 Dimensionality Reduction Techniques (with Python codes)
Thomas Bayes and the crisis in science – TheTLS
Welcome to STAT 505! | STAT 505
Introduction to Bayesian Linear Regression – Towards Data Science
Regression Analysis Tutorial and Examples | Minitab
The 10 Statistical Techniques Data Scientists Need to Master
Welcome to STAT 505! | STAT 505
Probability and Statistics Visually
The paper describing Scikit-image from its core developers
Full-screen interactive that lets you explore the first 300 years of Data Visualization
designing-great-visualizations.pdf
Gallery of Data Visualization - Missed Opportunities and Graphical Failures
Lesson 1-4, first visualization data - Govind Acharya | Tableau Public
Mapping the 1854 Cholera Outbreak | Tableau Public
10 Free Must-Read Books for Machine Learning and Data Science
60+ Free Books on Big Data, Data Science, Data Mining, Machine Learning, Python, R, and more
GGobi data visualization system.
Here Are (Approximately) 3000 Free Data Sources You Can Use Right Now
If you want to learn Data Science, take a few of these statistics classes
Medium – Read, write and share stories that matter
Top 28 Cheat Sheets for Machine Learning, Data Science, Probability, SQL & Big Data
Learn Data Science - Infographic (article) - DataCamp
Brandon Rohrer - Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)
CS231n Lecture 10 - Recurrent Neural Networks, Image Captioning, LSTM - YouTube
Nuts and Bolts of Applying Deep Learning (Andrew Ng) - YouTube
Siraj Raval - LSTM Networks - The Math of Intelligence (Week 8) - YouTube
Siraj Raval - Recurrent Neural Networks - The Math of Intelligence (Week 5) - YouTube
Andrew Ng: Artificial Intelligence is the New Electricity - YouTube
But what is a Neural Network? | Deep learning, chapter 1
Convolutional Networks in Java - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM
CS231n Convolutional Neural Networks for Visual Recognition
Deep Learning Fundamentals - Cognitive Class
Neural networks and deep learning
Understanding Hinton’s Capsule Networks. Part I: Intuition.
Understanding LSTM Networks -- colah's blog
The Unreasonable Effectiveness of Recurrent Neural Networks
Andrej Carpathy blog - Hacker's guide to Neural Networks
J Alammar – Explorations in touchable pixels and intelligent androids
Guide to the Sequential model - Keras Documentation
How to Use Word Embedding Layers for Deep Learning with Keras - Machine Learning Mastery
Building Input Functions with tf.estimator | TensorFlow
Getting Started With TensorFlow | TensorFlow
Installing TensorFlow on Windows | TensorFlow
TensorFlow Linear Model Tutorial | TensorFlow
TensorFlow Wide & Deep Learning Tutorial | TensorFlow
Using TensorFlow in Windows with a GPU | Heaton Research
Installation Guide Windows :: CUDA Toolkit Documentation
7 Steps to Mastering Machine Learning With Python
A visual introduction to machine learning
Deep Learning For Coders fast.ai
Lecture Collection | Machine Learning - Stanford course
Microsoft Azure ML Cheat sheet
Pedro Domigos Machine Learning lectures
The Hitchhiker’s Guide to Machine Learning in Python
Top 10 Machine Learning Projects on Github
UCI Machine Learning Repository
[ISLR class videos](https://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/
Machine Learning Zero-to-Hero: Everything you need in order to compete on Kaggle for the first…
GOOGLE - Rules of Machine Learning: | Machine Learning Rules | Google Developers
R Markdown: The Definitive Guide
Understanding the GitHub Flow · GitHub Guides
How to Prepare for a Machine Learning Interview - Semantic Bits
Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
AI Knowledge Map: How To Classify AI Technologies
Building A Linear Regression with PySpark and MLlib
Complete Guide on DataFrame Operations in PySpark
Install_Spark_on_Windows10.pdf
Introduction · Mastering Apache Spark
MLlib: Main Guide - Spark 2.3.1 Documentation
Overview - Spark 2.3.1 Documentation
RDD Programming Guide - Spark 2.3.1 Documentation
rdflib 5.0.0-dev — rdflib 5.0.0-dev documentation
Spark SQL and DataFrames - Spark 2.3.1 Documentation
Welcome to Spark Python API Docs! — PySpark 2.3.1 documentation
Why You Should Consider Google AI Platform For Your Machine Learning Projects
A Short Guide to Hard Problems | Quanta Magazine
The 10 Mining Techniques Data Scientists Need for Their Toolbox
Wikipedia Data Science: Working with the World’s Largest Encyclopedia
A Brief Overview of Outlier Detection Techniques – Towards Data Science
A Beginner-Friendly Introduction to Containers, VMs and Docker
A fast and easy Docker tutorial for beginners (video series)
Docker Compose in 12 Minutes - YouTube
How to Install and Use Docker on Ubuntu 18.04 | DigitalOcean
How to Install Docker On Ubuntu 18.04 Bionic Beaver - LinuxConfig.org
Learn Docker in 12 Minutes 🐳 - YouTube
What is a Container? - YouTube
What is Docker | Docker Tutorial for Beginners | Docker Container | DevOps Tools | Edureka - YouTube
Building Your Own Data Science Platform With Python & Docker - YouTube
50+ Data Structure and Algorithms Interview Questions for Programmers
GraphQL vs. REST – Apollo GraphQL
Microservices, APIs, and Swagger: How They Fit Together | Swagger
REST API concepts and examples - YouTube
Web Architecture 101 – VideoBlocks Product & Engineering
REST API & RESTful Web Services Explained - YouTube
Our Collections – Towards Data Science
JSON Crash Course - YouTube Can I use... Support tables for HTML5, CSS3, etc HTML5 Form Validation Examples < HTML | The Art of Web
The CSS Handbook: a handy guide to CSS for developers
Creating a Simple Website with HTML and CSS - Part 1 - YouTube
Learn CSS in 12 Minutes - YouTube
Beginner JavaScript Tutorial - 1 - Introduction to JavaScript - YouTube
Form Validation with JavaScript - Check for an Empty Text Field - YouTube
JavaScript beginner tutorial 30 - form validation text boxes and passwords - YouTube
JavaScript: Simple Form Validation - YouTube
Learn JavaScript in 12 Minutes - YouTube
Machine Learning with JavaScript : Part 1 – Hacker Noon
Machine Learning with JavaScript : Part 2 – Hacker Noon
W3School - JavaScript Form Validation
W3schools - JavaScript Tutorial
ClearlyDecoded.com - Yaakov Chaikin
GoDaddy Hosting Account Getting Started Guide
How to Make A Website in 2018 - Web Hosting Guide | WHSR
Art of Problem Solving - LaTeX symbols
Detexify LaTeX handwritten symbol recognition
The Comprehensive LaTeX Symbol ListThe Comprehensive LaTeX Symbol List - symbols-a4.pdf
MathJax Documentation — MathJax 2.7 documentation
TeX Commands available in MathJax
How to Install Ubuntu Linux on VirtualBox on Windows 10 [Step by Step Guide] | It's FOSS
Microsoft PowerShell Tutorial & Training Course – Microsoft Virtual Academy
Most Popular Linux Distributions and Why They Dominate the Market
[Solved] Could not get lock /var/lib/dpkg/lock Error in Ubuntu | It's FOSS
Time Series Analysis in Python: An Introduction – Towards Data Science
RJT1990/pyflux: Open source time series library for Python
Getting Started with Time Series — PyFlux 0.4.7 documentation
Complete guide to create a Time Series Forecast (with Codes in Python)
How to Create an ARIMA Model for Time Series Forecasting with Python
Time series with Siraj course by Kaggle
Debunking The Myths And Reality Of Artificial Intelligence - Forbes
Artificial Intelligence — The Revolution Hasn’t Happened Yet
Artificial Intelligence Learns to Learn Entirely on Its Own | Quanta Magazine
Can Buddhist philosophy explain what came before the Big Bang? | Aeon Essays
Coming to Grips with the Implications of Quantum Mechanics - Scientific American Blog Network
Did Toolmaking Pave the Road for Human Language? - The Atlantic
Edward Witten Ponders the Nature of Reality | Quanta Magazine
Gatekeeping and Elitism in Data Science
How Do Aliens Solve Climate Change? - The Atlantic
How I Learned to Stop Worrying About the LHC’s Missing New Physics
How Information Got Re-Invented – Limits – Medium
How Social Media Endangers Knowledge | WIRED
In These Small Cities, AI Advances Could Be Costly - MIT Technology Review
Inside Amazon’s $3.5 million competition to make Alexa chat like a human - The Verge
Let’s make private data into a public good - MIT Technology Review
On Chomsky and the Two Cultures of Statistical Learning
Quantum Leaps in Quantum Computing? - Scientific American
Strategy vs. Tactics: What's the Difference and Why Does it Matter?
The Fragile State of the Midwest’s Public Universities - The Atlantic
The Quantum Thermodynamics Revolution | Quanta Magazine
The Way You Read Books Says A Lot About Your Intelligence, Here’s Why
To Build Truly Intelligent Machines, Teach Them Cause and Effect | Quanta Magazine
Why Is American Mass Transit So Bad? It's a Long Story. - CityLab
Yuval Noah Harari on what 2050 has in store for humankind | WIRED UK
Yuval Noah Harari on Why Technology Favors Tyranny - The Atlantic
Yuval Noah Harari: ‘The idea of free information is extremely dangerous’ | Culture | The Guardian
Beyond Weird: Decoherence, Quantum Weirdness, and Schrödinger's Cat - The Atlantic
Life Is a Braid in Spacetime – Time – Medium
Mental Models: How to Train Your Brain to Think in New Ways - James Clear - Pocket
Don’t Compete. Create! - Darius Foroux - Pocket
Tesla will live and die by the Gigafactory - The Verge
So you want to be a Research Scientist – Vincent Vanhoucke – Medium
Homeland Security Will Let Software Flag Potential Terrorists
What Happens When a World Order Ends
Kevin Slavin: How algorithms shape our world | TED Talk
The Brain's Autopilot Mechanism Steers Consciousness - Scientific American
What is Intelligence? – Towards Data Science
This Is Exactly How You Should Train Yourself To Be Smarter - Michael Simmons - Pocket
The blind spot of science is the neglect of lived experience | Aeon Essays
A Complete Tutorial to Learn Data Science with Julia from Scratch
ML Experiment Tracking: What It Is, Why It Matters, and How to Implement It
Evaluating machine learning models for fairness and bias
Creating data science APIs with Flask
Flask and Heroku for online Machine Learning deployment
Overview of the different approaches to putting Machine Learning (ML) models in production
[Guide] Building Data Science Web Application with React, NodeJS, and MySQL
A beginner’s guide to training and deploying machine learning models using Python
A Guide to Scaling Machine Learning Models in Production
Deploying Keras Deep Learning Models with Flask – Towards Data Science
Deploying Machine Learning at Scale - Algorithmia Blog
Deploying Machine Learning has never been so easy – Towards Data Science
Quora - How do you take a machine learning model to production?
Tutorial to deploy Machine Learning model in Production as API with Flask
From Big Data to micro-services: how to serve Spark-trained models through AWS lambdas
How to deliver on Machine Learning projects – Insight Data
Deploying a Keras Deep Learning Model as a Web Application in P
Genetic Algorithm Implementation in Python – Towards Data Science
Introduction to Optimization with Genetic Algorithm
A tutorial on Differential Evolution with Python · Pablo R. Mier
Guide to the Sequential model - Keras Documentation
How to Use Word Embedding Layers for Deep Learning with Keras - Machine Learning Mastery
Brandon Rohrer - Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)
CS231n Lecture 10 - Recurrent Neural Networks, Image Captioning, LSTM - YouTube
Nuts and Bolts of Applying Deep Learning (Andrew Ng) - YouTube
Siraj Raval - LSTM Networks - The Math of Intelligence (Week 8) - YouTube
Siraj Raval - Recurrent Neural Networks - The Math of Intelligence (Week 5) - YouTube
Andrew Ng: Artificial Intelligence is the New Electricity - YouTube
A Visual Guide to Evolution Strategies
Andrej Carpathy blog - Hacker's guide to Neural Networks
Best (and Free!!) Resources to understand Nuts and Bolts of Deep learning
But what is a Neural Network? | Deep learning, chapter 1
Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
Convolutional Networks in Java - Deeplearning4j: Open-source, Distributed Deep Learning for the JVM
CS231n Convolutional Neural Networks for Visual Recognition
Deep Dive into Math Behind Deep Networks – Towards Data Science
Deep Learning Fundamentals - Cognitive Class
J Alammar – Explorations in touchable pixels and intelligent androids
Learning without Backpropagation: Intuition and Ideas (Part 1) – Tom Breloff
Must know Information Theory concepts in Deep Learning (AI)
Neural networks and deep learning
Neural Style Transfer: Creating Art with Deep Learning using tf.keras and eager execution
The Unreasonable Effectiveness of Recurrent Neural Networks
Understanding Hinton’s Capsule Networks. Part I: Intuition.
Understanding LSTM Networks -- colah's blog
A Neural Network in 13 lines of Python (Part 2 - Gradient Descent) - i am trask
How Do Artificial Neural Networks Learn? – Towards Data Science
The Neural Network Zoo - The Asimov Institute
A History of Deep Learning | Import.io
The Ultimate NanoBook to understand Deep Learning based Image Classifier
How to solve 90% of NLP problems: a step-by-step guide
Coding & English Lit: Natural Language Processing in Python
TextBlob: Simplified Text Processing — TextBlob 0.15.1 documentation
Python Regular Expression Tutorial (article) - DataCamp
Reinforcement Learning Course - Full Machine Learning Tutorial
A brief introduction to reinforcement learning – freeCodeCamp.org
An introduction to Reinforcement Learning – freeCodeCamp.org
Key Papers in Deep RL — Spinning Up documentation
Nuts & Bolts of Reinforcement Learning: Model Based Planning using Dynamic Programming
Reinforcement Learning: A Deep Dive | Toptal
Part 1: Key Concepts in RL — Spinning Up documentation
Dissecting Reinforcement Learning-Part.1
Reinforcement Q-Learning from Scratch in Python with OpenAI Gym – LearnDataSci
Google AI Blog: Curiosity and Procrastination in Reinforcement Learning
Reinforcement Learning: Monte Carlo Learning using OpenAI Gym
Building Input Functions with tf.estimator | TensorFlow
Getting Started With TensorFlow | TensorFlow
Installing TensorFlow on Windows | TensorFlow
TensorFlow Linear Model Tutorial | TensorFlow
TensorFlow Wide & Deep Learning Tutorial | TensorFlow
Using TensorFlow in Windows with a GPU | Heaton Research
Installation Guide Windows :: CUDA Toolkit Documentation
7 Steps to Mastering Machine Learning With Python
A visual introduction to machine learning
Approaching (Almost) Any Machine Learning Problem | Abhishek Thakur | No Free Hunch
Automated Machine Learning Hyperparameter Tuning in Python
Deep Learning For Coders fast.ai
Essentials of Machine Learning Algorithms (with Python and R Codes)
GOOGLE - Rules of Machine Learning: | Machine Learning Rules | Google Developers
http://www.r2d3.us/visual-intro-to-machine-learning-part-2/
Lecture Collection | Machine Learning - Stanford course
Machine Learning Zero-to-Hero: Everything you need in order to compete on Kaggle for the first…
Microsoft Azure ML Cheat sheet
Open Machine Learning Course (beta) • mlcourse.ai
Pedro Domigos Machine Learning lectures
The Hitchhiker’s Guide to Machine Learning in Python
Top 10 Machine Learning Projects on Github
UCI Machine Learning Repository
Hello Kaggle! - A Kaggle Guide for someone who is new at Kaggle
Everything About Python — Beginner To Advanced
Interactive spreadsheets in Jupyter
Built-in magic commands — IPython 6.2.1 documentation
Concrete Statistics Jupyter Notebook Peter Norvig
Economics simulation Jupyter Notebook Peter Norvig
Using Interact — Jupyter Widgets 7.0.3 documentation
Pixie - visual Python debugger for Jupyter notebook
color example code: colormaps_reference.py — Matplotlib 2.0.2 documentation
Matplotlib Plotting commands summary —
Seaborn tutorial — seaborn 0.7.1 documentation
Github/jmportilla/Complete-Python-Bootcamp: Lectures
Jupyter Notebook - Udemy Complete Python Bootcamp course
Python for Data Science and Machine Learning Bootcamp | Udemy
Computational Science and Engineering I | Mathematics | MIT OpenCourseWare
Foundations of Machine Learning (A course by Bloomberg)
Linear algebra (numpy.linalg) — NumPy v1.12 Manual
NumPy v1.12 Universal functions
Random sampling (numpy.random) — NumPy v1.13 Manual
SciPy — SciPy v0.19.0 Reference Guide
numpy-100/100 Numpy exercises with hint.md at master · rougier/numpy-100
Pandas: Python Data Analysis Library
How to publish your own Python Package on PyPi – freeCodeCamp
Step-by-Step Guide to Creating R and Python Libraries (in JupyterLab)
How to submit a package to PyPI — Peter Downs
Packaging and Distributing Projects — Python Packaging User Guide
reStructuredText Primer — Sphinx 1.8.0+ documentation
Using TestPyPI — Python Packaging User Guide
How to open source your Python library | Opensource.com
Amazon Web Services (AWS) - Cloud Computing Services
Connecting to Your Linux Instance from Windows Using PuTTY - Amazon Elastic Compute Cloud
Install Spark on Windows (PySpark) – Michael Galarnyk – Medium
10 Steps to Set Up Your Python Project for Success
itertools — Functions creating iterators for efficient looping — Python 3.6.3 documentation
Processing XML in Python with ElementTree - Eli Bendersky's website
28 Jupyter Notebook tips, tricks and shortcuts
A curated list of awesome Python frameworks, libraries, software and resources
Archived Problems - Project Euler
Choosing the right estimator — scikit-learn 0.18.1 documentation
Installing XGBoost For Anaconda on Windows (IT Best Kept Secret Is Optimization)
Python Conquers The Universe | Adventures across space and time with the Python programming language
Python Flask From Scratch - YouTube
Python Tricks 101 – Hacker Noon
Python tutorial - TutorialsPoint
Regular Expressions for Data Scientists
Simple Linear Regression Analysis - ReliaWiki
Introduction — Python 101 1.0 documentation
Documenting Python Code: A Complete Guide – Real Python
MIT AI: Python (Guido van Rossum) - YouTube
Python IDEs and Code Editors (Guide) – Real Python
Advanced Python web scraping tricks and tips
A Beginner’s Guide to Neural Networks with R
A Comprehensive Guide to Data Visualisation in R for Beginners
An R Introduction to Statistics | R Tutorial
Data Manipulation with dplyr | R-bloggers
Data Science and Machine Learning Bootcamp with R | Udemy
R Tutorial Series - Statistical Tests | Saranya Anandh | Pulse | LinkedIn
R: Recursive Partitioning and Regression Trees
- A list of colleges and universities offering degrees in data science.
- Data Science Degree @ Berkeley
- Data Science Degree @ UVA
- Data Science Degree @ Wisconsin
- MS in Computer Information Systems @ Boston University
- MS in Business Analytics @ ASU Online
- Data Science Engineer @ BTH
- MS in Applied Data Science @ Syracuse
- M.S. Management & Data Science @ Leuphana
- Master of Data Science @ Melbourne University
- Msc in Data Science @ The University of Edinburgh
- Master of Management Analytics @ Queen's University
- Master of Data Science @ Illinois Institute of Technology
- Master of Applied Data Science @ The University of Michigan
- Master Data Science and Artificial Intelligence @ Eindhoven University of Technology
- Coursera Introduction to Data Science
- Data Science - 9 Steps Courses, A Specialization on Coursera
- Data Mining - 5 Steps Courses, A Specialization on Coursera
- Machine Learning – 5 Steps Courses, A Specialization on Coursera
- CS 109 Data Science
- OpenIntro
- CS 171 Visualization
- Process Mining: Data science in Action
- Oxford Deep Learning
- Oxford Deep Learning - video
- Oxford Machine Learning
- UBC Machine Learning - video
- Data Science Specialization
- Coursera Big Data Specialization
- Statistical Thinking for Data Science and Analytics by Edx
- Cognitive Class AI by IBM
- Udacity - Deep Learning
- Keras in Motion
- Microsoft Professional Program for Data Science
- COMP3222/COMP6246 - Machine Learning Technologies
- CS 231 - Convolutional Neural Networks for Visual Recognition
- Coursera Tensorflow in practice
- Coursera Deep Learning Specialization
- 365 Data Science Course
- Coursera Natural Language Processing Specialization
- Coursera GAN Specialization
- Codecademy's Data Science
- Linear Algebra - Linear Algebra course by Gilbert Strang
- A 2020 Vision of Linear Algebra (G. Strang)
- Python for Data Science Foundation Course
- Data Science: Statistics & Machine Learning
- Machine Learning Engineering for Production (MLOps)
- NLP Specialization Coursera
- Recommender Systems Specialization from University of Minnesota is an intermediate/advanced level specialization focused on Recommender System on Coursera Plaform.
- 1000 Data Science Projects you can run on browser with ipyton.
- #tidytuesday A weekly data project aimed at the R ecosystem.
- Data science your way
- PySpark Cheatsheet
- Machine Learning, Data Science and Deep Learning with Python
- How To Label Data
- Your Guide to Latent Dirichlet Allocation
- Over 1000 Data Science Online Courses at Classpert Online Search Engine
- Tutorials of source code from the book Genetic Algorithms with Python by Clinton Sheppard
- Tutorials to get started on signal processings for machine learning
- Realtime deployment Tutorial on Python time-series model deployment.
- Python for Data Science: A Beginner’s Guide
- Data Scientist with R
- Data Scientist with Python
- Genetic Algorithms OCW Course
- AI Expert Roadmap - Roadmap to becoming an Artificial Intelligence Expert
- Convex Optimization - Convex Optimization (basics of convex analysis; least-squares, linear and quadratic programs, semidefinite programming, minimax, extremal volume, and other problems; optimality conditions, duality theory...)
Link | Description |
---|---|
The Data Science Lifecycle Process | The Data Science Lifecycle Process is a process for taking data science teams from Idea to Value repeatedly and sustainably. The process is documented in this repo |
Data Science Lifecycle Template Repo | Template repository for data science lifecycle project |
RexMex | A general purpose recommender metrics library for fair evaluation. |
ChemicalX | A PyTorch based deep learning library for drug pair scoring. |
PyTorch Geometric Temporal | Representation learning on dynamic graphs. |
Little Ball of Fur | A graph sampling library for NetworkX with a Scikit-Learn like API. |
Karate Club | An unsupervised machine learning extension library for NetworkX with a Scikit-Learn like API. |
ML Workspace | All-in-one web-based IDE for machine learning and data science. The workspace is deployed as a Docker container and is preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch) and dev tools (e.g., Jupyter, VS Code) |
Neptune.ai | Community-friendly platform supporting data scientists in creating and sharing machine learning models. Neptune facilitates teamwork, infrastructure management, models comparison and reproducibility. |
steppy | Lightweight, Python library for fast and reproducible machine learning experimentation. Introduces very simple interface that enables clean machine learning pipeline design. |
steppy-toolkit | Curated collection of the neural networks, transformers and models that make your machine learning work faster and more effective. |
Datalab from Google | easily explore, visualize, analyze, and transform data using familiar languages, such as Python and SQL, interactively. |
Hortonworks Sandbox | is a personal, portable Hadoop environment that comes with a dozen interactive Hadoop tutorials. |
R | is a free software environment for statistical computing and graphics. |
RStudio | IDE – powerful user interface for R. It’s free and open source, works on Windows, Mac, and Linux. |
Python - Pandas - Anaconda | Completely free enterprise-ready Python distribution for large-scale data processing, predictive analytics, and scientific computing |
Pandas GUI | Pandas GUI |
Scikit-Learn | Machine Learning in Python |
NumPy | NumPy is fundamental for scientific computing with Python. It supports large, multi-dimensional arrays and matrices and includes an assortment of high-level mathematical functions to operate on these arrays. |
Vaex | Vaex is a Python library that allows you to visualize large datasets and calculate statistics at high speeds. |
SciPy | SciPy works with NumPy arrays and provides efficient routines for numerical integration and optimization. |
Data Science Toolbox | Coursera Course |
Data Science Toolbox | Blog |
Wolfram Data Science Platform | Take numerical, textual, image, GIS or other data and give it the Wolfram treatment, carrying out a full spectrum of data science analysis and visualization and automatically generating rich interactive reports—all powered by the revolutionary knowledge-based Wolfram Language. |
Datadog | Solutions, code, and devops for high-scale data science. |
Variance | Build powerful data visualizations for the web without writing JavaScript |
Kite Development Kit | The Kite Software Development Kit (Apache License, Version 2.0) , or Kite for short, is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem. |
Domino Data Labs | Run, scale, share, and deploy your models — without any infrastructure or setup. |
Apache Flink | A platform for efficient, distributed, general-purpose data processing. |
Apache Hama | Apache Hama is an Apache Top-Level open source project, allowing you to do advanced analytics beyond MapReduce. |
Weka | Weka is a collection of machine learning algorithms for data mining tasks. |
Octave | GNU Octave is a high-level interpreted language, primarily intended for numerical computations.(Free Matlab) |
Apache Spark | Lightning-fast cluster computing |
Hydrosphere Mist | a service for exposing Apache Spark analytics jobs and machine learning models as realtime, batch or reactive web services. |
Data Mechanics | A data science and engineering platform making Apache Spark more developer-friendly and cost-effective. |
Caffe | Deep Learning Framework |
Torch | A SCIENTIFIC COMPUTING FRAMEWORK FOR LUAJIT |
Nervana's python based Deep Learning Framework | . |
Skale | High performance distributed data processing in NodeJS |
Aerosolve | A machine learning package built for humans. |
Intel framework | Intel® Deep Learning Framework |
Datawrapper | An open source data visualization platform helping everyone to create simple, correct and embeddable charts. Also at github.com |
Tensor Flow | TensorFlow is an Open Source Software Library for Machine Intelligence |
Natural Language Toolkit | An introductory yet powerful toolkit for natural language processing and classification |
nlp-toolkit for node.js | . |
Julia | high-level, high-performance dynamic programming language for technical computing |
IJulia | a Julia-language backend combined with the Jupyter interactive environment |
Apache Zeppelin | Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more |
Featuretools | An open source framework for automated feature engineering written in python |
Optimus | Cleansing, pre-processing, feature engineering, exploratory data analysis and easy ML with PySpark backend. |
Albumentations | А fast and framework agnostic image augmentation library that implements a diverse set of augmentation techniques. Supports classification, segmentation, detection out of the box. Was used to win a number of Deep Learning competitions at Kaggle, Topcoder and those that were a part of the CVPR workshops. |
DVC | An open-source data science version control system. It helps track, organize and make data science projects reproducible. In its very basic scenario it helps version control and share large data and model files. |
Lambdo | is a workflow engine which significantly simplifies data analysis by combining in one analysis pipeline (i) feature engineering and machine learning (ii) model training and prediction (iii) table population and column evaluation. |
Feast | A feature store for the management, discovery, and access of machine learning features. Feast provides a consistent view of feature data for both model training and model serving. |
Polyaxon | A platform for reproducible and scalable machine learning and deep learning. |
LightTag | Text Annotation Tool for teams |
UBIAI | Easy-to-use text annotation tool for teams with most comprehensive auto-annotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling |
Trains | Auto-Magical Experiment Manager, Version Control & DevOps for AI |
Hopsworks | Open-source data-intensive machine learning platform with a feature store. Ingest and manage features for both online (MySQL Cluster) and offline (Apache Hive) access, train and serve models at scale. |
MindsDB | MindsDB is an Explainable AutoML framework for developers. With MindsDB you can build, train and use state of the art ML models in as simple as one line of code. |
Lightwood | A Pytorch based framework that breaks down machine learning problems into smaller blocks that can be glued together seamlessly with an objective to build predictive models with one line of code. |
AWS Data Wrangler | An open-source Python package that extends the power of Pandas library to AWS connecting DataFrames and AWS data related services (Amazon Redshift, AWS Glue, Amazon Athena, Amazon EMR, etc). |
Amazon Rekognition | AWS Rekognition is a service that lets developers working with Amazon Web Services add image analysis to their applications. Catalog assets, automate workflows, and extract meaning from your media and applications. |
Amazon Textract | Automatically extract printed text, handwriting, and data from any document. |
Amazon Lookout for Vision | Spot product defects using computer vision to automate quality inspection.Identify missing product components, vehicle and structure damage, and irregularities for comprehensive quality control. |
Amazon CodeGuru | Automate code reviews and optimize application performance with ML-powered recommendations. |
CML | An open source toolkit for using continuous integration in data science projects. Automatically train and test models in production-like environments with GitHub Actions & GitLab CI, and autogenerate visual reports on pull/merge requests. |
Dask | An open source Python library to painlessly transition your analytics code to distributed computing systems (Big Data) |
Statsmodels | A Python-based inferential statistics, hypothesis testing and regression framework |
Gensim | An open-source library for topic modeling of natural language text |
spaCy | A performant natural language processing toolkit |
Grid Studio | Grid studio is a web-based spreadsheet application with full integration of the Python programming language. |
Python Data Science Handbook | Python Data Science Handbook: full text in Jupyter Notebooks |
Shapley | A data-driven framework to quantify the value of classifiers in a machine learning ensemble. |
DAGsHub | A platform built on open source tools for data, model and pipeline management. |
Deepnote | A new kind of data science notebook. Jupyter-compatible, with real-time collaboration and running in the cloud. |
Valohai | An MLOps platform that handles machine orchestration, automatic reproducibility and deployment. |
PyMC3 | A Python Library for Probabalistic Programming (Bayesian Inference and Machine Learning) |
PyStan | Python interface to Stan (Bayesian inference and modeling) |
hmmlearn | Unsupervised learning and inference of Hidden Markov Models |
Chaos Genius | ML powered analytics engine for outlier/anomaly detection and root cause analysis |
Nimblebox | A full-stack MLOps platform designed to help data scientists and machine learning practitioners around the world discover, create, and launch multi-cloud apps from their web browser. |
- scikit-learn
- scikit-multilearn
- sklearn-expertsys
- scikit-feature
- scikit-rebate
- seqlearn
- sklearn-bayes
- sklearn-crfsuite
- sklearn-deap
- sigopt_sklearn
- sklearn-evaluation
- scikit-image
- scikit-opt
- scikit-posthocs
- pystruct
- Shogun
- xLearn
- cuML
- causalml
- mlpack
- MLxtend
- modAL
- Sparkit-learn
- hyperlearn
- dlib
- RuleFit
- pyGAM
- Deepchecks
- torchvision
- torchtext
- torchaudio
- ignite
- PyTorchNet
- PyToune
- skorch
- PyVarInf
- pytorch_geometric
- GPyTorch
- pyro
- Catalyst
- pytorch_tabular
- TensorLayer
- TFLearn
- Sonnet
- tensorpack
- TRFL
- Polyaxon
- NeuPy
- tfdeploy
- tensorflow-upstream
- TensorFlow Fold
- tensorlm
- TensorLight
- Mesh TensorFlow
- Ludwig
- TF-Agents
- TensorForce
- altair
- addepar
- amcharts
- anychart
- bokeh
- slemma
- cartodb
- Cube
- d3plus
- Data-Driven Documents(D3js)
- datahero
- dygraphs
- ECharts
- exhibit
- gephi
- ggplot2
- Glue
- Google Chart Gallery
- highcarts
- import.io
- ipychart
- jqplot
- Matplotlib
- nvd3
- Netron
- Opendata-tools
- Openrefine
- plot.ly
- raw
- Seaborn
- techanjs
- Timeline
- variancecharts
- vida
- vizzu
- Wrangler
- r2d3
- NetworkX
- Redash
- C3
- TensorWatch
- ICML - International Conference on Machine Learning
- GECCO - The Genetic and Evolutionary Computation Conference (GECCO)
- epjdatascience
- Journal of Data Science - an international journal devoted to applications of statistical methods at large
- Big Data Research
- Journal of Big Data
- Big Data & Society
- Data Science Journal
- datatau.com/news - Like Hacker News, but for data
- Data Science Trello Board
- Medium Data Science Topic - Data Science related publications on medium
- Towards Data Science Genetic Algorithm Topic -Genetic Algorithm related Publications onTowards Data Science
- How to Become a Data Scientist
- Introduction to Data Science
- Intro to Data Science for Enterprise Big Data
- How to Interview a Data Scientist
- How to Share Data with a Statistician
- The Science of a Great Career in Data Science
- What Does a Data Scientist Do?
- Building Data Start-Ups: Fast, Big, and Focused
- How to win data science competitions with Deep Learning
- Full-Stack Data Scientist
- AI at Home
- AI Today
- Adversarial Learning
- Becoming a Data Scientist
- Chai time Data Science
- Data Crunch
- Data Engineering Podcast
- Data Science at Home
- Data Science Mixer
- Data Skeptic
- Data Stories
- Datacast
- DataFramed
- DataTalks.Club
- Gradient Dissent
- Learning Machines 101
- Let's Data (Brazil)
- Linear Digressions
- Not So Standard Deviations
- O'Reilly Data Show Podcast
- Partially Derivative
- Superdatascience
- The Data Engineering Show
- The Radical AI Podcast
- The Robot Brains Podcast
- What's The Point
- Artificial Intelligence with Python - Tutorialspoint
- Machine Learning from Scratch
- Probabilistic Machine Learning: An Introduction
- A Comprehensive Guide to Machine Learning
- Become a Leader in Data Science - Early access
- Fighting Churn With Data
- Data Science at Scale with Python and Dask
- Python Data Science Handbook
- The Data Science Handbook: Advice and Insights from 25 Amazing Data Scientists
- Think Like a Data Scientist
- Introducing Data Science
- Practical Data Science with R
- Everyday Data Science & (cheaper PDF version)
- Exploring Data Science - free eBook sampler
- Exploring the Data Jungle - free eBook sampler
- Classic Computer Science Problems in Python
- Math for Programmers Early access
- R in Action, Third Edition Early access
- Data Science Bookcamp Early access
- Data Science Thinking: The Next Scientific, Technological and Economic Revolution
- Applied Data Science: Lessons Learned for the Data-Driven Business
- The Data Science Handbook
- Essential Natural Language Processing - Early access
- Mining Massive Datasets - free e-book comprehended by an online course
- Pandas in Action - Early access
- Genetic Algorithms and Genetic Programming
- Genetic algorithms in search, optimization, and machine learning - Free Download
- Advances in Evolutionary Algorithms - Free Download
- Genetic Programming: New Approaches and Successful Applications - Free Download
- Evolutionary Algorithms - Free Download
- Advances in Genetic Programming, Vol. 3 - Free Download
- Global Optimization Algorithms: Theory and Application - Free Download
- Genetic Algorithms and Evolutionary Computation - Free Download
- Convex Optimization - Convex Optimization book by Stephen Boyd - Free Download
- Data Analysis with Python and PySpark - Early access
- R for Data Science
- Build a Career in Data Science
- Machine Learning Bookcamp - Early access
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
- Effective Data Science Infrastructure
- Practical MLOps: How to Get Ready for Production Models
- Data Analysis with Python and PySpark
- Regression, a Friendly guide - Early access
- Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing
- Data Science at the Command Line: Facing the Future with Time-Tested Tools
- Machine Learning - CIn UFPE
- Machine Learning with Python - Tutorialspoint
- Deep Learning
- Designing Cloud Data Platforms - Early access
- An Introduction to Statistical Learning with Applications in R
- Deep Learning with PyTorch
- Neural Networks and Deep Learning
- Deep Learning Cookbook
- Introduction to Machine Learning with Python
- Artificial Intelligence: Foundations of Computational Agents, 2nd Edition - Free HTML version
- The Quest for Artificial Intelligence: A History of Ideas and Achievements - Free Download
- Graph Algorithms for Data Science - Early access
- Data Mesh in Action - Early access
- Bloggers
- Facebook Accounts
- Twitter Accounts
- Newsletters
- Telegram Channels
- Slack Communities
- Data Science Competitions
- Wes McKinney - Wes McKinney Archives.
- Matthew Russell - Mining The Social Web.
- Greg Reda - Greg Reda Personal Blog
- Kevin Davenport - Kevin Davenport Personal Blog
- Julia Evans - Recurse Center alumna
- Hakan Kardas - Personal Web Page
- Sean J. Taylor - Personal Web Page
- Drew Conway - Personal Web Page
- Hilary Mason - Personal Web Page
- Noah Iliinsky - Personal Blog
- Matt Harrison - Personal Blog
- Vamshi Ambati - AllThings Data Sciene
- Prash Chan - Tech Blog on Master Data Management And Every Buzz Surrounding It
- Clare Corthell - The Open Source Data Science Masters
- Paul Miller Based in the UK and working globally, Cloud of Data's consultancy services help clients understand the implications of taking data and more to the Cloud.
- Data Science London Data Science London is a non-profit organization dedicated to the free, open, dissemination of data science. We are the largest data science community in Europe. We are more than 3,190 data scientists and data geeks in our community.
- Datawrangling by Peter Skomoroch. MACHINE LEARNING, DATA MINING, AND MORE
- Quora Data Science - Data Science Questions and Answers from experts
- Siah a PhD student at Berkeley
- Data Science Report MDS, Inc. Helps Build Careers in Data Science, Advanced Analytics, Big Data Architecture, and High Performance Software Engineering
- Louis Dorard a technology guy with a penchant for the web and for data, big and small
- Machine Learning Mastery about helping professional programmers to confidently apply machine learning algorithms to address complex problems.
- Daniel Forsyth - Personal Blog
- Data Science Weekly - Weekly News Blog
- Revolution Analytics - Data Science Blog
- R Bloggers - R Bloggers
- The Practical Quant Big data
- Datascope Analytics data-driven consulting and design
- Yet Another Data Blog Yet Another Data Blog
- Spenczar a data scientist at Twitch. I handle the whole data pipeline, from tracking to model-building to reporting.
- KD Nuggets Data Mining, Analytics, Big Data, Data, Science not a blog a portal
- Meta Brown - Personal Blog
- Data Scientist is building the data scientist culture.
- WhatSTheBigData is some of, all of, or much more than the above and this blog explores its impact on information technology, the business world, government agencies, and our lives.
- Tevfik Kosar - Magnus Notitia
- New Data Scientist How a Social Scientist Jumps into the World of Big Data
- Harvard Data Science - Thoughts on Statistical Computing and Visualization
- Data Science 101 - Learning To Be A Data Scientist
- Kaggle Past Solutions
- DataScientistJourney
- NYC Taxi Visualization Blog
- Learning Lover
- Dataists
- Data-Mania
- Data-Magnum
- Map Reduce Blog
- P-value - Musings on data science, machine learning and stats.
- datascopeanalytics
- Digital transformation
- datascientistjourney
- Data Mania Blog - The File Drawer - Chris Said's science blog
- Emilio Ferrara's web page
- DataNews
- Reddit TextMining
- Periscopic
- Hilary Parker
- Data Stories
- Data Science Lab
- Meaning of
- Adventures in Data Land
- DATA MINERS BLOG
- Dataclysm
- FlowingData - Visualization and Statistics
- Calculated Risk
- O'reilly Learning Blog
- Dominodatalab
- i am trask - A Machine Learning Craftsmanship Blog
- Vademecum of Practical Data Science - Handbook and recipes for data-driven solutions of real-world problems
- Dataconomy - A blog on the new emerging data economy
- Springboard - A blog with resources for data science learners
- Analytics Vidhya - A full-fledged website about data science and analytics study material.
- Occam's Razor - Focused on Web Analytics.
- Data School - Data science tutorials for beginners!
- Colah's Blog - Blog for understanding Neural Networks!
- Sebastian's Blog - Blog for NLP and transfer learning!
- Distill - Dedicated to clear explanations of machine learning!
- Chris Albon's Website - Data Science and AI notes
- Andrew Carr - Data Science with Esoteric programming languages
- floydhub - Blog for Evolutionary Algorithms
- Jingles - Review and extract key concepts from academic papers
- nbshare - Data Science notebooks
- Deep and Shallow - All things Deep and Shallow in Data Science
- Loic Tetrel - Data science blog
- Chip Huyen's Blog - ML Engineering, MLOps, and the use of ML in startups
- Maria Khalusova - Data science blog
- Data
- Big Data Scientist
- Data Science Day
- Data Science Academy
- Facebook Data Science Page
- Data Science London
- Data Science Technology and Corporation
- Data Science - Closed Group
- Center for Data Science
- Big data hadoop NOSQL Hive Hbase
- Analytics, Data Mining, Predictive Modeling, Artificial Intelligence
- Big Data Analytics using R
- Big Data Analytics with R and Hadoop
- Big Data Learnings
- Big Data, Data Science, Data Mining & Statistics
- BigData/Hadoop Expert
- Data Mining / Machine Learning / AI
- Data Mining/Big Data - Social Network Ana
- Vademecum of Practical Data Science
- Veri Bilimi Istanbul
- The Data Science Blog
Description | |
---|---|
Big Data Combine | Rapid-fire, live tryouts for data scientists seeking to monetize their models as trading strategies |
Big Data Mania | Data Viz Wiz , Data Journalist , Growth Hacker , Author of Data Science for Dummies (2015) |
Big Data Science | Big Data, Data Science, Predictive Modeling, Business Analytics, Hadoop, Decision and Operations Research. |
Charlie Greenbacker | Director of Data Science at @ExploreAltamira |
Chris Said | Data scientist at Twitter |
Clare Corthell | Dev, Design, Data Science @mattermark #hackerei |
DADI Charles-Abner | #datascientist @Ekimetrics. , #machinelearning #dataviz #DynamicCharts #Hadoop #R #Python #NLP #Bitcoin #dataenthousiast |
Data Science Central | Data Science Central is the industry's single resource for Big Data practitioners. |
Data Science London | Data Science. Big Data. Data Hacks. Data Junkies. Data Startups. Open Data |
Data Science Renee | Documenting my path from SQL Data Analyst pursuing an Engineering Master's Degree to Data Scientist |
Data Science Report | Mission is to help guide & advance careers in Data Science & Analytics |
Data Science Tips | Tips and Tricks for Data Scientists around the world! #datascience #bigdata |
Data Vizzard | DataViz, Security, Military |
DataScienceX | |
deeplearning4j | |
DJ Patil | White House Data Chief, VP @ RelateIQ. |
Domino Data Lab | |
Drew Conway | Data nerd, hacker, student of conflict. |
Emilio Ferrara | #Networks, #MachineLearning and #DataScience. I work on #Social Media. Postdoc at @IndianaUniv |
Erin Bartolo | Running with #BigData--enjoying a love/hate relationship with its hype. @iSchoolSU #DataScience Program Mgr. |
Greg Reda | Working @ GrubHub about data and pandas |
Gregory Piatetsky | KDnuggets President, Analytics/Big Data/Data Mining/Data Science expert, KDD & SIGKDD co-founder, was Chief Scientist at 2 startups, part-time philosopher. |
Hadley Wickham | Chief Scientist at RStudio, and an Adjunct Professor of Statistics at the University of Auckland, Stanford University, and Rice University. |
Hakan Kardas | Data Scientist |
Hilary Mason | Data Scientist in Residence at @accel. |
Jeff Hammerbacher | ReTweeting about data science |
John Myles White | Scientist at Facebook and Julia developer. Author of Machine Learning for Hackers and Bandit Algorithms for Website Optimization. Tweets reflect my views only. |
Juan Miguel Lavista | Principal Data Scientist @ Microsoft Data Science Team |
Julia Evans | Hacker - Pandas - Data Analyze |
Kenneth Cukier | The Economist's Data Editor and co-author of Big Data (http://www.big-data-book.com/). |
Kevin Davenport | Organizer of https://www.meetup.com/San-Diego-Data-Science-R-Users-Group/ |
Kevin Markham | Data science instructor, and founder of Data School |
Kim Rees | Interactive data visualization and tools. Data flaneur. |
Kirk Borne | DataScientist, PhD Astrophysicist, Top #BigData Influencer. |
Linda Regber | Data story teller, visualizations. |
Luis Rei | PhD Student. Programming, Mobile, Web. Artificial Intelligence, Intelligent Robotics Machine Learning, Data Mining, Natural Language Processing, Data Science. |
Mark Stevenson | Data Analytics Recruitment Specialist at Salt (@SaltJobs) Analytics - Insight - Big Data - Datascience |
Matt Harrison | Opinions of full-stack Python guy, author, instructor, currently playing Data Scientist. Occasional fathering, husbanding, organic gardening. |
Matthew Russell | Mining the Social Web. |
Mert Nuhoğlu | Data Scientist at BizQualify, Developer |
Monica Rogati | Data @ Jawbone. Turned data into stories & products at LinkedIn. Text mining, applied machine learning, recommender systems. Ex-gamer, ex-machine coder; namer. |
Noah Iliinsky | Visualization & interaction designer. Practical cyclist. Author of vis books: https://www.oreilly.com/pub/au/4419 |
Paul Miller | Cloud Computing/ Big Data/ Open Data Analyst & Consultant. Writer, Speaker & Moderator. Gigaom Research Analyst. |
Peter Skomoroch | Creating intelligent systems to automate tasks & improve decisions. Entrepreneur, ex Principal Data Scientist @LinkedIn. Machine Learning, ProductRei, Networks |
Prash Chan | Solution Architect @ IBM, Master Data Management, Data Quality & Data Governance Blogger. Data Science, Hadoop, Big Data & Cloud. |
Quora Data Science | Quora's data science topic |
R-Bloggers | Tweet blog posts from the R blogosphere, data science conferences and (!) open jobs for data scientists. |
Rand Hindi | |
Randy Olson | Computer scientist researching artificial intelligence. Data tinkerer. Community leader for @DataIsBeautiful. #OpenScience advocate. |
Recep Erol | Data Science geek @ UALR |
Ryan Orban | Data scientist, genetic origamist, hardware aficionado |
Sean J. Taylor | Social Scientist. Hacker. Facebook Data Science Team. Keywords: Experiments, Causal Inference, Statistics, Machine Learning, Economics. |
Silvia K. Spiva | #DataScience at Cisco |
Harsh B. Gupta | Data Scientist at BBVA Compass |
Spencer Nelson | Data nerd |
Talha Oz | Enjoys ABM, SNA, DM, ML, NLP, HI, Python, Java. Top percentile kaggler/data scientist |
Tasos Skarlatidis | Complex Event Processing, Big Data, Artificial Intelligence and Machine Learning. Passionate about programming and open-source. |
Terry Timko | InfoGov; Bigdata; Data as a Service; Data Science; Open, Social & Business Data Convergence |
Tony Baer | IT analyst with Ovum covering Big Data & data management with some systems engineering thrown in. |
Tony Ojeda | Data Scientist , Author , Entrepreneur. Co-founder @DataCommunityDC. Founder @DistrictDataLab. #DataScience #BigData #DataDC |
Vamshi Ambati | Data Science @ PayPal. #NLP, #machinelearning; PhD, Carnegie Mellon alumni (Blog: https://allthingsds.wordpress.com ) |
Wes McKinney | Pandas (Python Data Analysis library). |
WileyEd | Senior Manager - @Seagate Big Data Analytics @McKinsey Alum #BigData + #Analytics Evangelist #Hadoop, #Cloud, #Digital, & #R Enthusiast |
WNYC Data News Team | The data news crew at @WNYC. Practicing data-driven journalism, making it visual and showing our work. |
Alexey Grigorev | Data science author |
- AI Digest. A weekly newsletter to keep up to date with AI, machine learning, and data science. Archive.
- DataTalks.Club. A weekly newsletter about data-related things. Archive.
- The Analytics Engineering Roundup. A newsletter about data science. Archive.
- What is machine learning?
- Andrew Ng: Deep Learning, Self-Taught Learning and Unsupervised Feature Learning
- Data36 - Data Science for Beginners by Tomi Mester
- Deep Learning: Intelligence from Big Data
- Interview with Google's AI and Deep Learning 'Godfather' Geoffrey Hinton
- Introduction to Deep Learning with Python
- What is machine learning, and how does it work?
- Data School - Data Science Education
- Neural Nets for Newbies by Melanie Warrick (May 2015)
- Neural Networks video series by Hugo Larochelle
- Google DeepMind co-founder Shane Legg - Machine Super Intelligence
- Data Science Primer
- Data Science with Genetic Algorithms
- Data Science for Beginners
- DataTalks.Club
- Mildlyoverfitted - Tutorials on intermediate ML/DL topics
- mlops.community - Interviews of industry experts about production ML
- ML Street Talk - Unabashedly technical and non-commercial, so you will hear no annoying pitches.
- Neural networks by 3Blue1Brown
- Neural networks from scratch by Sentdex
- Manning Publications YouTube channel
- Ask Dr Chong: How to Lead in Data Science - Part 1
- Ask Dr Chong: How to Lead in Data Science - Part 2
- Ask Dr Chong: How to Lead in Data Science - Part 3
- Ask Dr Chong: How to Lead in Data Science - Part 4
- Ask Dr Chong: How to Lead in Data Science - Part 5
- Ask Dr Chong: How to Lead in Data Science - Part 6
- Regression Models: Applying simple Poisson regression
- Open Data Science – First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former.
- Loss function porn — Beautiful posts on DS/ML theme with video or graphic vizualization.
- Machinelearning – Daily ML news.
Some data mining competition platforms
Preview | Description |
---|---|
Key differences of a data scientist vs. data engineer | |
A visual guide to Becoming a Data Scientist in 8 Steps by DataCamp (img) | |
Mindmap on required skills (img) | |
Swami Chandrasekaran made a Curriculum via Metro map. | |
by @kzawadz via twitter | |
By Data Science Central | |
Data Science Wars: R vs Python | |
How to select statistical or machine learning techniques | |
Choosing the Right Estimator | |
The Data Science Industry: Who Does What | |
Data Science |
|
Different Data Science Skills and Roles from this article by Springboard | |
A simple and friendly way of teaching your non-data scientist/non-statistician colleagues how to avoid mistakes with data. From Geckoboard's Data Literacy Lessons. |
- Academic Torrents
- hadoopilluminated.com
- data.gov - The home of the U.S. Government's open data
- United States Census Bureau
- usgovxml.com
- enigma.com - Navigate the world of public data - Quickly search and analyze billions of public records published by governments, companies and organizations.
- datahub.io
- aws.amazon.com/datasets
- datacite.org
- The official portal for European data
- quandl.com - Get the data you need in the form you want; instant download, API or direct to your app.
- figshare.com
- GeoLite Legacy Downloadable Databases
- Quora's Big Datasets Answer
- Public Big Data Sets
- Kaggle Datasets
- A Deep Catalog of Human Genetic Variation
- A community-curated database of well-known people, places, and things
- Google Public Data
- World Bank Data
- NYC Taxi data
- Open Data Philly Connecting people with data for Philadelphia
- grouplens.org Sample movie (with ratings), book and wiki datasets
- UC Irvine Machine Learning Repository - contains data sets good for machine learning
- research-quality data sets by Hilary Mason
- National Climatic Data Center - NOAA
- ClimateData.us (related: U.S. Climate Resilience Toolkit)
- r/datasets
- MapLight - provides a variety of data free of charge for uses that are freely available to the general public. Click on a data set below to learn more
- GHDx - Institute for Health Metrics and Evaluation - a catalog of health and demographic datasets from around the world and including IHME results
- St. Louis Federal Reserve Economic Data - FRED
- New Zealand Institute of Economic Research – Data1850
- Open Data Sources
- UNICEF Data
- undata
- NASA SocioEconomic Data and Applications Center - SEDAC
- The GDELT Project
- Sweden, Statistics
- Github free data source list
- StackExchange Data Explorer - an open source tool for running arbitrary queries against public data from the Stack Exchange network.
- SocialGrep - a collection of open Reddit datasets.
- San Fransisco Government Open Data
- IBM Blog about open data
- IBM Asset Dataset
- Open data Index
- Public Git Archive
- GHTorrent
- Microsoft Research Open Data
- Open Government Data Platform India
- Google Dataset Search (beta)
- NAYN.CO Turkish News with categories
- Covid-19
- Covid-19 Google
- Enron Email Dataset
- 5000 Images of Clothes
- Other amazingly awesome lists can be found in the awesome-awesomeness
- Awesome Machine Learning
- lists
- awesome-dataviz
- awesome-python
- Data Science IPython Notebooks.
- awesome-r
- awesome-datasets
- awesome-Machine Learning & Deep Learning Tutorials
- Awesome Data Science Ideas
- Machine Learning for Software Engineers
- Community Curated Data Science Resources
- Awesome Machine Learning On Source Code
- Awesome Community Detection
- Awesome Graph Classification
- Awesome Decision Tree Papers
- Awesome Fraud Detection Papers
- Awesome Gradient Boosting Papers
- Awesome Computer Vision Models
- Awesome Monte Carlo Tree Search
- Glossary of common statistics and ML terms
- 100 NLP Papers
- Awesome Game Datasets
- Data Science Interviews Questions
- Awesome Explainable Graph Reasoning
- Top Data Science Interview Questions
- Awesome Drug Synergy, Interaction and Polypharmacy Prediction
Course | Slides | Dataset | Notes | Solutions |
---|---|---|---|---|
Introduction to Python | - | - | - | - |
Intermediate Python | - | - | - | - |
PROJECT TV, Halftime Shows, and the Big Game | - | - | - | - |
Data Manipulation with pandas | - | - | - | - |
PROJECT The Android App Market on Google Play | - | - | - | - |
Merging DataFrames with pandas | - | - | - | - |
PROJECT The GitHub History of the Scala Language | - | - | - | - |
Introduction to Data Visualization with Matplotlib | - | - | - | - |
Introduction to Data Visualization with Seaborn | - | - | - | - |
Python Data Science Toolbox (Part 1) | - | - | link | - |
Python Data Science Toolbox (Part 2) | - | - | - | - |
Intermediate Data Visualization with Seaborn | - | - | - | - |
PROJECT A Visual History of Nobel Prize Winners | - | - | - | - |
Introduction to Importing Data in Python | - | - | - | - |
Intermediate Importing Data in Python | - | - | - | - |
Importing & Cleaning Data with Python | - | - | - | - |
Cleaning Data in Python | - | - | - | - |
Working with Dates and Times in Python | - | - | - | - |
Writing Functions in Python | - | - | - | - |
Exploratory Data Analysis in Python | - | - | - | - |
Analyzing Police Activity with pandas | - | - | - | - |
Statistical Thinking in Python (Part 1) | - | - | - | - |
Statistical Thinking in Python (Part 2) | - | - | - | - |
PROJECT Dr. Semmelweis and the Discovery of Handwashing | - | - | - | - |
Supervised Learning with scikit-learn | - | - | - | - |
PROJECT Predicting Credit Card Approvals | - | - | - | - |
Unsupervised Learning in Python | - | - | - | - |
Machine Learning with Tree-Based Models in Python | - | - | - | - |
Case Study: School Budgeting with Machine Learning in Python | - | - | - | - |
Cluster Analysis in Python | - | - | - | - |
Course | Slides | Dataset | Notes | Solutions |
---|---|---|---|---|
Introduction to Data Science in Python | - | - | - | - |
Intermediate Python | - | - | - | - |
Data Manipulation with pandas | - | - | - | - |
Merging DataFrames with pandas | - | - | - | - |
Introduction to Data Visualization with Matplotlib | - | - | - | - |
Introduction to Data Visualization with Seaborn | - | - | - | - |
Introduction to Importing Data in Python | - | - | - | - |
Intermediate Importing Data in Python | - | - | - | - |
Cleaning Data in Python | - | - | - | - |
Exploratory Data Analysis in Python | - | - | - | - |
Analyzing Police Activity with pandas | - | - | - | - |
Introduction to SQL | - | - | - | - |
Streamlined Data Ingestion with pandas | - | - | - | - |
Introduction to Relational Databases in SQL | - | - | - | - |
Joining Data in SQL | - | - | - | - |
Introduction to Databases in Python | - | - | - | - |
Course | Solutions |
---|---|
Introduction to SQL Server | link |
Introduction to Relational Databases in SQL | link |
Intermediate SQL Server | link |
Time Series Analysis in SQL Server | - |
Functions for Manipulating Data in SQL Server | - |
Database Design | link |
Hierarchical and Recursive Queries in SQL Server | - |
Transactions and Error Handling in SQL Server | - |
Writing Functions and Stored Procedures in SQL Server | - |
Building and Optimizing Triggers in SQL Server | link |
Improving Query Performance in SQL Server | - |
Course | Slides | Dataset | Notes | Solutions |
---|---|---|---|---|
Introduction to Python | - | - | - | - |
Intermediate Python | - | - | - | - |
Python Data Science Toolbox (Part 1) | - | - | - | - |
Python Data Science Toolbox (Part 2) | - | - | - | - |
Introduction to Importing Data in Python | - | - | - | - |
Intermediate Importing Data in Python | - | - | - | - |
Cleaning Data in Python | - | - | - | - |
Data Manipulation with pandas | - | - | - | - |
Merging DataFrames with pandas | - | - | - | - |
Analyzing Police Activity with pandas | - | - | - | - |
Introduction to SQL | - | - | - | - |
Introduction to Relational Databases in SQL | - | - | - | - |
Introduction to Data Visualization with Matplotlib | - | - | - | - |
Introduction to Data Visualization with Seaborn | - | - | - | - |
Statistical Thinking in Python (Part 1) | - | - | - | - |
Statistical Thinking in Python (Part 2) | - | - | - | - |
Joining Data in SQL | - | - | - | - |
Introduction to Shell | - | - | - | - |
Conda Essentials | - | - | - | - |
Supervised Learning with scikit-learn | - | - | - | - |
Case Study: School Budgeting with Machine Learning in Python | - | - | - | - |
Unsupervised Learning in Python | - | - | - | - |
Machine Learning with Tree-Based Models in Python | - | - | - | - |
Introduction to Deep Learning in Python | - | - | - | - |
Introduction to Network Analysis in Python | - | - | - | - |
Course | Slides | Dataset | Notes | Solutions |
---|---|---|---|---|
Machine Learning for Everyone | - | - | - | - |
Introduction to Python | - | - | - | - |
Intermediate Python | - | - | - | - |
Python Data Science Toolbox (Part 1) | - | - | - | - |
Python Data Science Toolbox (Part 2) | - | - | - | - |
Statistical Thinking in Python (Part 1) | - | - | - | - |
Supervised Learning with scikit-learn | - | - | - | - |
Unsupervised Learning in Python | - | - | - | - |
Linear Classifiers in Python | - | - | - | - |
Machine Learning with Tree-Based Models in Python | - | - | - | - |
Extreme Gradient Boosting with XGBoost | - | - | - | - |
Cluster Analysis in Python | - | - | - | - |
Dimensionality Reduction in Python | - | - | - | - |
Preprocessing for Machine Learning in Python | - | - | - | - |
Machine Learning for Time Series Data in Python | - | - | - | - |
Feature Engineering for Machine Learning in Python | - | - | - | - |
Model Validation in Python | - | - | - | - |
Introduction to Natural Language Processing in Python | - | - | - | - |
Feature Engineering for NLP in Python | - | - | - | - |
Introduction to TensorFlow in Python | - | - | - | - |
Introduction to Deep Learning in Python | - | - | - | - |
Introduction to Deep Learning with Keras | - | - | - | - |
Advanced Deep Learning with Keras | - | - | - | - |
Image Processing in Python | - | - | - | - |
Image Processing with Keras in Python | - | - | - | - |
Hyperparameter Tuning in Python | - | - | - | - |
Introduction to PySpark | - | - | - | - |
Machine Learning with PySpark | - | - | - | - |
Winning a Kaggle Competition in Python | - | - | - | - |