# Monthly Archives: May 2013

## Free Data Science Education

The field of data science is heating up fast. The following list of educational resources will let you join the data revolution by getting up to speed with data science.

Data science — and the driving force behind it, machine learning — is the process of deriving added value from data assets. Commerce and research are being transformed by data-driven discovery and prediction. Skills required for data analytics at massive levels span a variety of disciplines and are not easy to obtain through conventional curricula. These include algorithms for machine learning (e.g., neural networks and clustering), parallel algorithms, basic statistical modeling (logistic regression and linear/non-linear regression), and proficiency with a complex ecosystem of tools and platforms.

**Meetup groups**

A good place to start is with meetup groups. Two of my favorite data science groups deal with the primary ingredients of data science work: R, which is the programming environment of choice for building algorithms, and machine learning. The LA area R user group is excellent; try to find one near you. The LA Machine Learning group has regular meetings that are extremely useful.

**Open Courseware**

The Massive Open Online Course (MOOC) movement is very active in the data science space and constitutes a superb educational resource. These free courses (some offer certifications) offer an excellent path toward obtaining the requisite background for becoming a data scientist. I’ve put together a Radical Data Science “pseudo degree program” for you to follow.

Lower-Division Courses

Data Science 101 – Statistics One

Data Science 102 – Computing for Data Analysis

Data Science 103 – Data Analysis

Data Science 104 – Introduction to Data Science

Upper-Division Courses

Data Science 201 – Machine Learning I

Data Science 202 – Machine Learning II

Data Science 203 – Neural Networks for Machine Learning

Graduate Courses

Data Science 301 – Learning from Data (Caltech course CS101)

Data Science 302 – Machine Learning III (MIT course 6.867)

**Free Data Science Books**

To go along with the coursework, there also are a number of excellent free books available:

Mining of Massive Datasets

Bayesian Reasoning and Machine Learning (pdf)

Information Theory, Inference, and Learning Algorithms

Gaussian Processes for Machine Learning (pdf)

The Elements of Statistical Learning

Introduction to Machine Learning (pdf)

Think Bayes (pdf)

As the interest in data science continues to grow, and as the shortage in talent becomes apparent, the timing is excellent to retool yourself and climb aboard the data science gravy train. If you know of any other good educational resources for data science and machine learning, please leave a note for all of us.