Do you need to quickly become a data scientist? Great guide for self starters #datascience #bigdata
Mo Data stashed this in Training in Anything Data
http://hail-data.quora.com/How-to-acquire-the-Essential-Skill-Set-the-Self-Starter-way
Great post on Quora
How to acquire the "Essential Skill Set"?- the Self Starter way.
The Essential Skills set are the basic fundamental skills which forms the foundation of every data scientist. Since my focus is more on a self-learner perspective so here I list certain online open resources that can be employed to develop the 4 Essential Skills of a Data Scientist as a whole. The idea is to pick one or two resources (links) from each sub group and learn about the same. 0. Basic Pre-requisites:
- Linear Algebra, Calculus: Mathispower4u-Calculus, Coursera-Linear Algebra, Linear Algebra & Probability for Machine Learning- U of Glasgow
- Statistics, Probability: Probability and Statistics for Programmers,Statistical Formulas For Programmers, Coursera- Data Analysis,Coursera- Statistics One
- Algorithms & Databases: Coursera-Analysis of Algorithms, Coursera- Introduction to Databases
- Programming: Google Developers R Programming Lectures, Scientific Python Lectures, How to Think Like a Computer Scientist
1. Acquire & Scrub Data:
- DFS & Databases: Hadoop Tutorial - Yahoo, BigDataUniversity: Big Data Course, Hortonworks Sandbox, Learning to Process Big Data with MapReduce and Hadoop - Hands-On Exercises
- Data Munging: Predictive Analytics: Data Preparation, Data Wrangling in Pandas, Data Wrangler, OpenRefine
2. Filter & Mine data:
- Data Analysis in R: Data science in R, Coursera-Computing for Data Analysis in R
Data Analysis in Python (numpy, scipy, pandas, scikit): Getting Started With Python For Data Science, SciPy 2013- NumPy Tutorials, Statistical Data Analysis in Python, Pandas (1st Video Below), SciPy 2013- Introduction to SciKit Learn Tutorial I & II (2nd & 3rd Video Below)
- Exploratory Data Analysis- Exploratory Data Analysis in R, Exploratory Data Analysis in Python, UC Berkeley: Descriptive Statistics, Basic Unix Shell Commands for the Data Scientist
Data Mining, Machine Learning:
Data Mining Map, Coursera-Machine Learning, Stanford - Statistical Learning, MITx: The Analytics Edge, STATS 202 Data Mining & Analysis, Mining Massive Data Sets-Stanford, Learning From Data - CalTech, Coursera-Web Intelligence & Big Data
3. Represent & Refine Data: Tableau-Training & Tutorials, Data visualisation in R with ggplot2 and plyr, Predictive Analytics: Overview and Data visualization, Flowing Data-Tutorials, UC Berkeley-Data Visualization,D3.js Tutorial4. Domain Knowledge: The Black-Box, as per your interest. Combining all the above:Data Literacy Course -- IAPUC Berkeley Introduction to Data Science Coursera-Introduction to Data ScienceTeach Data Science-Syracuse UniversityCoursera - Data Science TrackUdacity - Data Science Track Apply the knowledge:Harvard Data Science Course Homework Analyzing Big Data with Twitter Analyzing Twitter Data with Apache Hadoop
Stashed in: Big Data!
11:42 AM Sep 21 2014