What is Data Science?
Data science at its most basic level is defined as using data to obtain insights and information that provide some level of value. Data science is evolving fast and has a wide range of possibilities surrounding it and so to limit it by that basic definition is kind of elementary. An extension of the that definition would be that data science is a complex combination of skills such as programming, data visualization, command line tools, databases, statistics, machine learning and more… in order to analyze data and obtain insights, information, and value from vast amounts of data.
For those of you interested in more specifics of Data Science and what it is you can learn more, then Click here…
The very first thing you should learn is some basic python programming. Learn the Syntax, Variables and Data types, Lists and for Loops, Conditional Statements, Dictionaries and Frequency Tables, Functions, and Object Oriented Python to get started.
For python programming this is the only resource you will ever need…
Data Analysis and Visualization:-
Now we want to learn data analysis and visualization. First you will want to start off by learning pandas and numPy for cleaning and exploring your data. Then you will want to learn matplotlib for exploratory data visualization and storytelling with your data.
Command Line Tools:-
Next you will want to learn how to navigate the file directory, create and delete directories, how to edit and manage files and their permissions, how to work with programs from the command line, and how to create virtual environments. You’ll also want to learn about Git and GitHub for version control.
I find the best way to get into the command line is to use it on a day to day basis.
You’ll want to learn SQL for querying data as well as Postgrad SQL for advanced database management. You should also know how to work with APIs and web scraping for creating your own datasets. Also try learning spark and map-reduce.
Next you’ll want to learn statistics fundamentals which include sampling, frequency distributions, the mean, weighted mean, the median, the mode, measures of variability, Z-scores, probability, probability distributions, significance testing, and chi squared tests.
Introduction to Statistical Learning and Elements of Statistical Learning will give you a statistics foundation that will make you the go to person for all things statistics…
You will want to learn at least 10 basic algorithms for machine learning: linear regression, logistic regression, SVM, random forests, Gradient Boosting, PCA, k-means, collaborative filtering, k-NN, and ARIMA.
You will also need to understand how to evaluate model performance, hyper parameter optimization, cross-validation, linear and nonlinear functions, basic calculus and linear algebra, feature selection and preparation, gradient descent, binary classifiers, over fitting and under fitting , decision trees, neural networks, and then you should build something with those skills and even try some kaggle competitions. You can also move on to more advanced topics like NLP and AI if interested in those.
Once you’ve gotten the basic skills down I recommend getting really good at one thing such as deep learning, AI, statistics, NLP, or something else because it allows you to be the go to person for a specific skill and it looks really good for a job interview if that’s what you are trying to do.
You should really build some projects as you go. I recommend building things after you’ve learned basic python and data visualization tools. Learning by doing is one of the best ways to truly learn the skills you need in data science and it also proves to others that you actually can build something with data.
Once you have a few projects under your belt and feel confident coding - the next step is to apply for jobs. But before you do that you need to have a resume. Not just any resume, a good resume, a good data science resume.
You are entering a new field; therefore you will be dealing with a different type of recruiter or boss. Therefore, having a correctly constructed resume will help you to get that initial interview.
The part that everybody dreads but the part that everyone is working towards getting a job! There is such high demand in this sector; therefore you won't be short of jobs. However, preparing and smashing your interviews is the hard part.
There is a lot of content to remember and it can be difficult when you’re asked on the spot under pressure. However, there are data science prep courses, typical questions that interviewers ask, and more that can help you during this stage.
Starting a career in DATA SCIENCE:
You will want to build 2 advanced projects that you can put onto a resume or in a portfolio:
One that shows you can do an end to end data science project.
Then the second one should be a project that showcases your specialized skill. Make sure your projects are presentable, well-documented, and easy to understand, and put them on GitHub.
Create a great resume that stands out and communicates the right information tailored to the specific job you are applying for Create a solid LinkedIn profile so recruiters can find you and you can also use LinkedIn to apply for jobs.
I hope this data science Study Roadmap has either inspired you to have a career change or finally take the leap and start to learn data science. The majority of these resources are either known for their great content, being a bestseller or having proven to help people in their data science journey.