I understand data science is not all about programming but, as far as I know, Python comes into play to some extent on this matter. How much should I know about programming to do data science?
This question was asked earlier today by one our community member in our main channel
@bigdataspecialist.
I decided to share my answer here since it might be interesting to some of you and I am pretty sure great majority of you haven't even noticed his question/my answer.
TL:DR
Programming is important, but you don't have to be an expert. You just need some basic to intermediate skills to prepare your data (which you are going to use in data science tasks), possibly make some data visualizations to gain insights and at the end to create your machine learning models. These basic skills could probably be gained in a month, especially if you are not complete newbie who has never heard of programming 😅
You can get mentioned skills from this course:
https://www.coursera.org/learn/python-data-analysis
Teacher is Christopher Brooks and course is created by University of Michigan.
Note: I know it says its paid one, buy you can apply for financial aid and get course for free. That's how I got this course when I just started learning data science.
Keep in mind that Python is not only programming language which comes to mind when you think about doing data science.
For example, for almost all data science and machine learning tasks, I use Java. It's very specific and usually data scientists don't do that, but platform developed by my company is receiving 4k requests per second, so we need something blazing fast, and Python is pretty slow. That's why we use Java.
But if I am going to test something locally, or I need some easy data preparation or data visualizations, I use Python. Creating charts to gain some insights would be real nightmare with Java. But for you as a beginner Python is probably best choice
Long story short:
If you want it fast and easy, python is way to go
IF you want it very fast (but probably pretty hard to make it work) - Java.
If you want to perform advanced calculations and visualizations - R
If you want to show your visualizations dynamically on some web page, then certain JavaScript libraries like D3.js or chart.js.
Hope this helps.
➖➖➖➖➖➖➖➖➖➖➖➖➖
Join
@datascience_bds for more cool data science materials.
*This channel belongs to @bigdataspecialist group