This guide to meant to help web developers, software engineers and other IT industry people to transition into analytics / data science industry.
Last week, I was taking a guest lecture with one of the well known institutes in India. Rather (un)surprisingly, more than 60% of the students comprised of experienced IT Professionals. Most of them are facing a common problem, “I have been in IT / software / web development for more than a few years and want to up-skill myself in analytics. I have taken a few MOOCs and have tried using a few books / platforms. Still, I don’t get it what should I do next?”
This scenario is not very different from several learnups / meetups we have conducted over the last year. In order to help these people as much as I can, I’ve created this comprehensive career guide to get you quickly started with data science. Once you finish reading this post, you would know the next steps in making a transition.
People working in IT industry are generally comfortable with coding, working with databases and using frameworks. After spending a few years as a developer, you would know at least some languages like Java, ASP.net, Javascript, C++, C, HTML, Python, PHP, and you would have worked with several databases including SQL, MongoDb, Oracle etc. With these skills, you are envied by people trying to transition from non-programming background. Ask them, how they feel about it!
Let me explain the set of skills (eventually advantages) which I expect every good IT professional to have:
On the other hand, you can be fooled to believe that data science is about learning a few more tools. Just like in programming, knowing a few languages or frameworks would not make you a good software engineer. What differentiates a good analyst from a bad one is the problem solving and structured thinking skills. Tools are just a way to implement this thinking. Hence, I recommend people to just pick one tool depending on their convenience and then focus on getting hands on experience.
Here are the areas you need to focus on going forward:
The problem in today’s world is the problem of plenty. I am sure you couldn’t agree more. Try searching for resources on Internet for R / Python / Data Science and you end up with a long list of resources. Talk to a few people who have made the transition and they would add a few more resources, which worked for them. If you are an avid reader, you can add a few books and blogs over and above this. Check out the platforms offering MOOCs and you can see a few good courses.
The sad part is that while you have access to plenty of resources today, you find it difficult to find your way through these resources. Hence, I have created this learning path:
The Hardware:
Data Science is computationally intense (this should not be a news to you!). So, the first thing you should do is to set up a machine which helps you in your learning. Ideally, I would say that any machine used for serious data science work should have at least 8 GB RAM and an equivalent of i5 / i7 intel chips. Of course, the higher capacity you buy, the better it is! If you have some more money to spend at this stage, you can even get a SSD upgrade.
The Operating System:
Ironically, there is no single OS which works perfectly for Data Science. You would likely need a mix of Unix and Windows machine. Unix is better in resource management and for performing the data science work. On the other hand, you would need Windows for Powerpoint and Excel, both of which are used very heavily in data science work flows.
Also, there will be a few visualization tools, which work better in Windows environment. Hence, I would recommend to use Linux as the core OS with a virtual machine running Windows or vice versa. If you are used to Mac and can work comfortably on Excel on Mac, you might be good too.
The Softwares:
You would need to choose the language / tool of your preference here. If you have experience in coding with Object oriented languages, I would recommend Python. It is easy of learn and has a vibrant community on internet. If you aren’t used to object oriented programming, you can give R a try as well. If you need more details before making a decision, read this comparison – SAS vs. R vs. Python
Here is a list of softwares, I would recommends at the minimum:
If you have chosen Python as your preferred tool:
If you have chosen R as your preferred tool:
Your machine is ready to crunch some numbers now!
The art of structure thinking is a tacit requirement but profoundly being sought in every data scientist. Otherwise, below are some good resources to enhance your skill.
Assignment: Solve this case study on operational analytics: Call Center Optimization. It’s a beginner level assignment. Once you complete it successfully, you can move to medium and advanced level.
Think you are ready for the next level, try out our practice problem on Strategic thinking
Mathematics plays an important role in defining data science. Thankfully, you don’t need to learn all of math, just a few topics would do. You can start from the basic topics (marked mandatory) and pick up the rest of the topics as you progress:
Assignment: Do a statistical analysis on Big Mart Practice Problem. After you have finished with this assignment, you can showcase it on your LinkedIn profile as project work.
After step 3, you should do programming. Coding in data science is laconic in nature. Best practitioners avoid redundant lines of code and adopt ways to make it faster. Your prior knowledge of programming basics, should give you a nice head start in solving practical problems using R or Python.
Get used to the basics of R / Python from any of the following introductory courses:
Apart from these 2 tools, you can also use Julia, Go, Java to build predictive models. However, a possible drawback with other programming languages is the lack of community support. Till now, Python and R have the best community support on web. Thus, would help you to debug issues and learn faster.
Assignment: Already given the practice links above.
Time to get your hands on your first project. Like programming, the best way to learn data science is to do data science. Hence, let us start by taking up a problem to work on. You can choose any of our Practice problems or any of the projects mentioned here to start with. Perform an exploratory analysis on the data to get you started.
Here are a few good places to look at:
Assignment: Perform a similar analysis on the project of your choice.
Now that you have tasted blood, go for the kill! Check out our learning paths on R & Python – follow them step by step. Skip the steps you are comfortable with. Do as many exercises as you can!
Here are some additional machine learning resources you can look at. Remember that the best way to do data science is to learn data science thoroughly:
Assignments: Build a machine learning model on Loan Prediction Problem using the following algorithms:
For Decision Tree and Random Forest, you can seek help from: Complete Guide on Tree Based Modeling.
Make sure you understand how each of these algorithms works. Just implementing them and obtaining predictions wouldn’t be a success for you. The real success lies in gaining knowledge of how they work!
Time to step in battle ground. A benefit of being a part of analytics community is that you get to access so many thrilling ways of learning concepts. You no longer need to stick to traditional ways of learning.
Competitions:Â Several data science competitions get organized across the globe where you can participate and win prizes too. After you’ve completed above steps you must participate in these competitions to assess your learning level.
Assignment: In 6 months – 1 years, try to rank in top 100 in the competition rankings on both websites. This would give a massive boost to your profile.
We are prone to make mistakes (unknowingly) in pursuit of learning concepts quickly. Nothing to worry, we all are susceptible to such things. But, we need to be prudent enough to analyze our learning pace and proceed accordingly. Below are the list of mistakes ( bad practices) you should avoid while completing this learning path:
Just for some motivation…
https://www.youtube.com/watch?v=7mXFkvu98vI
I hope this guide will help people from IT / software development background to take up data science / machine learning as a career option. In summary, rely on your strengths, focus on developing structured thinking & problem solving, practice a lot and get your hands dirty on as many real life problems as possible. In the process, if you get stuck, leverage the communities and people in your network to help you out.
As usual, if you have any questions or suggestions I might have missed out on, feel free to reach out to us through the comments below.
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
Kunal, The effort that you are making to help the data ppl is very commendable.Please keep it up.
Thanks Vishwachandra!
Hi Kunal, Great stuff!! I am sure many aspiring data scientists will find this useful. Thank you for your help. Venkat
Glad that you liked this piece Venkat!
Thanks a lot, Kunal. This is very useful to me. I'm planning to switch my career to Analytics and I hope I will be successful soon by adopting the above mentioned steps.
I hope that this should help tremendously in your journey. Let us know, if you need any further help.
Awesome post, Kunal
Thanks Ashok
Good one for starters...
Hoping to help people in middle of this journey, or those who are already in middle of it.
I am one of the IT professionals who has transitioned to analytic and big data area, the only point i would want to mention is, keep doing what is been mentioned by Kunal and over time you would get an opportunity. It took me nearly 1.5 year to get in analytics field.
Thanks Moeen for seconding the approach. Congratulations for making the transition!
Hi Kunal, This is really a great article that I have ever found. I wish every aspiring data scientist (even myself) to make use of it to achieve their goal. Many thanks to Analytics Vidhya in providing the base/foundation for this type of articles from this great authors. Regards, Srinivas G Rao
Glad that you found it useful Srinivas!
It is very nice Article, Based on the article, one can know what is the pre-requisites to learn data science
Thanks Rambabu
Excellent work Kunal. This is what i was looking for, since i am IT professional looking to change from IT to Data Science.
Ksmvsn, Let us know how your journey goes or in case you need any help. Regards, Kunal
Any Recommendation/Learning path for Statistics lecturers wanting to move into Analytics/Data Science I am a Stats lecturer with 1 year of experience in teaching to undergraduates
Ragini, It should be much easier for you to transition that people from IT background. I would suggest to focus on business understanding and domain knowledge as a first and possibly the most critical step. Regards, Kunal
Super
Hi Kunal, I am IT professional with 10 yrs of exp in testing n BA in finance. I am looking for a course which will help to switch a career as data scientiest. 1 great lakes pgpba 2 ms program by ibm n aegis 3 pgdda by IIT banglore n Upgrad Which one is good?
GN, A successful transition is much more dependent on your efforts that the course you take. Also, I don't recommend complete transition for people with high experience - so choose your path wisely. Regards, Kunal
i required guidance on which institute course will be better practical real time oriented for me to start with in data analytics im non it background. plz suggest some good institute to join
Srikanth, Please post your query with details on the discussion portal (http://discuss.analyticsvidhya.com). Also provide details like your education background, experience till now and expectations from the course. Regards, Kunal
I am a Java Professional with 3 yrs experience.I have taken a break in my career due to child care.I want to make a career as data scientist. 1.Most of the Jobs posted in LinkedIn requires data scientist with 5+ yrs experience. What about the requirement for someone who is new to the field? 2.Does the short term certification courses provided by online training academies is on par with PGDM in Business Analytics provided by B-schools?Are these certifications industry recognized? 3..As data scientist is an umbrella term and there are many roles around it like data visualization expert,machine learning expert. Pls list down other roles as well. Thanks in advance for taking time to answer these questions.
i am new learner so do you have any basic tips for candidate like me? please post it.
please post the basic analytical tools for the fresh learner.
Very insightful. Thanks for sharing!
Great Stuff . Very much helpful to get into the course
Really nice points to start with transition to Data analytics. Sincerely thanks to you Kunal for providing really good guidance to the interested ones. It really helps. Kepe up the good work and let the community grow each day.
Hello; very interesting article. thank you for the advice, Can you give an estimation the time required to complete a Step before going to the next one ? I know that it depends on the persons skills. but let say for someone who have (My profile) - not much experience in programming (I used to program in C; Delphi in the university and I was one of the first in class and I learn C++ by my slef but I didn't work on any big project. - good skill in math and analytic thinking to resolve problem Thank you for answer in advance,
What an excellent guide, Kunal. Thank you for the contents and efforts.
hi, i am mechanical engineer working in oil and gas Engineering management profile. can i go for the business analytics?? As i want to switch from oil & gas industry
This is the best guideline for data scientist I ever found. Thanks Kunal and well done!
Thanks so much Kunal for such a beautiful explaination. This is the first article of my life which I read from start to end with interestingly. ?
Oops...missed to mention that I have bookmarked it and will start following it...?
Hi Kunal, Excellent guide to switch over to data science. I am 15+ yrs experience in Oracle technologies mostly application programming and database design. Is the correct choice for me to move to data science or Big data development or not ? . Can you and all give your feedback and i am looking this transfer both for career growth , sustainability in IT industry from next 10 yrs and better finance.. Regards Robert
thank you for the advice, Can you give an estimation the time required to complete a Step before going to the next one ? I know that it depends on the persons skills. but let say for someone who have (My profile) - not much experience in programming (I used to program in C; Delphi in the university and I was one of the first in class and I learn C++ by my slef but I didn't work on any big project. - good skill in math and analytic thinking to resolve problem