How do you get a job in data science if you currently aren’t a data scientist?
My path to DS
I asked this question while working as an analyst 8 years ago and wanting to break into data science. I wrote about it back then (lost the post), so I’m gonna revisit it now.
I was asked this question today: “how do I get started and get a job in data science if I don’t have any experience?” You may then wonder “I can’t get experience without a job…this is a chicken-and-egg problem.” False.
Simple: get experience. For free. In 10 weeks. Write a good resume. Network like crazy: connect with real people (and not just spam). Get rejected probably 20-30 times, learn from each rejection, keep learning and presto, you’ll finally have an offer. It’s honestly just a numbers game. Everyone “ahead” of you is an imposter just like you. Just figure out enough basics and you’ll be ready for your first job.
There are a million ways to do this, and a million blog posts like this one. But here’s my suggestion. Enjoy!
No Single Data Science Definition
Look at 10 job postings for data science and you’ll see 10 unique definitions for a data scientist. Roughly, I’d categorize them as such:
- Human decision support: someone who supports others in their decision making. This would be their boss, their team, etc. Decisions are: should we shut this line of business down? Are our customers churning and how do we prevent that?
- Machine Decision Support: someone who helps machines make automated decisions, like whether to approve an online credit card application, recommend a YouTube video, etc.
- Product developer: someone who uses data to build a web product or service. This is similar to machine decision support, but the role might be in a SaaS capacity.
Most of these things below will help with 1-2. Today, AI products are all the rage and I offer one week on deep learning to get you interested in 3. That being said, all the principles in each one of these sections is important in any field you pursue. # Mindset
While doing the following, on weeks 2-10, write one blog post to outline what you learned.
- Where did you start? Why was it hard to learn? What did you want to know?
- What concepts were new to you?
- Do something with that knowledge. Search the internet to find data you can write about.
If you don’t get very far, just move on to the next topic. Breadth is better early on to find out what you enjoy. Don’t get caught up in the details. Skim more. Write more. Worry less. You got this.
In all things, use ChatGPT to tutor you. Ask it questions. Be relentless. Have it explain things to you like you’re five.
Balancing learning between real books and ChatGPT is the best combo. ChatGPT is a better teacher, and the books have real info in them you can more fully trust. Using both together will get you there faster.
If any of the books below don’t vibe for you, there are a million free books out there. Find a simpler one and the same topic (and post in the comments so I can add it!)
Week 2: Storytelling
You can’t get credit for great data science work unless you can tell a story. Now that you have a blog, learn how to write about data.
- Telling Stories with Data
- Find three data-focused blogs that use charts and analysis. Economist, FiveThirtyEight, DataIsBeautiful, etc.
Week 3: Analytics and Data Intuition
To be able to make predictions about the future, you first need to understand how to understand the past.
Analytics is the foundation of data science. It’s the process of manipulating data and reshaping it to see it in new ways.
- Python for Data Analysis, 3E - written by the creator of
pandas
- R for Data Science (2e) - written by the creator of
dplyr
, the best data manipulation language ever.
Both these books are great. Read both. It’s good to learn early that there are multiple right ways of doing your analysis the wrong way 😉. Seeing how to think in two languages is actually easier than getting stuck on one.
Related: understand the bias in your data.
Week 4: Data Visualization
Learning how to make a plot is like learning how to write. Learn how colors work. When to use them. When to use a bar vs. a line chart. How to make ideas shine.
This is after analytics because analytics is more important (knowing what to look for and how to think about data) than plotting it.
There are two types of plots: those for you to learn from and those for others to learn from. First, learn what’s in the data. Learn for yourself.
“Look at the plot. Look at the plot. No seriously, look at the plot.” Chris Peterson, Capital One
Second, plot for other people. After you’ve learned something by sifting out all the noise, learn how to communicate that externally.
Beautiful plotting is really hard and time consuming. Details can take hours. Focus on the basics at first. The simplest plot focuses on “what’s the one thing I want someone to take away from this?”
Week 5: Get Data You Don’t Yet Have
If you work at a company > 50 people, your company’s data is probably stored in a database. Learning how to get this data so you can do data analytics and visualization will help you be self sufficient.
Analytics is more important than this, because someone else may be able to get you the data. Or, once you know analytics then you can use SQL to do analytics.
Three primary ways to get data include the following:
- SQL (from a database, returns a spreadsheet like table)
- API (get data from a website, returns JSON data)
- Webscraping (getting it yourself, this is considered “unstructured” data)
Interactive tutorials here are the way to go. Just google around for some good ones.
Week 6: Intro to Machine Learning
This book nails the foundations of machine learning. Doesn’t get too mathy, but teaches you the principles.
- An Introduction to Statistical Learning
- For reference, a more mathy and dense book that supports the Intro book is Elements of Statistical Learninf
For content on how to get insights from machine learning models: - Interpretable Machine Learning
Additional - understand the bias in your models Tobias Baer - Risk Management, Data Science, and Psychology - 4. Managing Bias in Machine Learning - Machine Learning for High-Risk Applications
Week 7: Regression Analysis
Regression Is key to getting insights out of data. It’s what sets you apart from a data analyst role.
Week 8 Choose your own Adventure
A/B Testing and KPI Optimization for Online Companies (applies to offline too)
All big internet companies use A/B testing, or split testing, to make decisions about how to improve their product. If you’re interested in software, or understanding how all modern websites and software are improved, you need to understand A/B testing.
If You’ve taken stat 101, this is where the “t.test” gets used to literally make Big Tech billions of dollars. I’m not kidding. It’s simple, but it’s powerful.
- It’s All A/Bout Testing: The Netflix Experimentation Platform | by Netflix Technology Blog | Netflix TechBlog
- A seven part series by Netflix: Netflix: A Culture of Learning. Martin Tingley with Wenjing Zheng… | by Netflix Technology Blog | Netflix TechBlog I put here the seventh post because it likes to parts 1-6 in the intro. This part 7 is probably the best to start with because it explains the context behind A/B testing.
- Experiment Guide – Accelerate innovation using trustworthy online controlled experiments This book tells you how Microsoft, Amazon, and Google use online experimentation to make billions of dollars. It’s written by the people who invented online experimentation and who are top data scientists at those companies.
- Sequential A/B Testing Keeps the World Streaming NetflixPart 1: Continuous Data | by Netflix Technology Blog | Feb, 2024 | Netflix TechBlog
Time Series Data
All modern data is time series data in some way (data collected over time). Even though you may not use forecasting in everything or what not, understanding the nature of that data will set you apart. Focus on principles, not techniques as you may not use these specific techniques.
Week 9: Intro to Deep Learning
Deep learning is all the rage, and will change the future for everyone. It’s vital you know how these systems work because this is the future of society.
- fast.ai - fast.ai—Making neural nets uncool again
- Great courses that teaches you how to build ChatGPT and Image Generators and understand it.
This YouTube videos and others by him are great:
Also this:
Week 10: Recommendation Systems
If you’re making automated decisions with data, you’re probably using a recommendation system at some level. Recommendation systems typically focus on “what video does Bryan want given all the data we have on him and people like him?”. But the principles also extend to: how do I recommend one course of action to my stakeholder?
I don’t have great resources here, but getting the concepts down will help you understand how virtually every social media app or streaming service works. (Netflix, Spotify, Instagram, etc all use recommenders to figure out the content you’re interested in. )
Week 11+: Do Projects
Now that you’ve studied analytics, data viz, regression, machine learning, and deep learning, create two blog posts or personal projects.
- Build a dashboard using a Jupyter notebook that updates daily. You can host on GitHub using quarto to render the notebook and use GitHub actions to update it every day.
- Create a model to predict who will win the next NBA game
- Analyze any data from data.gov
- Kaggle competitions!
Prepare your Resume
Now you’ve got 10 blog posts, which is 10 more bullet points on your resume than you had before you started. You’ve learned a ton if you only do 10% of all that’s above.
I’ve loved the advice I found written up here: Writing a Tech Resume.
Once you have a good resume, find real people to give it to. Dont apply blindly online. Talk to everyone you know. Find people on LinkedIn and message them directly with a personal message. Use one of your blog posts in your message to tell them why you’re a good fit for their role. Perhaps even write a blog post specific to data from their company!
Good luck!
_________________________
Bryan lives somewhere at the intersection of faith, fatherhood, and futurism and writes about tech, books, Christianity, gratitude, and whatever’s on his mind. If you liked reading, perhaps you’ll also like subscribing: