If you’re getting ready to launch a new career as a data analyst, chances are you’ve encountered an age-old dilemma. Job listings ask for experience, but how do you get experience if you’re looking for your first data analyst job? updated updared descirpiom
This is where your portfolio comes in. The projects you include in your portfolio demonstrate your skills and experience—even if it’s not from a previous data analytics job—to hiring managers and interviewers. Populating your portfolio with the right projects can go a long way toward building confidence that you’re the right person for the job, even without previous work experience.
In this article, we’ll discuss five types of projects you should include in your data analytics portfolio, especially if you’re just starting out. You’ll see some examples of how these projects are presented in real portfolios, and find a list of public data sets you can use to start completing projects.
Tip: When you’re just starting out, think in terms of “mini projects.” A portfolio project doesn’t need to feature a complete analysis end-to-end. Instead, complete smaller projects based on individual data analytics skills or steps in the data analysis process.
Data analysis project ideas
As an aspiring data analyst, you’ll want to demonstrate a few key skills in your portfolio. These data analytics project ideas reflect the tasks often fundamental to many data analyst roles.
1. Web scraping
While you’ll find no shortage of excellent (and free) public data sets on the internet, you might want to show prospective employers that you’re able to find and scrape your own data as well. Plus, knowing how to scrape web data means you can find and use data sets that match your interests, regardless of whether or not they’ve already been compiled.
If you know some Python, you can use tools like Beautiful Soup or Scrapy to crawl the web for interesting data. If you don’t know how to code, don’t worry. You’ll also find several tools that automate the process (many offer a free trial), like Octoparse or ParseHub.
If you’re unsure where to start, here are some websites with interesting data options to inspire your project:
Reddit
Wikipedia
Job portals
Tip: Anytime you’re scraping data from the internet, remember to respect and abide by each website’s terms of service. Limit your scraping activities so as not to overwhelm a company’s servers, and always cite your sources when you present your data findings in your portfolio.
Example web scraping project: Todd W. Schneider of Wedding Crunchers scraped some 60,000 New York Times wedding announcements published from 1981 to 2016 to measure the frequency of specific phrases.
2. Data cleaning
A significant part of your role as a data analyst is cleaning data to make it ready to analyze. Data cleaning (also called data scrubbing) is the process of removing incorrect and duplicate data, managing any holes in the data, and making sure the formatting of data is consistent.
As you look for a data set to practice cleaning, look for one that includes multiple files gathered from multiple sources without much curation. Some sites where you can find “dirty” data sets to work with include:
CDC Wonder
Data.gov
World Bank
Data.world
/r/datasets
Example data cleaning project: This Medium article outlines how data analyst Raahim Khan cleaned a set of daily-updated statistics on trending YouTube videos.