Posts

Showing posts from February, 2023

SQL data cleaning method for Google Data Analytics Case Study 1 - Bike share

Image
 Case study csv cleaning and joining process So, when I was working on my case study I felt overwhelmed and had a difficult time cleaning the data. As I was working thru the data, I had no idea really how to combine the csv files and clean them in a reasonable time. I did this process in a very difficult way. I opened each csv file and individually cleaned the data in Excel. At the time, I could think of nothing else. It was a time consuming process, and a waste of time. Now that I am done with the project, I have found a much more time efficient way to complete this part of the project.  This part of the process can all be done in SQL.  SQlite Browser  is the tool to use as it makes importing csv files and creating a table from them easy. It uses standard SQL code, and I will show you the code that I used to quickly get rid of rows with empty cells and to pull a random sample based on your sample calculator output. 1. Install SQlite Browser ...

Google Data Analytics Case Study 1 - Bike share

Image
Statement of the business task The business task is to help Cyclistic identify key metrics that will allow them to grow their business. Cyclistic is a successful bike sharing company that has a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago.   Cyclistic launched operations in 2016. Marketing strategies goal   The goal is to convert casual riders into annual members by understanding how annual members and casual riders differ. The marketing analyst team wants to know why casual riders would buy a membership and how digital media could affect their marketing tactics. Insights that can drive business decisions would be how member and casual riders use the bikes, when they use them and how far do they ride. Preparing the data used     The data that was used was the monthly data from the year 2022. The data was located on a local hard  drive in a folder labeled data/zipfiles.  It was decided that since th...