Below is the dashboard I created for Lab 10:
See story below for Labs 7-9. See on Tableau Public here:
When the pandemic started, everyone had to make lifestyle adjustments. I wanted to create a picture of what my health and fitness looked like (and continues to look like) during the pandemic.
I take health and fitness very seriously and already acknowledge that I have been less active recently due to the pandemic. After all, I no longer have a daily commute where I am power-walking to and from the train stations or at the office consistently choosing the stairs over the elevator. Conversely, not having a commute has allowed me to be more consistent in my at-home weight lifting workouts.
I hope that my project will illuminate my current habits and motivate me to be more proactive in my fitness journey. How has my day-to-day activity changed in comparison to pre-pandemic levels? How has the transition from the gym to my home “gym” affected my lifting stats and exercises? What’s changed in my workouts?
The primary audience for this project is myself. Although I am sure the data will be most interesting and meaningful to me, Professor McSweeney opened my mind to the idea that many others could benefit from seeing this personal fitness project. Throughout the past year and a half, we have all taken part in a collective experience of unexpected change, increased fear, and disruption of daily routines. For those that are interested in their fitness and health and needed to make adjustments as I did, this project may appeal to you. I hope you can see similar experiences that can help you on your own fitness journey.
I have split the data into two analysis groupings.
First, I want to look at general activity. I have already pulled Google Fit data, which contains my daily steps, distance walked, and average speed over the past 6 years. 6 years is beyond the scope of this project, but I will use this data set to show differences in activity from pre-pandemic to during.
For the second grouping, I will put together data sets from my notebook (pre-pandemic) and an image of my whiteboard in my apartment (during pandemic) that I used to track my weight lifting progress. I have the images of where I pulled the weightlifting data from and split them into two viewable albums (unfortunately, the Commons will not let me embed these albums, so I embedded in a post elsewhere) to show pre-pandemic and during the pandemic. 2 photos below show an example of how I kept track of my weight lifting. From these images, I have coded the information into an Excel file to create visualizations in Tableau. I have also made the decision to choose three exercises as coding the data is quite time-consuming.
IMAGE 1 – Pre-Pandemic Tracking: I used a small notebook to track all of my exercises when I went to the gym throughout 2019 and the beginning of 2020. I used colors to correspond to dates and tracked weight, how many sets performed, and how many repetitions within those sets.
See this visualization on Tableau Public.
From the line charts above, we can observe the following:
Since I had historical data dating back to February of 2015, I wanted to quickly see the monthly averages and if the decrease in activity above could be accounted for by general shifts in the months. I thought that perhaps weather could also be a contributing factor to the differences in daily step averages. The chart below shows the average daily steps from each month over the 5 year span.
See this visualization on Tableau Public.
There do appear to be natural fluctuations in my steps, likely due to changes in weather. Even with this in mind, the numbers during the pandemic are lower than in the chart above.
In compiling the weightlifting data into the Excel Spreadsheet, I flagged a few things that viewers should take note of due to the imperfect method of tracking:
With these limitations and specifics noted, we can move on to the important observations to be made from the data. See below for visualizations for each of the analyses. See here for the data set.
See this visualization on Tableau Public.
The scatterplot above charts the average weight lifted vs the average repetitions for each set and uses color to differentiate which are pre-pandemic exercises and which are during the pandemic. We can observe:
I wonder, did I get stronger during the pandemic than I was previously? The following visualizations may be able to shed some light on that. One way to measure strength could be to sum the total weight lifted for each exercise. Below are the numbers for total weight lifted (that is, the weight multiplied by the amount of times it was lifted).
See this visualization on Tableau Public.
If strength is measured in how much I lift per exercise, you can certainly see that during the pandemic, my stats were much higher for amount of weight I lifted in all three of the exercises I chose to spotlight. With this said, I know from above that I performed more reps with weight that was slightly lighter, so I am not sure if this is a full picture of my weight lifting ability. Below, I break each exercise down further with the Average Weight that was lifted in each exercise and the Average Amount of Reps.
See this visualization on Tableau Public.
It appears that, with the exception of Shoulder Press, I was certainly lifting much heavier weight before the pandemic. During the pandemic and clear for all three exercises, the amount of reps I performed for each set of the exercises was much higher. None of my pre-pandemic rep averages is over 13, while it appears most sets during the pandemic exceeded that number. If strength is to be measured by how heavy one can lift, one could say that my pre-pandemic numbers are more successful.
Within Excel, I did a VLookup to fill in step data for all of the dates that I had a logged exercise. In the dashboard below, I have the dual axis charts for pre-pandemic and during the pandemic.
See this visualization on Tableau Public.
Here, we can observe that before the pandemic, steps were much higher than during the pandemic. Since it is a dual axis, the numbers are not on the same scale, but you can still observe that almost all days during the pandemic has the steps underneath the total lifted weight amounts, with the opposite taking place before the pandemic. This is consistent with the findings above.
It is clear that from my analysis, workout habits changed when the pandemic struck. Generally, I did not do as much walking, reflected in my step counts. When it came to weight lifting, I definitely made progress in the total amount of weight I was lifting during the pandemic, likely due to the amount of reps I was performing with the lighter weights. Before the pandemic, my general activity was much higher, but the pandemic allowed me to focus more on weight lifting at home. Shoulder Press is where I made the most important progress as my rep counts went up at around the same weight.
This project has illuminated to me that there are a lot of different ways to stay active. Though my routine changed during the pandemic, I was still able to stay active and healthy. The lesson here is that as ling as you stay active and stay aware of what you are doing with your body, you can maintain your health! Looking forward, I think I will make the goal to balance my weightlifting, so that I can increase the amount of weight I lift, but also maintain the higher number or reps that I complete.
Line Graphs: I wanted to keep the design very simple for both of these graphs. I focused on the tooltip to ensure that it was easy to view and had the relevant information.
Scatter Plot: I needed to use color to differentiate between pre-pandemic and during pandemic workouts as I did not include the dates or chronological order of the workouts.
Bar Graphs: I thought it would be most effective to have bar graphs on top of one another for easy comparison.
Bar & Line (Dot) Graphs: I chose this set up because I thought it would be important to see the differences in weight lifted and reps side by side. Going with a similar strategy to the other bar graphs, I stacked them on top of each other for easy viewing. I thought the findings there were striking and interesting.
Dual Axis: I thought it would be beneficial to see step data and weight lifting data side by side to really bring both of the analyses together.
In general, I tried to give color a lot of importance, but also made sure that the tooltips had all of the information that would be needed if color was not considered.
See below for visualization from Tableau:
For first portion of this lab, we manually cleaned the data. CSV file linked here. Screenshot below from my manual cleaning work:
Second portion of Population Data screenshots below:
For this first project, I put myself into the shoes of a DSNY (Department of Sanitation NY) data analyst. I would like to examine the 311 calls that DSNY handles and see if there are any specific types of complaints that need to be addressed, more so than others. In an initial filtering of the data, I think a good way to get the data simplified is to look at it in three ways: citizen issues, sanitation issues, and other issues that do not fit within either of the categories. To further clarify, citizen issues deals with problems that are caused by the average New Yorker – of illegal dumping, residents not cleaning up their garbage properly, etc. Sanitation issues refer to problems caused by the department itself – of complaints about workers, baskets not being picked up, etc. I wonder: As we continue into this new normal, which specific complaints can the DSNY focus on to decrease the amount of sanitation-related 311 calls the city receives and, more consequently, improve the quality of life of New Yorkers and performance of the agency? Additionally, how has the volume of DSNY-related 311 service requests changed throughout the pandemic?
Each of the complaint types were coded the following way:
Citizen Issues |
Sanitation Issues |
Other |
Abandoned Bike | Collection Truck Noise |
Litter Basket / Request
|
Derelict Bicycle | DSNY Spillage |
Other Enforcement
|
Derelict Vehicle | Employee Behavior |
Sanitation Condition
|
Dirty Conditions | Missed Collection | Snow |
Graffiti | Missed Collection (All Materials) |
Vacant Lot
|
Recycling Enforcement
|
Overflowing Litter Baskets | |
Snow Removal | ||
Sweeping/Inadequate | ||
Sweeping/Missed |
The 311 Service Requests were sorted through and pulled from NYC Open Data. Since analysis is for the time period of the pandemic, data was pulled from March 11, 2020 (my first day of working from home) through August of 2021.
City officials, specifically in the DSNY, but also beyond at City Hall, would be interested in seeing the answer to the research questions above. Having this information would be beneficial to these stakeholders because it can influence where departmental resources are distributed for more efficient use of the DSNY workforce. Additionally, this information can determine if there is any legislation that can be passed to improve conditions for New Yorkers.
To begin I created a stacked bar chart to demonstrate which types of issues are reported most frequently and further, to show which complaint types are most common in each. I chose to do a stacked bar chart because it allows us to see both of these measures in one chart.
In reviewing the stacked bar chart above, we can observe the following:
Design choices for stacked bar chart:
I tried to keep this chart simple, as to communicate clearly. I coded each of the issue types with different colors and then made a gradient for each of the complaint types. We primarily look at the quantities here, so I made sure the tooltip only contained the important information – the complaint type and quantity.
Below, we see how DSNY-related 311 service request volume has changed throughout the pandemic. It is important to note that for all graphs below, we used the date that the report was created, which would correspond to the day that the call was initially made.
I also broke these out into the different issue-type categories to see if there was anything to glean from those differences.
We can observe the following:
I am most interested in doing further analysis on the Missed Collection, Dirty Conditions, and Derelict Vehicle complaint types as those are the most common 311 reports to the DSNY. Further analysis of these complaint types can provide information to resolve the most frequent complaints from both issue categories.
We can observe from the line graph above:
Design Choices for Line Graphs: I decided to make the line graphs very simple and thought that it was more beneficial to split the categories and complaint types into different graphs to make it more easy to view.
I had initially tried to do these charts as a bar graph or pie chart. Both were quite ineffective and unnecessary, so I decided to go with a table. We can observe that in all of these categories, most of the tickets are closed.
Opportunities for further analysis:
There are many interesting things to be gleaned from looking at the DSNY 311 Data. It is fair to conclude that both Citizen and Sanitation issues are of big concern and that the Missed Collection, Dirty Conditions, and Derelict Vehicle complaint types are the most common. It would be fair to suggest that DSNY officials take a closer look at these specific complaint types to try and decrease the amount of calls they get in those categories.
Interestingly, it appears that the Winter months of 2021 got the highest volume of 311 complaint calls with another spike happening in the warmer months of 2020.
With this, there are definitely more questions and further analysis to be done.
Please see my Blogpost 1 titled AN EXPLORATORY ANALYSIS OF 311 CALLS HANDLED BY THE DSNY DURING THE COVID-19 PANDEMIC where I completed these steps.
Data cleaning for the Fall 2021 Intro to Data Viz class’ Restaurant Recommendations is linked here in Google Sheets, copied from Tableau Desktop. One important note is that the cleaning in Tableau resulted in one row being removed because it was not a borough of New York City. The location of the restaurant was Long Island.
QUESTIONS:
Below, you can see a visualization of the World Population Change by region for the period between 1980 and 2015. Each region is color-coded per the chart.