- Our focus should be on cluster 1, comprising individuals with both high Spending Scores and substantial income.
- Among these customers, 54 percent are women. To appeal to this group, we should devise a marketing campaign centered around the popular items found in this cluster.
- As for cluster 2, there is a promising chance to conduct a sales event targeting its customers, especially for popular items.
- Abstract: Cleaning, Analysis and Visualization from Nexflix Data
- Link: https://www.kaggle.com/datasets/ariyoomotade/netflix-data-cleaning-analysis-and-visualization?select=netflix1.csv
- Conclusion
- The total count of Movies is higher than that of TV Shows.
- Prior to the year 2020, there was a positive growth in the number of Movie and TV Show releases.
- Moreover, the year 2017 saw the highest number of Movie releases, with 766 releases, while the year 2020 had the highest number of TV Show releases, with 436 releases.
- The top five countries with the highest number of releases are United States (57.6%), India (18.8%), United Kingdom (11.3%), Pakistan (7.5%), and Canada (4.8%).
- The director with the most published works is Rajiv Chilaka, with 20 productions.
- Dataset: https://www.kaggle.com/datasets/jayrav13/olympic-track-field-results (Olympic Track & Field Results published on Kaggle)
- Tasks:
- Use tableau to create a scatterplot with years on the X axis and performance on the Y axis similar to the ones posted here: https://www.kaggle.com/code/drgilermo/ahead-of-their-time
- Create a multiple display chart in which years is the X axis and there are multiple displays of the chart in produced in 1 for the events I have selected. Different events have different ranges on the Y axis
- Include sports with times in the same range (for example, 100m, 200m, 400m, 800m)
- Recode the Y variable in % improvements since the sports is introduced.
- Dashboards help us explore our data quickly. Create a dashboard in which the user chooses the event and the gender and the visual in 1 appears.
- Create a connected scatterplot with the results from 2. On the X axis you have years on the Y axis you can have a variable of your choice (seconds, %improvement, or pace). Include in the connected scatterplot only the performance of the gold Metalist or the average of the performance of all medalists for a given year
- Create the following visual presented in NYT for an event of your choice https://archive.nytimes.com/www.nytimes.com/interactive/2012/08/05/sports/olympics/the-100-meter-dash-one-race-every-medalist-ever.html
- Use python to find
- Which sports the performance has increased or decreased the most?
- In which sports the difference between men and women is the least/most (in percentage terms).
- In which sports and years the difference between the gold and the silver is the greatest?