In this challenge, we worked with school and student data for a district, and used pandas to create dataframes to analyze the performance of different schools based on various indicators and categories.
We started with two csv files providing us data on different schools and students. We begin our analysis by first merging the two databases into a single data frame. We then proceed to do some overall analysis of the district, followed by the same analysis for the various schools in the district. After doing the basic analysis on the whole distrcit and specific schools, and saving the data in those respective data frames, we dive further into the following analysis:\
- Performance of the schools (Highest and Lowest) based on their overall passing rate.
- Passing rates of math and reading for each school, broken down by different grades.
- Categorizing schools by their per capita spending, and performance analysis of said categories based on passing rates.
- Categorizing schools by the number of students in each school, and performance analysis of said categories based on passing rates.
- Finally, categorizing schools by their type (distrcit or charter), and performance analysis of said categories based on passing rates.
Amongst the many conclusions which could be drawn from our analysis, below are two:
- For math scores, even though the average scores varies significantly for different schools, the variation between the average scores for different grades is not very high. Whereas, for average reading scores, not much variation could be seen between grades or schools.
- The overall performance of charter schools is significantly better than district schools. (However, this does not necessarily mean that charter schools are better at teaching. This should be studied further as it may be related to other factors, such as syllabus and difficulty of exams.)