This program allows for users to easily generate links to IB papers from the IBDocuments website. These links are saved to a CSV file. No personal data is stored.
In high school, I found myself spending a large amount of time sifting through the IBDocuments website to download test papers (for extra practice). This python script is intended to automatically generate a list of links to the PDFs of IB math papers via web scraping, WITHOUT having to click around the website. The user specifies the subject level (Higher Level or Standard Level), and the papers desired (P1, P2, P3), and links are obtained by scraping the IBDocuments.com website, which is then saved to a CSV file. This list of links can then be opened via the OpenList Google Chrome extension.
- Web scraping using the Beautiful Soup module
- CSV file parsing
- First, make sure you have Python and Git installed on your computer.
- These are the Python download links for Windows and for Mac
- Git comes on most Macs and Linux machines, but if it is not installed, this installation guide may help.
- Clone/download the git repo into a folder on your computer.
- Open main.py and enter the respective search parameters (what subject, what IB papers desired)
- In terminal, access the program folder and type "python3 main.py"
- The program should run, displaying its progress collecting links for each year.
- When the 'Done!' message appears, a CSV file titled 'cms_scrape.csv' should be created in the folder. Open it in Excel or another CSV reader and copy the links desired.
- Paste these links into OpenList, a free Google Chrome extension, and click 'Open' to open all the links.
- If you do not have OpenList, it can be installed here for free.
Enjoy using this program and please feel free to contact me at lrm444@nyu.edu for any issues encountered!