# job-scraper
A timed event that once a day scraps relevant jobs links and sends them to discord.

+## Deployment
+Deployment currently uses [Terraform](https://www.terraform.io/) to set up AWS services.
+### Prerequisites
+This repo needs a private [Amazon ECR repo](https://us-east-1.console.aws.amazon.com/ecr/repositories?region=us-east-1) to be created in the same region that our container based lambda is deployed to (in our case us-east-1). Name the private repo to headless.
+### Setting up remote state
+Terraform has a feature called [remote state](https://www.terraform.io/docs/state/remote.html) which ensures the state of your infrastructure to be in sync for mutiple team members as well as any CI system.
+This project **requires** this feature to be configured. To configure **USE THE FOLLOWING COMMAND ONCE PER TEAM**.
+cd terraform/remote-state
+terraform init
+terraform apply
+# lol-counter-source-api
+A lambda that invokes a headless browser to render a page (including it's javascript) and passes along the rendered html.
+## Why is this lambda using a container deployment rather than the standard zip deployment?
+[Pupeteer](https://pptr.dev/) requires a chrome/chromium binary which execeeded the standard [lambda size limit](https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html#function-configuration-deployment-and-execution). Using a container image greatly increases the limit and allows for the binary to be deployed. Currently this service also uses [@sparticuz/chromium](https://github.com/Sparticuz/chromium) due to the standard pupeeteer chromium install having permissions issues when running in the deployed aws env.
+## Prerequisites
+You must have the following installed/configured on your system for this to work correctly
+1. [Docker](https://www.docker.com/)
+2. [Docker-Compose](https://docs.docker.com/compose/)
+## Development Environment
+The development environment uses a pinned version of [aws's node 18 image](https://gallery.ecr.aws/lambda/nodejs) to mimic the running lambda.
+docker-compose up
+The output is similar to what you would see in cloudwatch logs ex.
+headless-lambda-1 | 18 Aug 2023 09:47:04,515 [INFO] (rapid) exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)
+The endpoint of the local container is localhost:3000/2015-03-31/functions/function/invocations send a POST request with the following body
+ "queryStringParameters": {
+ "url": "https://www.google.com"
+# Scraper
+This is go lamda that recieves a url as a query string and passes along that website html. This lambda does not render any javascript, for that functionality look folder called headless.
+## Prerequisites
+You must have the following installed/configured on your system for this to work correctly
+1. [Go](https://go.dev/doc/install)
+# Scraper
+This is a go lambda that goes through the proxy api to receive website html. Once received it parses the html and does a keyword check on the job description. If any keyword exists in the description then the job link and company are sent to discord for manual review.
+## Prerequisites
+You must have the following installed/configured on your system for this to work correctly
+1. [Go](https://go.dev/doc/install)