From 4c9a7dfaf4855c8b0e98c59306fd5bdb3b51ca58 Mon Sep 17 00:00:00 2001
From: Austin Davis <austindavis4@gmail.com>
Date: Thu, 29 Feb 2024 00:39:17 -0700
Subject: [PATCH] docs

---
 README.md          | 16 ++++++++++++++++
 headless/README.md | 31 +++++++++++++++++++++++++++++++
 proxy/README.md    |  7 +++++++
 scraper/README.md  |  6 ++++++
 4 files changed, 60 insertions(+)
 create mode 100644 headless/README.md
 create mode 100644 proxy/README.md
 create mode 100644 scraper/README.md
diff --git a/README.md b/README.md
index b97e26f..f163fb5 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,19 @@
 # job-scraper
 A timed event that once a day scraps relevant jobs links and sends them to discord.
 ![job-scraper (1)](https://github.com/austin1237/job-scraper/assets/1394341/39688936-66f2-4819-93bf-fcafb83930c4)
+
+## Deployment
+Deployment currently uses [Terraform](https://www.terraform.io/) to set up AWS services.
+### Prerequisites
+This repo needs a private [Amazon ECR repo](https://us-east-1.console.aws.amazon.com/ecr/repositories?region=us-east-1) to be created in the same region that our container based lambda is deployed to (in our case us-east-1). Name the private repo to headless.
+
+### Setting up remote state
+Terraform has a feature called [remote state](https://www.terraform.io/docs/state/remote.html) which ensures the state of your infrastructure to be in sync for mutiple team members as well as any CI system.
+
+This project **requires** this feature to be configured. To configure **USE THE FOLLOWING COMMAND ONCE PER TEAM**.
+
+```bash
+cd terraform/remote-state
+terraform init
+terraform apply
+```
\ No newline at end of file
diff --git a/headless/README.md b/headless/README.md
new file mode 100644
index 0000000..5ccff97
--- /dev/null
+++ b/headless/README.md
@@ -0,0 +1,31 @@
+# lol-counter-source-api
+A lambda that invokes a headless browser to render a page (including it's javascript) and passes along the rendered html.
+
+## Why is this lambda using a container deployment rather than the standard zip deployment?
+[Pupeteer](https://pptr.dev/) requires a chrome/chromium binary which execeeded the standard [lambda size limit](https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html#function-configuration-deployment-and-execution). Using a container image greatly increases the limit and allows for the binary to be deployed. Currently this service also uses [@sparticuz/chromium](https://github.com/Sparticuz/chromium) due to the standard pupeeteer chromium install having permissions issues when running in the deployed aws env.
+
+## Prerequisites
+You must have the following installed/configured on your system for this to work correctly<br />
+1. [Docker](https://www.docker.com/)
+2. [Docker-Compose](https://docs.docker.com/compose/)
+
+## Development Environment
+The development environment uses a pinned version of [aws's node 18 image](https://gallery.ecr.aws/lambda/nodejs) to mimic the running lambda. 
+
+```bash
+docker-compose up
+```
+
+The output is similar to what you would see in cloudwatch logs ex.
+
+```bash
+headless-lambda-1  | 18 Aug 2023 09:47:04,515 [INFO] (rapid) exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)
+```
+
+The endpoint of the local container is localhost:3000/2015-03-31/functions/function/invocations send a POST request with the following body
+```json
+{
+    "queryStringParameters": {
+	"url": "https://www.google.com"
+}}
+```
\ No newline at end of file
diff --git a/proxy/README.md b/proxy/README.md
new file mode 100644
index 0000000..4234a87
--- /dev/null
+++ b/proxy/README.md
@@ -0,0 +1,7 @@
+# Scraper
+This is go lamda that recieves a url as a query string and passes along that website html. This lambda does not render any javascript, for that functionality look folder called headless.
+
+## Prerequisites
+You must have the following installed/configured on your system for this to work correctly<br />
+1. [Go](https://go.dev/doc/install)
+
diff --git a/scraper/README.md b/scraper/README.md
new file mode 100644
index 0000000..b53d317
--- /dev/null
+++ b/scraper/README.md
@@ -0,0 +1,6 @@
+# Scraper
+This is a go lambda that goes through the proxy api to receive website html. Once received it parses the html and does a keyword check on the job description. If any keyword exists in the description then the job link and company are sent to discord for manual review.
+
+## Prerequisites
+You must have the following installed/configured on your system for this to work correctly<br />
+1. [Go](https://go.dev/doc/install)