Skip to content

Commit 9ea8108

Browse files
authored
Merge pull request #307 from DedSecInside/python-3.11-dev
Replacing gotor with httpx and other major changes
2 parents 7df1e9e + c80f844 commit 9ea8108

24 files changed

+549
-814
lines changed

.env

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
11
export TORBOT_DATA_DIR=${PWD}/data
2-
export HOST='localhost'
3-
export PORT=8081
4-
export LOG_LEVEL="info" # OPTIONS - info, debug, fatal
2+
export SOCKS5_HOST='127.0.0.1'
3+
export SOCKS5_PORT=9050

.gitmodules

Lines changed: 0 additions & 3 deletions
This file was deleted.

README.md

Lines changed: 14 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -34,95 +34,45 @@
3434
6. Crawl custom domains
3535
7. Check if the link is live
3636
8. Built-in Updater
37-
9. Build visual tree of link relationship that can be quickly viewed or saved to an image file
37+
9. Build visual tree of link relationship that can be quickly viewed or saved to an file
3838

3939
...(will be updated)
4040

4141
### Dependencies
42-
- Tor
42+
- Tor (Optional)
4343
- Python ^3.9
44-
- Golang 1.19
4544
- Poetry
4645

4746
### Python Dependencies
4847

49-
(see requirements.txt for more details)
50-
51-
### Golang Dependencies
52-
- https://github.com/KingAkeem/gotor (This service needs to be ran in tandem with TorBot)
48+
(see pyproject.toml or requirements.txt for more details)
5349

5450
## Installation
5551

56-
### Gotor
57-
gotor is needed to run this module.
58-
Note: If the `gotor` directory is empty, you may need to run `git submodule update --init --recursive` to initialize the submodule.
59-
60-
#### Using local Tor service
61-
* Run the tor service:
62-
```sh
63-
sudo service tor start
64-
```
65-
* Make sure that your torrc is configured to SOCKS_PORT localhost:9050
66-
67-
* Open a new terminal and start `gotor`, this can be done using `docker` or `go`
68-
- using go:
69-
```sh
70-
cd gotor && go run cmd/main/main.go -server
71-
```
72-
73-
#### Using tor and gotor docker containers
74-
- using docker (multi-stage image, builds tor and gotor container):
75-
```sh
76-
cd gotor && ./build.sh
77-
```
78-
7952
### TorBot
8053
* TorBot dependencies are managed using `poetry`, you can find the installation commands below:
8154
```sh
8255
poetry install # to install dependencies
83-
poetry run python run.py -u https://www.example.com --depth 2 -v # example of running command with poetry
84-
poetry run python run.py -h # for help
85-
```
86-
87-
### Full Installation
88-
There is a shell script that will attempt to install both `torbot` and `gotor` as global modules.
89-
The script `install.sh` will first install the latest version of `torbot` found in `PyPI`,
90-
then it will attempt to install `gotor` to the `GOBIN` path after making the path globally accessible.
91-
```sh
92-
source install.sh # execute script
93-
```
94-
95-
You can now run
96-
```sh
97-
gotor -server
98-
```
99-
and crawl using
100-
```sh
101-
python -m torbot -u https://www.example.com
56+
poetry run python torbot/main.py -u https://www.example.com --depth 2 --visualize tree --save json # example of running command with poetry
57+
poetry run python torbot/main.py -h # for help
10258
```
10359

10460
### Options
10561
<pre>
10662
usage: Gather and analyze data from Tor sites.
10763

10864
optional arguments:
109-
-h, --help show this help message and exit
110-
--version Show current version of TorBot.
111-
--update Update TorBot to the latest stable version
112-
-q, --quiet
11365
-u URL, --url URL Specifiy a website link to crawl
114-
-s, --save Save results in a file
115-
-m, --mail Get e-mail addresses from the crawled sites
116-
-p, --phone Get phone numbers from the crawled sites
11766
--depth DEPTH Specifiy max depth of crawler (default 1)
118-
--gather Gather data for analysis
119-
-v, --visualize Visualizes tree of data gathered.
120-
-d, --download Downloads tree of data gathered.
121-
-e EXTENSION, --extension EXTENSION
122-
Specifiy additional website extensions to the list(.com , .org, .etc)
123-
-c, --classify Classify the webpage using NLP module
124-
-cAll, --classifyAll Classify all the obtained webpages using NLP module
125-
-i, --info Info displays basic info of the scanned site </pre>
67+
-h, --help Show this help message and exit
68+
-v Displays DEBUG level logging, default is INFO
69+
--version Show current version of TorBot.
70+
--update Update TorBot to the latest stable version
71+
-q, --quiet Prevents display of header and IP address
72+
--save FORMAT Save results in a file. (tree, json)
73+
--visualize FORMAT Visualizes tree of data gathered. (tree, json, table)
74+
-i, --info Info displays basic info of the scanned site
75+
--disable-socks5 Executes HTTP requests without using SOCKS5 proxy</pre>
12676

12777
* NOTE: -u is a mandatory for crawling
12878

gotor

Lines changed: 0 additions & 1 deletion
This file was deleted.

poetry.lock

Lines changed: 145 additions & 58 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)