Nils Magnus 7e6d4ee581 explain how to enter the mentioned venv
there was a line missing how to activate the virtual environment. In the described way it would've affected or harmed the system-wide installation
2024-03-23 23:52:11 +00:00
2023-10-09 16:06:05 +02:00
2023-10-10 09:58:18 +02:00
2023-10-09 14:23:37 +02:00
2023-10-09 14:32:49 +02:00

Help Center Spider

About

This is a spider tool with which you can visit all links on https://docs.otc.t-systems.com to find urls that are not correct.

Requirements

After you cloned the repository you need to prepare an environment to run the tool. You can easily do this with python virtual environment:

$ cd <local_folder>/
$ python -m venv venv/
$ source venv/bin/activate
(venv)$ python -m pip install -r requirements.txt

Configuration

In config.json you can define a couple items:

  • watchdog_file: if you run the tool in the background and want to stop it properly (not using kill), just send an exit message into the watchdog file: echo exit > watchdog.fifo
  • timer_runtime: maximum runtime limit in seconds
  • log_dir: logging folder
  • logging_interval: frequency of dumping log files
  • workers: number of workers (background processes) you want to run. If you set to 0 it will count from the number of cores (number_of_cores - 1)
  • starting_point: base url where to start

How to run

There are two ways to do it

In foreground

$ source venv/bin/activate
$ python main.py

In background

$ source venv/bin/activate
$ nohup python main.py > log/hc_spider.log 2> log/hc_spider.err <&- &

In case you running the tool in background you can stop the execution with $ echo exit > <watchdog_file>

Description
No description provided
Readme 300 KiB
Languages
Python 100%