mirror of
https://github.com/liameno/librengine.git
synced 2024-10-05 18:28:21 +03:00
Privacy Web Search Engine (not meta, own crawler)
cppcrawlerencryptionfrontendprivacyrobots-txtrsasearch-engineself-hostedspiderstarred-liameno-repostarred-repowebsearchwebsearchengine
.github | ||
cli | ||
crawler | ||
images | ||
lib | ||
scripts | ||
website | ||
.gitignore | ||
CMakeLists.txt | ||
config.json | ||
docker-compose.yml | ||
LICENSE | ||
README.md | ||
sites.txt |
Privacy Web Search Engine
Website
Features
Crawler
- Multithreading
- Cache
- Robots.txt
- Proxy
- Queue (BFS)
- Detect Trackers
- Http -> Https
Website / CLI
- Encryption (rsa)
- API
- Proxy
- Nodes
- Rating
Usage (Docker)
Please run the build every time to change the arguments.
The site is launched by default on port 8080 AND with tor proxy (!!!), to edit it you need to change config.json and rebuild website.
The api key for the database must be changed in the config and when the database is started(--api-key)
DB - please run before using other
sudo docker pull typesense/typesense:0.24.0.rcn6
mkdir /tmp/typesense-data
sudo docker run -p 8108:8108 -v/tmp/data:/data typesense/typesense:0.24.0.rcn6 --data-dir /data --api-key=xyz
Crawler
sudo docker-compose build crawler --build-arg SITES="$(cat sites.txt)" --build-arg THREADS=1 --build-arg CONFIG="$(cat config.json)"
sudo docker-compose up crawler
Website
sudo docker-compose build website --build-arg CONFIG="$(cat config.json)"
sudo docker-compose up crawler
Instances
¯\(ツ)/¯
TODO
- Docker
- Encryption (assymetric)
- Multithreading crawler
- Robots Rules (from headers & html) & crawl-delay
- Responsive web design
- Own FTS (...)
- Images Crawler
Dependencies
Config
./config.json
Mirrors
https://github.com/liameno/librengine https://codeberg.org/liameno/librengine
License
GNU Affero General Public License v3.0