Privacy Web Search Engine (not meta, own crawler)
Go to file
2022-09-01 15:11:30 +03:00
.github Update issue templates 2022-06-22 07:26:01 +00:00
cli Refactoring & Update CMake 2022-06-22 11:22:47 +03:00
crawler Refactoring 2022-07-14 10:31:34 +03:00
images Update /node/info demo image 2022-06-22 11:23:01 +03:00
lib Refactoring 2022-07-14 10:31:34 +03:00
scripts Fix #1 2022-08-20 20:30:40 +03:00
website Refactoring 2022-07-14 10:31:34 +03:00
.gitignore Update .gitignore 2022-07-14 10:31:34 +03:00
CMakeLists.txt Update May (opensearch->typesense) 2022-05-02 03:32:11 -04:00
config.json Replacing a third party robots.txt parser with own robots.txt parser 2022-05-27 19:40:36 +03:00
LICENSE Create LICENSE 2022-05-03 17:40:30 +00:00
README.md Update README.md 2022-09-01 15:11:30 +03:00
sites.txt Refactoring 2022-07-14 10:31:34 +03:00

Privacy Web Search Engine

Website

Features

Crawler

  • Multithreading
  • Cache
  • Robots.txt
  • Proxy
  • Queue (BFS)
  • Detect Trackers
  • Http -> Https

Website / CLI

  • Encryption (rsa)
  • API
  • Proxy
  • Nodes
  • Rating
cd scripts && sh install_deps.sh

Build

cd scripts && sh build_all.sh

Run

DB

mkdir /tmp/typesense-data &&
./typesense-server --data-dir=/tmp/typesense-data --api-key=xyz --enable-cors &&
sh scripts/init_db.sh

Crawler

./crawler ../../sites.txt 5 ../../config.json
#[sites_path] [threads_count] [config path]

Website

./website ../../config.json
#[config path]

CLI

Run website before!
./cli gnu 1 ../../config.json
#[query] [page] [config path]

Instances

¯\(ツ)

TODO

  • Encryption (assymetric)
  • Multithreading crawler
  • Robots Rules (from headers & html) & crawl-delay
  • Responsive web design
  • Own FTS (...)
  • Images Crawler

Dependencies

Config

./config.json

Mirrors

https://github.com/liameno/librengine https://codeberg.org/liameno/librengine

License

GNU Affero General Public License v3.0