mirror of
https://github.com/liameno/librengine.git
synced 2024-11-28 01:13:15 +03:00
Privacy Web Search Engine (not meta, own crawler)
cppcrawlerencryptionfrontendprivacyrobots-txtrsasearch-engineself-hostedspiderstarred-liameno-repostarred-repowebsearchwebsearchengine
.github | ||
cli | ||
crawler | ||
images | ||
lib | ||
scripts | ||
website | ||
.gitignore | ||
CMakeLists.txt | ||
config.json | ||
LICENSE | ||
README.md | ||
sites.txt |
Privacy Web Search Engine
Website
Features
Crawler
- Multithreading
- Cache
- Robots.txt
- Proxy
- Queue (BFS)
- Detect Trackers
- Http -> Https
Website / CLI
- Encryption (rsa)
- API
- Proxy
- Nodes
- Rating
cd scripts && sh install_deps.sh
Build
cd scripts && sh build_all.sh
Run
DB
mkdir /tmp/typesense-data &&
./typesense-server --data-dir=/tmp/typesense-data --api-key=xyz --enable-cors &&
sh scripts/init_db.sh
Crawler
./crawler ../../sites.txt 5 ../../config.json
#[sites_path] [threads_count] [config path]
Website
./website ../../config.json
#[config path]
CLI
Run website before!
./cli gnu 1 ../../config.json
#[query] [page] [config path]
Instances
¯\(ツ)/¯
TODO
- Encryption (assymetric)
- Multithreading crawler
- Robots Rules (from headers & html) & crawl-delay
- Responsive web design
- Own FTS (...)
- Images Crawler
Dependencies
Config
./config.json
Mirrors
https://github.com/liameno/librengine https://codeberg.org/liameno/librengine
License
GNU Affero General Public License v3.0