Privacy Web Search Engine (not meta, own crawler)
Go to file
2022-05-02 14:48:01 -04:00
cli Update 2022-05-02 14:11:50 -04:00
crawler Update 2022-05-02 14:11:50 -04:00
lib Fix Memory Leak in lib/encryption.cpp 2022-05-02 14:45:32 -04:00
scripts Update 2022-05-02 14:11:50 -04:00
website Update 2022-05-02 14:11:50 -04:00
.gitignore Update favicon.ico, README.md, preview.gif 2022-04-01 12:02:10 -04:00
CMakeLists.txt Update May (opensearch->typesense) 2022-05-02 03:32:11 -04:00
config.json Update 2022-05-02 14:11:50 -04:00
demo.png Update README.md 2022-04-28 08:15:12 -04:00
LICENSE # 2021-12-19 15:25:47 -05:00
README.md Update May (opensearch->typesense) 2022-05-02 03:32:11 -04:00

Librengine

Privacy Web Search Engine

Website

https://raw.githubusercontent.com/liameno/librengine/master/preview.gif

Donate to web-hosting

Сurrency Address
Bitcoin (BTC) bc1qxpu9vfzah3vw5pzanny0zmfsgd64klcj24pa8x
Dogecoin (DOGE) DM8cqzbrW2rrmGk4K6UCD7rfeoqnKjJTum
Ethereum (ETH) 0x1857A1A7a543ED123151ACCAbBF4EB058741e614
Litecoin (LTC) LLQMiWpF1cxET7p7UMYoWjJ26JuTp14u8K
Monero (XMR) 4AkPUBr4uoFV1K4fSitpGJjRHo4dfSzZ257YR9HxiQi3DvmgLW1rteRQfRRCFYytKugcygfHAvvJu3Tt96mSoVUE6JKJDZL

Features

  • Crawler
    • Proxy
    • Http To Https
    • Robots Txt...
  • Website
    • Encryption RSA (if js is enabled)
    • API...

TODO

  • Encryption (assymetric)
  • Robots Rules from headers && html, crawl-delay
  • Images Crawler
  • Adaptive Website

Dependencies

Arch:

yay -S curl lexbor openssl &&
wget https://dl.typesense.org/releases/0.22.2/typesense-server-0.22.2-linux-amd64.tar.gz &&
tar -zxf typesense-server-0.22.2-linux-amd64.tar.gz

Debian:

sudo apt install libcurl4-openssl-dev &&
wget https://dl.typesense.org/releases/0.22.2/typesense-server-0.22.2-linux-amd64.tar.gz &&
tar -zxf typesense-server-0.22.2-linux-amd64.tar.gz &&
git clone https://github.com/lexbor/lexbor && 
cd lexbor &&
cmake . && make && sudo make install &&
sudo apt install libssl-dev

Build

git clone https://github.com/liameno/librengine &&
cd librengine &&
sh scripts/build_all.sh

Run

./typesense-server --data-dir=/tmp/typesense-data --api-key=xyz --enable-cors &&
sh scripts/init_db.sh

Crawler

./crawler https://www.gnu.org ../../config.json
#[start_site] [config path]

Website

./website ../../config.json
#[config path]

Config

//proxy: type://ip:port
//socks5://127.0.0.1:9050

//_s - seconds

{
  "crawler": {
    "user_agent": "librengine",
    "proxy": "socks5://127.0.0.1:9050",
    "load_page_timeout_s": 20,
    "update_time_site_info_s_after": 86400, //10 days
    "delay_time_s": 3, 
    "max_recursive_deep": 2,
    "max_pages_site": 1,
    "max_page_symbols": 50000000, //50mb
    "max_robots_txt_symbols": 3000,
    "is_one_site": false,
    "is_http_to_https": true,
    "is_check_robots_txt": true
  },
  "cli": {
    //local=0 | nodes=1(website/backend)
    "mode": 0
  },
  "website": {
    "port": 8080,
    "proxy": "socks5://127.0.0.1:9050",
    "nodes": [ {
        "name": "This",
        "url": "http://127.0.0.1:8080"
      }
    ]
  },
  //edit also init_db.sh
  "db": {
    "url": "http://localhost:8108",
    "api_key": "xyz"
  }
}

License

GNU General Public License v3.0