feat(docs): add architecture docs

2024-09-11 12:05:38 +03:00 · 2024-05-03 16:31:58 +05:30 · 2024-05-03 16:31:58 +05:30 · 205373d676
commit 205373d676
parent 408abd24ea
3 changed files with 30 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -22,6 +22,8 @@ Perplexica is an open-source AI-powered searching tool or an AI-powered search e

 Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.

+Want to know more about its architecture and how it works? You can read it [here](https://github.com/ItzCrazyKns/Perplexica/docs/architecture/README.md).
+
 ## Preview

 ![video-preview](.assets/perplexica-preview.gif)
--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@ -0,0 +1,9 @@
+## Perplexica's Architecture
+Perplexica's architecture consists of the following key components:
+
+1. **User Interface**: A web-based interface that allows users to interact with Perplexica for searching images, videos, and much more.
+2. **Agent/Chains**: These components predict Perplexica's next actions, understand user queries, and decide whether a web search is necessary.
+3. **SearXNG**: A metadata search engine used by Perplexica to search the web for sources.
+4. **LLMs (Large Language Models)**: Utilized by agents and chains for tasks like understanding content, writing responses, and citing sources. Examples include Claude, GPTs, etc.
+5. **Embedding Models**: To improve the accuracy of search results, embedding models re-rank the results using similarity search algorithms such as cosine similarity and dot product distance.
+For a more detailed explanation of how these components work together, see [WORKING.md](https://github.com/ItzCrazyKns/Perplexica/docs/architecture/WORKING.md).
--- a/docs/architecture/WORKING.md
+++ b/docs/architecture/WORKING.md
@ -0,0 +1,19 @@
+## How does Perplexica work?
+
+Curious about how Perplexica works? Don't worry, we'll cover it here. Before we begin, make sure you've read about the architecture of Perplexica to ensure you understand what it's made up of. Haven't read it? You can read it [here](https://github.com/ItzCrazyKns/Perplexica/docs/architecture/README.md).
+
+We'll understand how Perplexica works by taking an example of a scenario where a user asks: "How does an A.C. work?". We'll break down the process into steps to make it easier to understand. The steps are as follows:
+
+1. The message is sent via WS to the backend server where it invokes the chain. The chain will depend on your focus mode. For this example, let's assume we use the "webSearch" focus mode.
+2. The chain is now invoked; first, the message is passed to another chain where it first predicts (using the chat history and the question) whether there is a need for sources or searching the web. If there is, it will generate a query (in accordance with the chat history) for searching the web that we'll take up later. If not, the chain will end there, and then the answer generator chain, also known as the response generator, will be started.
+3. The query returned by the first chain is passed to SearXNG to search the web for information.
+4. After the information is retrieved, it is based on keyword-based search. We then convert the information into embeddings and the query as well, then we perform a similarity search to find the most relevant sources to answer the query.
+5. After all this is done, the sources are passed to the response generator. This chain takes all the chat history, the query, and the sources. It generates a response that is streamed to the UI.
+
+### How are the answers cited?
+
+The LLMs are prompted to do so. We've prompted them so well that they cite the answers themselves, and using some UI magic, we display it to the user.
+
+### Image and Video Search
+
+Image and video searches are conducted in a similar manner. A query is always generated first, then we search the web for images and videos that match the query. These results are then returned to the user.