Feat/docs rework (#1525)

# Description Please include a summary of the changes and the related issue. Please also include relevant motivation and context. ## Checklist before requesting a review Please delete options that are not relevant. - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my code - [ ] I have commented hard-to-understand areas - [ ] I have ideally added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] Any dependent changes have been merged ## Screenshots (if appropriate):
2024-12-14 17:03:29 +03:00 · 2023-10-30 20:53:39 +01:00 · 2023-10-30 20:53:39 +01:00 · 90ee40f9f2
commit 90ee40f9f2
parent 6323931a8b
10 changed files with 346 additions and 42 deletions
--- a/docs/docs/Developers/run_fully_local.md
+++ b/docs/docs/Developers/run_fully_local.md
@ -0,0 +1,253 @@
+---
+sidebar_position: 2
+title: Using Quivr fully locally
+---
+
+# Using Quivr fully locally
+
+## Headers
+
+The following is a guide to set up everything for using Quivr locally:
+##### Table of Contents  
+* [Database](#database)
+* [Embeddings](#embeddings)
+* [LLM for inference](#llm)
+
+It is a first, working setup, but a lot of work has to be done to e.g. find the appropriate settings for the model.
+
+Importantly, this will currently only work on tag v0.0.46.
+
+The guide was put together in collaboration with members of the Quivr Discord, **Using Quivr fully locally** thread. That is a good place to discuss it. 
+
+This worked for me, but I sometimes got strange results (the output contains repeating answers/questions). Maybe because `stopping_criteria=stopping_criteria` must be uncommented in `transformers.pipeline`. Will update this page as I continue learning.
+
+<a name="database"/>
+
+## Local Supabase
+
+Instead of relying on a remote Supabase instance, we have to set it up locally. Follow the instructions on https://supabase.com/docs/guides/self-hosting/docker.
+
+Troubleshooting:
+* If the Quivr backend container cannot reach Supabase on port 8000, change the Quivr backend container to use the host network.
+* If email service does not work, add a user using the supabase web ui, and check "Auto Confirm User?".
+  * http://localhost:8000/project/default/auth/users
+
+<a name="embeddings"/>
+
+## Local embeddings
+
+First, let's get local embeddings to work with GPT4All. Instead of relying on OpenAI for generating embeddings of both the prompt and the documents we upload, we will use a local LLM for this.
+
+Remove any existing data from the postgres database:
+* `supabase/docker $ docker compose down -v`
+* `supabase/docker $ rm -rf volumes/db/data/`
+* `supabase/docker $ docker compose up -d`
+
+Change the vector dimensions in the necessary Quivr SQL files:
+* Replace all occurrences of 1536 by 768, in Quivr's `scripts\tables.sql`
+* Run tables.sql in the Supabase web ui SQL editor: http://localhost:8000
+
+Change the Quivr code to use local LLM (GPT4All) and local embeddings:
+* add code to `backend\core\llm\private_gpt4all.py`
+
+```python
+    from langchain.embeddings import HuggingFaceEmbeddings
+    ...
+    def embeddings(self) -> HuggingFaceEmbeddings:
+        emb = HuggingFaceEmbeddings(
+            model_name="sentence-transformers/all-mpnet-base-v2",
+            model_kwargs={'device': 'cuda'},
+            encode_kwargs={'normalize_embeddings': False}
+        )
+        return emb
+```
+
+Note that there may be better models out there for generating the embeddings: https://huggingface.co/spaces/mteb/leaderboard
+
+Update Quivr `backend/core/.env`'s Private LLM Variables:
+
+```
+    #Private LLM Variables
+    PRIVATE=True
+    MODEL_PATH=./local_models/ggml-gpt4all-j-v1.3-groovy.bin
+```
+
+Download GPT4All model:
+* `$ cd backend/core/local_models/`
+* `wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin`
+
+Ensure the Quivr backend docker container has CUDA and the GPT4All package:
+
+```
+    FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel
+    #FROM python:3.11-bullseye
+    
+    ARG DEBIAN_FRONTEND=noninteractive
+    ENV DEBIAN_FRONTEND=noninteractive
+    
+    RUN pip install gpt4all
+```
+
+Modify the docker-compose yml file (for backend container). The following example is for using 2 GPUs:
+
+```
+    ...
+    network_mode: host
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 2
+              capabilities: [gpu]
+```
+
+Install nvidia container toolkit on the host, https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html:
+
+```
+$ wget https://nvidia.github.io/nvidia-docker/gpgkey --no-check-certificate
+$ sudo apt-key add gpgkey
+$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
+$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
+$ sudo apt-get update
+
+$ sudo apt-get install -y nvidia-container-toolkit
+    
+$ nvidia-ctk --version
+    
+$ sudo systemctl restart docker
+```
+
+At this moment, if we try to upload a pdf, we get an error:
+
+```
+backend-core  | 1989-01-01 21:51:41,211 [ERROR] utils.vectors: Error creating vector for document {'code': '22000', 'details': None, 'hint': None, 'message': 'expected 768 dimensions, not 1536'}
+```
+
+This can be remedied by using local embeddings for document embeddings. In backend/core/utils/vectors.py, replace:
+
+```python
+    # def create_vector(self, doc, user_openai_api_key=None):
+    #     logger.info("Creating vector for document")
+    #     logger.info(f"Document: {doc}")
+    #     if user_openai_api_key:
+    #         self.commons["documents_vector_store"]._embedding = OpenAIEmbeddings(
+    #             openai_api_key=user_openai_api_key
+    #         )  # pyright: ignore reportPrivateUsage=none
+    #     try:
+    #         sids = self.commons["documents_vector_store"].add_documents([doc])
+    #         if sids and len(sids) > 0:
+    #             return sids
+
+    #     except Exception as e:
+    #         logger.error(f"Error creating vector for document {e}")
+
+    def create_vector(self, doc, user_openai_api_key=None):
+        logger.info("Creating vector for document")
+        logger.info(f"Document: {doc}")
+        self.commons["documents_vector_store"]._embedding = HuggingFaceEmbeddings(
+        model_name="sentence-transformers/all-mpnet-base-v2",
+        model_kwargs={'device': 'cuda'},
+        encode_kwargs={'normalize_embeddings': False}
+        )  # pyright: ignore reportPrivateUsage=none
+        logger.info('||| creating emedding')
+        try:
+            sids = self.commons["documents_vector_store"].add_documents([doc])
+            if sids and len(sids) > 0:
+                return sids
+
+        except Exception as e:
+            logger.error(f"Error creating vector for document {e}")
+```
+
+<a name="llm"/>
+
+## Local LLM
+
+The final step is to use a local model from HuggingFace for inference. (The HF token is optional, only required for certain models on HF.)
+
+Update the Quivr backend dockerfile:
+
+```
+    ENV HUGGINGFACEHUB_API_TOKEN=hf_XXX
+    
+    RUN pip install accelerate
+```
+
+Update the `private_gpt4all.py` file as follows:
+
+```python
+    import langchain
+    langchain.debug = True
+    langchain.verbose = True
+
+    import os
+    import transformers
+    from langchain.llms import HuggingFacePipeline
+    from langchain.embeddings import HuggingFaceEmbeddings
+    ...
+    
+    model_id = "stabilityai/StableBeluga-13B"    
+    ...
+    
+    def _create_llm(
+        self,
+        model,
+        streaming=False,
+        callbacks=None,
+    ) -> BaseLLM:
+        """
+        Override the _create_llm method to enforce the use of a private model.
+        :param model: Language model name to be used.
+        :param streaming: Whether to enable streaming of the model
+        :param callbacks: Callbacks to be used for streaming
+        :return: Language model instance
+        """
+
+        model_path = self.model_path
+
+        logger.info("Using private model: %s", model)
+        logger.info("Streaming is set to %s", streaming)
+        logger.info("--- model  %s",model)
+
+        logger.info("--- model path %s",model_path)
+
+        model_id = "stabilityai/StableBeluga-13B"
+        
+        llm = transformers.AutoModelForCausalLM.from_pretrained(
+            model_id,
+            use_cache=True,
+            load_in_4bit=True,
+            device_map='auto',
+            #use_auth_token=hf_auth
+        )
+        logger.info('<<< transformers.AutoModelForCausalLM.from_pretrained')
+
+        llm.eval()
+        logger.info('<<< eval')
+
+        tokenizer = transformers.AutoTokenizer.from_pretrained(
+            model_id,
+            use_auth_token=hf_auth
+        )
+        logger.info('<<< transformers.AutoTokenizer.from_pretrained')
+
+        generate_text = transformers.pipeline(
+            model=llm, tokenizer=tokenizer,
+            return_full_text=True,  # langchain expects the full text
+            task='text-generation',
+            # we pass model parameters here too
+            #stopping_criteria=stopping_criteria,  # without this model rambles during chat
+            temperature=0.5,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
+            max_new_tokens=512,  # mex number of tokens to generate in the output
+            repetition_penalty=1.1  # without this output begins repeating
+        )
+        logger.info('<<< generate_text = transformers.pipeline(')
+
+        result = HuggingFacePipeline(pipeline=generate_text)
+
+        logger.info('<<< generate_text = transformers.pipeline(')
+
+        logger.info("<<< created llm HuggingFace")
+        return result
+```
--- a/docs/docs/User_Guide/what-is-a-brain.md
+++ b/docs/docs/User_Guide/what-is-a-brain.md
@ -0,0 +1,12 @@
+---
+title: Concept of Brain
+---
+
+:::info
+A few brains were harmed in the making of this documentation 🤯😏
+:::
+
+
+A **brain** is a concept that we created to allow you to **create** and **organize** your knowledge in Quivr. 
+
+
--- a/docs/docs/intro.md
+++ b/docs/docs/intro.md
@ -3,32 +3,71 @@ sidebar_position: 1
 title: 🚀 Welcome to Quivr
 ---

-# Intro

-Quivr, your second brain, utilizes the power of GenerativeAI to store and retrieve unstructured information. Think of it as Obsidian, but turbocharged with AI capabilities.
+## Welcome to Quivr 🧠

-## Key Features 🎯
+[Quivr](https://quivr.app) is your **Second Brain** that can acts as your **personal assistant**. 

- **Universal Data Acceptance**: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, we've got you covered.
- **Generative AI**: Quivr employs advanced AI to assist you in generating and retrieving information.
- **Fast and Efficient**: Designed with speed and efficiency at its core. Quivr ensures rapid access to your data.
- **Secure**: Your data, your control. Always.
- **File Compatibility**:
-  - Text
-  - Markdown
-  - PDF
-  - Powerpoint
-  - Excel
-  - Word
-  - Audio
-  - Video
- **Open Source**: Freedom is beautiful, so is Quivr. Open source and free to use.
+It can:
+- **Answer questions** to the **files** that you uploaded
+- **Interact** with the **applications** that you connected to Quivr (Soon 🚀)

-## Demo Highlights 🎥
+:::info
+**Our goal **is to make Quivr the **best open-source personal assistant** that is powered by your knowledge and your applications 🔥
+It will remain open-source
+:::

-### **Demo**:
+## What does it do ? 
+
+> A video equals a thousand words
+
+<div style={{ textAlign: 'center' }}>
+  <video width="640" height="480" controls>
+    <source src="https://quivr-cms.s3.eu-west-3.amazonaws.com/singlestore_demo_quivr_232893659c.mp4" type="video/mp4"/>
+    Your browser does not support the video tag.
+  </video>
+</div>
+
+
+## How to get started ? 👀
+
+:::tip
+It takes less than **5 seconds** to get started with Quivr. You can even use your Google account to sign up.
+:::
+
+- Create an account on [Quivr](https://quivr.app)
+- Upload your files
+- Ask questions to Quivr
+  
+
+<div style={{ textAlign: 'center' }}>
+  <img src="/img/homepage.png" alt="Quivr home page" style={{ width: '60%' }} />
+</div>
+
+## How can you leverage Quivr ? 🚀
+
+You can use Quivr to:
+- **Search** information in your files
+- **Cross-search** information from your files
+- **Store** all your knowledge in one place
+- **Share** your knowledge with your team by leveraging the **collaboration** features
+
+## Features 🤗
+
+### Public & Private Brains
+
+You can create **public** and **private** brains.
+
+- **Public** brains are **searchable** by everyone on Quivr
+- **Private** brains are **only searchable** by you
+
+You can share your brains with a set of people by using their emails. This means that you can **collaborate** with your team on a brain without making it public.
+
+### Ask Questions
+
+You can **ask questions** to Quivr. Quivr will **answer** your questions by leveraging the **knowledge** that you uploaded to Quivr.
+
+### Custom Personality 
+
+You can **customize** the personality of Quivr by changing the **prompt** of your brain. You could tell your brain to always answer with a **funny** or **serious** tone or act like a **robot** or a **human**.

-<video width="640" height="480" controls>
-  <source src="https://github.com/StanGirard/quivr/assets/19614572/a6463b73-76c7-4bc0-978d-70562dca71f5" type="video/mp4"/>
-  Your browser does not support the video tag.
-</video>
--- a/docs/static/img/homepage.png
+++ b/docs/static/img/homepage.png
--- a/frontend/public/locales/en/chat.json
+++ b/frontend/public/locales/en/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "How to use Quivr ?",
    "what_is_quivr": "What is Quivr ?",
    "what_is_brain": "What is a brain ?",
-    "answer":{
-      "how_to_use_quivr": "Check the documentation https://brain.quivr.app/docs/get_started/intro.html",
+    "answer": {
+      "how_to_use_quivr": "Check the documentation https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr is a helpful assistant.",
      "what_is_brain": "A brain contains knowledge"
    }
  },
-  "welcome":"Welcome"
-}
+  "welcome": "Welcome"
+}
--- a/frontend/public/locales/es/chat.json
+++ b/frontend/public/locales/es/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "¿Cómo usar Quivr?",
    "what_is_quivr": "¿Qué es Quivr?",
    "what_is_brain": "¿Qué es un cerebro?",
-    "answer":{
-      "how_to_use_quivr": "Consulta la documentación en https://brain.quivr.app/docs/get_started/intro.html",
+    "answer": {
+      "how_to_use_quivr": "Consulta la documentación en https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr es un asistente útil.",
      "what_is_brain": "Un cerebro contiene conocimiento."
    }
  },
  "welcome": "Bienvenido"
-}
+}
--- a/frontend/public/locales/fr/chat.json
+++ b/frontend/public/locales/fr/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "Comment utiliser Quivr ?",
    "what_is_quivr": "Qu'est-ce que Quivr ?",
    "what_is_brain": "Qu'est-ce qu'un cerveau ?",
-    "answer":{
-      "how_to_use_quivr": "Consultez la documentation sur https://brain.quivr.app/docs/get_started/intro.html",
+    "answer": {
+      "how_to_use_quivr": "Consultez la documentation sur https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr est un assistant utile.",
      "what_is_brain": "Un cerveau contient des connaissances."
    }
  },
  "welcome": "Bienvenue"
-}
+}
--- a/frontend/public/locales/pt-br/chat.json
+++ b/frontend/public/locales/pt-br/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "Como usar o Quivr?",
    "what_is_quivr": "O que é o Quivr?",
    "what_is_brain": "O que é um cérebro?",
-    "answer":{
-      "how_to_use_quivr": "Consulte a documentação em https://brain.quivr.app/docs/get_started/intro.html",
+    "answer": {
+      "how_to_use_quivr": "Consulte a documentação em https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr é um assistente útil.",
      "what_is_brain": "Um cérebro contém conhecimento."
    }
  },
  "welcome": "Bem-vindo"
-}
+}
--- a/frontend/public/locales/ru/chat.json
+++ b/frontend/public/locales/ru/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "Как использовать Quivr?",
    "what_is_quivr": "Что такое Quivr?",
    "what_is_brain": "Что такое мозг?",
-    "answer":{
-      "how_to_use_quivr": "Проверьте документацию на https://brain.quivr.app/docs/get_started/intro.html",
+    "answer": {
+      "how_to_use_quivr": "Проверьте документацию на https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr - полезный ассистент.",
      "what_is_brain": "Мозг содержит знания."
    }
  },
  "welcome": "Добро пожаловать"
-}
+}
--- a/frontend/public/locales/zh-cn/chat.json
+++ b/frontend/public/locales/zh-cn/chat.json
@ -45,11 +45,11 @@
    "how_to_use_quivr": "如何使用Quivr？",
    "what_is_quivr": "什么是Quivr？",
    "what_is_brain": "什么是大脑？",
-    "answer":{
-      "how_to_use_quivr": "查看文档https://brain.quivr.app/docs/get_started/intro.html",
+    "answer": {
+      "how_to_use_quivr": "查看文档https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr是一个有用的助手。",
      "what_is_brain": "大脑包含知识。"
    }
  },
  "welcome": "欢迎来到"
-}
+}