Feat/docs rework (#1525)

# Description Please include a summary of the changes and the related issue. Please also include relevant motivation and context. ## Checklist before requesting a review Please delete options that are not relevant. - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my code - [ ] I have commented hard-to-understand areas - [ ] I have ideally added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] Any dependent changes have been merged ## Screenshots (if appropriate):
2024-12-25 12:22:58 +03:00 · 2023-10-30 20:53:39 +01:00 · 2023-10-30 20:53:39 +01:00 · 90ee40f9f2
commit 90ee40f9f2
parent 6323931a8b
10 changed files with 346 additions and 42 deletions
--- a/docs/docs/Developers/run_fully_local.md
+++ b/docs/docs/Developers/run_fully_local.md
@ -0,0 +1,253 @@
 ---
 sidebar_position: 2
 title: Using Quivr fully locally
 ---
 # Using Quivr fully locally
 ## Headers
 The following is a guide to set up everything for using Quivr locally:
 ##### Table of Contents  
 * [Database](#database)
 * [Embeddings](#embeddings)
 * [LLM for inference](#llm)
 It is a first, working setup, but a lot of work has to be done to e.g. find the appropriate settings for the model.
 Importantly, this will currently only work on tag v0.0.46.
 The guide was put together in collaboration with members of the Quivr Discord, **Using Quivr fully locally** thread. That is a good place to discuss it. 
 This worked for me, but I sometimes got strange results (the output contains repeating answers/questions). Maybe because `stopping_criteria=stopping_criteria` must be uncommented in `transformers.pipeline`. Will update this page as I continue learning.
 <a name="database"/>
 ## Local Supabase
 Instead of relying on a remote Supabase instance, we have to set it up locally. Follow the instructions on https://supabase.com/docs/guides/self-hosting/docker.
 Troubleshooting:
 * If the Quivr backend container cannot reach Supabase on port 8000, change the Quivr backend container to use the host network.
 * If email service does not work, add a user using the supabase web ui, and check "Auto Confirm User?".
  * http://localhost:8000/project/default/auth/users
 <a name="embeddings"/>
 ## Local embeddings
 First, let's get local embeddings to work with GPT4All. Instead of relying on OpenAI for generating embeddings of both the prompt and the documents we upload, we will use a local LLM for this.
 Remove any existing data from the postgres database:
 * `supabase/docker $ docker compose down -v`
 * `supabase/docker $ rm -rf volumes/db/data/`
 * `supabase/docker $ docker compose up -d`
 Change the vector dimensions in the necessary Quivr SQL files:
 * Replace all occurrences of 1536 by 768, in Quivr's `scripts\tables.sql`
 * Run tables.sql in the Supabase web ui SQL editor: http://localhost:8000
 Change the Quivr code to use local LLM (GPT4All) and local embeddings:
 * add code to `backend\core\llm\private_gpt4all.py`
 ```python
    from langchain.embeddings import HuggingFaceEmbeddings
    ...
    def embeddings(self) -> HuggingFaceEmbeddings:
        emb = HuggingFaceEmbeddings(
            model_name="sentence-transformers/all-mpnet-base-v2",
            model_kwargs={'device': 'cuda'},
            encode_kwargs={'normalize_embeddings': False}
        )
        return emb
 ```
 Note that there may be better models out there for generating the embeddings: https://huggingface.co/spaces/mteb/leaderboard
 Update Quivr `backend/core/.env`'s Private LLM Variables:
 ```
    #Private LLM Variables
    PRIVATE=True
    MODEL_PATH=./local_models/ggml-gpt4all-j-v1.3-groovy.bin
 ```
 Download GPT4All model:
 * `$ cd backend/core/local_models/`
 * `wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin`
 Ensure the Quivr backend docker container has CUDA and the GPT4All package:
 ```
    FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel
    #FROM python:3.11-bullseye
    ARG DEBIAN_FRONTEND=noninteractive
    ENV DEBIAN_FRONTEND=noninteractive
    RUN pip install gpt4all
 ```
 Modify the docker-compose yml file (for backend container). The following example is for using 2 GPUs:
 ```
    ...
    network_mode: host
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 2
              capabilities: [gpu]
 ```
 Install nvidia container toolkit on the host, https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html:
 ```
 $ wget https://nvidia.github.io/nvidia-docker/gpgkey --no-check-certificate
 $ sudo apt-key add gpgkey
 $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
 $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
 $ sudo apt-get update
 $ sudo apt-get install -y nvidia-container-toolkit
 $ nvidia-ctk --version
 $ sudo systemctl restart docker
 ```
 At this moment, if we try to upload a pdf, we get an error:
 ```
 backend-core  | 1989-01-01 21:51:41,211 [ERROR] utils.vectors: Error creating vector for document {'code': '22000', 'details': None, 'hint': None, 'message': 'expected 768 dimensions, not 1536'}
 ```
 This can be remedied by using local embeddings for document embeddings. In backend/core/utils/vectors.py, replace:
 ```python
    # def create_vector(self, doc, user_openai_api_key=None):
    #     logger.info("Creating vector for document")
    #     logger.info(f"Document: {doc}")
    #     if user_openai_api_key:
    #         self.commons["documents_vector_store"]._embedding = OpenAIEmbeddings(
    #             openai_api_key=user_openai_api_key
    #         )  # pyright: ignore reportPrivateUsage=none
    #     try:
    #         sids = self.commons["documents_vector_store"].add_documents([doc])
    #         if sids and len(sids) > 0:
    #             return sids
    #     except Exception as e:
    #         logger.error(f"Error creating vector for document {e}")
    def create_vector(self, doc, user_openai_api_key=None):
        logger.info("Creating vector for document")
        logger.info(f"Document: {doc}")
        self.commons["documents_vector_store"]._embedding = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2",
        model_kwargs={'device': 'cuda'},
        encode_kwargs={'normalize_embeddings': False}
        )  # pyright: ignore reportPrivateUsage=none
        logger.info('||| creating emedding')
        try:
            sids = self.commons["documents_vector_store"].add_documents([doc])
            if sids and len(sids) > 0:
                return sids
        except Exception as e:
            logger.error(f"Error creating vector for document {e}")
 ```
 <a name="llm"/>
 ## Local LLM
 The final step is to use a local model from HuggingFace for inference. (The HF token is optional, only required for certain models on HF.)
 Update the Quivr backend dockerfile:
 ```
    ENV HUGGINGFACEHUB_API_TOKEN=hf_XXX
    RUN pip install accelerate
 ```
 Update the `private_gpt4all.py` file as follows:
 ```python
    import langchain
    langchain.debug = True
    langchain.verbose = True
    import os
    import transformers
    from langchain.llms import HuggingFacePipeline
    from langchain.embeddings import HuggingFaceEmbeddings
    ...
    model_id = "stabilityai/StableBeluga-13B"    
    ...
    def _create_llm(
        self,
        model,
        streaming=False,
        callbacks=None,
    ) -> BaseLLM:
        """
        Override the _create_llm method to enforce the use of a private model.
        :param model: Language model name to be used.
        :param streaming: Whether to enable streaming of the model
        :param callbacks: Callbacks to be used for streaming
        :return: Language model instance
        """
        model_path = self.model_path
        logger.info("Using private model: %s", model)
        logger.info("Streaming is set to %s", streaming)
        logger.info("--- model  %s",model)
        logger.info("--- model path %s",model_path)
        model_id = "stabilityai/StableBeluga-13B"
        llm = transformers.AutoModelForCausalLM.from_pretrained(
            model_id,
            use_cache=True,
            load_in_4bit=True,
            device_map='auto',
            #use_auth_token=hf_auth
        )
        logger.info('<<< transformers.AutoModelForCausalLM.from_pretrained')
        llm.eval()
        logger.info('<<< eval')
        tokenizer = transformers.AutoTokenizer.from_pretrained(
            model_id,
            use_auth_token=hf_auth
        )
        logger.info('<<< transformers.AutoTokenizer.from_pretrained')
        generate_text = transformers.pipeline(
            model=llm, tokenizer=tokenizer,
            return_full_text=True,  # langchain expects the full text
            task='text-generation',
            # we pass model parameters here too
            #stopping_criteria=stopping_criteria,  # without this model rambles during chat
            temperature=0.5,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
            max_new_tokens=512,  # mex number of tokens to generate in the output
            repetition_penalty=1.1  # without this output begins repeating
        )
        logger.info('<<< generate_text = transformers.pipeline(')
        result = HuggingFacePipeline(pipeline=generate_text)
        logger.info('<<< generate_text = transformers.pipeline(')
        logger.info("<<< created llm HuggingFace")
        return result
 ```
--- a/docs/docs/User_Guide/what-is-a-brain.md
+++ b/docs/docs/User_Guide/what-is-a-brain.md
@ -0,0 +1,12 @@
 ---
 title: Concept of Brain
 ---
 :::info
 A few brains were harmed in the making of this documentation 🤯😏
 :::
 A **brain** is a concept that we created to allow you to **create** and **organize** your knowledge in Quivr. 
--- a/docs/docs/intro.md
+++ b/docs/docs/intro.md
@ -3,32 +3,71 @@ sidebar_position: 1
 title: 🚀 Welcome to Quivr
 ---
 # Intro
-Quivr, your second brain, utilizes the power of GenerativeAI to store and retrieve unstructured information. Think of it as Obsidian, but turbocharged with AI capabilities.
+## Welcome to Quivr 🧠
-## Key Features 🎯
+[Quivr](https://quivr.app) is your **Second Brain** that can acts as your **personal assistant**. 
- **Universal Data Acceptance**: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, we've got you covered.
+It can:
- **Generative AI**: Quivr employs advanced AI to assist you in generating and retrieving information.
+- **Answer questions** to the **files** that you uploaded
- **Fast and Efficient**: Designed with speed and efficiency at its core. Quivr ensures rapid access to your data.
+- **Interact** with the **applications** that you connected to Quivr (Soon 🚀)
 - **Secure**: Your data, your control. Always.
 - **File Compatibility**:
  - Text
  - Markdown
  - PDF
  - Powerpoint
  - Excel
  - Word
  - Audio
  - Video
 - **Open Source**: Freedom is beautiful, so is Quivr. Open source and free to use.
-## Demo Highlights 🎥
+:::info
 **Our goal **is to make Quivr the **best open-source personal assistant** that is powered by your knowledge and your applications 🔥
 It will remain open-source
 :::
-### **Demo**:
+## What does it do ? 
 > A video equals a thousand words
 <div style={{ textAlign: 'center' }}>
  <video width="640" height="480" controls>
    <source src="https://quivr-cms.s3.eu-west-3.amazonaws.com/singlestore_demo_quivr_232893659c.mp4" type="video/mp4"/>
    Your browser does not support the video tag.
  </video>
 </div>
 ## How to get started ? 👀
 :::tip
 It takes less than **5 seconds** to get started with Quivr. You can even use your Google account to sign up.
 :::
 - Create an account on [Quivr](https://quivr.app)
 - Upload your files
 - Ask questions to Quivr
 <div style={{ textAlign: 'center' }}>
  <img src="/img/homepage.png" alt="Quivr home page" style={{ width: '60%' }} />
 </div>
 ## How can you leverage Quivr ? 🚀
 You can use Quivr to:
 - **Search** information in your files
 - **Cross-search** information from your files
 - **Store** all your knowledge in one place
 - **Share** your knowledge with your team by leveraging the **collaboration** features
 ## Features 🤗
 ### Public & Private Brains
 You can create **public** and **private** brains.
 - **Public** brains are **searchable** by everyone on Quivr
 - **Private** brains are **only searchable** by you
 You can share your brains with a set of people by using their emails. This means that you can **collaborate** with your team on a brain without making it public.
 ### Ask Questions
 You can **ask questions** to Quivr. Quivr will **answer** your questions by leveraging the **knowledge** that you uploaded to Quivr.
 ### Custom Personality 
 You can **customize** the personality of Quivr by changing the **prompt** of your brain. You could tell your brain to always answer with a **funny** or **serious** tone or act like a **robot** or a **human**.
 <video width="640" height="480" controls>
  <source src="https://github.com/StanGirard/quivr/assets/19614572/a6463b73-76c7-4bc0-978d-70562dca71f5" type="video/mp4"/>
  Your browser does not support the video tag.
 </video>
--- a/docs/static/img/homepage.png
+++ b/docs/static/img/homepage.png
--- a/frontend/public/locales/en/chat.json
+++ b/frontend/public/locales/en/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "How to use Quivr ?",
    "what_is_quivr": "What is Quivr ?",
    "what_is_brain": "What is a brain ?",
-    "answer":{
+    "answer": {
-      "how_to_use_quivr": "Check the documentation https://brain.quivr.app/docs/get_started/intro.html",
+      "how_to_use_quivr": "Check the documentation https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr is a helpful assistant.",
      "what_is_brain": "A brain contains knowledge"
    }
  },
-  "welcome":"Welcome"
+  "welcome": "Welcome"
-}
+}
--- a/frontend/public/locales/es/chat.json
+++ b/frontend/public/locales/es/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "¿Cómo usar Quivr?",
    "what_is_quivr": "¿Qué es Quivr?",
    "what_is_brain": "¿Qué es un cerebro?",
-    "answer":{
+    "answer": {
-      "how_to_use_quivr": "Consulta la documentación en https://brain.quivr.app/docs/get_started/intro.html",
+      "how_to_use_quivr": "Consulta la documentación en https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr es un asistente útil.",
      "what_is_brain": "Un cerebro contiene conocimiento."
    }
  },
  "welcome": "Bienvenido"
-}
+}
--- a/frontend/public/locales/fr/chat.json
+++ b/frontend/public/locales/fr/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "Comment utiliser Quivr ?",
    "what_is_quivr": "Qu'est-ce que Quivr ?",
    "what_is_brain": "Qu'est-ce qu'un cerveau ?",
-    "answer":{
+    "answer": {
-      "how_to_use_quivr": "Consultez la documentation sur https://brain.quivr.app/docs/get_started/intro.html",
+      "how_to_use_quivr": "Consultez la documentation sur https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr est un assistant utile.",
      "what_is_brain": "Un cerveau contient des connaissances."
    }
  },
  "welcome": "Bienvenue"
-}
+}
--- a/frontend/public/locales/pt-br/chat.json
+++ b/frontend/public/locales/pt-br/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "Como usar o Quivr?",
    "what_is_quivr": "O que é o Quivr?",
    "what_is_brain": "O que é um cérebro?",
-    "answer":{
+    "answer": {
-      "how_to_use_quivr": "Consulte a documentação em https://brain.quivr.app/docs/get_started/intro.html",
+      "how_to_use_quivr": "Consulte a documentação em https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr é um assistente útil.",
      "what_is_brain": "Um cérebro contém conhecimento."
    }
  },
  "welcome": "Bem-vindo"
-}
+}
--- a/frontend/public/locales/ru/chat.json
+++ b/frontend/public/locales/ru/chat.json
@ -44,11 +44,11 @@
    "how_to_use_quivr": "Как использовать Quivr?",
    "what_is_quivr": "Что такое Quivr?",
    "what_is_brain": "Что такое мозг?",
-    "answer":{
+    "answer": {
-      "how_to_use_quivr": "Проверьте документацию на https://brain.quivr.app/docs/get_started/intro.html",
+      "how_to_use_quivr": "Проверьте документацию на https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr - полезный ассистент.",
      "what_is_brain": "Мозг содержит знания."
    }
  },
  "welcome": "Добро пожаловать"
-}
+}
--- a/frontend/public/locales/zh-cn/chat.json
+++ b/frontend/public/locales/zh-cn/chat.json
@ -45,11 +45,11 @@
    "how_to_use_quivr": "如何使用Quivr？",
    "what_is_quivr": "什么是Quivr？",
    "what_is_brain": "什么是大脑？",
-    "answer":{
+    "answer": {
-      "how_to_use_quivr": "查看文档https://brain.quivr.app/docs/get_started/intro.html",
+      "how_to_use_quivr": "查看文档https://brain.quivr.app/docs/intro.html",
      "what_is_quivr": "Quivr是一个有用的助手。",
      "what_is_brain": "大脑包含知识。"
    }
  },
  "welcome": "欢迎来到"
-}
+}