mirror of
https://github.com/StanGirard/quivr.git
synced 2024-12-25 04:12:44 +03:00
Feat/docs rework (#1525)
# Description Please include a summary of the changes and the related issue. Please also include relevant motivation and context. ## Checklist before requesting a review Please delete options that are not relevant. - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my code - [ ] I have commented hard-to-understand areas - [ ] I have ideally added tests that prove my fix is effective or that my feature works - [ ] New and existing unit tests pass locally with my changes - [ ] Any dependent changes have been merged ## Screenshots (if appropriate):
This commit is contained in:
parent
6323931a8b
commit
90ee40f9f2
253
docs/docs/Developers/run_fully_local.md
Normal file
253
docs/docs/Developers/run_fully_local.md
Normal file
@ -0,0 +1,253 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
title: Using Quivr fully locally
|
||||
---
|
||||
|
||||
# Using Quivr fully locally
|
||||
|
||||
## Headers
|
||||
|
||||
The following is a guide to set up everything for using Quivr locally:
|
||||
##### Table of Contents
|
||||
* [Database](#database)
|
||||
* [Embeddings](#embeddings)
|
||||
* [LLM for inference](#llm)
|
||||
|
||||
It is a first, working setup, but a lot of work has to be done to e.g. find the appropriate settings for the model.
|
||||
|
||||
Importantly, this will currently only work on tag v0.0.46.
|
||||
|
||||
The guide was put together in collaboration with members of the Quivr Discord, **Using Quivr fully locally** thread. That is a good place to discuss it.
|
||||
|
||||
This worked for me, but I sometimes got strange results (the output contains repeating answers/questions). Maybe because `stopping_criteria=stopping_criteria` must be uncommented in `transformers.pipeline`. Will update this page as I continue learning.
|
||||
|
||||
<a name="database"/>
|
||||
|
||||
## Local Supabase
|
||||
|
||||
Instead of relying on a remote Supabase instance, we have to set it up locally. Follow the instructions on https://supabase.com/docs/guides/self-hosting/docker.
|
||||
|
||||
Troubleshooting:
|
||||
* If the Quivr backend container cannot reach Supabase on port 8000, change the Quivr backend container to use the host network.
|
||||
* If email service does not work, add a user using the supabase web ui, and check "Auto Confirm User?".
|
||||
* http://localhost:8000/project/default/auth/users
|
||||
|
||||
<a name="embeddings"/>
|
||||
|
||||
## Local embeddings
|
||||
|
||||
First, let's get local embeddings to work with GPT4All. Instead of relying on OpenAI for generating embeddings of both the prompt and the documents we upload, we will use a local LLM for this.
|
||||
|
||||
Remove any existing data from the postgres database:
|
||||
* `supabase/docker $ docker compose down -v`
|
||||
* `supabase/docker $ rm -rf volumes/db/data/`
|
||||
* `supabase/docker $ docker compose up -d`
|
||||
|
||||
Change the vector dimensions in the necessary Quivr SQL files:
|
||||
* Replace all occurrences of 1536 by 768, in Quivr's `scripts\tables.sql`
|
||||
* Run tables.sql in the Supabase web ui SQL editor: http://localhost:8000
|
||||
|
||||
Change the Quivr code to use local LLM (GPT4All) and local embeddings:
|
||||
* add code to `backend\core\llm\private_gpt4all.py`
|
||||
|
||||
```python
|
||||
from langchain.embeddings import HuggingFaceEmbeddings
|
||||
...
|
||||
def embeddings(self) -> HuggingFaceEmbeddings:
|
||||
emb = HuggingFaceEmbeddings(
|
||||
model_name="sentence-transformers/all-mpnet-base-v2",
|
||||
model_kwargs={'device': 'cuda'},
|
||||
encode_kwargs={'normalize_embeddings': False}
|
||||
)
|
||||
return emb
|
||||
```
|
||||
|
||||
Note that there may be better models out there for generating the embeddings: https://huggingface.co/spaces/mteb/leaderboard
|
||||
|
||||
Update Quivr `backend/core/.env`'s Private LLM Variables:
|
||||
|
||||
```
|
||||
#Private LLM Variables
|
||||
PRIVATE=True
|
||||
MODEL_PATH=./local_models/ggml-gpt4all-j-v1.3-groovy.bin
|
||||
```
|
||||
|
||||
Download GPT4All model:
|
||||
* `$ cd backend/core/local_models/`
|
||||
* `wget https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin`
|
||||
|
||||
Ensure the Quivr backend docker container has CUDA and the GPT4All package:
|
||||
|
||||
```
|
||||
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel
|
||||
#FROM python:3.11-bullseye
|
||||
|
||||
ARG DEBIAN_FRONTEND=noninteractive
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
|
||||
RUN pip install gpt4all
|
||||
```
|
||||
|
||||
Modify the docker-compose yml file (for backend container). The following example is for using 2 GPUs:
|
||||
|
||||
```
|
||||
...
|
||||
network_mode: host
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 2
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
Install nvidia container toolkit on the host, https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html:
|
||||
|
||||
```
|
||||
$ wget https://nvidia.github.io/nvidia-docker/gpgkey --no-check-certificate
|
||||
$ sudo apt-key add gpgkey
|
||||
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
|
||||
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
|
||||
$ sudo apt-get update
|
||||
|
||||
$ sudo apt-get install -y nvidia-container-toolkit
|
||||
|
||||
$ nvidia-ctk --version
|
||||
|
||||
$ sudo systemctl restart docker
|
||||
```
|
||||
|
||||
At this moment, if we try to upload a pdf, we get an error:
|
||||
|
||||
```
|
||||
backend-core | 1989-01-01 21:51:41,211 [ERROR] utils.vectors: Error creating vector for document {'code': '22000', 'details': None, 'hint': None, 'message': 'expected 768 dimensions, not 1536'}
|
||||
```
|
||||
|
||||
This can be remedied by using local embeddings for document embeddings. In backend/core/utils/vectors.py, replace:
|
||||
|
||||
```python
|
||||
# def create_vector(self, doc, user_openai_api_key=None):
|
||||
# logger.info("Creating vector for document")
|
||||
# logger.info(f"Document: {doc}")
|
||||
# if user_openai_api_key:
|
||||
# self.commons["documents_vector_store"]._embedding = OpenAIEmbeddings(
|
||||
# openai_api_key=user_openai_api_key
|
||||
# ) # pyright: ignore reportPrivateUsage=none
|
||||
# try:
|
||||
# sids = self.commons["documents_vector_store"].add_documents([doc])
|
||||
# if sids and len(sids) > 0:
|
||||
# return sids
|
||||
|
||||
# except Exception as e:
|
||||
# logger.error(f"Error creating vector for document {e}")
|
||||
|
||||
def create_vector(self, doc, user_openai_api_key=None):
|
||||
logger.info("Creating vector for document")
|
||||
logger.info(f"Document: {doc}")
|
||||
self.commons["documents_vector_store"]._embedding = HuggingFaceEmbeddings(
|
||||
model_name="sentence-transformers/all-mpnet-base-v2",
|
||||
model_kwargs={'device': 'cuda'},
|
||||
encode_kwargs={'normalize_embeddings': False}
|
||||
) # pyright: ignore reportPrivateUsage=none
|
||||
logger.info('||| creating emedding')
|
||||
try:
|
||||
sids = self.commons["documents_vector_store"].add_documents([doc])
|
||||
if sids and len(sids) > 0:
|
||||
return sids
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error creating vector for document {e}")
|
||||
```
|
||||
|
||||
<a name="llm"/>
|
||||
|
||||
## Local LLM
|
||||
|
||||
The final step is to use a local model from HuggingFace for inference. (The HF token is optional, only required for certain models on HF.)
|
||||
|
||||
Update the Quivr backend dockerfile:
|
||||
|
||||
```
|
||||
ENV HUGGINGFACEHUB_API_TOKEN=hf_XXX
|
||||
|
||||
RUN pip install accelerate
|
||||
```
|
||||
|
||||
Update the `private_gpt4all.py` file as follows:
|
||||
|
||||
```python
|
||||
import langchain
|
||||
langchain.debug = True
|
||||
langchain.verbose = True
|
||||
|
||||
import os
|
||||
import transformers
|
||||
from langchain.llms import HuggingFacePipeline
|
||||
from langchain.embeddings import HuggingFaceEmbeddings
|
||||
...
|
||||
|
||||
model_id = "stabilityai/StableBeluga-13B"
|
||||
...
|
||||
|
||||
def _create_llm(
|
||||
self,
|
||||
model,
|
||||
streaming=False,
|
||||
callbacks=None,
|
||||
) -> BaseLLM:
|
||||
"""
|
||||
Override the _create_llm method to enforce the use of a private model.
|
||||
:param model: Language model name to be used.
|
||||
:param streaming: Whether to enable streaming of the model
|
||||
:param callbacks: Callbacks to be used for streaming
|
||||
:return: Language model instance
|
||||
"""
|
||||
|
||||
model_path = self.model_path
|
||||
|
||||
logger.info("Using private model: %s", model)
|
||||
logger.info("Streaming is set to %s", streaming)
|
||||
logger.info("--- model %s",model)
|
||||
|
||||
logger.info("--- model path %s",model_path)
|
||||
|
||||
model_id = "stabilityai/StableBeluga-13B"
|
||||
|
||||
llm = transformers.AutoModelForCausalLM.from_pretrained(
|
||||
model_id,
|
||||
use_cache=True,
|
||||
load_in_4bit=True,
|
||||
device_map='auto',
|
||||
#use_auth_token=hf_auth
|
||||
)
|
||||
logger.info('<<< transformers.AutoModelForCausalLM.from_pretrained')
|
||||
|
||||
llm.eval()
|
||||
logger.info('<<< eval')
|
||||
|
||||
tokenizer = transformers.AutoTokenizer.from_pretrained(
|
||||
model_id,
|
||||
use_auth_token=hf_auth
|
||||
)
|
||||
logger.info('<<< transformers.AutoTokenizer.from_pretrained')
|
||||
|
||||
generate_text = transformers.pipeline(
|
||||
model=llm, tokenizer=tokenizer,
|
||||
return_full_text=True, # langchain expects the full text
|
||||
task='text-generation',
|
||||
# we pass model parameters here too
|
||||
#stopping_criteria=stopping_criteria, # without this model rambles during chat
|
||||
temperature=0.5, # 'randomness' of outputs, 0.0 is the min and 1.0 the max
|
||||
max_new_tokens=512, # mex number of tokens to generate in the output
|
||||
repetition_penalty=1.1 # without this output begins repeating
|
||||
)
|
||||
logger.info('<<< generate_text = transformers.pipeline(')
|
||||
|
||||
result = HuggingFacePipeline(pipeline=generate_text)
|
||||
|
||||
logger.info('<<< generate_text = transformers.pipeline(')
|
||||
|
||||
logger.info("<<< created llm HuggingFace")
|
||||
return result
|
||||
```
|
12
docs/docs/User_Guide/what-is-a-brain.md
Normal file
12
docs/docs/User_Guide/what-is-a-brain.md
Normal file
@ -0,0 +1,12 @@
|
||||
---
|
||||
title: Concept of Brain
|
||||
---
|
||||
|
||||
:::info
|
||||
A few brains were harmed in the making of this documentation 🤯😏
|
||||
:::
|
||||
|
||||
|
||||
A **brain** is a concept that we created to allow you to **create** and **organize** your knowledge in Quivr.
|
||||
|
||||
|
@ -3,32 +3,71 @@ sidebar_position: 1
|
||||
title: 🚀 Welcome to Quivr
|
||||
---
|
||||
|
||||
# Intro
|
||||
|
||||
Quivr, your second brain, utilizes the power of GenerativeAI to store and retrieve unstructured information. Think of it as Obsidian, but turbocharged with AI capabilities.
|
||||
## Welcome to Quivr 🧠
|
||||
|
||||
## Key Features 🎯
|
||||
[Quivr](https://quivr.app) is your **Second Brain** that can acts as your **personal assistant**.
|
||||
|
||||
- **Universal Data Acceptance**: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, we've got you covered.
|
||||
- **Generative AI**: Quivr employs advanced AI to assist you in generating and retrieving information.
|
||||
- **Fast and Efficient**: Designed with speed and efficiency at its core. Quivr ensures rapid access to your data.
|
||||
- **Secure**: Your data, your control. Always.
|
||||
- **File Compatibility**:
|
||||
- Text
|
||||
- Markdown
|
||||
- PDF
|
||||
- Powerpoint
|
||||
- Excel
|
||||
- Word
|
||||
- Audio
|
||||
- Video
|
||||
- **Open Source**: Freedom is beautiful, so is Quivr. Open source and free to use.
|
||||
It can:
|
||||
- **Answer questions** to the **files** that you uploaded
|
||||
- **Interact** with the **applications** that you connected to Quivr (Soon 🚀)
|
||||
|
||||
## Demo Highlights 🎥
|
||||
:::info
|
||||
**Our goal **is to make Quivr the **best open-source personal assistant** that is powered by your knowledge and your applications 🔥
|
||||
It will remain open-source
|
||||
:::
|
||||
|
||||
### **Demo**:
|
||||
## What does it do ?
|
||||
|
||||
> A video equals a thousand words
|
||||
|
||||
<div style={{ textAlign: 'center' }}>
|
||||
<video width="640" height="480" controls>
|
||||
<source src="https://quivr-cms.s3.eu-west-3.amazonaws.com/singlestore_demo_quivr_232893659c.mp4" type="video/mp4"/>
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
</div>
|
||||
|
||||
|
||||
## How to get started ? 👀
|
||||
|
||||
:::tip
|
||||
It takes less than **5 seconds** to get started with Quivr. You can even use your Google account to sign up.
|
||||
:::
|
||||
|
||||
- Create an account on [Quivr](https://quivr.app)
|
||||
- Upload your files
|
||||
- Ask questions to Quivr
|
||||
|
||||
|
||||
<div style={{ textAlign: 'center' }}>
|
||||
<img src="/img/homepage.png" alt="Quivr home page" style={{ width: '60%' }} />
|
||||
</div>
|
||||
|
||||
## How can you leverage Quivr ? 🚀
|
||||
|
||||
You can use Quivr to:
|
||||
- **Search** information in your files
|
||||
- **Cross-search** information from your files
|
||||
- **Store** all your knowledge in one place
|
||||
- **Share** your knowledge with your team by leveraging the **collaboration** features
|
||||
|
||||
## Features 🤗
|
||||
|
||||
### Public & Private Brains
|
||||
|
||||
You can create **public** and **private** brains.
|
||||
|
||||
- **Public** brains are **searchable** by everyone on Quivr
|
||||
- **Private** brains are **only searchable** by you
|
||||
|
||||
You can share your brains with a set of people by using their emails. This means that you can **collaborate** with your team on a brain without making it public.
|
||||
|
||||
### Ask Questions
|
||||
|
||||
You can **ask questions** to Quivr. Quivr will **answer** your questions by leveraging the **knowledge** that you uploaded to Quivr.
|
||||
|
||||
### Custom Personality
|
||||
|
||||
You can **customize** the personality of Quivr by changing the **prompt** of your brain. You could tell your brain to always answer with a **funny** or **serious** tone or act like a **robot** or a **human**.
|
||||
|
||||
<video width="640" height="480" controls>
|
||||
<source src="https://github.com/StanGirard/quivr/assets/19614572/a6463b73-76c7-4bc0-978d-70562dca71f5" type="video/mp4"/>
|
||||
Your browser does not support the video tag.
|
||||
</video>
|
||||
|
BIN
docs/static/img/homepage.png
vendored
Normal file
BIN
docs/static/img/homepage.png
vendored
Normal file
Binary file not shown.
After Width: | Height: | Size: 888 KiB |
@ -44,11 +44,11 @@
|
||||
"how_to_use_quivr": "How to use Quivr ?",
|
||||
"what_is_quivr": "What is Quivr ?",
|
||||
"what_is_brain": "What is a brain ?",
|
||||
"answer":{
|
||||
"how_to_use_quivr": "Check the documentation https://brain.quivr.app/docs/get_started/intro.html",
|
||||
"answer": {
|
||||
"how_to_use_quivr": "Check the documentation https://brain.quivr.app/docs/intro.html",
|
||||
"what_is_quivr": "Quivr is a helpful assistant.",
|
||||
"what_is_brain": "A brain contains knowledge"
|
||||
}
|
||||
},
|
||||
"welcome":"Welcome"
|
||||
}
|
||||
"welcome": "Welcome"
|
||||
}
|
@ -44,11 +44,11 @@
|
||||
"how_to_use_quivr": "¿Cómo usar Quivr?",
|
||||
"what_is_quivr": "¿Qué es Quivr?",
|
||||
"what_is_brain": "¿Qué es un cerebro?",
|
||||
"answer":{
|
||||
"how_to_use_quivr": "Consulta la documentación en https://brain.quivr.app/docs/get_started/intro.html",
|
||||
"answer": {
|
||||
"how_to_use_quivr": "Consulta la documentación en https://brain.quivr.app/docs/intro.html",
|
||||
"what_is_quivr": "Quivr es un asistente útil.",
|
||||
"what_is_brain": "Un cerebro contiene conocimiento."
|
||||
}
|
||||
},
|
||||
"welcome": "Bienvenido"
|
||||
}
|
||||
}
|
@ -44,11 +44,11 @@
|
||||
"how_to_use_quivr": "Comment utiliser Quivr ?",
|
||||
"what_is_quivr": "Qu'est-ce que Quivr ?",
|
||||
"what_is_brain": "Qu'est-ce qu'un cerveau ?",
|
||||
"answer":{
|
||||
"how_to_use_quivr": "Consultez la documentation sur https://brain.quivr.app/docs/get_started/intro.html",
|
||||
"answer": {
|
||||
"how_to_use_quivr": "Consultez la documentation sur https://brain.quivr.app/docs/intro.html",
|
||||
"what_is_quivr": "Quivr est un assistant utile.",
|
||||
"what_is_brain": "Un cerveau contient des connaissances."
|
||||
}
|
||||
},
|
||||
"welcome": "Bienvenue"
|
||||
}
|
||||
}
|
@ -44,11 +44,11 @@
|
||||
"how_to_use_quivr": "Como usar o Quivr?",
|
||||
"what_is_quivr": "O que é o Quivr?",
|
||||
"what_is_brain": "O que é um cérebro?",
|
||||
"answer":{
|
||||
"how_to_use_quivr": "Consulte a documentação em https://brain.quivr.app/docs/get_started/intro.html",
|
||||
"answer": {
|
||||
"how_to_use_quivr": "Consulte a documentação em https://brain.quivr.app/docs/intro.html",
|
||||
"what_is_quivr": "Quivr é um assistente útil.",
|
||||
"what_is_brain": "Um cérebro contém conhecimento."
|
||||
}
|
||||
},
|
||||
"welcome": "Bem-vindo"
|
||||
}
|
||||
}
|
@ -44,11 +44,11 @@
|
||||
"how_to_use_quivr": "Как использовать Quivr?",
|
||||
"what_is_quivr": "Что такое Quivr?",
|
||||
"what_is_brain": "Что такое мозг?",
|
||||
"answer":{
|
||||
"how_to_use_quivr": "Проверьте документацию на https://brain.quivr.app/docs/get_started/intro.html",
|
||||
"answer": {
|
||||
"how_to_use_quivr": "Проверьте документацию на https://brain.quivr.app/docs/intro.html",
|
||||
"what_is_quivr": "Quivr - полезный ассистент.",
|
||||
"what_is_brain": "Мозг содержит знания."
|
||||
}
|
||||
},
|
||||
"welcome": "Добро пожаловать"
|
||||
}
|
||||
}
|
@ -45,11 +45,11 @@
|
||||
"how_to_use_quivr": "如何使用Quivr?",
|
||||
"what_is_quivr": "什么是Quivr?",
|
||||
"what_is_brain": "什么是大脑?",
|
||||
"answer":{
|
||||
"how_to_use_quivr": "查看文档https://brain.quivr.app/docs/get_started/intro.html",
|
||||
"answer": {
|
||||
"how_to_use_quivr": "查看文档https://brain.quivr.app/docs/intro.html",
|
||||
"what_is_quivr": "Quivr是一个有用的助手。",
|
||||
"what_is_brain": "大脑包含知识。"
|
||||
}
|
||||
},
|
||||
"welcome": "欢迎来到"
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue
Block a user