mirror of https://github.com/StanGirard/quivr.git synced 2024-12-24 20:03:41 +03:00

🧠 Dump all your files and chat with it using your Generative AI Second Brain using LLMs ( GPT 3.5/4, Private, Anthropic, VertexAI ) & Embeddings 🧠

ai audio chat chatgpt csv embeddings generative gpt gpt4all llm models pdf private privategpt second-brain starred-repo starred-stangirard-repo vectorstore whisper

Go to file

Stan Girard b0e62b08d6 fix(demo): remove multi file upload		2023-05-17 16:26:25 +02:00
.github/workflows	feat(releaseplease): added	2023-05-16 16:25:08 +02:00
.streamlit	feat(demo): app can now have a demo	2023-05-17 12:12:52 +02:00
.vscode	Support for Anthropics Models	2023-05-14 01:30:03 -07:00
loaders	fix(demo): max size audio	2023-05-17 12:18:55 +02:00
website	docs(website): added demo link	2023-05-17 14:44:57 +02:00
.gitignore	feat(website): first iteration	2023-05-14 21:12:30 +02:00
2023-05-13-02-16-02.png	feat(demo): added	2023-05-13 02:16:41 +02:00
brain.py	feat(forget): now able to forget things	2023-05-13 01:30:00 +02:00
Dockerfile	fix(requirements): fixed the issue	2023-05-13 16:37:18 +02:00
explorer.py	feat(explorer): beta	2023-05-16 17:04:45 +02:00
files.py	fix(demo): remove multi file upload	2023-05-17 16:26:25 +02:00
LICENSE	feat(license): added	2023-05-13 18:12:35 +02:00
logo.png	feat(readme): first iteration	2023-05-13 02:02:45 +02:00
main.py	doc(demo): made it more visual	2023-05-17 12:53:36 +02:00
question.py	feat(demo): app can now have a demo	2023-05-17 12:12:52 +02:00
README.md	Update README.md (#40 )	2023-05-17 00:14:13 +02:00
requirements.txt	Support for Anthropics Models	2023-05-14 01:30:03 -07:00
sidebar.py	feat(visual): moved things around	2023-05-12 23:58:19 +02:00
stats.py	feat(demo): app can now have a demo	2023-05-17 12:12:52 +02:00
utils.py	feat(init): init repository	2023-05-12 23:05:31 +02:00

README.md

Quivr

Quivr is your second brain in the cloud, designed to easily store and retrieve unstructured information. It's like Obsidian but powered by generative AI.

Features

Store Anything: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, you name it.
Generative AI: Quivr uses advanced AI to help you generate and retrieve information.
Fast and Efficient: Designed with speed and efficiency in mind. Quivr makes sure you can access your data as quickly as possible.
Secure: Your data is stored securely in the cloud and is always under your control.
Compatible Files:
- Text
- Markdown
- PDF
- Audio
- Video
Open Source: Quivr is open source and free to use.

Demo

Demo with GPT3.5

https://github.com/StanGirard/quivr/assets/19614572/80721777-2313-468f-b75e-09379f694653

Demo with Claude 100k context

https://github.com/StanGirard/quivr/assets/5101573/9dba918c-9032-4c8d-9eea-94336d2c8bd4

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

What things you need to install the software and how to install them.

Python 3.10 or higher
Pip
Virtualenv
Supabase account
Supabase API key
Supabase URL

Installing

Clone the repository

git clone git@github.com:StanGirard/Quivr.git & cd Quivr

Create a virtual environment

virtualenv venv

Activate the virtual environment

source venv/bin/activate

Install the dependencies

pip install -r requirements.txt

Copy the streamlit secrets.toml example file

cp .streamlit/secrets.toml.example .streamlit/secrets.toml

Add your credentials to .streamlit/secrets.toml file

supabase_url = "SUPABASE_URL"
supabase_service_key = "SUPABASE_SERVICE_KEY"
openai_api_key = "OPENAI_API_KEY"
anthropic_api_key = "ANTHROPIC_API_KEY" # Optional

Run the migration script on the Supabase database via the web interface

-- Enable the pgvector extension to work with embedding vectors
       create extension vector;

       -- Create a table to store your documents
       create table documents (
       id bigserial primary key,
       content text, -- corresponds to Document.pageContent
       metadata jsonb, -- corresponds to Document.metadata
       embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
       );

       CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)
           RETURNS TABLE(
               id bigint,
               content text,
               metadata jsonb,
               -- we return matched vectors to enable maximal marginal relevance searches
               embedding vector(1536),
               similarity float)
           LANGUAGE plpgsql
           AS $$
           # variable_conflict use_column
       BEGIN
           RETURN query
           SELECT
               id,
               content,
               metadata,
               embedding,
               1 -(documents.embedding <=> query_embedding) AS similarity
           FROM
               documents
           ORDER BY
               documents.embedding <=> query_embedding
           LIMIT match_count;
       END;
       $$;

Run the app

streamlit run main.py