quivr/README.md

170 lines
4.4 KiB
Markdown
Raw Normal View History

2023-05-14 22:46:31 +03:00
# Quivr
2023-05-13 03:02:45 +03:00
<p align="center">
2023-05-14 22:46:31 +03:00
<img src="./logo.png" alt="Quivr-logo" width="30%">
2023-05-13 03:02:45 +03:00
<p align="center">
2023-05-17 01:14:13 +03:00
<a href="https://discord.gg/HUpRgp2HG8">
<img src="https://img.shields.io/badge/discord-join%20chat-blue.svg" alt="Join our Discord" height="40">
</a>
2023-05-15 10:28:25 +03:00
2023-05-14 22:46:31 +03:00
Quivr is your second brain in the cloud, designed to easily store and retrieve unstructured information. It's like Obsidian but powered by generative AI.
2023-05-13 03:02:45 +03:00
## Features
2023-05-14 22:46:31 +03:00
- **Store Anything**: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, you name it.
- **Generative AI**: Quivr uses advanced AI to help you generate and retrieve information.
- **Fast and Efficient**: Designed with speed and efficiency in mind. Quivr makes sure you can access your data as quickly as possible.
2023-05-13 03:02:45 +03:00
- **Secure**: Your data is stored securely in the cloud and is always under your control.
2023-05-13 03:27:12 +03:00
- **Compatible Files**:
- **Text**
- **Markdown**
- **PDF**
- **Audio**
- **Video**
2023-05-14 22:46:31 +03:00
- **Open Source**: Quivr is open source and free to use.
2023-05-13 03:16:41 +03:00
## Demo
2023-05-13 20:56:54 +03:00
2023-05-15 10:24:25 +03:00
### Demo with GPT3.5
2023-05-17 00:18:58 +03:00
https://github.com/StanGirard/quivr/assets/19614572/80721777-2313-468f-b75e-09379f694653
2023-05-13 20:56:54 +03:00
2023-05-15 10:24:25 +03:00
### Demo with Claude 100k context
https://github.com/StanGirard/quivr/assets/5101573/9dba918c-9032-4c8d-9eea-94336d2c8bd4
2023-05-13 03:16:41 +03:00
2023-05-13 03:02:45 +03:00
## Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
### Prerequisites
What things you need to install the software and how to install them.
2023-05-13 03:13:56 +03:00
- Python 3.10 or higher
2023-05-13 03:02:45 +03:00
- Pip
- Virtualenv
- Supabase account
- Supabase API key
- Supabase URL
### Installing
2023-05-13 03:13:56 +03:00
- Clone the repository
2023-05-13 03:02:45 +03:00
2023-05-13 03:13:56 +03:00
```bash
2023-05-14 22:46:31 +03:00
git clone git@github.com:StanGirard/Quivr.git & cd Quivr
2023-05-13 03:13:56 +03:00
```
2023-05-13 03:02:45 +03:00
2023-05-13 03:13:56 +03:00
- Create a virtual environment
```bash
virtualenv venv
```
- Activate the virtual environment
```bash
source venv/bin/activate
```
- Install the dependencies
```bash
pip install -r requirements.txt
```
2023-05-13 18:28:09 +03:00
- Copy the streamlit secrets.toml example file
2023-05-13 03:13:56 +03:00
```bash
2023-05-13 18:28:09 +03:00
cp .streamlit/secrets.toml.example .streamlit/secrets.toml
2023-05-13 03:13:56 +03:00
```
2023-05-13 18:28:09 +03:00
- Add your credentials to .streamlit/secrets.toml file
2023-05-13 03:13:56 +03:00
```toml
supabase_url = "SUPABASE_URL"
supabase_service_key = "SUPABASE_SERVICE_KEY"
openai_api_key = "OPENAI_API_KEY"
anthropic_api_key = "ANTHROPIC_API_KEY" # Optional
2023-05-13 03:13:56 +03:00
```
2023-05-13 17:37:18 +03:00
- Run the migration script on the Supabase database via the web interface
2023-05-13 03:13:56 +03:00
```sql
-- Enable the pgvector extension to work with embedding vectors
create extension vector;
-- Create a table to store your documents
create table documents (
id bigserial primary key,
content text, -- corresponds to Document.pageContent
metadata jsonb, -- corresponds to Document.metadata
embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
);
CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)
RETURNS TABLE(
id bigint,
content text,
metadata jsonb,
-- we return matched vectors to enable maximal marginal relevance searches
embedding vector(1536),
similarity float)
LANGUAGE plpgsql
AS $$
# variable_conflict use_column
BEGIN
RETURN query
SELECT
id,
content,
metadata,
embedding,
1 -(documents.embedding <=> query_embedding) AS similarity
FROM
documents
ORDER BY
documents.embedding <=> query_embedding
LIMIT match_count;
END;
$$;
```
2023-05-17 21:36:57 +03:00
and
```sql
create table
stats (
-- A column called "time" with data type "timestamp"
time timestamp,
-- A column called "details" with data type "text"
chat boolean,
embedding boolean,
details text,
metadata jsonb,
-- An "integer" primary key column called "id" that is generated always as identity
id integer primary key generated always as identity
);
```
2023-05-13 03:13:56 +03:00
- Run the app
```bash
streamlit run main.py
```
2023-05-13 03:02:45 +03:00
## Built With
* [Python](https://www.python.org/) - The programming language used.
* [Streamlit](https://streamlit.io/) - The web framework used.
* [Supabase](https://supabase.io/) - The open source Firebase alternative.
2023-05-13 03:13:56 +03:00
## Contributing
Open a pull request and we'll review it as soon as possible.
2023-05-16 00:55:06 +03:00
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=StanGirard/quivr&type=Date)](https://star-history.com/#StanGirard/quivr&Date)