quivr/README.md

# Quivr

<p align="center">
<img src="./logo.png" alt="Quivr-logo" width="30%">
<p align="center">

<a href="https://discord.gg/HUpRgp2HG8">
  <img src="https://img.shields.io/badge/discord-join%20chat-blue.svg" alt="Join our Discord" height="40">
</a>

Quivr is your second brain in the cloud, designed to easily store and retrieve unstructured information. It's like Obsidian but powered by generative AI.

## Features

- **Store Anything**: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, you name it.
- **Generative AI**: Quivr uses advanced AI to help you generate and retrieve information.
- **Fast and Efficient**: Designed with speed and efficiency in mind. Quivr makes sure you can access your data as quickly as possible.
- **Secure**: Your data is stored securely in the cloud and is always under your control.
- **Compatible Files**: 
  - **Text**
  - **Markdown**
  - **PDF**
  - **Audio**
  - **Video**
- **Open Source**: Quivr is open source and free to use.
## Demo


### Demo with GPT3.5
https://github.com/StanGirard/quivr/assets/19614572/80721777-2313-468f-b75e-09379f694653


### Demo with Claude 100k context
https://github.com/StanGirard/quivr/assets/5101573/9dba918c-9032-4c8d-9eea-94336d2c8bd4

## Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

### Prerequisites

What things you need to install the software and how to install them.

- Python 3.10 or higher
- Pip
- Virtualenv
- Supabase account
- Supabase API key
- Supabase URL

### Installing

- Clone the repository

```bash
git clone git@github.com:StanGirard/Quivr.git & cd Quivr
```

- Create a virtual environment

```bash
virtualenv venv
```

- Activate the virtual environment

```bash
source venv/bin/activate
```

- Install the dependencies

```bash
pip install -r requirements.txt
```

- Copy the streamlit secrets.toml example file

```bash
cp .streamlit/secrets.toml.example .streamlit/secrets.toml
```

- Add your credentials to .streamlit/secrets.toml file

```toml
supabase_url = "SUPABASE_URL"
supabase_service_key = "SUPABASE_SERVICE_KEY"
openai_api_key = "OPENAI_API_KEY"
anthropic_api_key = "ANTHROPIC_API_KEY" # Optional
```

- Run the migration script on the Supabase database via the web interface

```sql
-- Enable the pgvector extension to work with embedding vectors
       create extension vector;

       -- Create a table to store your documents
       create table documents (
       id bigserial primary key,
       content text, -- corresponds to Document.pageContent
       metadata jsonb, -- corresponds to Document.metadata
       embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
       );

       CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)
           RETURNS TABLE(
               id bigint,
               content text,
               metadata jsonb,
               -- we return matched vectors to enable maximal marginal relevance searches
               embedding vector(1536),
               similarity float)
           LANGUAGE plpgsql
           AS $$
           # variable_conflict use_column
       BEGIN
           RETURN query
           SELECT
               id,
               content,
               metadata,
               embedding,
               1 -(documents.embedding <=> query_embedding) AS similarity
           FROM
               documents
           ORDER BY
               documents.embedding <=> query_embedding
           LIMIT match_count;
       END;
       $$;
```

and 

```sql
create table
  stats (
    -- A column called "time" with data type "timestamp"
    time timestamp,
    -- A column called "details" with data type "text"
    chat boolean,
    embedding boolean,
    details text,
    metadata jsonb,
    -- An "integer" primary key column called "id" that is generated always as identity
    id integer primary key generated always as identity
  );
```

- Run the app

```bash
streamlit run main.py
```

## Built With

* [Python](https://www.python.org/) - The programming language used.
* [Streamlit](https://streamlit.io/) - The web framework used.
* [Supabase](https://supabase.io/) - The open source Firebase alternative.

## Contributing

Open a pull request and we'll review it as soon as possible.

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=StanGirard/quivr&type=Date)](https://star-history.com/#StanGirard/quivr&Date)
feat(quivr): renamed 2023-05-14 22:46:31 +03:00			`# Quivr`
feat(readme): first iteration 2023-05-13 03:02:45 +03:00
			`<p align="center">`
feat(quivr): renamed 2023-05-14 22:46:31 +03:00			`<img src="./logo.png" alt="Quivr-logo" width="30%">`
feat(readme): first iteration 2023-05-13 03:02:45 +03:00			`<p align="center">`

Update README.md (#40) 2023-05-17 01:14:13 +03:00			`<a href="https://discord.gg/HUpRgp2HG8">`
			`<img src="https://img.shields.io/badge/discord-join%20chat-blue.svg" alt="Join our Discord" height="40">`
			`</a>`
Update README.md 2023-05-15 10:28:25 +03:00
feat(quivr): renamed 2023-05-14 22:46:31 +03:00			`Quivr is your second brain in the cloud, designed to easily store and retrieve unstructured information. It's like Obsidian but powered by generative AI.`
feat(readme): first iteration 2023-05-13 03:02:45 +03:00
			`## Features`

feat(quivr): renamed 2023-05-14 22:46:31 +03:00			`- Store Anything: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, you name it.`
			`- Generative AI: Quivr uses advanced AI to help you generate and retrieve information.`
			`- Fast and Efficient: Designed with speed and efficiency in mind. Quivr makes sure you can access your data as quickly as possible.`
feat(readme): first iteration 2023-05-13 03:02:45 +03:00			`- Secure: Your data is stored securely in the cloud and is always under your control.`
feat(readme): updated 2023-05-13 03:27:12 +03:00			`- Compatible Files:`
			`- Text`
			`- Markdown`
			`- PDF`
			`- Audio`
			`- Video`
feat(quivr): renamed 2023-05-14 22:46:31 +03:00			`- Open Source: Quivr is open source and free to use.`
feat(demo): added 2023-05-13 03:16:41 +03:00			`## Demo`

Update README.md 2023-05-13 20:56:54 +03:00
doc(readme): added demo 100k context 2023-05-15 10:24:25 +03:00			`### Demo with GPT3.5`
doc(demo): updated video 2023-05-17 00:18:58 +03:00			`https://github.com/StanGirard/quivr/assets/19614572/80721777-2313-468f-b75e-09379f694653`
Update README.md 2023-05-13 20:56:54 +03:00

doc(readme): added demo 100k context 2023-05-15 10:24:25 +03:00			`### Demo with Claude 100k context`
			`https://github.com/StanGirard/quivr/assets/5101573/9dba918c-9032-4c8d-9eea-94336d2c8bd4`
feat(demo): added 2023-05-13 03:16:41 +03:00
feat(readme): first iteration 2023-05-13 03:02:45 +03:00			`## Getting Started`

			`These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.`

			`### Prerequisites`

			`What things you need to install the software and how to install them.`

feat(public): release source code 2023-05-13 03:13:56 +03:00			`- Python 3.10 or higher`
feat(readme): first iteration 2023-05-13 03:02:45 +03:00			`- Pip`
			`- Virtualenv`
			`- Supabase account`
			`- Supabase API key`
			`- Supabase URL`

			`### Installing`

feat(public): release source code 2023-05-13 03:13:56 +03:00			`- Clone the repository`
feat(readme): first iteration 2023-05-13 03:02:45 +03:00
feat(public): release source code 2023-05-13 03:13:56 +03:00			```bash
feat(quivr): renamed 2023-05-14 22:46:31 +03:00			`git clone git@github.com:StanGirard/Quivr.git & cd Quivr`
feat(public): release source code 2023-05-13 03:13:56 +03:00			```
feat(readme): first iteration 2023-05-13 03:02:45 +03:00
feat(public): release source code 2023-05-13 03:13:56 +03:00			`- Create a virtual environment`

			```bash
			`virtualenv venv`
			```

			`- Activate the virtual environment`

			```bash
			`source venv/bin/activate`
			```

			`- Install the dependencies`

			```bash
			`pip install -r requirements.txt`
			```

Add toml file example 2023-05-13 18:28:09 +03:00			`- Copy the streamlit secrets.toml example file`
feat(public): release source code 2023-05-13 03:13:56 +03:00
			```bash
Add toml file example 2023-05-13 18:28:09 +03:00			`cp .streamlit/secrets.toml.example .streamlit/secrets.toml`
feat(public): release source code 2023-05-13 03:13:56 +03:00			```

Add toml file example 2023-05-13 18:28:09 +03:00			`- Add your credentials to .streamlit/secrets.toml file`
feat(public): release source code 2023-05-13 03:13:56 +03:00
			```toml
			`supabase_url = "SUPABASE_URL"`
			`supabase_service_key = "SUPABASE_SERVICE_KEY"`
			`openai_api_key = "OPENAI_API_KEY"`
Update Readme and add token counts Update readme for anthropic 2023-05-15 08:10:29 +03:00			`anthropic_api_key = "ANTHROPIC_API_KEY" # Optional`
feat(public): release source code 2023-05-13 03:13:56 +03:00			```

fix(requirements): fixed the issue 2023-05-13 17:37:18 +03:00			`- Run the migration script on the Supabase database via the web interface`
feat(public): release source code 2023-05-13 03:13:56 +03:00
			```sql
			`-- Enable the pgvector extension to work with embedding vectors`
			`create extension vector;`

			`-- Create a table to store your documents`
			`create table documents (`
			`id bigserial primary key,`
			`content text, -- corresponds to Document.pageContent`
			`metadata jsonb, -- corresponds to Document.metadata`
			`embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed`
			`);`

			`CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)`
			`RETURNS TABLE(`
			`id bigint,`
			`content text,`
			`metadata jsonb,`
			`-- we return matched vectors to enable maximal marginal relevance searches`
			`embedding vector(1536),`
			`similarity float)`
			`LANGUAGE plpgsql`
			`AS $$`
			`# variable_conflict use_column`
			`BEGIN`
			`RETURN query`
			`SELECT`
			`id,`
			`content,`
			`metadata,`
			`embedding,`
			`1 -(documents.embedding <=> query_embedding) AS similarity`
			`FROM`
			`documents`
			`ORDER BY`
			`documents.embedding <=> query_embedding`
			`LIMIT match_count;`
			`END;`
			`$$;`
			```

docs(readme): added stats database 2023-05-17 21:36:57 +03:00			`and`

			```sql
			`create table`
			`stats (`
			`-- A column called "time" with data type "timestamp"`
			`time timestamp,`
			`-- A column called "details" with data type "text"`
			`chat boolean,`
			`embedding boolean,`
			`details text,`
			`metadata jsonb,`
			`-- An "integer" primary key column called "id" that is generated always as identity`
			`id integer primary key generated always as identity`
			`);`
			```

feat(public): release source code 2023-05-13 03:13:56 +03:00			`- Run the app`

			```bash
			`streamlit run main.py`
			```
feat(readme): first iteration 2023-05-13 03:02:45 +03:00
			`## Built With`

			`* [Python](https://www.python.org/) - The programming language used.`
			`* [Streamlit](https://streamlit.io/) - The web framework used.`
			`* [Supabase](https://supabase.io/) - The open source Firebase alternative.`
feat(public): release source code 2023-05-13 03:13:56 +03:00
			`## Contributing`

			`Open a pull request and we'll review it as soon as possible.`

Add star history 2023-05-16 00:55:06 +03:00			`## Star History`

			`[![Star History Chart](https://api.star-history.com/svg?repos=StanGirard/quivr&type=Date)](https://star-history.com/#StanGirard/quivr&Date)`