Commit Graph

37 Commits

Author SHA1 Message Date
Stan Girard
380cf82706
feat: quivr core 0.1 (#2970)
# Description


# Testing backend 

## Docker setup
1. Copy `.env.example` to `.env`. Some env variables were added :
EMBEDDING_DIM
2. Apply supabase migratrions : 
```sh
supabase stop
supabase db reset
supabase start
```
3. Start backend containers
```
make dev
```
## Local setup 
You can also run backend without docker.
1. Install [`rye`](https://rye.astral.sh/guide/installation/). Choose
the managed python version and set the version to 3.11
2. Run the following: 
```
cd quivr/backend
rye sync
```
3. Source `.venv` virtual env : `source .venv/bin/activate`
4. Run the backend, make sure you are running redis and supabase
API: 
```
LOG_LEVEL=debug uvicorn quivr_api.main:app --log-level debug --reload --host 0.0.0.0 --port 5050 --workers 1
```
Worker: 
```
LOG_LEVEL=debug celery -A quivr_worker.celery_worker worker -l info -E --concurrency 1
```
Notifier: 
```
LOG_LEVEL=debug python worker/quivr_worker/celery_monitor.py
```

---------

Co-authored-by: chloedia <chloedaems0@gmail.com>
Co-authored-by: aminediro <aminedirhoussi1@gmail.com>
Co-authored-by: Antoine Dewez <44063631+Zewed@users.noreply.github.com>
Co-authored-by: Chloé Daems <73901882+chloedia@users.noreply.github.com>
Co-authored-by: Zewed <dewez.antoine2@gmail.com>
2024-09-02 10:20:53 +02:00
AmineDiro
685558560c
feat: quivr core tox test + parsers (#2929) 2024-07-30 18:49:12 +02:00
Stan Girard
2e4b80138c
chore: Update flashrank npm dependency to version 0.2.5 (#2781)
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2024-06-28 09:09:17 -07:00
Stan Girard
49c6eb686a
chore: Add supabase directory to Dockerfile (#2768)
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2024-06-27 05:02:10 -07:00
Stan Girard
1cd5ff6e78
chore: Add ci-migration.sh to Dockerfile (#2767)
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2024-06-27 04:02:15 -07:00
AmineDiro
2e75de4039
feat(backend): quivr-monorepo and quivr-core package (#2765)
# Description

closes #2722.

- Creates `quivr-monorepo` 
- Separates `quivr-core`
- Update dockerfiles and docker-compose

---------

Co-authored-by: aminediro <aminediro@github.com>
2024-06-27 03:51:01 -07:00
AmineDiro
ca93cb9062
refacto(backend): poetry package manager and chat route refactoring (#2684)
# Description
- Added package manager
- Added precommit checks
- Rewrote dependency injection of Services and Repositories
- Integrate async SQL alchemy engine
- Migrate Chat  repository to SQLModel 
- Migrated ChatHistory repository to SQLModel
- User SQLModel
- Unit test methodology with db rollback
- Unit tests ChatRepository
- Test ChatService get_history
- Brain entity SQL Model
- Promp SQLModel
- Rewrite chat/{chat_id}/question route
- updated docker files and docker compose in dev and production

Added `quivr_core` subpackages:
- Refactored KnowledgebrainQa
- Added Rag service to interface with non-rag dependencies

---------

Co-authored-by: aminediro <aminediro@github.com>
2024-06-26 00:58:55 -07:00
Stan Girard
e33d497598
feat(crawler): Add Playwright for web crawling (#2562)
This pull request adds the Playwright library for web crawling. It
includes the necessary dependencies and updates the code to use
Playwright for crawling websites.
2024-05-08 07:20:35 -07:00
Stan Girard
eb360830e0 Update Dockerfile dependencies 2024-04-28 14:34:44 +02:00
Stan Girard
b3e8c3d711 Add Supabase schema, migrations, and .gitignore file 2024-04-27 15:31:43 +02:00
Stan Girard
74c0e2d72c Add Supabase to Dockerfile 2024-04-27 15:20:39 +02:00
Stan Girard
e7b5699818
feat(docker): Update Dockerfile to install Supabase CLI (#2505)
This pull request updates the Dockerfile to include the installation of
the Supabase CLI. The Supabase CLI is required for interacting with the
Supabase backend. This update ensures that the Supabase CLI is installed
in the Docker image, allowing developers to easily use the Supabase CLI
within their Docker environment.
2024-04-27 04:42:24 -07:00
Stan Girard
743528a6e6 Add libpq-dev and gcc to Dockerfile 2024-02-14 20:29:57 -08:00
Stan Girard
33eedcf5eb Increase pip install timeout to 20000 2024-02-14 20:16:04 -08:00
Stan Girard
2ba3bc1f07
feat: 🎸 ocr (#2187)
added ocr

# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2024-02-12 19:56:20 -08:00
Stan Girard
dfdb294c50
feat: 🎸 api (#2078)
adding metadata to api

# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2024-01-25 15:56:46 -08:00
renovate[bot]
fe373fdd10
chore(deps): pin dependencies (#1975)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [actions/checkout](https://togithub.com/actions/checkout) | action |
pinDigest | -> `b4ffde6` |
| [actions/setup-node](https://togithub.com/actions/setup-node) | action
| pinDigest | -> `b39b52d` |
| [actions/setup-python](https://togithub.com/actions/setup-python) |
action | pinDigest | -> `65d7f2d` |
|
[aws-actions/amazon-ecr-login](https://togithub.com/aws-actions/amazon-ecr-login)
| action | pinDigest | -> `2fc7ace` |
|
[aws-actions/amazon-ecs-deploy-task-definition](https://togithub.com/aws-actions/amazon-ecs-deploy-task-definition)
| action | pinDigest | -> `df96430` |
|
[aws-actions/amazon-ecs-render-task-definition](https://togithub.com/aws-actions/amazon-ecs-render-task-definition)
| action | pinDigest | -> `4225e0b` |
|
[aws-actions/configure-aws-credentials](https://togithub.com/aws-actions/configure-aws-credentials)
| action | pinDigest | -> `5fd3084` |
| darthsim/imgproxy |  | pinDigest |  -> `0facd35` |
|
[docker/build-push-action](https://togithub.com/docker/build-push-action)
| action | pinDigest | -> `0a97817` |
|
[docker/build-push-action](https://togithub.com/docker/build-push-action)
| action | pinDigest | -> `4a13e50` |
| [docker/login-action](https://togithub.com/docker/login-action) |
action | pinDigest | -> `465a078` |
| [docker/login-action](https://togithub.com/docker/login-action) |
action | pinDigest | -> `343f7c4` |
|
[docker/setup-buildx-action](https://togithub.com/docker/setup-buildx-action)
| action | pinDigest | -> `885d146` |
|
[docker/setup-buildx-action](https://togithub.com/docker/setup-buildx-action)
| action | pinDigest | -> `f95db51` |
|
[docker/setup-qemu-action](https://togithub.com/docker/setup-qemu-action)
| action | pinDigest | -> `6882732` |
|
[google-github-actions/release-please-action](https://togithub.com/google-github-actions/release-please-action)
| action | pinDigest | -> `db8f2c6` |
| kong |  | pinDigest |  -> `1b53405` |
| [pavelzw/pytest-action](https://togithub.com/pavelzw/pytest-action) |
action | pinDigest | -> `510c5e9` |
| postgrest/postgrest |  | pinDigest |  -> `23b2dab` |
| python | final | pinDigest |  -> `0c1fbb2` |
| redis |  | pinDigest |  -> `a7cee7c` |
| supabase/edge-runtime |  | pinDigest |  -> `4e02aac` |
| supabase/gotrue |  | pinDigest |  -> `b503f1f` |
| supabase/logflare |  | pinDigest |  -> `e693c78` |
| supabase/postgres |  | pinDigest |  -> `fb8387f` |
| supabase/postgres-meta |  | pinDigest |  -> `31a107d` |
| supabase/realtime |  | pinDigest |  -> `634a59e` |
| supabase/storage-api |  | pinDigest |  -> `2cd146f` |
| supabase/studio |  | pinDigest |  -> `393669f` |
| timberio/vector |  | pinDigest |  -> `4bc04ac` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

👻 **Immortal**: This PR will be recreated if closed unmerged. Get
[config help](https://togithub.com/renovatebot/renovate/discussions) if
that's undesired.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/StanGirard/quivr).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xMDMuMSIsInVwZGF0ZWRJblZlciI6IjM3LjEwMy4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-01-04 14:00:18 +01:00
Pascal Gula
8bbe6e7054
Adds pytesseract, tesseract and poopler-utils (#1648)
To enable the ingestion of copy protected PDF via OCR instead of text
extraction

# Description

Copy protected PDF can't be properly imported via the standard langchain
loader.

See the following errors:

```
2023-11-15 14:16:31,927 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
[nltk_data] Downloading package punkt to
[nltk_data]     /home/pascal_gula_luccid_ai/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/pascal_gula_luccid_ai/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
Error processing file: detectron2 is not installed, pytesseract is not installed and the text of the PDF is not extractable. To process this file, install detectron2, install pytesseract, or remove copy protection from the PDF.
```

```
2023-11-15 15:04:14,624 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
Error processing file: Unable to get page count. Is poppler installed and in PATH?
```

```
023-11-15 15:59:11,886 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
Error processing file: tesseract is not installed or it's not in your PATH. See README file for more information.
```


## Checklist before requesting a review

Please delete options that are not relevant.

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented hard-to-understand areas
- [x] I have ideally added tests that prove my fix is effective or that
my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged

## Screenshots (if appropriate):

None

Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
2023-11-22 17:26:11 +01:00
Stan Girard
744eea6d43
feat: 🎸 docker reduced size by 2 (#1653)
reduced size by 2

# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2023-11-18 19:23:56 +01:00
Stan Girard
1e00f6929f fix: 🐛 docker
fixed multi stage
2023-11-13 15:03:55 +01:00
Mohamed Messaad
71d4a63a17
feat(docker): use multi-stage Docker builds for smaller images (#1614)
# Description

Currently, the production Docker images are very large, sitting at 4.17
GB for the frontend image, and 3.49 GB for backend images. This change
adds multi-stage builds, to optimize the image sizes, which results in
the following improvements:

- frontend image size: 4.17 GB -> 1.64 GB
- backend image size: 3.49 GB -> 1.71 GB

I hope this is appropriate as there is no open issue for this that I
know of.
I implemented this change and tested it locally, and would be glad to
discuss this and open an issue if necessary.

## Checklist before requesting a review

Please delete options that are not relevant.

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented hard-to-understand areas
- [x] New and existing unit tests pass locally with my changes

## Screenshots (if appropriate):
Image sizes before:
<img width="1416" alt="image"
src="https://github.com/StanGirard/quivr/assets/8296549/fcbb020f-8165-4549-ae30-823318691ec6">

Image sizes after:
<img width="1416" alt="image"
src="https://github.com/StanGirard/quivr/assets/8296549/d3f43c78-be26-4c38-9d23-9c1b0e9e37f2">
2023-11-13 10:13:56 +01:00
Matthieu Jacq
fa92243a18
feat: ⚙️🐞 configure debugger for the backend (#1345) 2023-10-09 15:23:13 +02:00
Stan Girard
2e4fdc80ec
feat(concurrency): added concurrency for increased performance (#1189) 2023-09-17 22:36:42 +02:00
Stan Girard
1d33fbd3eb
feat(file-system): added queue and filesystem (#1159)
* feat(queue): added

* feat(crawling): added queue

* fix(crawler): fixed github

* feat(docker): simplified docker compose

* feat(celery): added worker

* feat(files): now uploaded

* feat(files): missing routes

* feat(delete): added

* feat(storage): added policy and migrations

* feat(sqs): implemented

* feat(redis): added queue name variable

* fix(task): updated

* style(env): emoved unused env

* ci(tests): removed broken tests
2023-09-14 11:56:59 +02:00
Pat Tran
43a00b06ec
fix(dockerfile): backend Dockerfile exit code 1 (#1032) 2023-08-25 11:05:24 +02:00
Stan Girard
d0370ab499
feat(refacto): changed a bit of things to make better dx (#984) 2023-08-19 13:32:16 +02:00
Matt
e61f437ce8
Feat/backend core (#656) 2023-07-17 07:57:27 +01:00
nicksan222
c4c15a497c
Fixed pandocs (#662) 2023-07-15 23:20:47 +02:00
Matt
211740b400
fix: defined executable for windows/linux users (#652) 2023-07-14 18:24:09 +02:00
Stan Girard
fbd1e17018
feat(sentry): added sentry (#443) 2023-07-01 21:12:13 +02:00
Matt
d9b2be19d7
feat: start script (#367)
* feat: start script

* make faster
2023-06-23 14:20:03 +02:00
Matt
83fde0aeea
feat: private llm (#360)
* feat: private llm

* Update backend/vectorstore/supabase.py

* Update backend/vectorstore/supabase.py
2023-06-22 09:45:35 +01:00
Stan Girard
a3ca7ecb37
Back/refacto files (#240)
* feat(docker): added docker for prod

* feat(refacto): moved to modules
2023-06-03 23:12:42 +02:00
Stan Girard
72c92b1a54
VertexAI Google Cloud Palm2 Support (#226)
* feat(bard): added

* docs(readme): update

* chore(print): removed
2023-06-01 16:01:27 +02:00
shaun
a1693d94b2 Better envs 2023-05-21 21:18:55 -07:00
Dheerapat Tookkane
020c41b986 set pip timeout to 100 second (default 15) 2023-05-21 23:11:00 +07:00
Stan Girard
f952d7a269
New Webapp migration (#56)
* feat(v2): loaders added

* feature: Add scroll animations

* feature: upload ui

* feature: upload multiple files

* fix: Same file name and size remove

* feat(crawler): added

* feat(parsers): v2 added more

* feat(v2): audio now working

* feat(v2): all loaders

* feat(v2): explorer

* chore: add links

* feat(api): added status in return message

* refactor(website): remove old code

* feat(upload): return type for messages

* feature: redirect to upload if ENV=local

* fix(chat): fixed some issues

* feature: respect response type

* loading state

* feature: Loading stat

* feat(v2): added explore and chat pages

* feature: modal settings

* style: Chat UI

* feature: scroll to bottom when chatting

* feature: smooth scroll in chat

* feature(anim): Slide chat in

* feature: markdown chat

* feat(explorer): list

* feat(doc): added document item

* feat(explore): added modal

* Add clarification on Project API keys and web interface for migration scripts to Readme (#58)

* fix(demo): changed link

* add support to uploading zip file (#62)

* Catch UnicodeEncodeError exception (#64)

* feature: fixed chatbar

* fix(loaders): missing argument

* fix: layout

* fix: One whole chatbox

* fix: Scroll into view

* fix(build): vercel issues

* chore(streamlit): moved to own file

* refactor(api): moved to backend folder

* feat(docker): added docker compose

* Fix a bug where langchain memories were not being cleaned (#71)

* Update README.md (#70)

* chore(streamlit): moved to own file

* refactor(api): moved to backend folder

* docs(readme): updated for new version

* docs(readme): added old readme

* docs(readme): update copy dot env file

* docs(readme): cleanup

---------

Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com>
Co-authored-by: Matt LeBel <github@lebel.io>
Co-authored-by: Evan Carlson <45178375+EvanCarlson@users.noreply.github.com>
Co-authored-by: Mustafa Hasan Khan <65130881+mustafahasankhan@users.noreply.github.com>
Co-authored-by: zhulixi <48713110+zlxxlz1026@users.noreply.github.com>
Co-authored-by: Stanisław Tuszyński <stanislaw@tuszynski.me>
2023-05-21 01:20:55 +02:00