Commit Graph

34 Commits

Author SHA1 Message Date
Pascal Gula
8bbe6e7054
Adds pytesseract, tesseract and poopler-utils (#1648)
To enable the ingestion of copy protected PDF via OCR instead of text
extraction

# Description

Copy protected PDF can't be properly imported via the standard langchain
loader.

See the following errors:

```
2023-11-15 14:16:31,927 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
[nltk_data] Downloading package punkt to
[nltk_data]     /home/pascal_gula_luccid_ai/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/pascal_gula_luccid_ai/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
Error processing file: detectron2 is not installed, pytesseract is not installed and the text of the PDF is not extractable. To process this file, install detectron2, install pytesseract, or remove copy protection from the PDF.
```

```
2023-11-15 15:04:14,624 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
Error processing file: Unable to get page count. Is poppler installed and in PATH?
```

```
023-11-15 15:59:11,886 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
Error processing file: tesseract is not installed or it's not in your PATH. See README file for more information.
```


## Checklist before requesting a review

Please delete options that are not relevant.

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented hard-to-understand areas
- [x] I have ideally added tests that prove my fix is effective or that
my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged

## Screenshots (if appropriate):

None

Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
2023-11-22 17:26:11 +01:00
Stan Girard
744eea6d43
feat: 🎸 docker reduced size by 2 (#1653)
reduced size by 2

# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2023-11-18 19:23:56 +01:00
Stan Girard
5a3f284785
test(all): added (#1624)
# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2023-11-14 08:36:44 +01:00
Stan Girard
6017fa2e9c test(contact): added contact route with mock 2023-11-13 17:58:11 +01:00
Stan Girard
6bc9dd1894
ci: 🎡 tests (#1615)
fixed

# Description

Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.

## Checklist before requesting a review

Please delete options that are not relevant.

- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged

## Screenshots (if appropriate):
2023-11-13 13:53:25 +01:00
Mamadou DICKO
db5a6e4b9b
feat: allow users to chat with apis (#1612)
You can now create a brain which can fetch data from external APIs with
or without authentification

- POST query example with authentification 


https://github.com/StanGirard/quivr/assets/63923024/15013ba9-dedb-4f24-9e06-49daad9de7f3


- Get query example with authentification and search params



https://github.com/StanGirard/quivr/assets/63923024/1763875d-a8e9-4478-b07c-e99ca7337942


- Get query without authentification and search params



https://github.com/StanGirard/quivr/assets/63923024/f2742963-790d-4cb2-864a-8173979b650a
2023-11-09 16:58:51 +01:00
Mamadou DICKO
c47548d3cd
feat: setup premium feature backend (#1467)
Issue: https://github.com/StanGirard/quivr/issues/1468
2023-10-23 18:19:04 +02:00
Stan Girard
e62c3e0579 feat(litellm): adding huggingface compatibility mistral 2023-10-03 17:18:50 +02:00
Stan Girard
160588cfae feat(litellm): improved 2023-10-03 10:12:44 +02:00
Stan Girard
ead1ae86fc
feat(user_settings): increased (#1291) 2023-09-30 22:32:53 +02:00
Stan Girard
d8e188788f
fix(gpt-3.5-instruct): bug and new version of node (#1228) 2023-09-20 16:16:50 +02:00
Stan Girard
1d33fbd3eb
feat(file-system): added queue and filesystem (#1159)
* feat(queue): added

* feat(crawling): added queue

* fix(crawler): fixed github

* feat(docker): simplified docker compose

* feat(celery): added worker

* feat(files): now uploaded

* feat(files): missing routes

* feat(delete): added

* feat(storage): added policy and migrations

* feat(sqs): implemented

* feat(redis): added queue name variable

* fix(task): updated

* style(env): emoved unused env

* ci(tests): removed broken tests
2023-09-14 11:56:59 +02:00
Ishaan Jaff
02964c4077
feat(liteLLM): Add support for Azure OpenAI, Palm, Claude-2, Llama2, CodeLlama (100+LLMs) (#1097)
* v0 litellm

* bump versions
2023-09-05 17:38:19 +02:00
Zineb El Bachiri
3821502c6d
add xlsx and xls parser (#997) 2023-08-21 12:56:48 +02:00
Stan Girard
d0370ab499
feat(refacto): changed a bit of things to make better dx (#984) 2023-08-19 13:32:16 +02:00
Matt
e61f437ce8
Feat/backend core (#656) 2023-07-17 07:57:27 +01:00
Matt
8fbb4b2d91
fix: gpt4all (#595)
* fix: gpt4all

* fix: pyright

* Update backend/llm/openai.py

* fix: remove backend tag

* fix: typing

* feat: qa_base class

* fix: pyright

* fix: model_path not found
2023-07-11 20:15:56 +02:00
Zineb El Bachiri
f837a6e9b9
Feat/shareable brains send link be (#599)
* 🗃️ new table for invitations to subscribe to brain

*  new BrainSubscription class

*  new subscription router

* 👽️ add RESEND_API_KEY to .env in BE

* 📦 add 'resend' lib to requirements

* ♻️ fix some stanGPT
2023-07-11 18:20:31 +02:00
Mamadou DICKO
9e9f531c99
Feat/static analysis (#582)
* feat: add static analysis

* chore: update Makefile add static analysis script

* chore: add vscode extensions recommandations
2023-07-10 14:27:49 +02:00
Stan Girard
fbd1e17018
feat(sentry): added sentry (#443) 2023-07-01 21:12:13 +02:00
Mamadou DICKO
59fe7b089b
feat(chat): use openai function for answer (#354)
* feat(chat): use openai function for answer (backend)

* feat(chat): use openai function for answer (frontend)

* chore: refacto BrainPicking

* feat: update chat creation logic

* feat: simplify chat system logic

* feat: set default method to gpt-3.5-turbo-0613

* feat: use user own openai key

* feat(chat): slightly improve prompts

* feat: add global error interceptor

* feat: remove unused endpoints

* docs: update chat system doc

* chore(linter): add unused import remove config

* feat: improve dx

* feat: improve OpenAiFunctionBasedAnswerGenerator prompt
2023-06-22 17:50:06 +02:00
Matt
83fde0aeea
feat: private llm (#360)
* feat: private llm

* Update backend/vectorstore/supabase.py

* Update backend/vectorstore/supabase.py
2023-06-22 09:45:35 +01:00
Mamadou DICKO
e1a740472f
Feat: chat name edit (#343)
* feat(chat): add name update

* chore(linting): add flake8

* feat: add chat name edit
2023-06-20 09:54:23 +02:00
Stan Girard
f4e85db187 fix(llm): using wrong llm probably because of breaking change in langchain 2023-06-14 22:15:48 +02:00
Matt
33f49ee289
feat: user can create api keys (#329)
* feat: user can create api keys

* fix: linting on build

* Update backend/routes/api_key_routes.py

* chore: rename and refactor AuthBearer

* chore: add types
2023-06-14 21:21:13 +02:00
Stan Girard
67530c13f2 fix(google): requirements 2023-06-12 17:37:58 +02:00
Stan Girard
e0cf37791b feat(pdf): added new pdf miner that works 2023-06-06 11:18:33 +02:00
Stan Girard
72c92b1a54
VertexAI Google Cloud Palm2 Support (#226)
* feat(bard): added

* docs(readme): update

* chore(print): removed
2023-06-01 16:01:27 +02:00
Stan Girard
327074c5d4
feat(auth): now application has authentication (#144)
* feat(auth): backend authentification verification

* feat(auth): added to all endpoints

* feat(auth): added to all endpoints

* feat(auth): redirect if not connected

* chore(print): removed

* feat(login): redirect

* feat(icon): added

* chore(yarn): removed lock

* chore(gitignore): removed
2023-05-24 22:21:22 +02:00
shaun
c38265a5f5 add summarization backend 2023-05-21 23:39:55 -07:00
Stan Girard
97bf4464ad
Merge branch 'main' into main 2023-05-21 09:34:31 +02:00
Evan Carlson
6dab1259ef
add docx2txt package for uploading word docs (#93) 2023-05-21 09:21:51 +02:00
Murtaza
1706538343 Add epub loader to parse epub uploads. 2023-05-21 11:45:31 +05:30
Stan Girard
f952d7a269
New Webapp migration (#56)
* feat(v2): loaders added

* feature: Add scroll animations

* feature: upload ui

* feature: upload multiple files

* fix: Same file name and size remove

* feat(crawler): added

* feat(parsers): v2 added more

* feat(v2): audio now working

* feat(v2): all loaders

* feat(v2): explorer

* chore: add links

* feat(api): added status in return message

* refactor(website): remove old code

* feat(upload): return type for messages

* feature: redirect to upload if ENV=local

* fix(chat): fixed some issues

* feature: respect response type

* loading state

* feature: Loading stat

* feat(v2): added explore and chat pages

* feature: modal settings

* style: Chat UI

* feature: scroll to bottom when chatting

* feature: smooth scroll in chat

* feature(anim): Slide chat in

* feature: markdown chat

* feat(explorer): list

* feat(doc): added document item

* feat(explore): added modal

* Add clarification on Project API keys and web interface for migration scripts to Readme (#58)

* fix(demo): changed link

* add support to uploading zip file (#62)

* Catch UnicodeEncodeError exception (#64)

* feature: fixed chatbar

* fix(loaders): missing argument

* fix: layout

* fix: One whole chatbox

* fix: Scroll into view

* fix(build): vercel issues

* chore(streamlit): moved to own file

* refactor(api): moved to backend folder

* feat(docker): added docker compose

* Fix a bug where langchain memories were not being cleaned (#71)

* Update README.md (#70)

* chore(streamlit): moved to own file

* refactor(api): moved to backend folder

* docs(readme): updated for new version

* docs(readme): added old readme

* docs(readme): update copy dot env file

* docs(readme): cleanup

---------

Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com>
Co-authored-by: Matt LeBel <github@lebel.io>
Co-authored-by: Evan Carlson <45178375+EvanCarlson@users.noreply.github.com>
Co-authored-by: Mustafa Hasan Khan <65130881+mustafahasankhan@users.noreply.github.com>
Co-authored-by: zhulixi <48713110+zlxxlz1026@users.noreply.github.com>
Co-authored-by: Stanisław Tuszyński <stanislaw@tuszynski.me>
2023-05-21 01:20:55 +02:00