# Description
- Added package manager
- Added precommit checks
- Rewrote dependency injection of Services and Repositories
- Integrate async SQL alchemy engine
- Migrate Chat repository to SQLModel
- Migrated ChatHistory repository to SQLModel
- User SQLModel
- Unit test methodology with db rollback
- Unit tests ChatRepository
- Test ChatService get_history
- Brain entity SQL Model
- Promp SQLModel
- Rewrite chat/{chat_id}/question route
- updated docker files and docker compose in dev and production
Added `quivr_core` subpackages:
- Refactored KnowledgebrainQa
- Added Rag service to interface with non-rag dependencies
---------
Co-authored-by: aminediro <aminediro@github.com>
This pull request adds the Playwright library for web crawling. It
includes the necessary dependencies and updates the code to use
Playwright for crawling websites.
This pull request updates the Dockerfile to include the installation of
the Supabase CLI. The Supabase CLI is required for interacting with the
Supabase backend. This update ensures that the Supabase CLI is installed
in the Docker image, allowing developers to easily use the Supabase CLI
within their Docker environment.
added ocr
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
adding metadata to api
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
To enable the ingestion of copy protected PDF via OCR instead of text
extraction
# Description
Copy protected PDF can't be properly imported via the standard langchain
loader.
See the following errors:
```
2023-11-15 14:16:31,927 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
[nltk_data] Downloading package punkt to
[nltk_data] /home/pascal_gula_luccid_ai/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /home/pascal_gula_luccid_ai/nltk_data...
[nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
Error processing file: detectron2 is not installed, pytesseract is not installed and the text of the PDF is not extractable. To process this file, install detectron2, install pytesseract, or remove copy protection from the PDF.
```
```
2023-11-15 15:04:14,624 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
Error processing file: Unable to get page count. Is poppler installed and in PATH?
```
```
023-11-15 15:59:11,886 [INFO] models.files: Computing documents from file Cradle to Cradle Criteria for the built environmen.pdf
Error processing file: tesseract is not installed or it's not in your PATH. See README file for more information.
```
## Checklist before requesting a review
Please delete options that are not relevant.
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented hard-to-understand areas
- [x] I have ideally added tests that prove my fix is effective or that
my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged
## Screenshots (if appropriate):
None
Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
reduced size by 2
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
# Description
Currently, the production Docker images are very large, sitting at 4.17
GB for the frontend image, and 3.49 GB for backend images. This change
adds multi-stage builds, to optimize the image sizes, which results in
the following improvements:
- frontend image size: 4.17 GB -> 1.64 GB
- backend image size: 3.49 GB -> 1.71 GB
I hope this is appropriate as there is no open issue for this that I
know of.
I implemented this change and tested it locally, and would be glad to
discuss this and open an issue if necessary.
## Checklist before requesting a review
Please delete options that are not relevant.
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented hard-to-understand areas
- [x] New and existing unit tests pass locally with my changes
## Screenshots (if appropriate):
Image sizes before:
<img width="1416" alt="image"
src="https://github.com/StanGirard/quivr/assets/8296549/fcbb020f-8165-4549-ae30-823318691ec6">
Image sizes after:
<img width="1416" alt="image"
src="https://github.com/StanGirard/quivr/assets/8296549/d3f43c78-be26-4c38-9d23-9c1b0e9e37f2">
* feat(v2): loaders added
* feature: Add scroll animations
* feature: upload ui
* feature: upload multiple files
* fix: Same file name and size remove
* feat(crawler): added
* feat(parsers): v2 added more
* feat(v2): audio now working
* feat(v2): all loaders
* feat(v2): explorer
* chore: add links
* feat(api): added status in return message
* refactor(website): remove old code
* feat(upload): return type for messages
* feature: redirect to upload if ENV=local
* fix(chat): fixed some issues
* feature: respect response type
* loading state
* feature: Loading stat
* feat(v2): added explore and chat pages
* feature: modal settings
* style: Chat UI
* feature: scroll to bottom when chatting
* feature: smooth scroll in chat
* feature(anim): Slide chat in
* feature: markdown chat
* feat(explorer): list
* feat(doc): added document item
* feat(explore): added modal
* Add clarification on Project API keys and web interface for migration scripts to Readme (#58)
* fix(demo): changed link
* add support to uploading zip file (#62)
* Catch UnicodeEncodeError exception (#64)
* feature: fixed chatbar
* fix(loaders): missing argument
* fix: layout
* fix: One whole chatbox
* fix: Scroll into view
* fix(build): vercel issues
* chore(streamlit): moved to own file
* refactor(api): moved to backend folder
* feat(docker): added docker compose
* Fix a bug where langchain memories were not being cleaned (#71)
* Update README.md (#70)
* chore(streamlit): moved to own file
* refactor(api): moved to backend folder
* docs(readme): updated for new version
* docs(readme): added old readme
* docs(readme): update copy dot env file
* docs(readme): cleanup
---------
Co-authored-by: iMADi-ARCH <nandanaditya985@gmail.com>
Co-authored-by: Matt LeBel <github@lebel.io>
Co-authored-by: Evan Carlson <45178375+EvanCarlson@users.noreply.github.com>
Co-authored-by: Mustafa Hasan Khan <65130881+mustafahasankhan@users.noreply.github.com>
Co-authored-by: zhulixi <48713110+zlxxlz1026@users.noreply.github.com>
Co-authored-by: Stanisław Tuszyński <stanislaw@tuszynski.me>