This pull request adds a GitHub Actions workflow for building and
pushing Docker images to Amazon ECR. The workflow is triggered on every
push to the main branch and includes steps for configuring AWS
credentials, logging in to Amazon ECR, GitHub Container Registry, and
Docker Hub, setting up Docker Buildx, creating a Docker cache storage
backend, and building, tagging, and pushing the Docker image to Amazon
ECR.
This pull request adds the Playwright library for web crawling. It
includes the necessary dependencies and updates the code to use
Playwright for crawling websites.
# Description
Delete the replacement of non ASCII characters into spaces
## Checklist before requesting a review
Please delete options that are not relevant.
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged
## Screenshots (if appropriate):
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request adds a new config parameter to the
`conversational_qa_chain` function. The config parameter allows for
passing metadata, specifically the conversation ID, to the function.
This change ensures that the conversation ID is included in the metadata
when invoking the `conversational_qa_chain` function.
This pull request adds the ProxyBrain integration to the project. The
ProxyBrain class is responsible for handling conversational QA and
generating answers based on the provided chat history and question.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
---------
Co-authored-by: Stan Girard <girard.stanislas@gmail.com>
This pull request fixes the parsing instruction in the common.py file.
The result_type has been corrected to "markdown" and the
parsing_instruction has been updated to handle checkboxes, tables, and
other elements that are hard to parse in a meaningful way.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request adds Llama Parse integration for complex document
parsing in Quivr. Llama Parse is a tool from Llama Index that allows you
to read complex documents in Quivr. It provides an API key that needs to
be added to the `.env` file as `LLAMA_CLOUD_API_KEY`. Once configured,
you can use the Llama Parse tool to read `pdf`, `docx`, and `doc` files
in Quivr.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request adds caching for the Supabase client and database
instances in order to improve performance and reduce unnecessary API
calls. The `get_supabase_client()` and `get_supabase_db()` functions now
check if the instances have already been created and return the cached
instances if available. This avoids creating new instances for every
function call, resulting in faster execution times.
This pull request adds the pyinstrument package and updates the Makefile
and backend code. The pyinstrument package is used for profiling and the
Makefile and backend code have been modified to support profiling.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request adds a ci-migration script that sets the project ID
from an environment variable and runs the supabase link and supabase db
push commands. This script will be used for continuous integration
purposes.
This pull request removes the citation metadata from the generate_answer
and generate_stream functions. The citation metadata was previously
being added to the streamed_chat_history and metadata dictionaries, but
it is no longer necessary. This change improves the efficiency and
clarity of the code.
This pull request updates the chunk size and overlap parameters in the
File class to improve performance. It also increases the top_n value in
the compressor for both the CohereRerank and FlashrankRerank models.
Additionally, it ensures that the page content is encoded in UTF-8
before processing.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request updates the Dockerfile to include the installation of
the Supabase CLI. The Supabase CLI is required for interacting with the
Supabase backend. This update ensures that the Supabase CLI is installed
in the Docker image, allowing developers to easily use the Supabase CLI
within their Docker environment.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
… to the DB API
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request updates the chunk overlap value in the File class from
300 to 200. This change reduces the overlap between chunks, improving
the performance of chunking operations.
# Description
Please include a summary of the changes and the related issue. Please
also include relevant motivation and context.
## Checklist before requesting a review
Please delete options that are not relevant.
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented hard-to-understand areas
- [ ] I have ideally added tests that prove my fix is effective or that
my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged
## Screenshots (if appropriate):
This pull request adds the flashrank and contextual compression
retriever to the codebase. The flashrank reranker model is used for
compression, and the contextual compression retriever combines the base
compressor and base retriever to improve document retrieval.
This pull request updates the API documentation to include new sections
on configuring Quivr and contacting the Quivr team. It also removes the
"API Brains" section from the documentation.
This pull request fixes the issue of duplicate sources in the model
response and adds metadata to the response. It removes duplicate sources
with the same name and creates a list of unique sources. Additionally,
it includes the generated URLs and sources in the metadata of the model
response.