quivr/backend/core/MegaParse/Dockerfile
Jacopo Chevallard ef90e8e672
feat: introducing configurable retrieval workflows (#3227)
# Description

Major PR which, among other things, introduces the possibility of easily
customizing the retrieval workflows. Workflows are based on LangGraph,
and can be customized using a [yaml configuration
file](core/tests/test_llm_endpoint.py), and adding the implementation of
the nodes logic into
[quivr_rag_langgraph.py](1a0c98437a/backend/core/quivr_core/quivr_rag_langgraph.py)

This is a first, simple implementation that will significantly evolve in
the coming weeks to enable more complex workflows (for instance, with
conditional nodes). We also plan to adopt a similar approach for the
ingestion part, i.e. to enable user to easily customize the ingestion
pipeline.

Closes CORE-195, CORE-203, CORE-204

## Checklist before requesting a review

Please delete options that are not relevant.

- [X] My code follows the style guidelines of this project
- [X] I have performed a self-review of my code
- [X] I have commented hard-to-understand areas
- [X] I have ideally added tests that prove my fix is effective or that
my feature works
- [X] New and existing unit tests pass locally with my changes
- [X] Any dependent changes have been merged

## Screenshots (if appropriate):
2024-09-23 09:11:06 -07:00

17 lines
445 B
Docker

# Using a slim version for a smaller base image
FROM python:3.11.6-slim-bullseye
# Install GEOS library, Rust, and other dependencies, then clean up
RUN apt-get clean && apt-get update && apt-get install -y \
poppler-utils \
tesseract-ocr
WORKDIR /code
# Upgrade pip and install dependencies
RUN pip install megaparse
# You can run the application with the following command:
# docker run -it megaparse_image python your_script.py